On Sunday 15 January 2006 06:27, Jon Masters wrote:
> On 1/14/06, Kent Ryhorchuk <kryhorchuk@yahoo.com> wrote:
> > I've got it working on a 2GB Samsung NAND flash. This
> > flash supports interleaved operation (two chip
> > selects) and cached programming (odd and even banks on
> > each chip). The host CPU is a 180MHz ARM9 that has a
> > dedicated smart media interface with HW ECC
> > calculation and DMA.
>
> Are you using the DMA engine? It won't use itself :-)
Put a scope on the NAND chip's #CE #AL #CL lines and check that
you are driving the chip as best you can. I have found, for
example, that the embedded DMA controller in a Sharp SoC ARM can
generate tighter bus i/o read cycles than the CPU -- often DMA
controllers are designed to perform short bursts of cycles
efficiently. So even if the CPU just spins, waiting for the
DMA to complete, it's a win.
One might be able to do this from the CPU with the cache enabled
for the appropriate I/O space, but caching will break normal
NAND IO (like page addressing and status polling). I have done
this for NOR read, but then NOR is much like RAM. Of course with
ARM you could double map the NAND i/o space, one non-cached and
one cached, and see if a cached-line load generates tighter read
cycles.
Another technique that worked for me on the MPC5200 (PPC) was to
issue 32-bit reads to the (8-bit) NAND chip, the memory
controller converted these to a burst of 4 back-to-back byte
reads) and I got a 70% boost. Whatever you do, this kind of
thing gets very hardware specific -- but I have attained much
improved NAND read performance on various platforms/CPUs using
these techiniques.
The trick is figuring out how to get the CPU/memory-controller in
your design to drive the NAND chip as close to spec as possible.
With regular CPU byte reads, it may be far from optimal.
-imcd