On Tuesday 16 January 2007 12:52 pm, you wrote:
> On Tuesday 16 January 2007 19:47, Chuck Tilbury wrote:
> > I am having a weird problem with my nand driver.  Since I am using yaffs
> > and I have that spew, I hope it is OK that I post here.  Let me know if
> > it's not.
>
> You might get better nand driver support on the mtd list. If you can
> reproduce this with mtd-only functions then it will help cut downmn your
> debug time.
>
> > When I copy a file to the file yaffs file system, I get a lot of spew
> > that looks like this:
> >
> > **>> Block 1 needs retiring
> > **>> Block 2 needs retiring
> > **>> Block 2 retired
> > **>> Block 3 needs retiring
> > **>> Block 3 retired
> > **>> Block 4 needs retiring
> > **>> Block 4 retired
> > ..... messages with sequential block numbers deleted ......
> > **>> Block 779 needs retiring
> > **>> Block 779 retired
> > **>> yaffs write required 780 attempts
> > **>> Block 1 retired
> > # ls -las /mnt
> >    0 drw-rw-rw-    1 0        0             512 .
> >    1 drwxrwxrwx   14 0        0            1024 ..
> >    0 drw-rw-rw-    1 0        0             512 lost+found
> >    0 -rw-r--r--    1 0        0             426 rc
> >
> > The file /etc/rc was written correctly to the FS.
>
> This does look like nand driver problems.. Because it sometimes works, I
> would think that you have marginal timing.
>
> > When I turn on any debugging, it does not retire blocks at all.  In fact,
> > if I insert a printk() anywhere such that it gets called once in a while,
>
> Is this debugging in the nand driver?
> If the prinltk() is inserted inside the timing critical portions then
> obviously it would be stretching timing from marginal to OK.
> If however it is inserted elsewhere then you need another theory... I
> wonder if it could be a power decoupling issue? Check Vcc for glitching at
> the NAND.
>
> > the retire messages go away and the status in /proc/yaffs shows that no
> > blocks were retired.  If I get rid of all of the debugging messages, I
> > run out of space in a few writes.  (it's a 64M byte part)
> >
> > With all debugging on, I get this spew:
> > # cp /etc/inittab /mnt
> > yaffs_mknod: making oject for inittab, mode 81a4 dev 0
> > Writing chunk 24931 tags 262 0
> > Chunk -1 not found zero instead
> > Chunk -1 not found zero instead
> > Chunk -1 not found zero instead
> > Chunk -1 not found zero instead
> > Chunk -1 not found zero instead
> > Chunk -1 not found zero instead
> > Chunk -1 not found zero instead
> > Chunk -1 not found zero instead
> > Chunk -1 not found zero instead
> > Writing chunk 24932 tags 262 1
> > Writing chunk 24933 tags 262 0
> > line 3142 delete of chunk 24963
> > Writing chunk 24931 tags 0 0
> >
> > # ls -las /mnt
> > yaffs_readdir: starting at 0
> > yaffs_readdir: starting at 5
> >    0 drw-rw-rw-    1 0        0             512 .
> >    1 drwxrwxrwx   14 0        0            1024 ..
> >    0 -rw-r--r--    1 0        0              82 inittab
> >    0 drw-rw-rw-    1 0        0             512 lost+found
> >    0 -rw-r--r--    1 0        0             426 rc
> >
> > Again a correct write.
> >
> > I have been messing around with the timing on the GPIO lines that control
> > the part and I always come back to this.  Can anyone give a pointer?
>
> You need to be very careful with timing. There are many issues which can
> make timing hard to predict.
>
> > Is
> > the "Chunk -1 ..." message an indication of a failure?
>
> No.  You were trying to read a file at a file location that does not exist
> (ie. a hole in a sparse file or past the end of a file), so yaffs is
> serving up a chunk that is all zeros.
>
> Depending on alignment, reading is part of writing. In misaligned cases
> yaffs must read the chunk, modify it, then write it back.
>
> -- CHarles

I thought I would post the solution to the problem.  One of the guys in the 
office backported the MTD system from a 2.4.19 kernel.  We already had that 
working.  We installed that and the new NAND driver and it works perfectly.

The bottom line is that there appears to be a major problem in the MTD system 
in a 2.6.16 kernel that breaks the NAND functionality.

-=Chuck=-