Re: [Yaffs] Kernel upgrade 2.6.10 to 2.6.22 causing yaffs co…

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: Ben Dooks
Date:  
To: yaffs
Subject: Re: [Yaffs] Kernel upgrade 2.6.10 to 2.6.22 causing yaffs corruption
On Thu, Oct 11, 2007 at 10:58:11PM +0100, Ben Dooks wrote:
> On Thu, Oct 11, 2007 at 08:44:14PM +0100, Ben Dooks wrote:
> > On Thu, Oct 11, 2007 at 10:35:55AM -0400, Ian McDonnell wrote:
> > > Ben,
> > >
> > > On Thursday 11 October 2007 07:25, Ben Dooks wrote:
> > > > I'm working on upgrading a client's kernel from 2.6.10 to
> > > > 2.6.22. They are using yaffs to store their configuration
> > > > data.
> > > >
> > > > When I start the 2.6.22 kernel, I get a number of the
> > > > following: **>> yaffs chunk 32 was not erased
> > > >
> > > > If I run 2.6.10 after writing the files from 2.6.22 I get
> > > > blocks with ECC errors in them, and the files get corrupted.
> > >
> > > The problem is with a change to MTD -- MTD's NAND interface had
> > > an overhaul between 2.6.17 and mid-late 2.6.18. MTD's default
> > > layout of oob data in the NAND page's spare area changed. This
> > > means that yaffs' tag metadata gets scrambled when you jump
> > > between versions of MTD. There are ways to fix this, you can
> > > supply MTD with your own layout policy. Rumage through the mail
> > > archive to find some discussion on the topic.
> > > http://aleph1.co.uk/lurker/list/yaffs.en.html
> >
> > I didn't find anything relevant to this. I have found one problem,
> > if CONFIG_YAFFS_DOES_ECC is set, and the mtd driver has an ECC
> > method, then it seems to be using the mtd hardware ECC code during
> > reads. The Kconfig help text seems to imply this should override
> > anything NAND is up to?
> >
> > Even with the hardware ECC disabled, nipping between the two
> > kernels is still giving:
> >
> > yaffs_ECCCorrect: READ ff,ff,bf WANT ff,ff,ff, DIFF 00,00,40
> >
> > The diff value is _always_ 00,00,40 whatever the ECC being read.
>
> With the diff always being this, I belive the mtd oob layout is now
> synchronised between the two kernel versions... both kernels can see
> the file in question, but the ecc correction causes the file to end
> up being 'corrected' when it does not need it.
>
> I will continue running debug on this, becuase before the debugging
> the two ecc genrators where exactly the same file.


I've run output from both 2.6.10 and 2.6.22 showing the data in the
OOB:

writing file on 2.6.10:

nandmtd_WriteChunkToNAND: writing chunk 32 oob00,00,10,00 ff,ff,05,01
nandmtd_WriteChunkToNAND: writing chunk 33 oob01,00,10,80 ff,ff,05,01
nandmtd_WriteChunkToNAND: writing chunk 34 oob02,00,10,80 ff,ff,05,01
nandmtd_WriteChunkToNAND: writing chunk 35 oob03,00,10,80 ff,ff,05,01
nandmtd_WriteChunkToNAND: writing chunk 36 oob04,00,10,80 ff,ff,05,01

on 2.6.22 on mount:

==> chunk 0: OOB read00,00,10,00 05,01,04,40
==> chunk 32: OOB read00,00,10,00 00,ff,05,01
==> chunk 64: OOB read20,00,10,80 ff,ff,05,01

It seems to be reading the same data from chunk 32, but it is now
ignorning the file on the filesystem. Chunk 0 is on the first block
which now seems to be worn out from the repeated erase/program cycles
of testing.

--
Ben

Q:      What's a light-year?
A:      One-third less calories than a regular year.