Re: [Yaffs] Fix for yaffs_ECCCorrectOther

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: Charles Manning
Date:  
To: yaffs
CC: Blair Barnett, Paul Lima, Gennady Dagman
Subject: Re: [Yaffs] Fix for yaffs_ECCCorrectOther
On Friday 15 September 2006 06:14, Gennady Dagman wrote:
> Hello,
>
> While investigating linux kernel oops crash (attached) I found:
>
> The traceback: nandmtd2_ReadChunkWithTagsFromNAN -> yaffs_UnpackTags2 ->
> yaffs_ECCCorrectOther
>
> The line of code that caused the crash in yaffs_ECCCorrectOther:
>         data[lDelta] ^= (1 << bit);
> At this point data address is in r4 = c34f3cd4 and offset (lDelta) is in
> r1 = 30303034 (WRONG ! should not be bigger than
> sizefo(yaffs_PackedTags2TagsPart)-1 = 15)

>
> The real cause of this failure was a mysteriously corrupted tag and
> tag's ECC read from flash. Register and stack analysis shows that at
> this point we had:
>
> data(tag) = 303030cc 303030fe 30183030 18303030
> test_ecc = 15 00000004 fffffffb
> read_ecc = 30 30303030 30303030
>
> Questions:
> ----------
>
> 1) I believe yaffs_ECCCorrectOther is lacking this robustness check:
>
>  if(lDelta >=  nBytes)
>    return -1;

>
> just before: data[lDelta] ^= (1 << bit);
>
> 2) Probably we should also check that
>     read_ecc->lineParity = ~read_ecc->lineParityPrime
> or, at least, the difference between
>     read_ecc->lineParity and ~read_ecc->lineParityPrime
> not more than 1 bit ??


Thank you. I will investigate this further.

>
> 3) What could be the reason for this strange tag+ecc corruption ? Looks
> like it was (over)written with ascii 0s.. Could it be the result of
> unfinished (interrupted by power off, h/w reset etc) nand write ?


I doub't this is caused by an interrupted nand write. It looks more like a
bus integrity or driver issue causing the incorrect address to be set up in
the NAND write.

>
> 4) I see that currently ecc errors during nand_read are ignored silently
> as long as all other read errors. Maybe it's worth to report them at
> least: error counters, yaffs traces ? As for ecc errors I suggest to
> put the following trace:
>
> if (t->eccResult)
> {
> T(YAFFS_TRACE_ALWAYS,
> (TSTR("ecc error=%d colParity=0x%0x lineParity=0x%0x lineParityPrime=0x
> %0x" TENDSTR),t->eccResult, pt->ecc.colParity, pt->ecc.lineParity,
> pt->ecc.lineParityPrime));
> }
>
> in yaffs_UnpackTags2 just after call to yaffs_ECCCorrectOther it could
> make life easier.
>
> Thanks,
>
> Gennady Dagman.