On Friday 15 September 2006 06:14, Gennady Dagman wrote: > Hello, > > While investigating linux kernel oops crash (attached) I found: > > The traceback: nandmtd2_ReadChunkWithTagsFromNAN -> yaffs_UnpackTags2 -> > yaffs_ECCCorrectOther > > The line of code that caused the crash in yaffs_ECCCorrectOther: > data[lDelta] ^= (1 << bit); > At this point data address is in r4 = c34f3cd4 and offset (lDelta) is in > r1 = 30303034 (WRONG ! should not be bigger than > sizefo(yaffs_PackedTags2TagsPart)-1 = 15) > > The real cause of this failure was a mysteriously corrupted tag and > tag's ECC read from flash. Register and stack analysis shows that at > this point we had: > > data(tag) = 303030cc 303030fe 30183030 18303030 > test_ecc = 15 00000004 fffffffb > read_ecc = 30 30303030 30303030 > > Questions: > ---------- > > 1) I believe yaffs_ECCCorrectOther is lacking this robustness check: > > if(lDelta >= nBytes) > return -1; > > just before: data[lDelta] ^= (1 << bit); > > 2) Probably we should also check that > read_ecc->lineParity = ~read_ecc->lineParityPrime > or, at least, the difference between > read_ecc->lineParity and ~read_ecc->lineParityPrime > not more than 1 bit ?? Thank you. I will investigate this further. > > 3) What could be the reason for this strange tag+ecc corruption ? Looks > like it was (over)written with ascii 0s.. Could it be the result of > unfinished (interrupted by power off, h/w reset etc) nand write ? I doub't this is caused by an interrupted nand write. It looks more like a bus integrity or driver issue causing the incorrect address to be set up in the NAND write. > > 4) I see that currently ecc errors during nand_read are ignored silently > as long as all other read errors. Maybe it's worth to report them at > least: error counters, yaffs traces ? As for ecc errors I suggest to > put the following trace: > > if (t->eccResult) > { > T(YAFFS_TRACE_ALWAYS, > (TSTR("ecc error=%d colParity=0x%0x lineParity=0x%0x lineParityPrime=0x > %0x" TENDSTR),t->eccResult, pt->ecc.colParity, pt->ecc.lineParity, > pt->ecc.lineParityPrime)); > } > > in yaffs_UnpackTags2 just after call to yaffs_ECCCorrectOther it could > make life easier. > > Thanks, > > Gennady Dagman.