[Yaffs] Fix for yaffs_ECCCorrectOther

Top Page
Attachments:
Message as email
+ (text/plain)
+ Oops (text/plain)
Delete this message
Reply to this message
Author: Gennady Dagman
Date:  
To: yaffs
CC: Blair Barnett, Paul Lima
Subject: [Yaffs] Fix for yaffs_ECCCorrectOther
Hello,

While investigating linux kernel oops crash (attached) I found:

The traceback: nandmtd2_ReadChunkWithTagsFromNAN -> yaffs_UnpackTags2 ->
yaffs_ECCCorrectOther

The line of code that caused the crash in yaffs_ECCCorrectOther:
        data[lDelta] ^= (1 << bit);
At this point data address is in r4 = c34f3cd4 and offset (lDelta) is in
r1 = 30303034 (WRONG ! should not be bigger than
sizefo(yaffs_PackedTags2TagsPart)-1 = 15)


The real cause of this failure was a mysteriously corrupted tag and
tag's ECC read from flash. Register and stack analysis shows that at
this point we had:

data(tag) = 303030cc 303030fe 30183030 18303030
test_ecc = 15 00000004 fffffffb
read_ecc = 30 30303030 30303030

Questions:
----------

1) I believe yaffs_ECCCorrectOther is lacking this robustness check:

if(lDelta >= nBytes)
return -1;

just before: data[lDelta] ^= (1 << bit);

2) Probably we should also check that 
    read_ecc->lineParity = ~read_ecc->lineParityPrime
or, at least, the difference between 
    read_ecc->lineParity and ~read_ecc->lineParityPrime 
not more than 1 bit ??


3) What could be the reason for this strange tag+ecc corruption ? Looks
like it was (over)written with ascii 0s.. Could it be the result of
unfinished (interrupted by power off, h/w reset etc) nand write ?

4) I see that currently ecc errors during nand_read are ignored silently
as long as all other read errors. Maybe it's worth to report them at
least: error counters, yaffs traces ? As for ecc errors I suggest to
put the following trace:

if (t->eccResult)
{
T(YAFFS_TRACE_ALWAYS,
(TSTR("ecc error=%d colParity=0x%0x lineParity=0x%0x lineParityPrime=0x
%0x" TENDSTR),t->eccResult, pt->ecc.colParity, pt->ecc.lineParity,
pt->ecc.lineParityPrime));
}

in yaffs_UnpackTags2 just after call to yaffs_ECCCorrectOther it could
make life easier.

Thanks,

Gennady Dagman.



Unable to handle kernel paging request at virtual address f37f6d08
pgd = c3a2c000
[f37f6d08] *pgd=00000000, *pmd = 00000000
Internal error: Oops: f5
CPU: 0
pc : [<c00836fc>]    lr : [<00000004>]    Not tainted
sp : c34f3c74  ip : 00000025  fp : c34f3c94
r10: c34f3d5c  r9 : 00000000  r8 : 00000025
r7 : c34f3ce4  r6 : c34f3c98  r5 : cfcfcfcb  r4 : c34f3cd4
r3 : 00000015  r2 : 00000001  r1 : 30303034  r0 : ffffffff
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  Segment user
Control: 397F  Table: A3A2C000  DAC: 00000015
Process CNI_gui (pid: 331, stack limit = 0xc34f2368)
Stack: (0xc34f3c74 to 0xc34f4000)
3c60:                                              c34f3c98 c34f3d5c c34f3cd4
3c80: 03788000 00000000 c34f3cbc c34f3c98 c0084258 c0083628 c3ea2015 00000004
3ca0: fffffffb c34f3cd4 c3ea2000 c02d6a00 c34f3d18 c34f3cc0 c0084900 c0084208
3cc0: c34f3cd0 c029d9e0 c010ac58 0100007f 00000040 303030cc 303030fe 30183030
3ce0: 18303030 c34f3d30 30303030 30303030 00000010 000001bd c3ea2000 00006f50
3d00: 00000000 000003a1 c34f3d5c c34f3d28 c34f3d1c c0085470 c008483c c34f3d54
3d20: c34f3d2c c007bbc8 c0085440 c3e861c0 c3ea2000 000003d1 c34f3d5c ffffffff
3d40: c3ea2000 001e8000 c34f3dc4 c34f3d58 c007e160 c007bae4 000003d1 aaaaaaaa
3d60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
3d80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
3da0: 55555555 c3e861c0 c30c5000 00000000 000003d1 00001000 c34f3ddc c34f3dc8
3dc0: c007e3dc c007e100 00000000 00000800 c34f3e14 c34f3de0 c007fd34 c007e3cc
3de0: 00000000 c30c5000 c3e861c0 c30c5000 c026c200 c3e861c0 c026c218 c3ea2000
3e00: c01c5444 00000000 c34f3e40 c34f3e18 c00789cc c007fba0 c026c200 c365c2d4
3e20: c026c200 00000000 c02fd808 c36e7ca0 c36e7ca0 c34f3e58 c34f3e44 c0078aec
3e40: c0078938 00000000 c365c2d4 c34f3e80 c34f3e5c c003a9ec c0078ae4 000001e8
3e60: 00000007 0000021e c36e7ca0 c389aea0 c365c2d4 c34f3ea0 c34f3e84 c003aa80
3e80: c003a92c 00000000 000001eb c02fd814 0000021e c34f3ed4 c34f3ea4 c003c6a0
3ea0: c003aa34 c365c264 0000021e c389aea0 c3890fcc 001f3000 001f3000 c00133a0
3ec0: c34f3fb0 00000000 c34f3f08 c34f3ed8 c0037be8 c003c578 c0037b6c c0021a18
3ee0: c34f2000 c00133a0 00000000 001f3000 c389aea0 c34f3fb0 00000007 c34f3f34
3f00: c34f3f0c c0037e24 c0037b90 c3890fcc c389aea0 001f3000 c389aea0 00000000
3f20: c34f2000 c00133a0 c34f3f5c c34f3f38 c00211dc c0037db0 c00133bc c34f2000
3f40: c00133a0 c34f3fb0 001f3938 00000007 c34f3f8c c34f3f60 c0021324 c0021174
3f60: 002b9000 00000007 c017506c 001f3938 c34f3fb0 0025d530 0022d398 002b7610
3f80: c34f3fac c34f3f90 c002177c c00212c8 ffffffff c34f3fec 00000000 0000397f
3fa0: 00000000 c34f3fb0 c001b49c c0021754 001f3938 0000002a 002b8588 001f3938
3fc0: 001f3938 00000000 00000000 001f3938 0025d530 0022d398 002b7610 002b7570
3fe0: bffffd50 bffffd38 001aad40 001abbd8 20000010 ffffffff ffffffff ffffffff
Backtrace:
Function entered at [<c008361c>] from [<c0084258>]
 r8 = 00000000  r7 = 03788000  r6 = C34F3CD4  r5 = C34F3D5C
 r4 = C34F3C98
Function entered at [<c00841fc>] from [<c0084900>]
 r6 = C02D6A00  r5 = C3EA2000  r4 = C34F3CD4
Function entered at [<c0084830>] from [<c0085470>]
Function entered at [<c0085434>] from [<c007bbc8>]
Function entered at [<c007bad8>] from [<c007e160>]
Function entered at [<c007e0f4>] from [<c007e3dc>]
 r8 = 00001000  r7 = 000003D1  r6 = 00000000  r5 = C30C5000
 r4 = C3E861C0
Function entered at [<c007e3c0>] from [<c007fd34>]
 r5 = 00000800  r4 = 00000000
Function entered at [<c007fb94>] from [<c00789cc>]
Function entered at [<c007892c>] from [<c0078aec>]
Function entered at [<c0078ad8>] from [<c003a9ec>]
 r5 = C365C2D4  r4 = 00000000
Function entered at [<c003a920>] from [<c003aa80>]
Function entered at [<c003aa28>] from [<c003c6a0>]
 r7 = 0000021E  r6 = C02FD814  r5 = 000001EB  r4 = 00000000
Function entered at [<c003c56c>] from [<c0037be8>]
Function entered at [<c0037b84>] from [<c0037e24>]
Function entered at [<c0037da4>] from [<c00211dc>]
 r8 = C00133A0  r7 = C34F2000  r6 = 00000000  r5 = C389AEA0
 r4 = 001F3000
Function entered at [<c0021168>] from [<c0021324>]
Function entered at [<c00212bc>] from [<c002177c>]
Function entered at [<c0021748>] from [<c001b49c>]
 r7 = 0000397F  r6 = 00000000  r5 = C34F3FEC  r4 = FFFFFFFF
Code: 138ee002 e31c0002 138ee001 e3a02001 (e7d43001)