Hi Charles,
Thanks for your reply.
I've looked into the code deeper and:
> On Fri, May 5, 2017 at 10:34 AM, Lukasz Majewski <lukma@denx.de>
> wrote:
>
> > Dear All,
> >
> > I'm working on embedded system equipped with NAND Flash memory.
> >
> > The code is pretty old and corresponds to SHA1:
> > 60f5ecebdeee37d56f33374c407376f596baa468
> >
> > from: git://www.aleph1.co.uk/yaffs2
> >
> >
> > From my debugging I do see two bit flips (should be 1s, but I read
> > 0s) happening in the same chunk of data (0x100), from which ECC is
> > calculated.
> > As fair as I know the ECC will be "correct" for two bit flops.
> >
> >
> > I've looked to the yaffs_ecc.* files and found following comment:
> >
> >
> > /*
> > * This code implements the ECC algorithm used in SmartMedia.
> > *
> > * The ECC comprises 22 bits of parity information and is stuffed
> > into 3 bytes.
> > * The two unused bit are set to 1.
> > * The ECC can correct single bit errors in a 256-byte page of data.
> > * Thus, two such ECC blocks are used on a 512-byte NAND page.
> > *
> > */
> >
> > So it seems like two bit flops are not detected - only single bit
> > flop is detected and corrected.
> >
> > Is there any way to mitigate this issue?
> >
>
> There are many ways to do ECC in the system. Yaffs provides an ECC
> function that is really intended for older flash devices (SmartMedia
> etc - really, really old) running in Yaffs1 mode where 1-bit
> correcting ECC was the norm. That Yaffs ECC code is also used for
> protecting very small blocks of unprotected data (eg. tags).
- I'm using yaffs2 with useNANDECC = 1, so I rely on Linux MTD
subsystem (nand flash driver -> nand_ecc.c) on calculating ECC
Unfortunately - the Linux version which I do use (2.6.27) only supports
NAND_ECC_SOFT, which corresponds to Hamming 1bit/256B correction.
I think that I will switch to BCH codes 4bit/512B.
One question - will yaffs2 work with OOB's ECC extended from 24B to 28B?
(I'm also wondering how I could setup a test environment to validate
switch to BCH ECC scheme - use nandsim driver from kernel?)
Or would you recommend something better?
>
> In most circumstances (ie. running Yaffs2 mode), the actual ECC that
> is used is done in the driver and it is not part of Yaffs.
>
>
Yes, you are right -> ECC is calculated in NAND driver. However, with
my kernel - both algorithms are the same.... (1bit/256B ECC) => 24B
stored in OOB.
>
>
> >
> > IMHO the NAND flash page for this data chunk shall be marked as BAD
> > -> but we cannot detect such errors.
> >
> > Is there any plan to implement new algorithm?
> >
>
> When Yaffs2 was introduced (many years ago now), the biggest
> motivation was that there are different NAND types with different ECC
> requirements, different write order requirements etc.
>
> ie. the decision was made that it is impossible to try handle all the
> ECC variants (hw/sw/BCH/...) within Yaffs and it was better to move
> that into the driver.
>
> If you can post a bit more info about the flash parts you are using
> etc, then I can give a better appraisal.
I'm using Samsung's NAND Flash memory - 128M x 8B. It uses 2KiB pages
with 128KiB erase blocks. It doesn't support ECC implemented in its
internal controller (only EDC).
I'm also wondering what would happen if:
- I do store file in the YAFFS2 FS
- This file is RO mostly (some library)
- By torture testing it happens that:
-- I do have 1 bit flip -> no problem ECC will correct it
-- I do have 2 bit flips in the 256 ECC "covered" data -
is this data regarded as valid or is eccResult =
YAFFS_ECC_RESULT_UNFIXED set?
From my systems it seems like the Yaffs2 chunk is treated as
a correct one (data is read from this file, but checksums
differs with the factory file).
What can be done in such situation? How to fix it?