Re: [Yaffs] ECC algorithm is hamming code now, will any new …

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: Thomas Gleixner
Date:  
To: Charles Manning
CC: yaffs
Subject: Re: [Yaffs] ECC algorithm is hamming code now, will any new algorithm be enter YAFFS?
On Sun, 2005-08-07 at 16:07 +1200, Charles Manning wrote:
> > M-systems uses BCH code to support MLC inside its DOC products. This
> > algorithm can detect 4bit and correct 4bit error. Will YAFFS employ
> > any other new ECC algorithm?
>
> Are the 4 bits 4 bits per page or what? With most ECC structures used with
> NAND, the ECC corrects one bad bit per 256 bytes. Correcting more requires
> larger ECC areas and requires more ECC computation (hardware or software).
>
> Since ECC is part of mtd (or whatever NAND layer you are using), this is realy
> independent of YAFFS.


The NAND/MTD layer supports a couple of different ECC solutions. The DoC
devices use a Reed-Solomon code, which is supported by a en/decoder
library.
Reed-Solomon codes can correct and detect more errors than the
SmartMedia Hamming Code which is the standard ECC since NAND came up.
OTH such codes need hardware support because the calculation by software
would be too time consuming. DoC's have a builtin hardware RS encoder.

> It is also important to consider most likely failure modes. I am not familair
> with MLC failure modes, but single bit errors (as corrected by ECC) are
> typically very rare with NAND (as used by YAFFS). Double bit errors are even
> more rare. I have done done tests a few times where over 100Gbytes of data
> was written to a file system without a single bit of corruption. Since
> 100Gbytes translates into many lifetimes of most mobile/embedded products, I
> am pretty confident that for most usages bit errors are not a significant
> problem when used with single-bit ECC.


100GiB data related to which partition size ?

Lets assume a 16 MiB partition, where you write 100GiB data. Lets
further assume that we have a real sum of 256GiB of data written to
FLASH due to garbage collection, wear levelling...

256 GiB / 16 MiB = 16384

That means we erased / programmed each block of the FLASH 16k times.
This is nowhere near the 100k erase cycles.

We conducted long time tests, where we definitely encountered multi bit
errors in different contexts. The effects start to show up, when you
reach the 60k erase/program range. There are also known effects with
occasional bitflips on read, but those seem to be single bit only.


tglx