Fwd: [Yaffs] bit error rates]

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: Jon Masters
Date:  
To: yaffs mail list
Subject: Fwd: [Yaffs] bit error rates]
---------- Forwarded message ----------
From: Jon Masters <>
Date: Feb 15, 2006 11:53 PM
Subject: Re: [Yaffs] bit error rates]
To: William Watson <>


On 2/15/06, William Watson <> wrote:

> I will also note that a NAND vendor who paid us a visit at about that same
> time said that we should expect WORSE soft error behaviour with succeeding
> generations of NAND flash chips. The geometries would get smaller and
> smaller, the chip dies would get larger and larger, and the amount of time
> for production testing of each chip would not increase, or at least, not
> increase as fast as the total storage of a chip. Thus, the testing per page
> would only go down in subsequent generations of chips. These two statements
> seemed to say that we would see both (1) increased rates of ECC errors, and
> (2) an increase in the number of marginal blocks not marked bad by the chip
> vendor.


But this sounds like it might be better to be additionally cautious -
I agree that marking OOB data is a good idea, maybe I'll get to look
at that.

> Another obvious alternative strategy for preventing data loss due to
> accumulation of multiple bit errors would be to periodically read the entire
> data array, checking for ECC errors. You'd want to calculate the impact
> that such reading would have on the rate of appearance of errors, as well as
> the impact on system and NAND performance. For a standard file system, it
> might suffice to perform one additional data chunk read for every N read
> requests, incrementing the "scrub" page each time. This would ensure a
> complete read scrub at a fixed percentage overhead. One could also perform
> a read scrub every M write operations, if desired.


A low priority kernel thread which sat and got woken up about as much
as kswapd probably wouldn't have much impact but could do this - and
only run when nothing else is using the flash part. This would be
better in the MTD layer though and might necessitate some changes to
the locking currently used.

Jon.