Hi, Ricard and Charles
> > But now, MTD layer read function only return ?EUCLEAN to YAFFS2, YAFFS2 cannot
> > get how many bit flip occur.
>
> Since a couple of kernel versions back, the mtd layer now includes a
> method of notifying how many bits were corrected during a read operation,
> and not just -EUCLEAN. So the mechanism to find out how many bitflips have
> occurred actually exists now.
> (Not all NAND flash controllers with hardware ECC support this however,
> i.e. they correct the data but it's not possible to read from the hardware
> how many bits were corrected (or in some cases even if bit correction
> occurred).)
Yes, you are right. I check the latest Linux kernel just now. There do is a mechanism
to find out how many bitflip have occurred.
> > But actually, in NAND flash, only program and erase error can be marked bad
> > block. Bit flip symptom is easy happen after a page is read many cycles.
> >
> > If a system use YAFFS2, and never power down this system, user will see a
> > lot of bad block after they run a time, But this block isn?t a real bad block.
> >
> > How about just refresh the block when bit flip occur, but not record the bit
> > flip count, and mark it as bad block?
>
> I would think that one factor in deciding if the block is going really bad
> would be to estimate how many read cycles have been done since the last
> rewrite. If there have been a lot of read cycles, or a lot of time has
> passed since the last rewrite, it would be more expected for bitflips to
> occur and hence the block should not be marked as bad.
Exactly, but I am afraid it is hardly to track read cycles for a block in YAFFS2.
As Linux standard driver can give bitflip numbers, I think it is possible to use the
method Charles just mentioned, use a three level of errors,
* NO_ERROR : No problems.
* REFRESH_NEEDED: (new) Refresh block, don't worry about it going bad.
* FIXED_BUT_SUSPECT: Treated same as FIXED is now, retire the block if it does this again.
* UNFIXED
Define 2 threshold, one threshold is used for REFRESH_NEEDED, and the other used for REFRESH_NEEDED.
REFRESH_NEEDED threshold should be lower than REFRESH_NEEDED threshold.
In the normal condition, a block should be refreshed once its bitflip reach the REFRESH_NEEDED threshold,
never can exceed REFRESH_NEEDED threshold. When bitflip exceed REFRESH_NEEDED threshold, I assume something
abnormal occurs, YAFFS2 can record this block, and if this abnormal occurs more than 3 times, YAFFS2 can mark
this block as bad block.
No sure if this process make sure, how do you think about? If it is ok for you, I am make a YAFFS2 patch to submit
to you.
Thanks