Hi, Ricard and Charles

> > But now, MTD layer read function only return ?EUCLEAN to YAFFS2, YAFFS2 cannot
> > get how many bit flip occur.
>
> Since a couple of kernel versions back, the mtd layer now includes a 
> method of notifying how many bits were corrected during a read operation, 
> and not just -EUCLEAN. So the mechanism to find out how many bitflips have 
> occurred actually exists now.

> (Not all NAND flash controllers with hardware ECC support this however, 
> i.e. they correct the data but it's not possible to read from the hardware 
> how many bits were corrected (or in some cases even if bit correction 
> occurred).)

Yes, you are right. I check the latest Linux kernel just now. There do is a mechanism
to find out how many bitflip have occurred.

> > But actually, in NAND flash, only program and erase error can be marked bad
> > block. Bit flip symptom is easy happen after a page is read many cycles.
> > 
> > If a system use YAFFS2, and never power down this system, user will see a
> > lot of bad block after they run a time, But this block isn?t a real bad block.
> > 
> > How about just refresh the block when bit flip occur, but not record the bit
> > flip count, and mark it as bad block?
>
> I would think that one factor in deciding if the block is going really bad 
> would be to estimate how many read cycles have been done since the last 
> rewrite. If there have been a lot of read cycles, or a lot of time has 
> passed since the last rewrite, it would be more expected for bitflips to 
> occur and hence the block should not be marked as bad.

Exactly, but I am afraid it is hardly to track read cycles for a block in YAFFS2. 
As Linux standard driver can give bitflip numbers, I think it is possible to use the 
method Charles just mentioned, use a three level of errors, 
* NO_ERROR : No problems.
* REFRESH_NEEDED: (new) Refresh block, don't worry about it going bad.
* FIXED_BUT_SUSPECT: Treated same as FIXED is now, retire the block if it does this again.
* UNFIXED

Define 2 threshold, one threshold is used for REFRESH_NEEDED, and the other used for REFRESH_NEEDED.
REFRESH_NEEDED threshold should be lower than REFRESH_NEEDED threshold.
In the normal condition, a block should be refreshed once its bitflip reach the REFRESH_NEEDED threshold,
never can exceed REFRESH_NEEDED threshold. When bitflip exceed REFRESH_NEEDED threshold, I assume something 
abnormal occurs, YAFFS2 can record this block, and if this abnormal occurs more than 3 times, YAFFS2 can mark 
this block as bad block. 

No sure if this process make sure, how do you think about? If it is ok for you, I am make a YAFFS2 patch to submit
to you.
Thanks