>>
>>I think that's a bad idea. The block should be marked as bad. It's not
>>worth losing data just to save out on theorectically marking a good
>>block bad - it doesn't seem to happen in practice. I'd rather lose all
>>of the good blocks than lose any data, so would many other people.
>>
> But if you loose ALL good blocks you loose also your data! ;-)
I tend to agree with you. Early retirement of blocks with correctable ECC
also doesn't seem as clearly a good idea to me.
Even if new flash chips may not exhibit this failure mode often in
"accelerated lifetime" testing, thats not to say another generation of
future NAND flash chips won't (or old ones resurrected in future surplus
markets). Since this behavior is already documented by Toshiba you can be
certain that if a future NAND flash manufacturing process has a side effect
of this happening more often, the manufacturers will use it in a heartbeat
to keep their yields high. Also, one version of "accelerated lifetime"
testing may not quite mesh with what actually happens with real devices over
the course of several years of deployment in the field. The environments
people place their embedded designs in can be quite surprisingly hostile.
>
>>
>>>That's because Toshiba document says about soft errors: "This condition
>>>is cleared by a block erase".
>>>
>>
>>Sure. But it might be indicative of a problem nonetheless.
>>
> Is this statement based on any documentation or on your personal
> experience?
> Toshiba document says also "Although random bit errors may occur during
> use,
> this does not necessarily mean that a block is bad"
> However if this is not true can you point me to other documentation
> talking about
> the relationship between random bit errors and permanent block failures?
>
> Sorry, I don't want to raise a flame war, I just want to understand YAFFS
> bad block
> marking policy, and if there is a better solution.
> Excuse me for my English, it's not my natural language.
I would at least hope for a knob so that one could at least turn off this
"early retirement" of good blocks.
Even if its rare, the probabilistic chance of it happening tracks with the
amount of reading (not writing) you do on the flash and I could have an
application that accumulates a large number of bad blocks this way by
reading flash at full speed for many years straight. (a not-so-rare embedded
design) I would hope to not have to worry about the flash wearing out with
100% read loads. Its harder to do accelerated lifetime analysis on that
since that application is already reading at close to max speed.
This issue chimes uniquely with me since we've seen a small number of our
boards running YAFFS and NAND flash returned to us with large amounts of
marked bad blocks that weren't actually bad. I'm not sure if the YAFFS
policy of retiring blocks early was to blame, but its the most likely
explanation yet. Needless to say, our customers weren't impressed since
they believed their application had very little flash writing (but a lot of
flash reading) going on.
//Jesse Off