Re: [Yaffs] Bad eraseblocks and NAND / ECC layouts

Attachments:
Message as email (text/plain)

Author: Rong Shen
Date:
To: Noah Fontes
CC: yaffs
Subject: Re: [Yaffs] Bad eraseblocks and NAND / ECC layouts

On Fri, May 29, 2009 at 3:23 PM, Noah Fontes <nfontes@cynigram.com> wrote:
> > Rong Shen wrote:
>>> This would miss certain classes of error - such as charge leakage (e.g.
>>> the cells will erase or set themselves over a day/week/month/year).
>>>
>>> http://lists.openmoko.org/pipermail/community/2007-July/008471.html
>
> So, from what I understand of this message, the BBT is first generated
> by a capable bootloader (which the RouterBOARD doesn't have) out of the
> factory and then regenerated when necessary by Linux when errors are
> detected? Neat...

AFAIK, most manufacturers would put bad block marks in the spare area.
Bad block table is created in memory when Linux boots - my case, there
might be cases where bbt is created on Flash. But you are right blocks
can be marked as bad whenever error happens.

>
>> You are probably right on this, but no one can guarantee a good block
>> will never turn to bad block, even most of manufactures would say in
>> their datasheet that bad blocks may develop later on. So if there's no
>> 100 percent reliable way to detect bad blocks, why don't we try the
>> above process, which can at least detect non-intermittent errors.
>
> Isn't this how the scanner in the Linux kernel works anyway? From the
> code...
>
> /* Compare the pattern */
> for (i = 0; i < td->len; i++) {
> if (p[td->offs + i] != td->pattern[i])
> return -1;
> }
> return 0;
>
> and this is simply run at boot time. I don't see the harm in
> implementing something similar to that in a memtest86 fashion. Or am I
> missing something here?

It's about right, but more specifically, you need to
1. erase a block
2. write pattens (pre-filled in a buffer) to it - most nand flash
would allow multiple writes to a page, so better just write a whole
block in one go.
3. read the block back into a buffer
4. do the comparing step as you mentioned - and mark block as bad if
error happens
5. repeat the above step for other blocks

>
> With regard to the 'uncorrectable errors', is there still a chance that
> there's a driver problem if the ECC layout allows the filesystem to be
> mounted correctly? (That said - mtdblock0 contains an ELF image of a
> Linux kernel, not a filesystem, and mtdblock1 contains the Yaffs
> filesystem itself. I'm not sure this matters though...) I'll try to

Possible, I found on my system uboot (the bootloader) and yaffs put
ECC at different locations,
which should be fine as long as no one accesses areas they are not
supposed to, in other words,
uboot accessing blocks used by yaffs would causes a lot ECC errors.

But looking at your output, it doesn't seem to be the case.

> search around and see if I can force those blocks to be reprobed or so.

To do so, you just need to call the MEMWRITEOOB ioctl on a MTD char device.

> Anyway, thanks for the tips! Since the device is non-critical, I'm
> considering just wiping it and dealing with the consequences if they pop
> up. I'll have a look at the Hynix docs before doing so just to make
> sure, as suggested.
>
> Regards,
>
> Noah
>
> - --
> Noah Fontes
> Cynigram
> http://www.cynigram.com/
> Phone: +1 919 349 9826
> >

--
Rong

This message is part of the following thread:
	the complete thread tree sorted by date
	Noah Fontes at
	Robin Iddon at