Re: [Yaffs] yaffs2 and power fail problems

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: Charles Manning
Date:  
To: yaffs
CC: ian
Subject: Re: [Yaffs] yaffs2 and power fail problems
On Friday 16 February 2007 05:12, wrote:
> On Wednesday 14 February 2007 18:49, Blair Barnett wrote:
> > We're experiencing what appear to be file system corruption
> > due to power fail. I just got done looking at a nanddump of a
> > yaffs2 file system that looks like the block header (first
> > page in the block) was overwritten with garbage data (can't
> > tell whether it's "good" data from some where else yet).
>
> If power is lost during an erase, the memory cells are left in an
> undefined state with a "half bucket" of electrons which can be
> read as a one or a zero depending on the temperature, supply
> voltage, age and the wind ;-) The garbage data would very
> propably produce a data ECC error (mtd ecc), but this is not
> guaranteed.


This scenario seems to make the most sense.

While this sort of thing is pretty easy to see on NOR (which takes a long time
to erase), there is only a very small window for this to happen on NAND since
an erase only takes 2 or 3 msec or so. How often do you see this?

I would suggest that you check the power rails etc. If your CPU can run at
voltages where the NAND is marginal, then you have the potential to be
telling the NAND to do stuff which it can't do properly. ie. During a power
failure you'd ideally be shutting down the CPU while the NAND still has
enough power to be sane.

Also look carefully at the WP pin. Yanking the WP pin during a program/erase
can cause problems.

You should ideally be doing power OK check in the NAND driver before starting
an erase or write to ensire that the system will have residual power to
complete an erase.

>
> If an error is detected by a failed read (mtd ecc), I would
> expect Yaffs to recover. If not, and Yaffs' test of it's own
> spare/tags "mini ECC" looks good, then the data would be assumed
> valid and presented as part of a Yaffs file.
> [Have I got this right Charles?]


There are places where the ecc is ignored.

>
> If you compute the ECC data for the "corrupted" block header
> page, is it correct?
> Is the mini-ecc that runs over the Yaffs tags correct?
>
> > I'm running 2.4.27 linux with the latest yaffs2 tarball. I'm
> > unable to quickly move to 2.6.
>
> Linux 2.5 vs. 2.6 should not matter; there are many others
> in the same boat (real world).


Correct. That's why we consider it important to support 2.4.x rather than give
out abuse!

>
> -imcd
>
> _______________________________________________
> yaffs mailing list
>
> http://lists.aleph1.co.uk/cgi-bin/mailman/listinfo/yaffs