On Friday 16 February 2007 05:12, ian@brightstareng.com wrote: > On Wednesday 14 February 2007 18:49, Blair Barnett wrote: > > We're experiencing what appear to be file system corruption > > due to power fail. I just got done looking at a nanddump of a > > yaffs2 file system that looks like the block header (first > > page in the block) was overwritten with garbage data (can't > > tell whether it's "good" data from some where else yet). > > If power is lost during an erase, the memory cells are left in an > undefined state with a "half bucket" of electrons which can be > read as a one or a zero depending on the temperature, supply > voltage, age and the wind ;-) The garbage data would very > propably produce a data ECC error (mtd ecc), but this is not > guaranteed. This scenario seems to make the most sense. While this sort of thing is pretty easy to see on NOR (which takes a long time to erase), there is only a very small window for this to happen on NAND since an erase only takes 2 or 3 msec or so. How often do you see this? I would suggest that you check the power rails etc. If your CPU can run at voltages where the NAND is marginal, then you have the potential to be telling the NAND to do stuff which it can't do properly. ie. During a power failure you'd ideally be shutting down the CPU while the NAND still has enough power to be sane. Also look carefully at the WP pin. Yanking the WP pin during a program/erase can cause problems. You should ideally be doing power OK check in the NAND driver before starting an erase or write to ensire that the system will have residual power to complete an erase. > > If an error is detected by a failed read (mtd ecc), I would > expect Yaffs to recover. If not, and Yaffs' test of it's own > spare/tags "mini ECC" looks good, then the data would be assumed > valid and presented as part of a Yaffs file. > [Have I got this right Charles?] There are places where the ecc is ignored. > > If you compute the ECC data for the "corrupted" block header > page, is it correct? > Is the mini-ecc that runs over the Yaffs tags correct? > > > I'm running 2.4.27 linux with the latest yaffs2 tarball. I'm > > unable to quickly move to 2.6. > > Linux 2.5 vs. 2.6 should not matter; there are many others > in the same boat (real world). Correct. That's why we consider it important to support 2.4.x rather than give out abuse! > > -imcd > > _______________________________________________ > yaffs mailing list > yaffs@lists.aleph1.co.uk > http://lists.aleph1.co.uk/cgi-bin/mailman/listinfo/yaffs