[Yaffs] Re: bit error rates

Charles Manning manningc2 at actrix.gen.nz
Fri Feb 10 21:13:01 GMT 2006


On Friday 10 February 2006 12:55, Peter Barada wrote:
> > I would be interested to know if anybody else has seen this behavior.
>
> Yes, I have.  I use a YAFFS1 NOR-based system, and in the writes, we lay
> down the data chunk, and then the tag.  In the unlikely event that a
> power-cycle occurs while writing the data, the tag is still empty, but
> some of the data chunk is not erased, and then next time a write occurs
> into that chunk, YAFFS sees that the write fails since the previous data
> was written(and retires the whole block), even though the tag indicated
> the chunk is empty.
>
> To fix this, I used two bits in the pageStatus byte in the tag, and
> write the tag first, then the data, and then update the tag.  Assuming
> that the pageStatus starts out as 0xff, then the first tag write puts in
> the value of the tag, but writes a pageStatus byte of 0xfe to indicate
> that a write is in progress, then writes the chunk data, and then comes
> back an re-writes the tag with the same data, and a pageStatus of 0xfc.
> In the rest of the code, the chunk is assumed to be valid if the
> pageStatus is 0xff(and objectId is non-0xfffff) or if 0xfc, empty if the
> objectId is 0xfffff, and deleted if the pageStatus is either 0xfe, or
> 0x00(the value written to delete a tag).
>
> This solved the problem for me.  I assume an approach like this would
> work for NAND...

I think this is a bit different for NOR because NOR must be written as a whole 
bunch of interrupable writes.

The whole NAND page is written as a single write operation, so this is less 
likely to be an issue. This means that doing a pre and post write with NAND 
is probably not going to do anything useful.  Hunch!

I think an interrupted erase is probably more likely to cause problems, but 
again this is just a hunch.

Dealing to an interrupted write is relatively straight forward. It will always 
be the last page written before the system went down. Most of the time 
(except for the last page written to a block), we can detect the last page 
because it is the  last page in the currently allocated block.

Failures due to interrupted erases are a bit more difficult to track.

Sergei, I know you've done a lot of tests that highlighted the problem. Were 
you able to break down the failure stats as to what operation was interrupted 
and caused the problem?

It would be nice to improve this, but as Jon sayas, I think data integrity 
should always come first!

-- Charles




More information about the yaffs mailing list