Hello

You really have two independent, but related issues going on here...

On Thu, Feb 22, 2018 at 4:45 AM, de Brebisson, Cyrille (Calculator Division) <cyrille@hp.com> wrote:

Hello,

 

I am responsible for implementing yaffs in a small, bare meta OS system with a 512MB, 2K pages, 64  pages block NAND Flash.

 

Our Flash rom subsystem is sometimes failing reads. This is happing on blocks that are supposed to be good and contains file data! Furthermore, later reads to the same page might succeed (after a system reboot).

 

Of course, this is wrecking havoc on our system’s reliability.

 

So, a couple of questions:

  • have you ever seen that in your experience?

Bad flash drivers are quite common. One area where people often make mistakes is in waiting until the data is properly ready before reading.
Also, make sure the flash memory power supplies are good and low ripple.  Remember that cells (and NAND flash cells in particular) are really analogue elements and too much ripple on the power rails can cause problems.
 
  • when this happen, I make sure that the flash driver to yaffs interface returns a YAFFS_FAIL, however, yaffs does seem to still return data to the user that called the file read, incorrectly returning known bad bytes at this point. Is this normal?
Yes, there is an issue here. Yaffs can choose to either say there is an IO error (EIO) or it can try to give back as much data as it can. Yaffs chooses the latter approach.
I have some changes underway to allow Yaffs to return EIO instead.

  • After such a bad read, yaffs does not seem to do anything on the block like mark it as bad or something like this. Is it intended?
It should be marked for future cleanup.

 

I am working with a checkout of yaffs which is around 6 month old (taken in june 2017) in a bare metal setup.


I doubt there are significant changes in the last 6 months to impact this issue.

The major thing to do is find out why the flash reads are failing so badly.

Regards

Charles