Raj, On Thursday 02 August 2007 14:20, Raj Kumar Yadav wrote: > > > Do you think, it is necessary to enable the option "page > > > verify after write" in nand driver under MTD layer. Will > > > it solve the problem ? > > > > Turning on verify while you figure out what's going is not a > > bad idea. Verification costs however, the data is read back > > and compared after the write, so there's additional i/o and > > CPU involved. The NAND chip itself is supposed to check that > > the write/erase is successful as the operation completes -- > > it indicates this in a status byte that's read by MTD -- so > > it should not be necessary to read back the data to verify > > the write. Now there could be other issues outside the NAND > > chip that cause data errors, in which case having MTD > > read-back and verify data just written may be useful. > > I have a doubt, If a block is gone bad, writing 0xA5 may end > up to for example 0xF5 or similar at some byte, somewhere in > page. In this case the flash controller will not return error > or Will it ?. Not sure what you mean by 'flash controller' do you have a platform equipped with hardware-based NAND ECC (under MTD). The software ECC that MTD uses can detected/correct a single bit errors in each 256 byte segment of the page, and detect 2-bit errors; anything more may slip through. MTD does not check the Yaffs tag data that is placed in the NAND 'spare' bytes. Tag data errors are only detected/corrected by Yaffs (as seen the trace output in your original posting). > I agree on the cost part, however if write verify makes the > solution more robust, atleast I > am not averse to the idea. If one has problems with data integrity on the bus to/from the CPU the NAND chip, ECC and write-verification is not really going to help -- I'm not saying you actually have any such problem. If you think you may have erased a bad-block marker (during development) that's very probably the explanation for all this. Mark the block bad again by zeroing the marker byte -- mtd has a function to do this -- not sure if there's a tool/app for this. It looks to me like Yaffs has a problem, clearly it should not have abandoned the scan just because of a chunk/page read error. As a side: if a block fails with a bit stuck on (1), then I wonder if the NAND chip's test for a successful erase will pass. I don't see why it wouldn't -- I assume it's just testing that all the cells read as one (1). In which case the failure of the cell would only be seen when writing a page that has a 0 at the location of the broken cell and MTD should be informed of such a failure by the NAND chip's write status. If the data bit is 1, then the broken bit goes unnoticed and the failure is benign. Retirement: Perhaps on a hard read failure it would be good to attempt to write all-zeros to the page. If this fails, this is more evidence that there is a real problem with the page and it can be marked for retirement. If a cell is stuck at 0, that should manifest itself later as an erase failure, and it can be marked for retirement. But if a write fails, but it *is* possible to zero all the bits, then ignore the write failure and move on. While doing all this testing, don't actually zero any block status bytes. -imcd