Hi Charles, hi all,
I use Micron nand and yaffs2 modules which runs on linux/mtd driver.
The nand reports file corruptions and many bad block after several
months' installation.
According to our debug message, we think there is a BUG on yaffs2
handling write error from low layer. Here is detail information:
1. yaffs version: we patched to HASH: 8a3135f
2. the code which may contain BUG:
file: yaffs_guts.c
488 static int yaffs_write_new_chunk(struct yaffs_dev *dev,
489 +-- 2 lines: const u8
*data,------------------------------------------------------------------------------------------------------------------------------------------
491 {
492 int attempts = 0;
493 int write_ok = 0;
494 int chunk;
495
496 yaffs2_checkpt_invalidate(dev);
497
498 do {
499 struct yaffs_block_info *bi = 0;
500 int erased_ok = 0;
501
......
560 if (write_ok != YAFFS_OK) {
561 /* Clean up aborted write, skip to next block and
562 * try another chunk */
563 yaffs_handle_chunk_wr_error(dev, chunk, erased_ok);
564 continue;
565 }
and :
232 static void yaffs_handle_chunk_wr_error(struct yaffs_dev *dev,
int nand_chunk,
233 int erased_ok)
234 {
235 int flash_block = nand_chunk / dev->param.chunks_per_block;
236 struct yaffs_block_info *bi = yaffs_get_block_info(dev, flash_block);
237
238 yaffs_handle_chunk_error(dev, bi);
239
240 if (erased_ok) {
241 /* Was an actual write failure,
242 * so mark the block for retirement.*/
243 bi->needs_retiring = 1;
244 yaffs_trace(YAFFS_TRACE_ERROR | YAFFS_TRACE_BAD_BLOCKS,
245 "**>> Block %d needs retiring", flash_block);
246 }
3. why we think it's BUG:
In above code logic, yaffs2 try first verify the written block has
been erased( and just verify the first chunk and skip other chunk of
the same block).
Then, try to write chunk to nand. The logic should be that any chunk
written error should lead to retire the block.
Currently, 500 int erased_ok = 0; will not retire the rest of
chunks (ie 1..63 chunks).
So I think the default value of erased_ok should be 1 and the chunk
written failure of lost power should be correctly detected when
scanning the whole nand or
when mount yaffs2.
If I make any mistake please point out, Thanks,
Best Regards,
SAM