[Yaffs] yaffs failure on writing to a bad block

Attachments:
Message as email (text/plain)

Author: Raj Kumar Yadav
Date:
To: yaffs
Subject: [Yaffs] yaffs failure on writing to a bad block

Hi Ian & all,

I am using ARM9 based board with 512MB large page(2K page, 256K Block)
nand flash. The linux kernel version is 2.6.14 with yaffs source on
date July 12, 2007 from
http://www.aleph1.co.uk/cgi-bin/viewcvs.cgi/yaffs2/.

I am using the NAND controller present in the SoC, it broadly provide
a interface over the NAND SMI interface. Instead of reading/writing on
the SMI lines, we read/write to/from the FIFO buffer(length 512byte)
of NAND Controller. All the command/data transfer/status can be
initiated/read from the controller registers.

Using NAND Software ECC for data correction.

yaffs configuration is as follows:

CONFIG_YAFFS_FS=y
CONFIG_YAFFS_YAFFS1=y
# CONFIG_YAFFS_DOES_ECC is not set
CONFIG_YAFFS_YAFFS2=y
CONFIG_YAFFS_AUTO_YAFFS2=y
# CONFIG_YAFFS_DISABLE_LAZY_LOAD is not set
CONFIG_YAFFS_CHECKPOINT_RESERVED_BLOCKS=10
# CONFIG_YAFFS_DISABLE_WIDE_TNODES is not set
# CONFIG_YAFFS_ALWAYS_CHECK_CHUNK_ERASED is not set
CONFIG_YAFFS_SHORT_NAMES_IN_RAM=y
# CONFIG_JFFS_FS is not set

MTD configuration
#
# NAND Flash Device Drivers
#
CONFIG_MTD_NAND=y
CONFIG_MTD_NAND_VERIFY_WRITE=y
CONFIG_MTD_NAND_MM6PLUS=y
CONFIG_MTD_NAND_IDS=y
# CONFIG_MTD_NAND_DISKONCHIP is not set
# CONFIG_MTD_NAND_NANDSIM is not set

In the nand flash, there are 5 partition with last partition as
~450MB. The kernel boots up
from 2nd partition which have readonly yaffs2 image as rootfs. After
bootup, we are
mounting this ~450MB partition as yaffs2 and do read/write operation as needed.

To ensure the successful completion of a page write, I am using MTD
write verify.
Until now, all read/write/mount/unmount were working fine.

1) Now on unmount the ~450MB partition, on one particular block the
NAND verify failed,
but the YAFFS did not write the data of the block into some other
block. And during the next mount, it took time more(15second) than
expected (~1sec).

Which looks like check-pointing failed during last unmount, may be due
to the NAND write verify failure during unmount. Is it the correct
behavior?

2) It is found that the block on which NAND verify failed, is actually
bad, as I am unable to erase/write on that block using nand-utils or
the custom bootloader commands.

NAND Controller shows status as success after erase/write.

But after the erase, all bytes on the 1st, 3rd, 5th, ... page are
0xFF. and all bytes on the 2nd, 4th, 6th,... page are 0x00.

Similarily, writing data pattern on any of the pages in that block
have no effect on the page data.

This also means that, I cannot mark the block bad, as the first page
is all 0xff and nand write on the page has no effect on page data.

So, it is ending up in a situation, where the block will never be
marked bad, and the write will always fail (due to MTD NAND verify) on
the block pages.

Does anybody faces similar problem, please suggest any solution.

Will enabling the yaffs option
"CONFIG_YAFFS_ALWAYS_CHECK_CHUNK_ERASED=y" solve the issue.

Will it make the erase more slow?

All suggestions/comments are welcome.

Thanks,
Raj Kumar Yadav

This message is part of the following thread:
	the complete thread tree sorted by date

	ian@brightstareng.com at