[Yaffs] Crash in yaffs_CheckpointClose after corruption.

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: Chris Paulson-Ellis
Date:  
To: yaffs
Subject: [Yaffs] Crash in yaffs_CheckpointClose after corruption.
Hi,

I have a yaffs2 filesystem on a 1Gbyte NAND flash that has somehow become corrupt. The filesystem mounts, but yaffs crashes on unmount if any changes are made to the directory with the corrupt entries.

My pattern of access is continuous writing (at about 200kBytes/s) with a background task removing the oldest files when the filesystem reports less than 1Mbyte free. The files so created and deleted are about 135MBytes, so there are only a few of them stored at once and space is freed in large blocks.

The corrupt filesystem had become full, perhaps due to a failure of the mechanism described above to take into account the ability of yaffs to garbage collect blocks fast enough or to checkpoint.

My current yaffs code was taken from CVS on 13-Nov-2007. The version running when the filesystem apparently first became corrupt was taken from CVS on 30-Sep-2007, so was almost exactly the same.

It crashes on unmount like this:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c3dc0000
[00000000] *pgd=23db6031, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#1] PREEMPT
Modules linked in: pss_nand qhal
CPU: 0    Not tainted  (2.6.22.2 #1)
PC is at yaffs_CheckpointClose+0x9c/0x128
LR is at yaffs_CheckpointSave+0x478/0x4d8
pc : [<c00eb0a8>]    lr : [<c00e9590>]    psr: 80000013
sp : c0017e00  ip : c0017e20  fp : c0017e1c
r10: befaeedc  r9 : c3cef000  r8 : 00000000
r7 : 00000000  r6 : 00000000  r5 : c3dc9e00  r4 : c3cef000
r3 : 00000000  r2 : c0016000  r1 : c0016000  r0 : c01fc018
Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  Segment user
Control: c000317f  Table: 23dc0000  DAC: 00000015
Process umount (pid: 155, stack limit = 0xc0016260)
Stack: (0xc0017e00 to 0xc0018000)
7e00: c3cef000 00000000 c3dc9e00 c02e0f80 c0017e8c c0017e20 c00e9590 c00eb01c
7e20: c0086610 c3cef000 c3dc9e00 00000000 00000000 00000000 c0016000 befaeedc
7e40: c0017e64 c0017e50 c002df64 c002db88 c0016000 c0017e6c c0017e8c c0017e78
7e60: c00df754 c3cef000 c3dc9e00 c02e0f80 00000000 00000000 c0016000 befaeedc
7e80: c0017ea4 c0017e90 c00dfd88 c00e9128 c3dc9e00 c022b3a8 c0017ebc c0017ea8
7ea0: c0073cec c00dfd58 c02e6060 c022b448 c0017ed4 c0017ec0 c0073d9c c0073c7c
7ec0: c004638c c3dc9e00 c0017eec c0017ed8 c0073e9c c0073d94 c3dc9e00 c02e0f20
7ee0: c0017f0c c0017ef0 c00892b0 c0073e48 c0017f28 c02e0f20 c3dc9e00 00000000
7f00: c0017f24 c0017f10 c00794c8 c0089248 00000000 c023cf78 c0017fa4 c0017f28
7f20: c008ac7c c00794b4 c3db5878 c02e0f20 c0017fa4 c0017f40 c00757ec 00000001
7f40: 00000001 00000000 01f0000b 000041b6 00000001 00000000 00000000 00000000
7f60: 00000800 00000000 34aadccf 00000000 34aadccf 00000000 c0017f78 c0017f78
7f80: 00001000 00053008 000532c0 00053298 00000034 c001f044 00000000 c0017fa8
7fa0: c001eec0 c008aa98 00053008 000532c0 000532c0 00000000 00000008 00000000
7fc0: 00053008 000532c0 00053298 00000034 00000000 00000000 befaeedc 00000000
7fe0: befacb7c befacb50 400f9b08 400f9b2c 60000010 000532c0 ebffe5b7 e59f216c
Backtrace:
[<c00eb00c>] (yaffs_CheckpointClose+0x0/0x128) from [<c00e9590>] (yaffs_CheckpointSave+0x478/0x4d8)
 r6:c02e0f80 r5:c3dc9e00 r4:00000000
[<c00e9118>] (yaffs_CheckpointSave+0x0/0x4d8) from [<c00dfd88>] (yaffs_put_super+0x40/0xb8)
[<c00dfd48>] (yaffs_put_super+0x0/0xb8) from [<c0073cec>] (generic_shutdown_super+0x80/0x118)
 r5:c022b3a8 r4:c3dc9e00
[<c0073c6c>] (generic_shutdown_super+0x0/0x118) from [<c0073d9c>] (kill_block_super+0x18/0x2c)
 r5:c022b448 r4:c02e6060
[<c0073d84>] (kill_block_super+0x0/0x2c) from [<c0073e9c>] (deactivate_super+0x64/0x7c)
 r4:c3dc9e00
[<c0073e38>] (deactivate_super+0x0/0x7c) from [<c00892b0>] (mntput_no_expire+0x78/0xc0)
 r5:c02e0f20 r4:c3dc9e00
[<c0089238>] (mntput_no_expire+0x0/0xc0) from [<c00794c8>] (path_release_on_umount+0x24/0x28)
 r7:00000000 r6:c3dc9e00 r5:c02e0f20 r4:c0017f28
[<c00794a4>] (path_release_on_umount+0x0/0x28) from [<c008ac7c>] (sys_umount+0x1f4/0x208)
 r4:c023cf78
[<c008aa88>] (sys_umount+0x0/0x208) from [<c001eec0>] (ret_fast_syscall+0x0/0x2c)
 r8:c001f044 r7:00000034 r6:00053298 r5:000532c0 r4:00053008
Code: e59f0088 e1560003 aa000004 e59430f4 (e7935106)



Here is a (ARM9) dissasembly of the file containing yaffs_CheckpointClose:

http://www.edesix.com/yaffs/yaffs_checkptrw.o.dis


Here is a trace of the mount after doing:

# echo +all > /proc/yaffs
# echo -mtd > /proc/yaffs
# echo 9 > /proc/sys/kernel/printk
# mount -t yaffs2 /dev/mtdblock11 /nand/7

http://www.edesix.com/yaffs/mount.txt


Here is the trace of the (failing) write operation after doing:

# echo +all > /proc/yaffs
# touch /nand/7/video/test

http://www.edesix.com/yaffs/touch.txt


Here is a trace of the umount (including above crash) after doing:

# echo +all > /proc/yaffs
# echo -mtd > /proc/yaffs
# umount /nand/7

http://www.edesix.com/yaffs/umount.txt


Here is a dump of the NAND (warning - 1Gbyte), created with:

# nanddump -f nand7.dump /dev/mtd11
ECC failed: 0
ECC corrected: 0
Number of bad blocks: 5
Number of bbt blocks: 0
Block size 131072, page size 2048, OOB size 64
Dumping data starting at 0x00000000 and ending at 0x40000000...

http://www.edesix.com/yaffs/nand7.dump


Regards,
Chris.