Re: [Yaffs] Crash in yaffs_CheckpointClose after corruption.

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: Charles Manning
Date:  
To: yaffs
Subject: Re: [Yaffs] Crash in yaffs_CheckpointClose after corruption.
I've been thinking about this a bit.

This is probably being triggered by running out of checkpoint blocks.

Try setting CONFIG_YAFFS_CHECKPOINT_RESERVED_BLOCKS to a larger number (say
20).


-- CHarles


On Thursday 15 November 2007 00:49:06 Chris Paulson-Ellis wrote:
> Hi,
>
> I have a yaffs2 filesystem on a 1Gbyte NAND flash that has somehow become
> corrupt. The filesystem mounts, but yaffs crashes on unmount if any changes
> are made to the directory with the corrupt entries.
>
> My pattern of access is continuous writing (at about 200kBytes/s) with a
> background task removing the oldest files when the filesystem reports less
> than 1Mbyte free. The files so created and deleted are about 135MBytes, so
> there are only a few of them stored at once and space is freed in large
> blocks.
>
> The corrupt filesystem had become full, perhaps due to a failure of the
> mechanism described above to take into account the ability of yaffs to
> garbage collect blocks fast enough or to checkpoint.
>
> My current yaffs code was taken from CVS on 13-Nov-2007. The version
> running when the filesystem apparently first became corrupt was taken from
> CVS on 30-Sep-2007, so was almost exactly the same.
>
> It crashes on unmount like this:
>
> Unable to handle kernel NULL pointer dereference at virtual address
> 00000000 pgd = c3dc0000
> [00000000] *pgd=23db6031, *pte=00000000, *ppte=00000000
> Internal error: Oops: 17 [#1] PREEMPT
> Modules linked in: pss_nand qhal
> CPU: 0    Not tainted  (2.6.22.2 #1)
> PC is at yaffs_CheckpointClose+0x9c/0x128
> LR is at yaffs_CheckpointSave+0x478/0x4d8
> pc : [<c00eb0a8>]    lr : [<c00e9590>]    psr: 80000013
> sp : c0017e00  ip : c0017e20  fp : c0017e1c
> r10: befaeedc  r9 : c3cef000  r8 : 00000000
> r7 : 00000000  r6 : 00000000  r5 : c3dc9e00  r4 : c3cef000
> r3 : 00000000  r2 : c0016000  r1 : c0016000  r0 : c01fc018
> Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  Segment user
> Control: c000317f  Table: 23dc0000  DAC: 00000015
> Process umount (pid: 155, stack limit = 0xc0016260)
> Stack: (0xc0017e00 to 0xc0018000)
> 7e00: c3cef000 00000000 c3dc9e00 c02e0f80 c0017e8c c0017e20 c00e9590
> c00eb01c 7e20: c0086610 c3cef000 c3dc9e00 00000000 00000000 00000000
> c0016000 befaeedc 7e40: c0017e64 c0017e50 c002df64 c002db88 c0016000
> c0017e6c c0017e8c c0017e78 7e60: c00df754 c3cef000 c3dc9e00 c02e0f80
> 00000000 00000000 c0016000 befaeedc 7e80: c0017ea4 c0017e90 c00dfd88
> c00e9128 c3dc9e00 c022b3a8 c0017ebc c0017ea8 7ea0: c0073cec c00dfd58
> c02e6060 c022b448 c0017ed4 c0017ec0 c0073d9c c0073c7c 7ec0: c004638c
> c3dc9e00 c0017eec c0017ed8 c0073e9c c0073d94 c3dc9e00 c02e0f20 7ee0:
> c0017f0c c0017ef0 c00892b0 c0073e48 c0017f28 c02e0f20 c3dc9e00 00000000
> 7f00: c0017f24 c0017f10 c00794c8 c0089248 00000000 c023cf78 c0017fa4
> c0017f28 7f20: c008ac7c c00794b4 c3db5878 c02e0f20 c0017fa4 c0017f40
> c00757ec 00000001 7f40: 00000001 00000000 01f0000b 000041b6 00000001
> 00000000 00000000 00000000 7f60: 00000800 00000000 34aadccf 00000000
> 34aadccf 00000000 c0017f78 c0017f78 7f80: 00001000 00053008 000532c0
> 00053298 00000034 c001f044 00000000 c0017fa8 7fa0: c001eec0 c008aa98
> 00053008 000532c0 000532c0 00000000 00000008 00000000 7fc0: 00053008
> 000532c0 00053298 00000034 00000000 00000000 befaeedc 00000000 7fe0:
> befacb7c befacb50 400f9b08 400f9b2c 60000010 000532c0 ebffe5b7 e59f216c
> Backtrace:
> [<c00eb00c>] (yaffs_CheckpointClose+0x0/0x128) from [<c00e9590>]
> (yaffs_CheckpointSave+0x478/0x4d8) r6:c02e0f80 r5:c3dc9e00 r4:00000000
> [<c00e9118>] (yaffs_CheckpointSave+0x0/0x4d8) from [<c00dfd88>]
> (yaffs_put_super+0x40/0xb8) [<c00dfd48>] (yaffs_put_super+0x0/0xb8) from
> [<c0073cec>] (generic_shutdown_super+0x80/0x118) r5:c022b3a8 r4:c3dc9e00
> [<c0073c6c>] (generic_shutdown_super+0x0/0x118) from [<c0073d9c>]
> (kill_block_super+0x18/0x2c) r5:c022b448 r4:c02e6060
> [<c0073d84>] (kill_block_super+0x0/0x2c) from [<c0073e9c>]
> (deactivate_super+0x64/0x7c) r4:c3dc9e00
> [<c0073e38>] (deactivate_super+0x0/0x7c) from [<c00892b0>]
> (mntput_no_expire+0x78/0xc0) r5:c02e0f20 r4:c3dc9e00
> [<c0089238>] (mntput_no_expire+0x0/0xc0) from [<c00794c8>]
> (path_release_on_umount+0x24/0x28) r7:00000000 r6:c3dc9e00 r5:c02e0f20
> r4:c0017f28
> [<c00794a4>] (path_release_on_umount+0x0/0x28) from [<c008ac7c>]
> (sys_umount+0x1f4/0x208) r4:c023cf78
> [<c008aa88>] (sys_umount+0x0/0x208) from [<c001eec0>]
> (ret_fast_syscall+0x0/0x2c) r8:c001f044 r7:00000034 r6:00053298 r5:000532c0
> r4:00053008
> Code: e59f0088 e1560003 aa000004 e59430f4 (e7935106)

>
>
> Here is a (ARM9) dissasembly of the file containing yaffs_CheckpointClose:
>
> http://www.edesix.com/yaffs/yaffs_checkptrw.o.dis
>
>
> Here is a trace of the mount after doing:
>
> # echo +all > /proc/yaffs
> # echo -mtd > /proc/yaffs
> # echo 9 > /proc/sys/kernel/printk
> # mount -t yaffs2 /dev/mtdblock11 /nand/7
>
> http://www.edesix.com/yaffs/mount.txt
>
>
> Here is the trace of the (failing) write operation after doing:
>
> # echo +all > /proc/yaffs
> # touch /nand/7/video/test
>
> http://www.edesix.com/yaffs/touch.txt
>
>
> Here is a trace of the umount (including above crash) after doing:
>
> # echo +all > /proc/yaffs
> # echo -mtd > /proc/yaffs
> # umount /nand/7
>
> http://www.edesix.com/yaffs/umount.txt
>
>
> Here is a dump of the NAND (warning - 1Gbyte), created with:
>
> # nanddump -f nand7.dump /dev/mtd11
> ECC failed: 0
> ECC corrected: 0
> Number of bad blocks: 5
> Number of bbt blocks: 0
> Block size 131072, page size 2048, OOB size 64
> Dumping data starting at 0x00000000 and ending at 0x40000000...
>
> http://www.edesix.com/yaffs/nand7.dump
>
>
> Regards,
> Chris.
>
>
>
> _______________________________________________
> yaffs mailing list
>
> http://lists.aleph1.co.uk/cgi-bin/mailman/listinfo/yaffs