Author: Paul Fidler Date: To: balloon Subject: [Balloon] Nand/Yaffs Wierdness?
I have a Balloon 3 (P2) that has spent most of the afternoon spitting
out error messages like this whenever I've tried to boot it:
BUG: soft lockup detected on CPU#0!
Pid: 711, comm: depmod CPU: 0 PC is at nand_read_byte+0x18/0x20 LR is at
nand_command+0x114/0x1f4 pc : [<c0145850>] lr : [<c0145ea4>] Not tainted sp
: c3cbd9cc ip : c3cbd9dc fp : c3cbd9d8 r10: 00000000 r9 : 00000000 r8 :
ffffffff r7 : 000000ff r6 : c3c46000 r5 : c3c46180 r4 : ffffffff r3 :
c3c46180 r2 : c4812000 r1 : c4814014 r0 : 00000000 Flags: nZcv IRQs on FIQs
on Mode SVC_32 Segment user Control: 397F Table: A3AB4000 DAC: 00000015
[<c002393c>] (show_regs+0x0/0x4c) from [<c00595f0>]
(softlockup_tick+0x68/0x80)
r4 = C3CBD984 [<c0059588>] (softlockup_tick+0x0/0x80) from [<c0043804>]
(do_timer+0x70/0x10c)
r4 = 00000001 [<c0043794>] (do_timer+0x0/0x10c) from [<c0026ecc>]
(timer_tick+0xe0/0x134) [<c0026dec>] (timer_tick+0x0/0x134) from
[<c002d488>] (pxa_timer_interrupt+0x34/0x84)
r6 = 00000001 r5 = C3CBD984 r4 = F2A00000 [<c002d454>]
(pxa_timer_interrupt+0x0/0x84) from [<c0022980>] (__do_irq+0x4c/0x90)
r8 = 00000000 r7 = 0000001A r6 = C3CBD984 r5 = 00000000
r4 = C027B2F8
[<c0022934>] (__do_irq+0x0/0x90) from [<c0022bd0>] (do_level_IRQ+0x68/0xc0)
r8 = FFFFFFFF r7 = 000000FF r6 = C3CBD984 r5 = 0000001A
r4 = C02D22C8 [<c0022b68>] (do_level_IRQ+0x0/0xc0) from [<c0022d70>]
(asm_do_IRQ+0x4c/0x74)
r6 = 04000000 r5 = C3CBD9B8 r4 = C3CBD984
[<c0022d24>] (asm_do_IRQ+0x0/0x74) from [<c0021924>] (__irq_svc+0x24/0x60)
r4 = FFFFFFFF [<c0145838>] (nand_read_byte+0x0/0x20) from [<c0145ea4>]
(nand_command+0x114/0x1f4) [<c0145d90>] (nand_command+0x0/0x1f4) from
[<c0147a20>] (nand_write_oob+0xc8/0x22c)
r8 = C3C46000 r7 = C3C46180 r6 = FFFFFFFB r5 = FFFFFFFF
r4 = 00000000 [<c0147958>] (nand_write_oob+0x0/0x22c) from [<c013aed4>]
(part_write_oob+0x78/0xb0) [<c013ae5c>] (part_write_oob+0x0/0xb0) from
[<c00c5780>] (nandmtd_WriteChunkToNAND+0x108/0x114)
r6 = C3DAF600 r5 = 00000000 r4 = 00F83E00 [<c00c5678>]
(nandmtd_WriteChunkToNAND+0x0/0x114) from [<c00c4eb0>]
(yaffs_WriteChunkToNAND+0x4c/0x54)
r8 = C3CBDAB8 r7 = 00000000 r6 = C3DEF000 r5 = 00000000
r4 = 00007C1F [<c00c4e64>] (yaffs_WriteChunkToNAND+0x0/0x54) from
[<c00c5354>] (yaffs_TagsCompatabilityWriteChunkWithTagsToNAND+0xf0/0xfc)
r6 = 00007C1F r5 = C3DEF000 r4 = C3CBDB1C [<c00c5264>]
(yaffs_TagsCompatabilityWriteChunkWithTagsToNAND+0x0/0xfc) from
[<c00c603c>] (yaffs_WriteChunkWithTagsToNAND+0xac/0x114) [<c00c5f90>]
(yaffs_WriteChunkWithTagsToNAND+0x0/0x114) from [<c00bfed0>]
(yaffs_DeleteChunk+0x20c/0x264)
r8 = 00000000 r7 = 00007C3F r6 = 000003E1 r5 = C3DEF000
r4 = C3CBDB1C [<c00bfcc4>] (yaffs_DeleteChunk+0x0/0x264) from [<c00bccfc>]
(yaffs_HandleWriteChunkError+0x94/0xb4) [<c00bcc68>]
(yaffs_HandleWriteChunkError+0x0/0xb4) from [<c00bcb80>]
(yaffs_WriteNewChunkWithTagsToNAND+0xc0/0xe0)
r6 = 00000000 r5 = C3DEF000 r4 = 00007C3F [<c00bcac0>]
(yaffs_WriteNewChunkWithTagsToNAND+0x0/0xe0) from [<c00bffac>]
(yaffs_WriteChunkDataToObject+0x84/0xc8) [<c00bff28>]
(yaffs_WriteChunkDataToObject+0x0/0xc8) from [<c00c1854>]
(yaffs_WriteDataToFile+0x2a0/0x2bc) [<c00c15b4>]
(yaffs_WriteDataToFile+0x0/0x2bc) from [<c00ba98c>]
(yaffs_file_write+0x98/0x190) [<c00ba8f4>] (yaffs_file_write+0x0/0x190)
from [<c00ba560>] (yaffs_commit_write+0x90/0x160) [<c00ba4d0>]
(yaffs_commit_write+0x0/0x160) from [<c005c074>]
(generic_file_buffered_write+0x26c/0x614) [<c005be0c>]
(generic_file_buffered_write+0x4/0x614) from [<c005c67c>]
(__generic_file_aio_write_nolock+0x260/0x500) [<c005c41c>]
(__generic_file_aio_write_nolock+0x0/0x500) from [<c005ca28>]
(__generic_file_write_nolock+0x84/0xb0) [<c005c9a4>]
(__generic_file_write_nolock+0x0/0xb0) from [<c005cc48>]
(generic_file_write+0x48/0xbc) [<c005cc00>] (generic_file_write+0x0/0xbc)
from [<c007a6bc>] (vfs_write+0xc0/0x184) [<c007a5fc>] (vfs_write+0x0/0x184)
from [<c007a84c>] (sys_write+0x50/0x7c) [<c007a7fc>] (sys_write+0x0/0x7c)
from [<c0021ca0>] (ret_fast_syscall+0x0/0x2c)
r9 = C3CBC000 r8 = C0021E44 r7 = 00000004 r6 = 000185F8
r5 = 00001000 r4 = 40330000
**>> Block 993 retired
**>> yaffs chunk 31808 was not erased
**>> yaffs chunk 31809 was not erased
**>> yaffs chunk 31810 was not erased
**>> yaffs chunk 31811 was not erased
**>> yaffs chunk 31812 was not erased
**>> yaffs chunk 31813 was not erased
**>> yaffs chunk 31814 was not erased
**>> yaffs chunk 31815 was not erased
**>> yaffs chunk 31816 was not erased
**>> yaffs chunk 31817 was not erased
**>> yaffs chunk 31818 was not erased
**>> yaffs chunk 31819 was not erased
**>> yaffs chunk 31820 was not erased
**>> yaffs chunk 31821 was not erased
**>> yaffs chunk 31822 was not erased
**>> yaffs chunk 31823 was not erased
**>> yaffs chunk 31824 was not erased
**>> yaffs chunk 31825 was not erased
It keeps doing this (but with progressively higher chunk numbers) until
grinding to a halt after several minutes. It also occasionally throws
out a message about retiring a block.
This evening however, it has started behaving normally again and it
boots all the way through to the login prompt, but I hadn't done anything
to it to try and fix it.
The same board was producing similar 'yaffs chunk n was not erased'
messages on Thursday, but while trying to download a big file via zmodem,
not while booting. On that occasion the only way to cure it and get a
stable system again was a force-erase of the nand. (A non-forced erase
didn't seem to work). It's been back to TCL and there doesn't seem to be
anything wrong with the hardware. TCL were also able to cure it by doing a
force-erase and managed to squirt several MBs over zmodem without
reproducing any sysmptons.
Any suggestions as to what's going on? Is it just encountering bad nand
that's been marked as good (thanks to the forced erasing), or is there
something else going on?
Paul Fidler
--
Cambridge University Engineering Department | Tel: +44 1223 332816
Trumpington Street, Cambridge, CB2 1PZ, UK | Fax: +44 1223 332662