[Yaffs] [PATCH] Fix to stop retiring blocks when partition f…

Top Page
Attachments:
Message as email
+ (text/plain)
+ before.log.gz (application/x-gzip)
+ after.log.gz (application/x-gzip)
+ yaffs-fix-bad-retires.patch (text/x-patch)
Delete this message
Reply to this message
Author: Frank Mandarino
Date:  
To: yaffs
Subject: [Yaffs] [PATCH] Fix to stop retiring blocks when partition fills
Hi,

I am running Linux-2.6.19 with the latest CVS version of yaffs2
installed, patched with Vitaly's yaffs2-fixups.patch, on a system using
512/16 small page flash memory.

During testing when I would fill a partition, I was getting messages of
the form:

yaffs: Block struck out
**>> Block NNN retired

The blocks would remain marked bad until the partition was erased with
flash_eraseall.

I created a small 256K paritition to demonstrate and to work on the
problem. The attached file "before.log.gz" shows how blocks are retired
until there are no usable blocks in the partition.

I traced through the code and I believe the problem occurs due to the
following sequence:

1) yaffs_FindChunkInGroup() calls yaffs_ReadChunkWithTagsFromNAND() with
a NULL data buffer, since it just needs the tags.

2) yaffs_ReadChunkWithTagsFromNAND() passes the call to
yaffs_TagsCompatabilityReadChunkWithTagsFromNAND().

3) yaffs_TagsCompatabilityReadChunkWithTagsFromNAND() calls
yaffs_ReadChunkFromNAND(), passing a pointer to a yaffs_ECCResult
variable, eccResult, and with the doErrorCorrection flag set to one.

4) yaffs_ReadChunkFromNAND() only performs the ECC result checking if
both the data buffer is non-NULL and doErrorCorrection is set. In this
case the data buffer is NULL, so no value is returned in the eccResult
pointer.

5) yaffs_TagsCompatabilityReadChunkWithTagsFromNAND() sets the
etags->eccResult field to the local eccResult value, which is garbage
because no value was set by yaffs_ReadChunkFromNAND().

6) yaffs_ReadChunkWithTagsFromNAND() checks the tags->eccResult value,
and calls yaffs_HandleChunkError() when the garbage value is greater
than YAFFS_ECC_RESULT_NO_ERROR.

7) yaffs_HandleChunkError() set the block's gcPrioritise flag and
increments the blocks chunkErrorStrikes count.

8) After a few garbage collections, the chunkErrorStrikes count exceeds
three, so yaffs_HandleChunkError() sets the block's needsRetiring flag
and the block is later marked dead.


I'm not exactly sure how this should be fixed. It looks like ECC
results are not available when using yaffs1 compatibility mode in
2.6.18+ because translate_oob2spare() always sets eccres1 and eccres2 to
0. There is a /* FIXME */ comment on the line where they are set to 0,
but I don't know what that fix should be.


The attached patch simply initializes the local eccResult variable in
yaffs_TagsCompatabilityReadChunkWithTagsFromNAND() to
YAFFS_ECC_RESULT_NO_ERROR to stop tags->eccResult from being set to a
garbage value.

The attached file "after.log.gz" shows how the patched version no longer
retires any blocks.


Please consider committing this patch to stabilize the code until a
proper fix can be worked out.

Regards,
../fam
-- 
Frank Mandarino                                fmandarino(a)endrelia.com
Endrelia Technologies Inc.
Toronto, Ontario, Canada

Index: yaffs_tagscompat.c
===================================================================
RCS file: /home/aleph1/cvs/yaffs2/yaffs_tagscompat.c,v
retrieving revision 1.8
diff -u -r1.8 yaffs_tagscompat.c
--- yaffs_tagscompat.c    29 Nov 2005 20:54:32 -0000    1.8
+++ yaffs_tagscompat.c    22 Dec 2006 19:39:26 -0000
@@ -437,7 +437,7 @@


     yaffs_Spare spare;
     yaffs_Tags tags;
-    yaffs_ECCResult eccResult;
+    yaffs_ECCResult eccResult = YAFFS_ECC_RESULT_NO_ERROR;


     static yaffs_Spare spareFF;
     static int init;