[Yaffs] Can removing chunkErrorStrikes check cause yaffs2 to…

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: CHEN XUEQIN
Date:  
To: yaffs
Subject: [Yaffs] Can removing chunkErrorStrikes check cause yaffs2 too many Block struck out ?
Hi:

     I've run yaffs2 in device, whose enviroment parameters :
         * kernel 2.6.23
         * cpu:  powerpc MPC8323
         * nand: Samsung NAND 1GiB, 4 Level Cell, 3,3V 8-bit


     Sometimes I found bits filp in some key files. The program
which used the damaged file wound go wrong. This bits flip
could not be detected by hardware ECC. I thought some aggressive
error step should be token to reduce problem. So I changed source
code in yaffs_HandleChunkError from:


void yaffs_HandleChunkError(yaffs_Device *dev, yaffs_BlockInfo *bi)
{
    if (!bi->gcPrioritise) {
        bi->gcPrioritise = 1;
        dev->hasPendingPrioritisedGCs = 1;
        bi->chunkErrorStrikes++;


        if (bi->chunkErrorStrikes > 3) {
            bi->needsRetiring = 1; /* Too many stikes, so retire this */
            T(YAFFS_TRACE_ALWAYS, (TSTR("yaffs: Block struck out" TENDSTR)));


        }
    }
}


to:

void yaffs_HandleChunkError(yaffs_Device *dev, yaffs_BlockInfo *bi)
{
    if (!bi->gcPrioritise) {
        bi->gcPrioritise = 1;
        dev->hasPendingPrioritisedGCs = 1;
        bi->chunkErrorStrikes++;


            bi->needsRetiring = 1; /* Too many stikes, so retire this */
        T(YAFFS_TRACE_ALWAYS, (TSTR("yaffs: Block struck out" TENDSTR)));
    }
}


     In the other words, with above patch, any write or verify error
will cause yaffs2 to mark those block bad. Recently I found faults in
some devices. The kernel print many continuous bad block. The log like this


//3 continuous bad block
block 773 is bad
block 774 is bad
block 775 is bad
//7 continuous bad block
block 777 is bad
block 778 is bad
block 779 is bad
block 780 is bad
block 781 is bad
block 782 is bad
block 783 is bad

// 44 continuous bad block
block 816 is bad
block 817 is bad
block 818 is bad
block 819 is bad
block 820 is bad
block 821 is bad
block 822 is bad
block 823 is bad
block 824 is bad
block 825 is bad
block 826 is bad
block 827 is bad
block 828 is bad
block 829 is bad
block 830 is bad
block 831 is bad
...
...

      Here is my question:
          1. Is my patch wrong?
          2. Why the official yaffs2 code assume 3 chunkErrorStrike to
             retire a block? Reduce to 1 chunkErrorStrike will wrongly
             mark the good block bad?
          3. Should I remove the patch?


      Thanks a lot for your advice.