Re: [Yaffs] cvs YAFFS + MTD cvs + 2.4.27-vrs1 problems

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: Aras Vaichas
Date:  
CC: YAFFS mailing list
Subject: Re: [Yaffs] cvs YAFFS + MTD cvs + 2.4.27-vrs1 problems
Aras Vaichas wrote:
> Hello all,
>
> I have managed to get the latest MTD code to compile with a 2.4.27-vrs1
> kernel. I compiled the latest YAFFS code (from CVS) into the kernel. I
> mounted a 16MB block of Smartmedia NAND and tried to create some files,
> I got a load of errors , I reset the machine because it was obviously
> going crazy. I then noticed that my NAND suddenly had a load of "bad
> blocks"


It's been a long journey, but I'm almost there.

I've finally nailed down a solution to what was causing the occasional bit flip
and thus all the lost/bad-pages on my NAND.

If I comment out "#USE_NANDECC = -DCONFIG_YAFFS_USE_NANDECC" in my YAFFS
Makefile, then my files are read back correctly. The only thing that happens
now is that I get that "Reading data from NAND FLASH without ECC is not
recommended" warning message BUT my system appears to be working.

I am guessing that the ECC stuff is being handled twice or it is being
mishandled somewhere when I define CONFIG_YAFFS_USE_NANDECC. Comments?

I am now able to copy, say, a 15MB file from NAND without any problems, and
without losing massive numbers of blocks due to incorrectly labelled bad blocks.

Thank you very much to those that helped me get this far, especially Thomas and
Charles.

regards,

Aras Vaichas


------- test details ------

I designed a test to see if multiple readings of the same file from NAND would
produce bit errors in different locations in the copied file. This was found to
be true, and therefore something was wrong with the reading and processing of
the data from the Flash.

For example with CONFIG_YAFFS_USE_NANDECC defined I run the same test twice and
get two different results:

/root # ll /mnt/y1
drw-rw-rw-    1 root     root          512 Dec  9 14:57 ./
drwxrwxr-x   15 563      100          4096 Dec  6 11:51 ../
drw-rw-rw-    1 root     root          512 Dec  9 14:57 lost+found/
-rw-rw-r--    1 root     root     15728640 Dec  9 12:00 random.copy.bin


/root # ./readbacktest.sh /mnt/y1
copying random.copy.bin from FLASH to local ...
converting with hexdump ...
comparing copy to original ...
19096301 65 61
... 11 errors in total ...
44852526 146 142

/root # ./readbacktest.sh /mnt/y1
copying random.copy.bin from FLASH to local ...
converting with hexdump ...
comparing copy to original ...
13223325 142 63
... 14 errors in total ...
41558756 64 66

The first column is the offset of the difference between original and copy, the
second and third columns are the octal value (I'm using cmp in Busybox). This
test shows that about 1 bit in 10 million is corrupt after a read, and you can
see that the location of the bit flip is not the same between reads, therefore
something caused by the hardware is creating this problem and it isn't being
fixed correctly in software.

If I don't include CONFIG_YAFFS_USE_NANDECC you will see that I don't get any
errors (after reading back a 15MB file) but I do get warnings from nand_base.c

/root # ./readbacktest.sh /mnt/y2
copying random.copy.bin from FLASH to local ...
Reading data from NAND FLASH without ECC is not recommended
... SNIP ...
Reading data from NAND FLASH without ECC is not recommended
converting with hexdump ...
comparing copy to original ...
no differences found
/root #