On Tuesday 26 January 2010 19:24:19 Peter Barada wrote: > On Tue, 2010-01-26 at 12:46 +1300, Charles Manning wrote: > > Hello Peter > > > > On Tuesday 26 January 2010 09:39:46 Peter Barada wrote: > > > I've run into a problem using the latest YAFFS code on linux-2.6.28-rc8 > > > using today's YAFFS CVS code. > > > > Does the test work with older yaffs? > > No, it did not - however it exhibited some random behavior. Previous > version was pulled from 20090909 and my thought was with the current > changes to handle large file yaffs_Tnode handling it would help - > version strings are: What changes were those? There have been no changes wrt file size in the tnode trees for ages. As far as yaffs is concerned a 30MB file is tiny. If you were dealing with file sizes around the 2^31 integer roll-over or something then I could understand bugs creeping in. > > peter@blitz:~/work/logic/eps_svn/software/products/linux/LTIB/trunk/ltib-20 >091102-som/rpm/BUILD/linux-2.6.28-rc8$ grep '\$Id:' fs/yaffs2/*.[hc] > fs/yaffs2/yaffs_checkptrw.c: "$Id: yaffs_checkptrw.c,v 1.20 2009-09-09 > 03:03:01 charles Exp $"; > fs/yaffs2/yaffs_ecc.c: "$Id: yaffs_ecc.c,v 1.11 2009-03-06 17:20:50 > wookey Exp $"; > fs/yaffs2/yaffs_fs.c: "$Id: yaffs_fs.c,v 1.82 2009-09-18 00:39:21 > charles Exp $"; > fs/yaffs2/yaffs_guts.c: "$Id: yaffs_guts.c,v 1.89 2009-09-09 00:56:53 > charles Exp $"; > fs/yaffs2/yaffs_mtdif1.c:const char *yaffs_mtdif1_c_version = "$Id: > yaffs_mtdif1.c,v 1.11 2009-09-09 03:03:01 charles Exp $"; > fs/yaffs2/yaffs_mtdif2.c: "$Id: yaffs_mtdif2.c,v 1.23 2009-03-06 > 17:20:53 wookey Exp $"; > fs/yaffs2/yaffs_mtdif.c: "$Id: yaffs_mtdif.c,v 1.22 2009-03-06 17:20:51 > wookey Exp $"; > fs/yaffs2/yaffs_nand.c: "$Id: yaffs_nand.c,v 1.11 2009-09-09 03:03:01 > charles Exp $"; > > I'll go back and re-test with that version to grnerate the output. The > original test did the dd command with "dd if-/dev/urandom of=somefile.$i > count=0 bs=0 skip=30M" to seek out 30MB after the open and then close > the file - initially I thought the test was off as that dd command > wouldn't generate any actual data on an EXT3 device. > > The MTD layer has performed flawlessly with the previous version(s), Previous versions of what? yaffs? linux? > so > I'm not thinking the MTD ECC handling itself is in error - I can add > code to dump it if finds an ECC error on read; I noticed that the > current code doesn't verify the data written if > "CONFIG_YAFFS_ALWAYS_CHECK_CHUNK_ERASED" is not set - do you have a > development patch that enables the readback to verify the chunk is > written correctly so I can test that my MTD layer is still operating > correctly? No I don't have a patch like that but it would be helpful if yaffs did verification to help check the mtd layer more effectively. I'll look at adding that. > > > With this code, I'm seeing 30MB files that are created have mismatching > > > checksums while running the attached test script. The output from the > > > test looks like: > > > > > > OMAP-35x# . /media/mmcblk0p1/x > > > Create 30M file and get > > > md5sum > > > 30720+0 records > > > in > > > 30720+0 records > > > out > > > 5b04790304a4221f1016a8c310da4746 > > > somefile.1 > > > **>> Block 710 needs > > > retiring > > > **>> yaffs write required 2 > > > attempts > > > **>> Block 710 > > > retired > > > Block 710 is in state 9 after gc, should be > > > erased > > > Calculate md5sums for copied > > > files > > > 5b04790304a4221f1016a8c310da4746 > > > somefile.1 > > > 5b04790304a4221f1016a8c310da4746 > > > somefile.2 > > > 5b04790304a4221f1016a8c310da4746 > > > somefile.3 > > > 8c8d5a7974d0b9da747bc59edd1991f6 > > > somefile.4 > > > execute sync and resee where a logical checkin fcalculate > > > md5sums > > > save exit: isCheckpointed > > > 1 > > > 5b04790304a4221f1016a8c310da4746 > > > somefile.1 > > > 5b04790304a4221f1016a8c310da4746 > > > somefile.2 > > > 5b04790304a4221f1016a8c310da4746 > > > somefile.3 > > > 5b04790304a4221f1016a8c310da4746 > > > somefile.4 > > > Delete one of the > > > files > > > 5b04790304a4221f1016a8c310da4746 > > > somefile.1 > > > 5b04790304a4221f1016a8c310da4746 > > > somefile.3 > > > 3cb7668eb7760202d96970a6a9a3361f > > > somefile.4 > > > recopy the deleted > > > file > > > f6dba6d5af7a7a89481da1849035a417 > > > somefile.1 > > > 5b04790304a4221f1016a8c310da4746 > > > somefile.3 > > > 5b04790304a4221f1016a8c310da4746 > > > somefile.4 > > > f6dba6d5af7a7a89481da1849035a417 > > > somefile.7 > > > Creating test folder and some junk files in that > > > folder > > > 1+0 records > > > in > > > 1+0 records > > > out > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.1 > > > md5sums of all files in test > > > folder > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.1 > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.2 > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.3 > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.4 > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.5 > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.6 > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.7 > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.8 > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.9 > > > execute sync and recalculate > > > md5sums > > > save exit: isCheckpointed > > > 1 > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.1 > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.2 > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.3 > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.4 > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.5 > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.6 > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.7 > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.8 > > > ae1028b8d6aef86d020c9edfae29ca3d > > > junk.9 > > > Remove some files and recreate > > > them > > > Calculate md5sums for 30M files > > > again > > > f6dba6d5af7a7a89481da1849035a417 > > > somefile.1 > > > 5b04790304a4221f1016a8c310da4746 > > > somefile.3 > > > 5b04790304a4221f1016a8c310da4746 > > > somefile.4 > > > 1fee3f481bfa5cf3403efe9e481a0374 > > > somefile.7 > > > execute sync and recalculate > > > md5sums > > > save exit: isCheckpointed > > > 1 > > > a7b6ccfa31115aa75a0fdca07073293d > > > somefile.1 > > > 5b04790304a4221f1016a8c310da4746 > > > somefile.3 > > > 1abb3d578e2d129341df26916090b869 > > > somefile.4 > > > f6dba6d5af7a7a89481da1849035a417 > > > somefile.7 > > > OMAP-35x# > > > > > > In the output, note that the md5sum of "somefile.*" should all match. > > > > > > Anyone seen anything like this before? Test attached. > > > > I just ran the test on both 2.6.24-xxx and 2.6.31-xxx using nandsim on a > > PC and had no problems. Here's one run: > > > > root@linux-dual-head:/mnt# ~charles/Dropbox/yaffs-30M-test > > Create 30M file and get md5sum > > 30720+0 records in > > 30720+0 records out > > 31457280 bytes (31 MB) copied, 7.72198 s, 4.1 MB/s > > dc7fd1b9553217a9a1becbb101271eab somefile.1 > > Calculate md5sums for copied files > > dc7fd1b9553217a9a1becbb101271eab somefile.1 > > dc7fd1b9553217a9a1becbb101271eab somefile.2 > > dc7fd1b9553217a9a1becbb101271eab somefile.3 > > dc7fd1b9553217a9a1becbb101271eab somefile.4 > > execute sync and recalculate md5sums > > dc7fd1b9553217a9a1becbb101271eab somefile.1 > > dc7fd1b9553217a9a1becbb101271eab somefile.2 > > dc7fd1b9553217a9a1becbb101271eab somefile.3 > > dc7fd1b9553217a9a1becbb101271eab somefile.4 > > Delete one of the files > > dc7fd1b9553217a9a1becbb101271eab somefile.1 > > dc7fd1b9553217a9a1becbb101271eab somefile.3 > > dc7fd1b9553217a9a1becbb101271eab somefile.4 > > recopy the deleted file > > dc7fd1b9553217a9a1becbb101271eab somefile.1 > > dc7fd1b9553217a9a1becbb101271eab somefile.3 > > dc7fd1b9553217a9a1becbb101271eab somefile.4 > > dc7fd1b9553217a9a1becbb101271eab somefile.7 > > Creating test folder and some junk files in that folder > > 1+0 records in > > 1+0 records out > > 1024 bytes (1.0 kB) copied, 0.000442124 s, 2.3 MB/s > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.1 > > md5sums of all files in test folder > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.1 > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.2 > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.3 > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.4 > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.5 > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.6 > > 0bc0c6e9588ee2bf6c894see where a logical checkin f63208c5a0e9 junk.7 > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.8 > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.9 > > execute sync and recalculate md5sums > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.1 > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.2 > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.3 > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.4 > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.5 > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.6 > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.7 > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.8 > > 0bc0c6e9588ee2bf6c89463208c5a0e9 junk.9 > > Remove some files and recreate them > > Calculate md5sums for 30M files again > > dc7fd1b9553217a9a1becbb101271eab somefile.1 > > dc7fd1b9553217a9a1becbb101271eab somefile.3 > > dc7fd1b9553217a9a1becbb101271eab somefile.4 > > dc7fd1b9553217a9a1becbb101271eab somefile.7 > > execute sync and recalculate md5sums > > dc7fd1b9553217a9a1becbb101271eab somefile.1 > > dc7fd1b9553217a9a1becbb101271eab somefile.3 > > dc7fd1b9553217a9a1becbb101271eab somefile.4 > > dc7fd1b9553217a9a1becbb101271eab somefile.7 > > > > > > Perhaps the retirement of the blocks indicates that some data was being > > corrupted. > > Could be - I'll re-nuke the flash (since those blocks on this particular > board should not be bad) and try again. I'm wondering if I'm caught in > limbo with the particular version of the kernel I have that on the > OMAP35x exhibits some caching behavior that isn't caught in the changes > you've made. Unfortunately this is a production release and if you have > a suggestion on how to go backwards (i.e. undo some of the caching > changes that I'm caught in the middle of), I'd be appreciative - I'm > looking for stability, not necessarily efficiency compared to previous > kernel versions. Caches are an easy way to get data inconsistency. Which cache are you talking about here? yaffs should not be changing to support changes in mtd-level or OMAP-specific caching. There are two caches that yaffs **should** be aware of and should play nice with: * It's own cache. Try disabling that to see if that makes any difference. You can do that by mounting with -o "no-cache" * The page cache. There have been some changes in this area recently. fsx (which really pounds on the page cache interface) runs but you might have uncovered a hole that fsx does not. The page cache can be thrown out by # echo 3 > /proc/sys/vm/drop_caches which will force yaffs to read all the data back again. Thus if you do sync md5sum foo echo 3 > /proc/sys/vm/drop_caches md5sum foo then it indicates that the data in the cache was inconsistent with the data on flash. > > At some point it would be nice if there were tags on the YAFFS CVS tree > so I can snap to a known version and apply it to a kernel and walk > forward or backwards in time to capture logical changes to the YAFFS > source and test with each. Tagging each checkin would pollute the tags space pretty quickly. cvs does not provide a checkin Id like svn or git but you can use -D to fetch as of a specific date cvs update -D "2009-10-31" -- CHarles