Charles,
I recently updated my kernel to the latest CVS YAFFS code and
discovered rather serious filesystem corruption apparently triggered
by heavy file unlink and creation activity. The symptoms are easily
reproduced by repeatedly extracting a tar archive containing several
files and symbolic links in an initially empty YAFFS file system.
Soon, tar reports "tar: Couldnt remove old file: Directory not empty"
for a random file which was not supposed to be a directory! Other
symptoms are YAFFS errors which read "**>> yaffs chunk 792 was not
erased **>> yaffs write required 2 attempts".
After the errors, the filesystem shows corrupted directories with ls
output like:
~ # ls -l /mnt/bin
ls: /mnt/bin/ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ: No such file or directory
ls: /mnt/bin/ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ: No such file or directory
ls: /mnt/bin/ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ: No such file or directory
ls: /mnt/bin/ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ: No such file or directory
lrwxrwxrwx 1 root root 14 Dec 8 18:24 [ -> ../bin/busybox
lrwxrwxrwx 1 root root 14 Dec 8 18:24 ash -> ../bin/busybox
lrwxrwxrwx 1 root root 14 Dec 8 18:24 awk -> ../bin/busybox
lrwxrwxrwx 1 root root 14 Dec 8 18:24 basename -> ../bin/busybox
Unmounting and remounting the file system seems to make the directory
corruption go away (at least most of the time).
My kernel is based on 2.4.26. The architecture is ppc. YAFFS is
running over a pretty stock MTD/NAND layer.
A condensed summary of configuration from my Makefile:
#USE_RAM_FOR_TEST = -DCONFIG_YAFFS_RAM_ENABLED
USE_MTD = -DCONFIG_YAFFS_MTD_ENABLED
#USE_OLD_MTD = -DCONFIG_YAFFS_USE_OLD_MTD
#USE_NANDECC = -DCONFIG_YAFFS_USE_NANDECC
#USE_WRONGECC = -DCONFIG_YAFFS_ECC_WRONG_ORDER
USE_GENERIC_RW = -DCONFIG_YAFFS_USE_GENERIC_RW
#USE_HEADER_FILE_SIZE = -DCONFIG_YAFFS_USE_HEADER_FILE_SIZE
#IGNORE_CHUNK_ERASED = -DCONFIG_YAFFS_DISABLE_CHUNK_ERASED_CHECK
#IGNORE_WRITE_VERIFY = -DCONFIG_YAFFS_DISBLE_WRITE_VERIFY
ENABLE_SHORT_NAMES_IN_RAM = -DCONFIG_SHORT_NAMES_IN_RAM
I have isolated the change which introduced this behavior to the CVS
changes made on 10/21/2004. IE, "cvs diff -c -D 2004/10/20 -D
2004/10/21" will show the changes that seem to be causing the problem.
CVS 2004/10/20 seems to work fine and I would just drop back to that
revision, but, of course, that leaves the bug which Michael found to
bite me later.
I'm not familiar enough with the VFS layer to guess at the cause, but
this is quite reproducable. Any ideas where to look? Any suggestions
on narrowing this down to a specific VFS interaction?
BTW, my trek back thru CVS history might have been less confusing with
fewer "empty log messages". :-)
Thanks,
--Brad Bosch
Quite some time ago, Charles Manning wrote:
>
> I have just checked in changes to yaffs_fs.c, yaffs_guts.c, yaffs_guts.h to
> fix this problem.
>
> Now yaffs Objects in the object look up hash table are not freed until the
> coresponding inode is cleared.
>
> I did some tests with a smaller bucket size (8) and observed that the
> recycling problem does not happen. Object numbers are now only recycled when
> the Linux cache tells us it is OK.
>
> This mechanism does not use any new kernel calls and should thus be good with
> older kernels.
>
> Thanx to Michael for his efforts in hunting down the problem.
>
> -- Charles
>
> _______________________________________________
> yaffs mailing list
> yaffs@stoneboat.aleph1.co.uk
> http://stoneboat.aleph1.co.uk/cgi-bin/mailman/listinfo/yaffs