Hmmm
OK it looks like the entry is staying "alive" in the dentry which means that
the directory is not becoming empty. This causes problems because when a
look-up is attempted the dentry no longer has an associated inode - this
would be a BadThing. The defered deletion might be causing this problem.
I'll do some tests here, but if you have a system all set up amd faulting, it
would be interesting to see a trace of yaffs messages as well as doing "ls
-ial" on a broken directory. The -i option then gives inode info which helps
to tie up the faulting file
-- Charles
On Thursday 06 January 2005 20:35,
bbosch@iphase.com wrote:
> Charles,
>
> I recently updated my kernel to the latest CVS YAFFS code and
> discovered rather serious filesystem corruption apparently triggered
> by heavy file unlink and creation activity. The symptoms are easily
> reproduced by repeatedly extracting a tar archive containing several
> files and symbolic links in an initially empty YAFFS file system.
> Soon, tar reports "tar: Couldnt remove old file: Directory not empty"
> for a random file which was not supposed to be a directory! Other
> symptoms are YAFFS errors which read "**>> yaffs chunk 792 was not
> erased **>> yaffs write required 2 attempts".
>
> After the errors, the filesystem shows corrupted directories with ls
> output like:
>
> ~ # ls -l /mnt/bin
> ls:
> /mnt/bin/ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
>ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
>ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
>ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ: No such file or directory ls:
> /mnt/bin/ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
>ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
>ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
>ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ: No such file or directory ls:
> /mnt/bin/ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
>ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
>ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
>ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ: No such file or directory ls:
> /mnt/bin/ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
>ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
>ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
>ÿÿÿÿÿÿÿÿÿ ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ: No such file or directory
> lrwxrwxrwx 1 root root 14 Dec 8 18:24 [ -> ../bin/busybox
> lrwxrwxrwx 1 root root 14 Dec 8 18:24 ash ->
> ../bin/busybox lrwxrwxrwx 1 root root 14 Dec 8 18:24 awk
> -> ../bin/busybox lrwxrwxrwx 1 root root 14 Dec 8 18:24
> basename -> ../bin/busybox
>
> Unmounting and remounting the file system seems to make the directory
> corruption go away (at least most of the time).
>
> My kernel is based on 2.4.26. The architecture is ppc. YAFFS is
> running over a pretty stock MTD/NAND layer.
>
> A condensed summary of configuration from my Makefile:
>
> #USE_RAM_FOR_TEST = -DCONFIG_YAFFS_RAM_ENABLED
> USE_MTD = -DCONFIG_YAFFS_MTD_ENABLED
> #USE_OLD_MTD = -DCONFIG_YAFFS_USE_OLD_MTD
> #USE_NANDECC = -DCONFIG_YAFFS_USE_NANDECC
> #USE_WRONGECC = -DCONFIG_YAFFS_ECC_WRONG_ORDER
> USE_GENERIC_RW = -DCONFIG_YAFFS_USE_GENERIC_RW
> #USE_HEADER_FILE_SIZE = -DCONFIG_YAFFS_USE_HEADER_FILE_SIZE
> #IGNORE_CHUNK_ERASED = -DCONFIG_YAFFS_DISABLE_CHUNK_ERASED_CHECK
> #IGNORE_WRITE_VERIFY = -DCONFIG_YAFFS_DISBLE_WRITE_VERIFY
> ENABLE_SHORT_NAMES_IN_RAM = -DCONFIG_SHORT_NAMES_IN_RAM
>
> I have isolated the change which introduced this behavior to the CVS
> changes made on 10/21/2004. IE, "cvs diff -c -D 2004/10/20 -D
> 2004/10/21" will show the changes that seem to be causing the problem.
> CVS 2004/10/20 seems to work fine and I would just drop back to that
> revision, but, of course, that leaves the bug which Michael found to
> bite me later.
>
> I'm not familiar enough with the VFS layer to guess at the cause, but
> this is quite reproducable. Any ideas where to look? Any suggestions
> on narrowing this down to a specific VFS interaction?
>
> BTW, my trek back thru CVS history might have been less confusing with
> fewer "empty log messages". :-)
>
> Thanks,
>
> --Brad Bosch
>
> Quite some time ago, Charles Manning wrote:
> > I have just checked in changes to yaffs_fs.c, yaffs_guts.c, yaffs_guts.h
> > to fix this problem.
> >
> > Now yaffs Objects in the object look up hash table are not freed until
> > the coresponding inode is cleared.
> >
> > I did some tests with a smaller bucket size (8) and observed that the
> > recycling problem does not happen. Object numbers are now only recycled
> > when the Linux cache tells us it is OK.
> >
> > This mechanism does not use any new kernel calls and should thus be good
> > with older kernels.
> >
> > Thanx to Michael for his efforts in hunting down the problem.
> >
> > -- Charles
> >
> > _______________________________________________
> > yaffs mailing list
> > yaffs@stoneboat.aleph1.co.uk
> > http://stoneboat.aleph1.co.uk/cgi-bin/mailman/listinfo/yaffs