On Tuesday 26 September 2006 12:06, Gennady Dagman wrote: > Thanks, Charles, > > I also tend to think that applying of your latest checkins > could be the right thing to do. > > Could you please only confirm that writing of not erased chunk > could really cause troubles ? (fs corruptions etc ?) It is likely to fix #2, but it could fix #1 too. Of course we shouldn't oops no matter what is on the flash :-). > > Regards, > > Gennady Dagman. > > On Tue, 2006-09-26 at 11:31 +1200, Charles Manning wrote: > > On Tuesday 26 September 2006 06:11, Gennady Dagman wrote: > > > Hello, > > > > > > We ran into this linux kernel crash during mounting yaffs2 partition > > > (please find full Oops file attached below) > > > and from the trace-back and register analysis I conclude that: > > > > > > The trace-back function call chain: > > > get_sb_bdev -> > > > yaffs_internal_read_super -> > > > yaffs_GutsInitialise -> > > > yaffs_CheckpointRestore -> > > > yaffs_ReadCheckpointData -> > > > yaffs_ReadCheckpointObjects -> > > > yaffs_ReadCheckpointTnodes -> > > > yaffs_AddOrFindLevel0Tnode -> memcpy > > > > > > >From the looking into yaffs_AddOrFindLevel0Tnode code it's pretty > > > > clear > > > > > > that the only reason for memcpy > > > (at the end of yaffs_AddOrFindLevel0Tnode) to crash is having both > > > fStruct->topLevel = 0 and fStruct->top = 0. > > > > > > Looks like this problem is not reproducible easily - we saw it only > > > ones so far and I suspect > > > the root cause of it (as well as few others odd problems we run into > > > from time to time - see, for example, > > > http://aleph1.co.uk/lurker/message/20060914.181435.951c1454.en.html) is > > > a flash file system corruption. > > > > > > Questions: > > > --------------- > > > > > > 1) Can you imagine what could be the reason (other than flash fs > > > corruption) for this Oops crash ? > > > > I'll have a look at that oops to see what hapopened. > > > > > 2) I see that currently in our yaffs code we have defined: > > > > > > #define CONFIG_YAFFS_DISABLE_CHUNK_ERASED_CHECK > > > > > > means that erasure check of NAND chunks is NOT performed before > > > write, but I know for sure that from time to time we do encounter > > > not erased chunks as result of power-off during block erasure. > > > What could be the consequences of using not erased chunks for > > > yaffs_WriteChunkWithTagsToNAND ? Could it cause fs corruptions ? > > > problems like > > > http://aleph1.co.uk/lurker/message/20060914.181435.951c1454.en.html ?? > > > or this current crash ??? > > > > Try out the new code I recently checked in for handling the retirement > > better. Amongst other things this handles the erased checking far better > > and I hunch it will fix the problem you describe here without burdening > > you with lots of erase checks. > > > > See http://aleph1.co.uk/lurker/message/20060921.084052.7a405cd0.en.html > > _______________________________________________ > yaffs mailing list > yaffs@lists.aleph1.co.uk > http://lists.aleph1.co.uk/cgi-bin/mailman/listinfo/yaffs