[Yaffs] Yaffs crash during small-page scan

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: Ian McDonnell
Date:  
To: yaffs
Subject: [Yaffs] Yaffs crash during small-page scan
Yaffers,

[it's my turn to ask for help!]

We're using yaffs_guts.c revision 1.10 circa 2005/07/26 from
yaffs2 cvs on a well established product. We are getting a few
cases of a crash/oops during the scan (small page). It crashes
in yaffs_AddObjectToDirectory called from yaffs_Scan -- there's
a bad pointer in parent directory children list, so the call
to list_add() oops's.


NAND state is such that during the scan yaffs finds:

1) a file object header (call it object A) who's
parent ID is 4, the deleted-file-dir.

2) further on several data chunks belonging to object A
are seen -- the in-core data for object A created
in (1) was freed (cuz the parent was deleted(id=4)),
so now a new object A is created, and because this
step has found data chunks, the new A has type=file.

3) further on a directory object is seen with the same
ID as A. This is not good, and deserves investigation
of it's own, BUT I don't expect yaffs to crash.

When the scan gets to (3) is sees that the object has changed
type and calls yaffs_DestroyObject(in) in the section with the
comment "This should not happen..." to zap the existing object.

Scan the goes on and calls yaffs_FindOrCreateObjectByNumber with
the second objectId 'A' now with type=dir. Things go bad when
yaffs_AddObjectToDirectory is later called because the directory
object was incorrectly (re)initialized when it morphed from a
file into a directory -- this is why the child list pointers are
junk.

Now, I'd like to simply upgrade to the latest and greater version
of Yaffs, but that's not so easy with thousands of units in the
field at risk. I'd need to understand what should be happening
during the scan when this situation arises and fix up the code
to handle it better. I'd also like to be sure this is fixed
appropriately in the current codebase -- the scan code handling
this case look unchanged.

-imcd