On Saturday 03 December 2005 07:56, Lawson.Reed wrote: > Found it.... took me 5 work days. > And this deadlock issue IS in the YAFFS2 tip code on CVS here: > http://www.aleph1.co.uk/cgi-bin/viewcvs.cgi/yaffs2/yaffs_fs.c?rev=1.34&view >=auto > Thanks for this huge effort Reed. I am quite surprised that someone else has not stumbled on this either. Does it only impact on 2.4.x kernels or also 2.6? > So, no one has seen this??? > > Here is what is happening: > > Process 'A' grabs the YAFFS2 grossLock. > Process 'B' preempts and it's job is to free unused inodes everywhere. > (hint: 'B' is kswapd). So, 'B' sets I_FREEING. Then it calls > yaffs_clear_inode() which needs the grossLock. So, it goes > on the wait queue because 'A' has the grossLock. > > Now process 'A' runs. It's holding the grossLock. It calls > yaffs_get_inode() which calls BACK UP to iget()... With > the grossLock held! That calls find_inode(). It finds > I_FREEING set and then gets put on a wait queue in > __wait_on_freeing_inode(). > > Presto chango deadlock. > > So, my solution is to make sure the grossLock is not held when > calling yaffs_get_inode(). Plus, I added grossLocking to > yaffs_read_inode() since NB's comment in there is no longer > true. > > I ran my 20 thread torture test which usually deadlocks in under 30 > seconds. It ran overnight with this fix. The test found no compare > errors in the 20 files that it reads and writes at random times with > random data and random lengths. > > So, I strongly suggest that someone close to the YAFFSs effort review > this change and incorporate it. I am kinda new to all this and I'm > not even sure what the correct way to submit the changes are. > So, let me know how I can help. Did you mean to include some code? The easiest is to send something in patch form using "diff -Naur old new", but others are fine too. Thanx Charles