Hi Charles,
 
I have two important ivestigations on Write performance issue in GC, which will improve the performance during write.
 
1.    Once the " oldestDirtySequence" is been calculated in function " yaffs_BlockNotDisqualifiedFromGC", we dont have to keep recomputing it till the block associated with oldestDirtySequence is actually been ERASED.
    This will remove the extra overhead of oldestDirtySequence recomputation and improves the write performance.
 
2.   While grabbing the prioritised block in function " yaffs_FindBlockForGarbageCollection", there is extra overhead seen in the perticular scenario as explained bellow.
      If the block which needs gcPrioritise and hasShrinkHeader, then this is leading heavy overhead on write performance till all the prioritised blocks are done with GC. It will be really helpful if the solution being found for this problem.
 
I will be very thankful on your valuable suggestions to the points mentioned as above
 
Thanks,
Asif
 
-----Original Message-----
From: "Asheefikbal" <Asheef@mistralsolutions.com>
To: yaffs@lists.aleph1.co.uk
Cc: "Charles Manning" <manningc2@actrix.gen.nz>,aravind@mistralsolutions.com,athif@mistralsolutio ns.com
Date: Mon, 10 Mar 2008 19:23:02 +0530
Subject: Re: [Yaffs] Critical BUG in YAFFS2 !!!


Hi Charles,
 
Thanks for the response. I totally agree with your points.
 
I am just curious to know the ideas you are planning to implement and also wondered  how to get the new implementations.
 
I too have some ideas as stated bellow which i thought to share with you
 
- Even if many object headers are marked as shrink headers, i think the overhead is in finding the lower sequence number for every call to "yaffs_FindBlockForGarbageCollection"
 
- If the above point makes sense, then, we are initialising  "oldestDirtySequence" to '0' in every call to "yaffs_FindBlockForGarbageCollection"
 
- I am thinking like once the oldestDirtySequence is been found in "yaffs_BlockNotDisqualifiedFromGC", we can save it till that block is been actually erased. This can then be initialised to ZERO in the function "yaffs_BlockBecameDirty" if the erase is successful.
 
Thanks,
Asif
 
 
 
 

-----Original Message-----
From: Charles Manning <manningc2@actrix.gen.nz>
To: yaffs@lists.aleph1.co.uk
Cc: "Asheefikbal" <Asheef@mistralsolutions.com>
Date: Mon, 10 Mar 2008 08:09:31 +1300
Subject: Re: [Yaffs] Critical BUG in YAFFS2 !!!

Alif,

Thank you for your indepth investigation. This highlights a performance issue
with shrink headers.

However, this is *not* a bug. The code is this way for a particular reason.

The reason for doing the disqualification is to ensure correct operation for
two situations, one pretty common and the other quite subtle:

* deletion of files
* creating files with holes

When we delete a file we need to keep track of that deletion until all a the
data chunks that were in the file have been erased by garbage collection. If
we do not then those chunks can come back to life as files in the lost+found
directory.

Creating files with holes is a particularly subtle situation that is probably
only encountered very infrequently. However, this is part of POSIX, so we try
the best we can.

Consider the situation when you have a file that goes through the following
operations:

A) Create [YAFFS writes object header size 0].
B) Write 3MB
C) Truncate file to 1MB. [YAFFS writes object header with size 1MB. This is
marked as a shrink header].
D) Seek to 3MB
E) Write 1MB of data.
F) Close file [YAFFS writes object header with size 4MB]

The file now is 4MB of size but there is a 2MB hole in the middle that POSIX
tells us should be read back as zero.

If the object header written in step C was erased (through garbage collection)
before the data written in step B is erased, then if a reboot happened at the
wrong time we'd "forget" that we'd actually truncated the data and the data
would reappear in the file.

The way to prevent this happening is to mark those special object headers in a
way that prevents them being deleted until it is safe to do so. The way we do
this is with a shrink header flag which is applied to the object header and
also to the block.

Because we cannot keep track of all the dependencies, it is easier to just do
the following test when we want to erase (GC) a block:
* If a block does have any shrink headers then it is OK to GC.
* If it does have a shrink header then if there are older blocks (ie with a
lower sequence number that have some deleted chunks in them) then don't GC.
* otherwise, GC.

That is the existing code in yaffs_BlockNotDisqualifiedFromGC(). Modifying the
code as you suggest would break the above behaviour.

The real problem is that far too many object headers are being marked as
shrink headers which cause this test to be performed far too often. As you
point out, this causes poor garbage collection choices which causes extra
overheads in writes.

I do have some ideas and want to implement those quite soon.

Thanks again

-- Charles


On Saturday 08 March 2008 02:37:49 Asheefikbal wrote:
> Hi,
>
> There is a bug in YAFFS2 in GC implementation.
>
> While checking the block, whether its qualified for GC(in the function
> yaffs_BlockNotDisqualifiedFromGC), we are supposed to find the oldest dirty
> sequence number. But in the present implementation, we are finding only the
> oldest sequence number.
>
> Hence, it needs to be changed to find the oldest dirty sequence number as
> under.
>
> Current Implementation
> --------------------------------
> "if (b->blockState == YAFFS_BLOCK_STATE_FULL && (b->pagesInUse -
> b->softDeletions) < dev->nChunksPerBlock && b->sequenceNumber < seq)"
>
> Required Implementation(To find Oldest Dirty Sequence Number)
> ---------------------------------------------------------------------------
>-------------- "if (b->blockState == YAFFS_BLOCK_STATE_FULL &&
> (b->pagesInUse -
> b->softDeletions) <= (bi->pagesInUse - bi->softDeletions) &&
> b->sequenceNumber < seq)"
>
> After disk is been used for long time(multiple usages), even if the disk
> has the valid Block for the GC having sequence number newer than the oldest
> sequence number, then this block is not qualified for GC(with Current
> Implementation). This also leads to poor write performance...
>
> I have tested with the modified implementation, and is working consistantly
> fine with proper disk usage and also with improved write performance.
>
> Hope this makes sense...
>
> Regards,
> Asif
>
>
> -------------------------------------------------------DISCLAIMER----------
>-------------------------------------------- The information transmitted
> herewith is confidential and proprietary information intended only for use
> by the individual or entity to which it is addressed. If the reader of this
> message is not the intended recipient, you are hereby notified that any
> review, retransmission, dissemination, distribution, copying or other use
> of, or taking of any action in reliance upon this information is strictly
> prohibited. If you have received this communication in error, please
> contact the sender and delete the material from your computer.
> ---------------------------------------------------------------------------
>-----------------------------------------------------
>
> Please do not print this email unless it is absolutely necessary. Spread
> environmental awareness.

-------------------------------------------------------DISCLAIMER------------------------------------------------------
The information transmitted herewith is confidential and proprietary information intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.
--------------------------------------------------------------------------------------------------------------------------------

Please do not print this email unless it is absolutely necessary. Spread environmental awareness.