Hi Charles,
Thanks for the response. I totally agree with your
points.
I am just curious to know the ideas you are
planning to implement and also wondered how to get the new
implementations.
I too have some ideas as stated bellow which i
thought to share with you
- Even if many object headers are marked as
shrink headers, i think the overhead is in finding the lower sequence
number for every call to "yaffs_FindBlockForGarbageCollection"
- If the above point makes sense, then, we are
initialising "oldestDirtySequence" to '0' in every call to
"yaffs_FindBlockForGarbageCollection"
- I am thinking like once the oldestDirtySequence
is been found in "yaffs_BlockNotDisqualifiedFromGC", we can save it
till that block is been actually erased. This can then be
initialised to ZERO in the function "yaffs_BlockBecameDirty" if the erase is
successful.
Thanks,
Asif
-----Original
Message-----
From: Charles Manning <manningc2@actrix.gen.nz>
To:
yaffs@lists.aleph1.co.uk
Cc: "Asheefikbal"
<Asheef@mistralsolutions.com>
Date: Mon, 10 Mar 2008 08:09:31
+1300
Subject: Re: [Yaffs] Critical BUG in YAFFS2 !!!
Alif,
Thank
you for your indepth investigation. This highlights a performance issue
with shrink headers.
However, this is *not* a bug. The code is this
way for a particular reason.
The reason for doing the
disqualification is to ensure correct operation for
two situations, one
pretty common and the other quite subtle:
* deletion of files
*
creating files with holes
When we delete a file we need to keep track
of that deletion until all a the
data chunks that were in the file have
been erased by garbage collection. If
we do not then those chunks can
come back to life as files in the lost+found
directory.
Creating
files with holes is a particularly subtle situation that is probably
only encountered very infrequently. However, this is part of POSIX, so we
try
the best we can.
Consider the situation when you have a file
that goes through the following
operations:
A) Create [YAFFS
writes object header size 0].
B) Write 3MB
C) Truncate file to 1MB.
[YAFFS writes object header with size 1MB. This is
marked as a shrink
header].
D) Seek to 3MB
E) Write 1MB of data.
F) Close file [YAFFS
writes object header with size 4MB]
The file now is 4MB of size but
there is a 2MB hole in the middle that POSIX
tells us should be read
back as zero.
If the object header written in step C was erased
(through garbage collection)
before the data written in step B is
erased, then if a reboot happened at the
wrong time we'd "forget" that
we'd actually truncated the data and the data
would reappear in the
file.
The way to prevent this happening is to mark those special
object headers in a
way that prevents them being deleted until it is
safe to do so. The way we do
this is with a shrink header flag which is
applied to the object header and
also to the block.
Because we
cannot keep track of all the dependencies, it is easier to just do
the
following test when we want to erase (GC) a block:
* If a block does have
any shrink headers then it is OK to GC.
* If it does have a shrink header
then if there are older blocks (ie with a
lower sequence number that
have some deleted chunks in them) then don't GC.
* otherwise, GC.
That is the existing code in yaffs_BlockNotDisqualifiedFromGC(). Modifying
the
code as you suggest would break the above behaviour.
The real
problem is that far too many object headers are being marked as
shrink
headers which cause this test to be performed far too often. As you
point out, this causes poor garbage collection choices which causes extra
overheads in writes.
I do have some ideas and want to implement
those quite soon.
Thanks again
-- Charles
On
Saturday 08 March 2008 02:37:49 Asheefikbal wrote:
> Hi,
>
> There is a bug in YAFFS2 in GC implementation.
>
> While
checking the block, whether its qualified for GC(in the function
>
yaffs_BlockNotDisqualifiedFromGC), we are supposed to find the oldest
dirty
> sequence number. But in the present implementation, we are
finding only the
> oldest sequence number.
>
> Hence, it
needs to be changed to find the oldest dirty sequence number as
>
under.
>
> Current Implementation
>
--------------------------------
> "if (b->blockState ==
YAFFS_BLOCK_STATE_FULL && (b->pagesInUse -
>
b->softDeletions) < dev->nChunksPerBlock &&
b->sequenceNumber < seq)"
>
> Required Implementation(To
find Oldest Dirty Sequence Number)
>
---------------------------------------------------------------------------
>-------------- "if (b->blockState == YAFFS_BLOCK_STATE_FULL
&&
> (b->pagesInUse -
> b->softDeletions) <=
(bi->pagesInUse - bi->softDeletions) &&
>
b->sequenceNumber < seq)"
>
> After disk is been used for
long time(multiple usages), even if the disk
> has the valid Block for
the GC having sequence number newer than the oldest
> sequence number,
then this block is not qualified for GC(with Current
>
Implementation). This also leads to poor write performance...
>
> I have tested with the modified implementation, and is working
consistantly
> fine with proper disk usage and also with improved
write performance.
>
> Hope this makes sense...
>
>
Regards,
> Asif
>
>
>
-------------------------------------------------------DISCLAIMER----------
>-------------------------------------------- The information
transmitted
> herewith is confidential and proprietary information
intended only for use
> by the individual or entity to which it is
addressed. If the reader of this
> message is not the intended
recipient, you are hereby notified that any
> review, retransmission,
dissemination, distribution, copying or other use
> of, or taking of
any action in reliance upon this information is strictly
> prohibited.
If you have received this communication in error, please
> contact the
sender and delete the material from your computer.
>
---------------------------------------------------------------------------
>-----------------------------------------------------
>
>
Please do not print this email unless it is absolutely necessary. Spread
> environmental awareness.