Re: [Yaffs] Disadvantage of using yaffs checkpointing?

Top Page
Attachments:
Message as email
+ (text/plain)
+ (text/html)
Delete this message
Reply to this message
Author: peterlingoal
Date:  
To: Charles Manning
CC: yaffs
Subject: Re: [Yaffs] Disadvantage of using yaffs checkpointing?
I did a quick testing using HEAD yaffs to search for a checkpoint block on
my NAND, it returned the same one in the bad block area. Even it is
rejected in the later checking, checkpoint will not work properly every
time if this back block is at the starting.

shall the checkpoint be ignored and search continues if it is in a bad
block area?

On Fri, May 18, 2012 at 9:39 AM, peterlingoal <>wrote:

> Yes we are are using a pretty old version (back in Sep 2010), and now we
> are trying to upgrade to latest.
> Could you please help point out which checksum would prevent an old
> checkpoint being used? Right now I could not simply try a new version as
> the version mismatch would always force a re-scan.
>
> BTW, HowYaffsWorks is a great document, however there's no download link
> in yaffsDotnet. I didn't find this doc until I googled for the file
> directly. Could this be fixed so newbie like me could read the document
> first before asking questions?
>
> Thanks,
> Peter
>
>
> On Fri, May 18, 2012 at 5:26 AM, Charles Manning <>wrote:
>
>> On Thursday 17 May 2012 22:29:42 peterlingoal wrote:
>> > After spending sometime looking around in my corrupted NAND, I think I
>> am
>> > clear what's going on there:
>> >
>> > There's some *outdated* checkpoint block in the bad blocks portion, and
>> the
>> > real good one is located at a later block. During mounting, yaffs
>> firstly
>> > found the *outdate* checkpoint block and loaded from there. That's why
>> > loading from checkpoint will always result a corrupted FS, even after
>> > re-scanning all the blocks with no-checkpoing-read.
>> >
>> > now the question part:
>> >
>> >    1. why in the first place there's some checkpoint block 'left over'
>> in
>> >    the bad blocks? shall they be erased?
>> It is generally a bad idea to erase bad blocks.
>> >    2. While looking for a checkpoint block, shall the block status be
>> >    checked? Or is there any better way to handle this situation? I
>> simply
>> > used mtd->block_isbad and continue searching and it seemed working.
>> That should be happening. I'll fix it if that is broken.

>>
>> Now my question :-):
>> Are you using an old version of yaffs or the latest? There are various
>> checksums on the checkpoint data which should fail if old data is found.
>>
>> >
>> > regards,
>> > Peter
>> >
>> > On Mon, May 7, 2012 at 3:08 AM, Charles Manning
>> <>wrote:
>> > > On Friday 04 May 2012 00:30:55 peterlingoal wrote:
>> > > > Hi Charles,
>> > > >
>> > > > Thanks for the reply.
>> > > >
>> > > > I am quite confused about the bad block management methodology,
>> seems
>> > >
>> > > both
>> > >
>> > > > MTD and yaffs2 have some kind of bad block control. The problem of
>> my
>> > >
>> > > case
>> > >
>> > > > is, after some period of usage, the yaffs2 file system on some NAND
>> > > > begin to fail. Remounting with ignoring checkpoint could recover the
>> > > > file
>> > >
>> > > system,
>> > >
>> > > > but for once only. The file system is still boomed after reboot and
>> > > > mount (with checkpoint).
>> > > >
>> > > > I tried to read the codes of yaffs2 about scanning if checkpoint is
>> > > > ignored, and got confused. Seems yaffs2 driver is querying status of
>> > > > each block (in function yaffs2_scan_backwards). My question is:
>> > >
>> > > I suggest you read the HowYaffsWords doc. You can find that on
>> > > yaffs.netor find the openoffice doc on the yaffs git.
>> > >
>> > > >    1. what does function yaffs2_scan_backwards do?

>> > >
>> > > This function scans the nand partition if there is no checkpoint. It
>> > > reads the
>> > > tags and builds up the file system state.
>> > >
>> > > >    2. MTD keeps a BBT (in NAND in my case), how does yaffs2 module

>> > >
>> > > obtains
>> > >
>> > > >    the BBT information? Why rescan from backward is needed in my
>> case
>> > > > in order to recover a file system.

>> > >
>> > > Yaffs calls the MTD function to determine if a block is good or bad.
>> > > Yaffs does not know or care if mtd used a bad block table or not.
>> > >
>> > > >    3. After recovering the system, seems the bad block information
>> is
>> > > > not saved. So re-scan is still needed after a reboot. This is my
>> guess,
>> > > > please correct me if I am wrong.

>> > > >
>> > > > Also I am using a quite old version of yaffs2 ( back in 2010).
>> What's
>> > > > the most recommended stable version of yaffs2,
>> > >
>> > > I suggest using a more recent version. I would recommend using the
>> > > current HEAD.
>> > >
>> > > > and the kernel MTD driver
>> > > > version?
>> > >
>> > > Sorry I don't keep current with all mtd changes and cant't advise that
>> > > off the
>> > > top of my head.
>> > >
>> > > > To cut some boot up time I am saving BBT on NAND and reuse it
>> > > > after reboot, will this make any negative impact?
>> > >
>> > > I don't see that this will cause any problems. yaffs does not care
>> how or
>> > > if
>> > > you store bbt info.
>> > >
>> > > > I am interested in block
>> > > > summaries, but I would like to stick to checkpoint at the moment.
>> > >
>> > > If you use the new code you will get summaries as part of the
>> > > improvement.
>> > >
>> > > > I am new to kernel level debugging, so I am quite lost here. Any
>> help
>> > > > is appreciated. Thanks!
>> > >
>> > > We've all been there.
>> > >
>> > > > regards,
>> > > > Peter
>> > > >
>> > > > On Mon, Apr 30, 2012 at 7:41 AM, Charles Manning
>> > >
>> > > <>wrote:
>> > > > > On Saturday 28 April 2012 05:26:23 Peter Lin wrote:
>> > > > > > I have several NANDs that yaffs2 module would consider itself
>> > > > >
>> > > > > successfully
>> > > > >
>> > > > > > recovered from check pointing and skip scanning, but the
>> filesystem
>> > >
>> > > is
>> > >
>> > > > > not
>> > > > >
>> > > > > > usable. Mounting with option no-checkpoint-read could recover
>> the
>> > > > > > filesystem.
>> > > > > >
>> > > > > > I understand that bad block management shall be provided from
>> MTD
>> > > > > > layer, and rescanning fixing the problem proved MTD is doing his
>> > > > > > job. But I do have some questions:
>> > > > > >
>> > > > > > 1. why in the first place the check point restoring succeeded
>> but
>> > >
>> > > left
>> > >
>> > > > > > a corrupted filesystem?
>> > > > >
>> > > > > It is impossible to say with so little info.
>> > > > >
>> > > > > > 2. What would happen if a used block become a bad
>> > > > > > block?
>> > > > >
>> > > > > That block will not be scanned. But blocks don't just"go bad". We
>> > > > > have
>> > >
>> > > to
>> > >
>> > > > > mark
>> > > > > them as bad, That normally means we have timne to extract the
>> useful
>> > >
>> > > data
>> > >
>> > > > > first.
>> > > > >
>> > > > > > will the whole filesystem got crazy?
>> > > > >
>> > > > > No. Yaffs uses a log structure with tags. That means there is no
>> > >
>> > > "master
>> > >
>> > > > > table" or such which holds all the information.
>> > > > >
>> > > > > > Any way to recover from it?
>> > > > > >
>> > > > > > 3.
>> > > > > > Any way to check or indicate an inconsistence in the
>> filesystem, so
>> > >
>> > > the
>> > >
>> > > > > > mounting script could try with the option no-checkpoint-read?
>> > > > >
>> > > > > There is no such provision at present. Since there is no scanning
>> if
>> > >
>> > > the
>> > >
>> > > > > checkpoint works, it is really hard to see how you would decise
>> that
>> > >
>> > > the
>> > >
>> > > > > checkpoint was bad.
>> > > > >
>> > > > > If you are having problems with checkpoint, then consider just
>> > > > > turning
>> > >
>> > > it
>> > >
>> > > > > off.
>> > > > > Since block summaries were introduced, the boot speed up benefits
>> of
>> > > > > checkpointing are not as dramatic as they were.
>> > > > >
>> > > > > > Thanks for your work and help. Please let me know if there's any
>> > > > > > mistake
>> > > > >
>> > > > > in
>> > > > >
>> > > > > > my understanding.
>> > > > > >
>> > > > > > regards,
>> > > > > > Peter
>> > > > > >
>> > > > > > does the official kernel has this function enabled or is there
>> any
>> > > > > > option that controls it?
>> > > > > >
>> > > > > > On 2010-03-04 20:55, Charles Manning wrote:
>> > > > > > > On Friday 05 March 2010 07:14:59 Shivdas Gujare wrote:
>> > > > > > > > Hi Charles,
>> > > > > > > >
>> > > > > > > > Thanks lot for your help.
>> > > > > > > >
>> > > > > > > > On Wed, Mar 3, 2010 at 12:34 PM, Charles Manning
>> > > > > > > >
>> > > > > > > > wrote:
>> > > > > > > > > On Wednesday 03 March 2010 23:33:31 Sven Van Asbroeck
>> wrote:
>> > > > > > > > >> Hello Shivdas,
>> > > > > > > > >>
>> > > > > > > > >> > So, what does actually "check pointing" saves while
>> > > > > > > > >> > unmount?
>> > > > > > > > >>
>> > > > > > > > >> It's my understanding that the check point consists of
>> the
>> > > > > > > > >> RAM
>> > > > >
>> > > > > data
>> > > > >
>> > > > > > > > >> structure which is assembled when a yaffs partition is
>> > >
>> > > scanned.
>> > >
>> > > > > > > > >> It consists of meta-information associated with each
>> chunk
>> > > > > > > > >> and block. If you'd like to know more, I recommend
>> reading
>> > > > > > > > >> the
>> > >
>> > > 'How
>> > >
>> > > > > > > > >> Yaffs works' document, which is available in CVS.
>> > > > > > > > >
>> > > > > > > > > A full scan builds up a set of data structures that define
>> > > > > > > > > the file system state. A checkpoint captures a reduced
>> > > > > > > > > version of that,
>> > > > >
>> > > > > enough
>> > > > >
>> > > > > > > > > to reconstitute the main part of the state and the rest
>> can
>> > > > > > > > > be
>> > > > >
>> > > > > built
>> > > > >
>> > > > > > > > > up on a lazy basis.
>> > > > > > > > >
>> > > > > > > > >> > and Is it
>> > > > > > > > >> > safe to use check-pointing always in final product?
>> > > > > > > > >>
>> > > > > > > > >> According to Charles, checkpointing is designed to be
>> used
>> > > > > > > > >> in the way you describe. To my knowledge, no open
>> > > > > > > > >> checkpointing issues exist, but you should search the
>> > > > > > > > >> archives. If you are concerned about the checkpoint
>> > > > > > > > >> diverging from the
>> > > > > > > > >> meta-information on flash, you could a) disable
>> > > > > > > > >> checkpointing altogether, or b) submit a
>> > > > >
>> > > > > patch
>> > > > >
>> > > > > > > > >> implementing a checkpoint counter ;-)
>> > > > > > > > >
>> > > > > > > > > You can also choose to mount ignoring checkpointing with
>> > > > > > > > >
>> > > > > > > > > mount -t yaffs2 -o"no-checkpoint-read" ..
>> > > > > > > >
>> > > > > > > > This is not the option for me, since in final product, end
>> user
>> > > > >
>> > > > > should
>> > > > >
>> > > > > > > > not be able
>> > > > > > > > to change system data (i.e. mount flag's.) Or I can't
>> change it
>> > > > >
>> > > > > unless
>> > > > >
>> > > > > > > > rootfs is flashed
>> > > > > > > > on device, since yaffs2/nand partitions are mounted from rcS
>> > > > > > > > script.
>> > > > > > >
>> > > > > > > You don't need to do this. Just leave checkpointing on.
>> > > > > > >
>> > > > > > > -- CHarles
>> > > > > > >
>> > > > > > >
>> > > > > > > -- Charles
>> > > > > >
>> > > > > > -Peter
>> > > > > > _______________________________________________
>> > > > > > yaffs mailing list
>> > > > > >
>> > > > > > http://lists.aleph1.co.uk/cgi-bin/mailman/listinfo/yaffs
>> > >
>> > > _______________________________________________
>> > > yaffs mailing list
>> > >
>> > > http://lists.aleph1.co.uk/cgi-bin/mailman/listinfo/yaffs
>>
>>
>>
>