Hi Charles,

Thanks for the reply.

I am quite confused about the bad block management methodology, seems both MTD and yaffs2 have some kind of bad block control. The problem of my case is, after some period of usage, the yaffs2 file system on some NAND begin to fail. Remounting with ignoring checkpoint could recover the file system, but for once only. The file system is still boomed after reboot and mount (with checkpoint).

I tried to read the codes of yaffs2 about scanning if checkpoint is ignored, and got confused. Seems yaffs2 driver is querying status of each block (in function yaffs2_scan_backwards). My question is:
  1. what does function yaffs2_scan_backwards do?
  2. MTD keeps a BBT (in NAND in my case), how does yaffs2 module obtains the BBT information? Why rescan from backward is needed in my case in order to recover a file system.
  3. After recovering the system, seems the bad block information is not saved. So re-scan is still needed after a reboot. This is my guess, please correct me if I am wrong.
Also I am using a quite old version of yaffs2 ( back in 2010). What's the most recommended stable version of yaffs2, and the kernel MTD driver version? To cut some boot up time I am saving BBT on NAND and reuse it after reboot, will this make any negative impact? I am interested in block summaries, but I would like to stick to checkpoint at the moment.

I am new to kernel level debugging, so I am quite lost here. Any help is appreciated. Thanks!

regards,
Peter


On Mon, Apr 30, 2012 at 7:41 AM, Charles Manning <manningc2@actrix.gen.nz> wrote:
On Saturday 28 April 2012 05:26:23 Peter Lin wrote:
> I have several NANDs that yaffs2 module would consider itself successfully
> recovered from check pointing and skip scanning, but the filesystem is not
> usable. Mounting with option no-checkpoint-read could recover the
> filesystem.
>
> I understand that bad block management shall be provided from MTD layer,
> and rescanning fixing the problem proved MTD is doing his job. But I do
> have some questions:
>
> 1. why in the first place the check point restoring succeeded but left a
> corrupted filesystem?

It is impossible to say with so little info.

> 2. What would happen if a used block become a bad
> block?
That block will not be scanned. But blocks don't just"go bad". We have to mark
them as bad, That normally means we have timne to extract the useful data
first.

> will the whole filesystem got crazy?

No. Yaffs uses a log structure with tags. That means there is no "master
table" or such which holds all the information.

> Any way to recover from it?

> 3.
> Any way to check or indicate an inconsistence in the filesystem, so the
> mounting script could try with the option no-checkpoint-read?

There is no such provision at present. Since there is no scanning if the
checkpoint works, it is really hard to see how you would decise that the
checkpoint was bad.

If you are having problems with checkpoint, then consider just turning it off.
Since block summaries were introduced, the boot speed up benefits of
checkpointing are not as dramatic as they were.


>
> Thanks for your work and help. Please let me know if there's any mistake in
> my understanding.
>
> regards,
> Peter
>
> does the official kernel has this function enabled or is there any option
> that controls it?
>
> On 2010-03-04 20:55, Charles Manning wrote:
> > On Friday 05 March 2010 07:14:59 Shivdas Gujare wrote:
> > > Hi Charles,
> > >
> > > Thanks lot for your help.
> > >
> > > On Wed, Mar 3, 2010 at 12:34 PM, Charles Manning
> > >
> > > wrote:
> > > > On Wednesday 03 March 2010 23:33:31 Sven Van Asbroeck wrote:
> > > >> Hello Shivdas,
> > > >>
> > > >> > So, what does actually "check pointing" saves while
> > > >> > unmount?
> > > >>
> > > >> It's my understanding that the check point consists of the RAM data
> > > >> structure which is assembled when a yaffs partition is scanned. It
> > > >> consists of meta-information associated with each chunk and block.
> > > >> If you'd like to know more, I recommend reading the 'How Yaffs
> > > >> works' document, which is available in CVS.
> > > >
> > > > A full scan builds up a set of data structures that define the file
> > > > system state. A checkpoint captures a reduced version of that, enough
> > > > to reconstitute the main part of the state and the rest can be built
> > > > up on a lazy basis.
> > > >
> > > >> > and Is it
> > > >> > safe to use check-pointing always in final product?
> > > >>
> > > >> According to Charles, checkpointing is designed to be used in the
> > > >> way you describe. To my knowledge, no open checkpointing issues
> > > >> exist, but you should search the archives. If you are concerned
> > > >> about the checkpoint diverging from the meta-information on flash,
> > > >> you could a) disable checkpointing altogether, or b) submit a patch
> > > >> implementing a checkpoint counter ;-)
> > > >
> > > > You can also choose to mount ignoring checkpointing with
> > > >
> > > > mount -t yaffs2 -o"no-checkpoint-read" ..
> > >
> > > This is not the option for me, since in final product, end user should
> > > not be able
> > > to change system data (i.e. mount flag's.) Or I can't change it unless
> > > rootfs is flashed
> > > on device, since yaffs2/nand partitions are mounted from rcS script.
> >
> > You don't need to do this. Just leave checkpointing on.
> >
> > -- CHarles
> >
> >
> > -- Charles
>
> -Peter
> _______________________________________________
> yaffs mailing list
> yaffs@lists.aleph1.co.uk
> http://lists.aleph1.co.uk/cgi-bin/mailman/listinfo/yaffs