On Sunday 06 September 2009 02:01:46 Yeasah Pell wrote:
> Ross Younger wrote:
> > Yeasah Pell wrote:
> >> > Bad blocks are actually managed at the MTD level, not the filesystem;
> >> > the exact method used to store the bad block table varies depending on
> >> > the exact NAND configuration, but often the NAND pages themselves have
> >> > a marker in the header which indicates "badness".
> >
> > Every chip has its own concept of factory-bad block markers. Many (most?)
> > chip spec sheets then recommend that the driver reads this information
> > and uses it to populate a Bad Block Table on the device. You can find the
> > code which does this in a recent kernel source tree, in
> > drivers/mtd/nand/nand_bbt.c. (Take care - there's a general
> > implementation, which some chips deviate from. You might want to read its
> > chip detection logic too.)
> >
> > If your chip has a BBT, then you're probably better off erasing just the
> > blocks comprising that table than you are erasing the device completely.
> > (That may yet require hacking the driver.) The net result would be to
> > forget all the blocks that the driver believes have gone bad in use -
> > which I've found can happen during development if you mess up your
> > drivers.
> >
> >
> > Ross
>
> It's an excellent point, some drivers do enable an on-flash BBT. This
> isn't determined by the NAND chip in use though, it's determined by the
> NAND *controller* in use. The option which enables this
> (NAND_USE_FLASH_BBT) is easily grepped for and gives the following list
> of drivers:

Not entirely true.

Yes, some controllers use BBTs but some NAND chips need them too.

You need to look at both the flash chips and controllers to figure out a 
suitable strategy.

For example, some flash parts define factory marked bad blocks to be any 
blocks that are not all 1s when shipped. Since this policy has no well 
defined bad block markers, you really have to build a BBT to use a chip like 
this.

For such a device you might even use a hybrid approach: BBT for factory marked 
bad blocks and regular marking for blocks that go bad during runtime.


>
> autcpu12: SmartMediaCard on autronix autcpu12 board
> cs553x_nand: NAND support for CS5535/CS5536 (AMD Geode companion chip)
> cafe_nand: NAND support for OLPC CAFÉ chip
> diskonchip: DiskOnChip 2000, Millennium and Millennium Plus (NAND
> reimplementation) (EXPERIMENTAL)
> fsl_elbc_nand: NAND support for Freescale eLBC controllers
> rtc_from4: Renesas Flash ROM 4-slot interface board (FROM_BOARD4)
> nand_pasemi: NAND support for PA Semi PWRficient
>
> I'm not sure how many of those controllers are widely used (the OLPC one
> must see significant use at least), but I've never used any of these
> controllers myself. But if you do use one of these controllers, it would
> indeed be best to just try to erase the on-flash table. Of course MTD
> protects these blocks from conventional erasure too (otherwise
> flash_eraseall would clear the bbt), so you'd have to disable that
> protection. Temporarily changing the third parameter of the call to
> nand_erase_nand in nand_base.c to 1 seems like it would trivially
> accomplish this. As long as the bbt isn't rewritten from memory before
> you can reboot, I think that should work fine.
>
> One other thing that occurred to me the first time I encountered this
> (this is specific to drivers which do not use an on-chip bad block
> table) is that there is probably some forensically identifiable
> difference between the OOB in a factory marked bad block versus the OOB
> in a MTD marked bad block, especially for chips where the marker
> specification diverges somewhat from the MTD code (which seems to be
> most of them). I never pursued this very far, as I always had a log of
> the original factory bad block list that I could manually restore from.

This is a very good idea during testing and development.

It can also help to defeat actual bad block marking during early development.

-- CHarles