Mr. Manning, et al:
We have enabled on-die ECC handling for the Micron NAND part MT29F4G08AACWC TSOP-I-48 (or MT29F8G08ADBDAH4). In the course of doing this we ran across a use case that required us to change YAFFS. The changes (patches) are included as an attachment to this email.
You may find them useful enough to put them into the mainline sources... or not. In any case, here they are...
They do the following:
#1: A patch to yaffs_mtdif2_multi.c to allow EUCLEAN and EBADMSG to percolate up into YAFFS bbt handling.
We were returning EUCLEAN from corrected read operations as required but YAFFS was not handling them. It appears that a check for tags->ecc_result == YAFFS_ECC_RESULT_NO_ERROR was preventing bad block handling.
We could not find another/better way to set tags->ecc_result to the value YAFFS_ECC_RESULT_NO_ERROR in order to fix that conditional.
Our hack was to simply remove the condition from the expression - this hack is really only valid for our use case. We are using inband-tags (too little spare area due to on-die ECC bytes) so the tags are already covered by the on-die ECC and we don't need YAFFS to protect them with additional ECC.
Would turning on/off ecc on the tags fix this? Maybe ecc_result is uninitialized or incorrectly being set to something other than YAFFS_ECC_RESULT_NO_ERROR?
#2: Add a mount option to yaffs2 to supply the maximum strike count for a given block. Modifies yaffs_guts.c and yaffs_vfs_multi.c.
This was hard coded into the yaffs code and we found it helpful during our testing to change this value. Our NAND part experiences a fairly high number of random correctable ECC errors (4-bit correction, 5-bit detection via the on-die ECC engine.)
It seems that a fixed maximum is somewhat problematic in the presence of these kinds of errors - given enough time without rebooting/remounting, a large percentage of the blocks on this device could eventually be marked as bad.
Given that the errors are cumulative over time, this check should probably be for a high "rate of bad reads" relative to a moving window average... or something slightly more nuanced than a simple maximum. See this discussion on the MTD list:
http://permalink.gmane.org/gmane.linux.drivers.mtd/40618
It looks like the strength of the ECC needs to be taken into account somehow. In our use case, given that the units are rebooted fairly frequently for other reasons, simply providing a larger max may be sufficient. Our testing continues in order to determine the correct maximum value to use for this NAND part.
Thanks for all your hard work on YAFFS and for your quick response to our questions.
Let us know how we can help.
Alan
#1: mods to allow EUCLEAN and EBADMSG to percolate up into YAFFS bbt handling
diff -r yaffs2-8a3135f/yaffs_mtdif2_multi.c yaffs2-8a3135f.patched/yaffs_mtdif2_multi.c
194c194
< && tags->ecc_result == YAFFS_ECC_RESULT_NO_ERROR) {
---
> /* KAM && tags->ecc_result == YAFFS_ECC_RESULT_NO_ERROR*/) {
199c199
< && tags->ecc_result == YAFFS_ECC_RESULT_NO_ERROR) {
---
> /* KAM && tags->ecc_result == YAFFS_ECC_RESULT_NO_ERROR*/) {
#2: mods to create new yaffs mount option "chunk-error-strikes-max-xx"
diff -r yaffs2-8a3135f/yaffs_guts.c yaffs2-8a3135f.patched/yaffs_guts.c
37a38,41
> /* Extern declarations */
>
> extern unsigned int yaffs_chunk_error_strikes_max; // KAM: see: yaffs_vfs_multi.c
>
219c223
< if (bi->chunk_error_strikes > 3) {
---
> if (bi->chunk_error_strikes > yaffs_chunk_error_strikes_max) {
diff -r yaffs2-8a3135f/yaffs_vfs_multi.c yaffs2-8a3135f.patched/yaffs_vfs_multi.c
38a39
> #include <linux/kernel.h> // KAM - for simple_strtol
187a189,190
> unsigned int yaffs_chunk_error_strikes_max = 3; // KAM - could not find a way to access options in yaffs_guts.c - created global
>
2540a2544
> int chunk_error_strikes_max; // KAM
2542a2547,2561
> // KAM - start new code
> #define STRIKES_OPTION_PREFIX "chunk-error-strikes-max-"
>
> static int get_strikes_from_option(char* option) {
> int strikes = 3; // default
> char* start = strstr(option, STRIKES_OPTION_PREFIX);
> if (start != NULL) {
> char* end;
> start = start + strlen(STRIKES_OPTION_PREFIX); // skip prefix
> strikes = (int)simple_strtol(start,&end,10);
> }
> return strikes;
> }
> // KAM - end new code
>
2598a2618,2619
> } else if (strstr(cur_opt, STRIKES_OPTION_PREFIX) != NULL) { // KAM
> options->chunk_error_strikes_max = get_strikes_from_option(cur_opt); // KAM
2717a2739,2741
> yaffs_trace(YAFFS_TRACE_OS, "chunk_error_strikes_max %d", options.chunk_error_strikes_max); // KAM
> yaffs_chunk_error_strikes_max = options.chunk_error_strikes_max; // KAM - could not find a way to access options in yaffs_guts.c - created global
>