Re: [Yaffs] background GC

Attachments:
Message as email (text/plain) (text/html)

Author: Hugo Etchegoyen
Date:
To: yaffs
Old-Topics: Re: [Yaffs] Proposals for speeding up shutdown
Subject: Re: [Yaffs] background GC

Dear Charles,

I'm trying the last version of yaffs after some time of inactivity. It
looks great, with background collection and block refreshing, and the
granularity of collection has been decreased so collection can be
interrupted more quickly. This seems to solve my previous concerns about
wear leveling and quick reaction on power fail.

I have one new concern now. This is the result of some experimenting and
studying the code. Please forgive me if I didn't get it right, the GC
code is rather complex.

I've noticed that after a while all dirty blocks get erased, even if
they have just a single unused chunk, due to this code in
yaffs_find_gc_block():

     /*
      * If nothing has been selected for a while, try the oldest dirty
      * because that's gumming up the works.
      */

     if (!selected && dev->param.is_yaffs2 &&
         dev->gc_not_done >= (background ? 10 : 20)) {
         yaffs2_find_oldest_dirty_seq(dev);
         if (dev->oldest_dirty_block > 0) {
             selected = dev->oldest_dirty_block;
             dev->gc_dirtiest = selected;
             dev->oldest_dirty_gc_count++;
             bi = yaffs_get_block_info(dev, selected);
             dev->gc_pages_in_use =
                 bi->pages_in_use - bi->soft_del_pages;
         } else {
             dev->gc_not_done = 0;
         }
     }

If no block is selected for erasure when this code is executed, this
means that there are enough erased blocks and no dirty block with enough
garbage was found. Then one out of 10 or 20 times the oldest dirty block
is selected and erased anyway.

I'm not sure about the rationale behind this code. I guess it tends to
keep the nand flash de-fragmented, since in the long run it will
elliminate all garbage. But I'm concerned about the amount of erasures
caused by this logic. I ran a test on a 128 MB partition which was like
40 % full and 60 % free. I just created and deleted a a very small file
several times, each time waiting for yaffs to stabilize until no more
blocks were erased (I was checking erasures by tracing). I'm aware this
is the least friendly thing to do, because it does not allow yaffs to
"consolidate" successive changes.

Now /proc/yaffs reads as follows:

    Multi-version YAFFS built:Mar 21 2011 16:41:07

    Device 0 "nand_filesystem"
    start_block.......... 0
    end_block............ 999
    total_bytes_per_chunk 2048
    use_nand_ecc......... 1
    no_tags_ecc.......... 0
    is_yaffs2............ 1
    inband_tags.......... 0
    empty_lost_n_found... 0
    disable_lazy_load.... 0
    refresh_period....... 500
    n_caches............. 10
    n_reserved_blocks.... 5
    always_check_erased.. 0

    data_bytes_per_chunk. 2048
    chunk_grp_bits....... 0
    chunk_grp_size....... 1
    n_erased_blocks...... 571
    blocks_in_checkpt.... 1

    n_tnodes............. 2409
    n_obj................ 241
    n_free_chunks........ 36581

    n_page_writes........ 3241
    n_page_reads......... 3108
    n_erasures........... 55
    n_gc_copies.......... 3088
    all_gcs.............. 645
    passive_gc_count..... 645
    oldest_dirty_gc_count 50
    n_gc_blocks.......... 51
    bg_gcs............... 51
    n_retired_writes..... 0
    n_retired_blocks..... 0
    n_ecc_fixed.......... 0
    n_ecc_unfixed........ 0
    n_tags_ecc_fixed..... 0
    n_tags_ecc_unfixed... 0
    cache_hits........... 0
    n_deleted_files...... 0
    n_unlinked_files..... 64
    refresh_count........ 1
    n_bg_deletions....... 0

There have been 55 block erasures; one of them is a block refresh and 50
are "oldest dirty block" erasures. This leaves us with only 4 "good"
erasures of blocks with a reasonable amount of garbage. Tracing showed
that the 50 "oldest dirty" erasures were performed on blocks with very
little garbage, many times just one chunk. At first sight this looks
excessive, considering that more than half of the blocks in the flash
are empty. The difference in flash wear is big (around 10 to 1 in this
case).

What do you think? Maybe increasing the skip values (10 and 20) so that
the "oldest dirty" code runs less frequently will give a better balance
between long-term de-fragmentation and too many erasures?

Best regards,
Hugo

--

Ing. Hugo Eduardo Etchegoyen*
*Gerente Dto. Software de Base

Compañía Hasar| Grupo Hasar*
*Marcos Sastre y José Ingenieros
El Talar. Pacheco
[B1618CSD] Buenos Aires. Argentina
Tel [54 11] 4117 8900 | Fax [54 11] 4117 8998
E-mail: hetchegoyen@hasar.com
Visítenos en: www.hasar.com
<http://www.hasar.com>Información legal y política de confidencialidad:
www.grupohasar.com/disclaimer <http://www.grupohasar.com/disclaimer>

This message is part of the following thread:
	the complete thread tree sorted by date