On Thursday 19 April 2012 23:50:21 Ezio Zhang wrote: > shrink hdr is used to identify file hole and deleted files. > for the second situation,if a file deleted without using shrink hdr the > file will appear in the next start.(howyaffswork.pdf says "Shrink headers > are also used to indicate that a file has been deleted and that the record > of the file deletion is not lost.") > but what i am puzzling is that when it is safe to remove shrink hdr? The best way to think about this to think about what a shrink header does and the information it provides. Typically we only care about the most up to date version of some information. For example, if a file name is changed, then I only care about the new name and not the old name. The old header with the old name can be erased any time. There are times when the history matters and this is when the shrink header flag is used. The first of those is file holes. Consider the sequence: Open new file. Write 2MB of data. Truncate back to 1MB. Seek to 2MB. Write 1MB of data. The file should now contain 1MB of real data, a 1MB "hole", then 1MB of real data. So how do we go about remembering that the hole is there. The most recent file header just tells us that the file is 3MB in size. If we only relied on that information then it would not be possible to tell which data in the file should be treated as part of the hole. To this end, yaffs2 considers all the old headers too. After all, when a header was written, that was the file size and any data beyond that limit should be treated as deleted. Thus when the file header indicating the truncation back to 1MB is seen, we now know that any data beyond 1MB should be deleted. That file header that tells us this is now old but still contains useful information. We can't erase that information until the data chunks in the hole have all been erased. This is where shrink header markers come in. The shrink header marker is a flag that tells us that the header cannot be erased (ie cannot be garbage collected) until the data in the hole has been erased. Now clearly it would be very time consuming and complex to track the location of every chunk in every hole (there might, in theory, be thousands). Instead we use a "trick". Since the data in the hole was written before the shrink header, it must have an older sequence number. Since it was deleted, the blocks holding that data must have some deleted chunks. Therefore we can prevent the garbage collector from deleting the shink header too early by ensuring that there are no older blocks with deleted data. That is the function of the function yaffs_block_ok_for_gc(). Now a similar issue applies for deleted files. If we delete a file header for a deleted file before the data chunks have themselves been erased, the data chunks would be reconstructed into a file and placed in lost+found. Thus the shrink header mechanism is used again. The down side of shrink headers is that it can force the garbage collector to ignore blocks with a lot of garbage (ie. fast blocks to gc). That is why I added the feature to only use shrink headers for larger holes. I think there are some improvements that can be made - particularly around handling deleted files. -- CHarles