This bug was diagnosed and the fix was identified by Nathaniel Filardo
of Johns Hopkins. All I'm providing is an easy test case and a writeup.
There is a permanent-data-loss bug in venti/copy: if it is copying
a block of hashes, and the last byte(s) of the last hash in the block
are zero, then the copy loop skips copying that hash (and all of its
descendants). This is low-probability (I think roughly
1/(256*VtEntrySize) = .01%) but very reproducible.
I have enclosed these files to help reproduce the bug:
* "find-trouble.c" locates an 8K hunk of the fortune database which
has a SHA1 hash ending with a zero byte
* "trouble" is an 8K file containing such a hunk
* "transcript" shows how to use vac to store an "un-copyable" vac
archive into one venti and how to verify that venti/copy skipped
copying the "trouble" file from that venti to another. It is
probably important to make sure that, before running vac, the
"trouble" file was added to the directory last, so that vac will
add the file's hash to the *end* of the relevant block.
P.S. The particular form of the change to the loop test matches
some code in P9P's venti/copy: http://tinyurl.com/neuujzc
|