Hi,

I use git-annex 3.20120123 on a debian-testing amd-64 machine with software RAID6 and LVM2 on it. I needed to move the whole /home directory to another LV (the new LV is on encrypted PV, the old LV is encrypted and not properly aligned; I'm changing from encrypted /home only to encrypted everything except /boot), so I have used the rsync -aAXH from a ro mounted /home to a new LV mounted on /mnt/home_2. After the move was complete I run the git annex fsck on my (4TB of) data. The fsck finds some files bad, and moves them to the ..../bad directory. So far so good, this is how it should be, right? But then- I have a file with sha1sum of all my files. So - I checked the 'bad' file against that. It was OK. Then I computed the SHA256 of the file - this is used by git annex fsck. It was OK, too. So how did it happen, that the file was marked as bad? Do I miss something here? Could it be related to the hardware (HDDs) and silent data corruption? Or is it the undesirable effect of rsync? Or maybe the fsck is at fault here?

Any ideas?

Well, it should only move files to .git/annex/bad/ if their filesize is wrong, or their checksum is wrong.

You can try moving a file out of .git/annex/bad/ and re-run fsck and see if it fails it again. (And if it does, paste in a log!)

To do that -- Suppose you have a file .git/annex/bad/SHA256-s33--5dc45521382f1c7974d9dbfcff1246370404b952 and you know that file foobar was supposed to have that content (you can check that foobar is a symlink to that SHA value). Then reinject it:

git annex reinject .git/annex/bad/SHA256-s33--5dc45521382f1c7974d9dbfcff1246370404b952 foobar

Comment by http://joey.kitenet.net/ Tue Feb 14 16:58:33 2012

Thanks, joey, but I still do not know, why the file that has been (and is) OK according to separate sha1 and sha256 checks, has been marked 'bad' by fsck and moved to .git/annex/bad. What could be a reason for that? Could have rsync caused it? I know too little about internal workings of git-annex to answer this question.

But one thing I know for certain - the false positives should not happen, unless something is wrong with the file. Otherwise, if it is unreliable, if I have to check twice, it is useless. I might as well just keep checksums of all the files and do all checks by hand...

Comment by antymat Tue Feb 14 22:48:37 2012

All that git annex fsck does is checksum the file and move it away if the checksum fails.

If bad data was somehow read from the disk that one time, what you describe could occur. I cannot think of any other way it could happen.

Comment by http://joey.kitenet.net/ Tue Feb 14 22:57:29 2012

OK, thanks. I was just wondering - since there are links in git(-annex), and a hard links too, that maybe the issue has been caused by rsync.

I will keep my eye on that and run checks with my own checksum and fsck from time to time, and see what happens. I will post my results here, but the whole run (fsck or checksum) takes almost 2 days, so I will not do it too often... ;)

Comment by antymat Wed Feb 15 07:13:12 2012

The symlinks are in the git repository. So if the rsync damanged one, git would see the change. And nothing that happens to the symlinks can affect fsck.

git-annex does not use hard links at all.

fsck corrects mangled file permissions. It is possible to screw up the permissions so badly that it cannot see the files at all (ie, chmod 000 on a file under .git/annex/objects), but then fsck will complain and give up, not move the files to bad. So I don't see how a botched rsync could result in fsck moving a file with correct content to bad.

Comment by http://joey.kitenet.net/ Wed Feb 15 15:22:56 2012
Comments on this page are closed.