I'm starting out with git-annex and running into some confusion with setting up the remotes.

I have three systems I'm trying to set up (domains edited):

  • psychosis: ssh://psychosis.foo.com/vid
  • bacon: ssh://bucket.foo.com/vid
  • bucket: ssh://bucket.bar.org/vid

And one bare repository so that I can have a single place to push/pull:

  • origin: https://git.foo.com/jim/vid.git

On psychosis:

psychosis$ git config --list | grep ^remote | sort
remote.bacon.annex-uuid=8f1f0898-f8c1-11e0-9bf2-b387af26ee63
remote.bacon.fetch=+refs/heads/*:refs/remotes/bacon/*
remote.bacon.url=ssh://bucket.foo.com/vid
remote.bucket.annex-uuid=82814942-f8e0-11e0-b053-e70a61e98e19
remote.bucket.fetch=+refs/heads/*:refs/remotes/bucket/*
remote.bucket.url=ssh://bucket.bar.org/vid
remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*
remote.origin.url=https://git.foo.com/jim/vid.git

psychosis$ git annex status
supported backends: WORM SHA1 SHA256 SHA512 SHA224 SHA384 SHA1E SHA256E SHA512E SHA224E SHA384E URL
supported remote types: git S3 bup directory rsync web hook
known repositories: 
        09c0b436-f8de-11e0-842f-b7644539d57f -- here (psychosis)
        82814942-f8e0-11e0-b053-e70a61e98e19 -- bucket
local annex keys: 2256
local annex size: 449 gigabytes
total annex keys: 2256
total annex size: 449 gigabytes
backend usage: 
        WORM: 2256

First point of confusion: Why doesn't "bacon" show up in "git annex status"? I can "git annex copy --to bacon filename" and it will copy it there. Is there some step of setting it up that I missed? I basically just did "git remote add bacon ssh://bucket.foo.com/vid".

Now I've started setting up the remotes on each host:

On bacon:

bacon$ git config --list | grep ^remote | sort
remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*
remote.origin.url=https://git.foo.com/jim/vid.git
remote.psychosis.annex-uuid=09c0b436-f8de-11e0-842f-b7644539d57f
remote.psychosis.fetch=+refs/heads/*:refs/remotes/psychosis/*
remote.psychosis.url=ssh://psychosis.foo.com/vid

bacon$ git annex status
supported backends: WORM SHA1 SHA256 SHA512 SHA224 SHA384 SHA1E SHA256E SHA512E SHA224E SHA384E URL
supported remote types: git S3 bup directory rsync web hook
known repositories: 
        09c0b436-f8de-11e0-842f-b7644539d57f -- psychosis
        8f1f0898-f8c1-11e0-9bf2-b387af26ee63 -- here (bacon)
temporary directory size: 366 megabytes (clean up with git-annex unused)
local annex keys: 1
local annex size: 308 bytes
total annex keys: 2256
total annex size: 449 gigabytes
backend usage: 
        WORM: 2256

On bucket:

bucket$ git config --list | grep ^remote | sort
remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*
remote.origin.url=https://git.foo.com/jim/vid.git
remote.psychosis.annex-uuid=09c0b436-f8de-11e0-842f-b7644539d57f
remote.psychosis.fetch=+refs/heads/*:refs/remotes/psychosis/*
remote.psychosis.url=ssh://psychosis.foo.com/vid

bucket$ git annex status
supported backends: WORM SHA1 SHA256 SHA512 SHA224 SHA384 SHA1E SHA256E SHA512E SHA224E SHA384E URL
supported remote types: git S3 bup directory rsync web hook
known repositories: 
        09c0b436-f8de-11e0-842f-b7644539d57f -- psychosis
        82814942-f8e0-11e0-b053-e70a61e98e19 -- here (bucket)
temporary directory size: 183 megabytes (clean up with git-annex unused)
local annex keys: 3
local annex size: 550 megabytes
total annex keys: 2256
total annex size: 449 gigabytes
backend usage: 
        WORM: 2256

But I'm getting weird results if I try to show the map from psychosis:

psychosis$ git annex map
$ git annex map
map /vid/tv ok
map bacon (sshing...) 
ok
map bucket (sshing...) 
ok
map origin 
failed
map psychosis (sshing...) 
jim@psychosis.foo.com's password: 
ok
map psychosis (sshing...) 
jim@psychosis.foo.com's password: 
ok

  running: dot -Tx11 map.dot

Second confusion: it's as if psychosis was considered a new remote each time? The generated map has psychosis listed with several redundant links:

Map

Is this some bug or do I just need to be hit with the clue bat?

My guess is that psychosis has not pulled the git-annex branch since bacon was set up (or that bacon's git-annex branch has not been pushed to origin). git-annex status only shows remotes present in git-annex:uuid.log This may be a bug.

The duplicate links in the map I don't quite understand. I only see duplicate links in my maps when I have the same repository configured as two different git remotes (for example, because the same repository can be accessed two different ways). You don't seem to have that in your config.

Comment by http://joey.kitenet.net/ Mon Oct 17 19:01:21 2011
Actually, there is a hint that, while you ran the git annex map on psychosis, it decided to ssh to itself two times. That seems to be where the duplicate links came from, I guess you must have some git remotes you did not show.
Comment by http://joey.kitenet.net/ Mon Oct 17 19:02:50 2011

No extra remotes (that I'm aware of); that output was only edited to change hostnames.

On all three hosts, "git push origin" and "git pull origin" say everything is up to date.

I'm using git-annex 3.20111011 on all hosts (although some were running 3.20110928 when I created the repositories).

Regarding the multiple links, I've put a copy of the dot file here. It shows psychosis in three separate subgraphs, that are just getting rendered together as one, if that helps clarify anything.

Wait, I just realized you said "the git-annex branch". My origin only has "master". Do you mean the one specifically named "git-annex"? I thought that was something that gets managed automatically, or is it something I need to manually check out and deal with?

Any other info I could provide?

Ok, after pushing the "git-annex" branch to origin, then "git annex status" knows all repositories on all hosts, so that part makes sense now. Thanks for the tip. But the "git annex map" output hasn't changed.

I think:

  • The first extra edge is because bucket had "ssh://psychosis.foo.com/vid/", while bacon had "ssh://psychosis.foo.com/vid" with no trailing slash. That got lost in the hostname/path editing I did, sorry. Maybe those should be considered matching?
  • The second extra edge is because, when running "git annex map" from psychosis, it doesn't recognize the remote's remote URL as pointing back to itself.

For the second case, after the "spurious" SSH, it could still recognize that the repositories are the same by the duplicated annex uuid, which currently shows up in map.dot twice. I wonder what it would take to avoid the spurious SSH -- maybe some config that lists "alternate" URLs that should be considered the same as the current repository? Or actually list URLs in uuid.log? Fortunately, I think this only affects the map, so it's not a big problem.

Hmm, I don't see the spurious ssh edge in the dot file -- that is, I don't see any ssh:// uris in it?
Comment by http://joey.kitenet.net/ Sat Oct 22 01:18:27 2011

I think that's because the SSH was successful (I entered the password and let it connect), so it got the UUID and put that in the .dot instead. The same UUID (for psychosis) then ended up in two different "subgraph" stanzas, and Graphviz just plotted them together as one node.

Maybe this will clarify:

On psychosis, run "git annex map" and press ^C at the ssh password prompt: map-nossh.dot Map

On psychosis, run "git annex map" and type the correct password: map-goodssh.dot Map

As I see it:

  • psychosis ("localhost") connects to each of its remotes
  • some of them point back to ssh://psychosis
  • psychosis doesn't know that ssh://psychosis is itself, so it tries to connect
  • if successful:
    • psychosis gets put twice in the .dot as if it was two different hosts, one "local" and one "ssh://psychosis"
    • graphviz recognizes it as the same node because the UUID is the same, but graphviz still draws the extra connecting lines
  • if unsuccessful:
    • ssh://psychosis is shown as an additional host that can't be reached
Comments on this page are closed.