The walkthrough builds up a decentralized git repository setup, but git-annex can also be used with a centralized bare repository, just like git can. This tutorial shows how to set up a centralized repository hosted on GitHub.

set up the repository, and make a checkout

I've created a repository for technical talk videos, which you can fork on Github. Or make your own repository on GitHub (or elsewhere) now.

On your laptop, install git-annex, and clone the repository:

# git clone git@github.com:joeyh/techtalks.git
# cd techtalks

Tell git-annex to use the repository, and describe where this clone is located:

# git annex init 'my laptop'
init my laptop ok

Let's tell git-annex that GitHub doesn't support running git-annex-shell there. This means you can't store annexed file contents on GitHub; it would really be better to host the bare repository on your own server, which would not have this limitation. (If you want to do that, check out using gitolite with git-annex.)

# git config remote.origin.annex-ignore true

add files to the repository

Add some files, obtained however.

# youtube-dl -t 'http://www.youtube.com/watch?v=b9FagOVqxmI'
# git annex add *.mp4
add Haskell_Amuse_Bouche-b9FagOVqxmI.mp4 (checksum) ok
(Recording state in git...)
# git commit -m "added a video. I have not watched it yet but it sounds interesting"

This file is available directly from the web; so git-annex can download it:

# git annex addurl http://kitenet.net/~joey/screencasts/git-annex_coding_in_haskell.ogg
addurl kitenet.net_~joey_screencasts_git-annex_coding_in_haskell.ogg
(downloading http://kitenet.net/~joey/screencasts/git-annex_coding_in_haskell.ogg ...)
(checksum...) ok
(Recording state in git...)
# git commit -a -m 'added a screencast I made'

Feel free the rename the files, etc, using normal git commands:

# git mv Haskell_Amuse_Bouche-b9FagOVqxmI.mp4 Haskell_Amuse_Bouche.mp4
# git mv kitenet.net_~joey_screencasts_git-annex_coding_in_haskell.ogg git-annex_coding_in_haskell.ogg
# git commit -m 'better filenames'

Now push your changes back to the central repository. This first time, remember to push the git-annex branch, which is used to track the file contents.

# git push origin master git-annex
To git@github.com:joeyh/techtalks.git
 * [new branch]      master -> master
 * [new branch]      git-annex -> git-annex

That push went fast, because it didn't upload large videos to GitHub. To check this, you can ask git-annex where the contents of the videos are:

# git annex whereis
whereis Haskell_Amuse_Bouche.mp4 (1 copy) 
    767e8558-0955-11e1-be83-cbbeaab7fff8 -- here
ok
whereis git-annex_coding_in_haskell.ogg (2 copies) 
    00000000-0000-0000-0000-000000000001 -- web
    767e8558-0955-11e1-be83-cbbeaab7fff8 -- here
ok

make more checkouts

So far you have a central repository, and a checkout on a laptop. Let's make another checkout that's used as a backup. You can put it anywhere you like, just make it be somewhere your laptop can access. A few options:

  • Put it on a USB drive that you can plug into the laptop.
  • Put it on a desktop.
  • Put it on some server in the local network.
  • Put it on a remote VPS.

I'll use the VPS option, but these instructions should work for any of the above.

# ssh server
server# sudo apt-get install git-annex

Clone the central repository as before. (If the clone fails, you need to add your server's ssh public key to github -- see this page.)

server# git clone git@github.com:joeyh/techtalks.git
server# cd techtalks
server# git config remote.origin.annex-ignore true
server# git annex init 'backup'
init backup (merging origin/git-annex into git-annex...) ok

Notice that the server does not have the contents of any of the files yet. If you run ls, you'll see broken symlinks. We want to populate this backup with the file contents, by copying them from your laptop.

Back on your laptop, you need to configure a git remote for the backup. Adjust the ssh url as needed to point to wherever the backup is. (If it was on a local USB drive, you'd use the path to the repository instead.)

# git remote add backup ssh://server/~/techtalks

Now git-annex on your laptop knows how to reach the backup repository, and can do things like copy files to it:

# git annex copy --to backup git-annex_coding_in_haskell.ogg
copy git-annex_coding_in_haskell.ogg (checking backup...)
12877824   2%  255.11kB/s    00:00
ok

You can also git annex move files to it, to free up space on your laptop. And then you can git annex get files back to your laptop later on, as desired.

After you use git-annex to move files around, remember to push, which will broadcast its updated location information.

# git push

take it farther

Of course you can create as many checkouts as you desire. If you have a desktop machine too, you can make a checkout there, and use git remote add to also let your desktop access the backup repository.

You can add remotes for each direct connection between machines you find you need -- so make the laptop have the desktop as a remote, and the desktop have the laptop as a remote, and then on either machine git-annex can access files stored on the other.

Comments on this page are closed.