My svnbackup script is the second most popular page on my site, after
the PyMOTW home page, and search terms such as “svn backup” and “svn
backup script” regularly appear at the top of the list of sources of
traffic to my site. The link to svnbackup doesn’t appear on the first
page of Google’s search results, a sign I take to mean that this problem
isn’t well understood or solved (otherwise, why would so many people
page through the search results to find it?).
I created svnbackup to manage off-site backups of my svn repository
and the repository we run at work. The requirements were pretty basic:
- Incremental backups.
- Easy to restore.
- Safe if the backups were running during a transaction.
Both repositories use FSFS to store the repository contents, so
according to some reports it would be safe to simply rsync (or otherwise
backup) the raw repository data as long as (or possible even if) a
transaction was in process. It turns out to not be so difficult to do a
safer backup, though, so that’s what I went with.
The obvious solution is to use “svnadmin dump”, to extract transaction
information from the repository. svnbackup.sh is a wrapper around
svnadmin to produce reasonably-sized chunks for the backups. The only
real problems I had to solve were how to track what had been backed up
and how to move the backup output off-site.
Tracking the last revision number which had been backed up is easy
using a simple text file on the svn server. If the information is lost,
the worst thing that happens is the next run backs up the entire
repository again. That can be time consuming, but is not destructive.
Copying the files off-site is handled via scp.
Other alternatives have more or different options. I like python, and
obviously use it a lot, but I’m not sure I would have used it as a shell
script replacement as the folks at collab.net did. On the other hand,
I didn’t care that my tool doesn’t run on Windows (thought it might,
with Cygwin) and they do. If I had found their tool when I needed it, I
probably would not have written my own, since the features are largely
the same. Their off-site support uses ftp, not scp, but it looks like it
would be straightforward to add the scp support.