Using Subversion+Apache Webserver as Backup Storage

Tagging

To make my life easier and to find out which version was valid at which point in time I also implemented a tags subdirectory. After each backup commit to the Subversion server, I automatically store a tag directory of the current version by using the remote copy functionality of the subversion client (for details refer to svn copy --help, use case “URL -> URL” there) knowing that this is not necessary with subversion. This is only for my personal convience as I do not like to specify dates when checking out data from a versioning system like Subversion.
Please note that using the local copy function (use case “WC -> WC”) is a bad idea in this case as on execution of the commit command also will copy the entire data locally on your filesystem, which will eat up your entire disk space over time quite quickly. Using the remote version makes the Subversion server creating a reference copy (lazy copy) and thus only takes some kilobytes for storing a reference in a transaction file, even if you have a large number of huge files stored in the repository. The downside of this is that this cannot be done together with the commit, but is executed as a seperate transaction on the Subversion server. However, this should not pose a major issue as the numbers of transaction is not limited practically.

The Script

If you want to set up a similar scenario you may benefit from the following references:

Download this file:
Subversion Backup Scripts (1.3 KiB, 821 hits)
(License information for these files). It contains my set of scripts and a collection of subdirectories which might be handy for you. In any case make sure that you adjusted the repostory’s path in file backup_run.cmd.
To run these programs you will need the Windows Subversion command line tool installed on the system.

Further Aspects

If you do not have a linux box for backup at hand but another Windows system, you might find this blog on installing WAMP with a Subversion server useful.
Please note that you might have a problem more now which you did not know before: How do you get rid of old versions? I will eloberate on this topic in a seperate posting.
Whereever there is light, you also will find shadows. The dark side of this approach is that the entire data is stored twice on the backupee system: once for the original and possibly changed data and once again in the corresponding .svn subdirectories. This is a known behaviour of the subversion client which enables it to efficiently compare what has been touched before committing only the data which is necessary. In our case this, however, implies that you double the required disk space.
besides the required disk space for those .svn subdirectories there is another drawback: In some rare cases the existance of this additional .svn subdirectory in each directory of your backup directory structure may pose problems or cause a different behaviour of the application. This may happen, for example, if the application wants to delete a otherwise empty directory, but unfortunately cannot because this .svn directory still is there. Thus those applications need to be programmed intelligent enough to ignore this automatically.
A very similar approach could also be achieved via the recent “Previous Version” functionality in the latest NTFS version. Microsoft refers to it as Volume Shadow Copy. In my case it was not subject of discussion, because the version of the Operating System of the Windows Server is too old. Thus, I also did not play around with it yet.

Additional Resources

If you are interested in a quick tutorial on subversion, you might want to read http://aymanh.com/subversion-a-quick-tutorial
For a discussion about the drawbacks and impact of using Subversion as a backup server you can also read http://stackoverflow.com/questions/61888/using-subversion-for-general-purpose-backup

Pages: 1 2 3

Nico's Blog

Hints that matter

Using Subversion+Apache Webserver as Backup Storage

Tagging

The Script

Further Aspects

Additional Resources

Leave a Reply