Hackwater sunsetting Subversion in favor of Git

Posted on 06 January 2013 by jose

I've been working on moving my personal code repository from Subversion to Git. My main guide for this is John Albin Wilkins' article on converting a Subversion repository to Git. In it, he outlines the steps he took to convert a Subversion repo to a bare git repository; he follows that article with another announcing a script that does all the heavy lifting for a user that has a number of SVN repos to convert. If your SVN repository layout is fairly traditional, that script will work well for you. I have two repositories, one for a project I did as a Hackwater.com technical consultant, and another that holds similar projects plus all of my personal code. John Wilkins' script worked well for that first SVN repo; it was laid out predictably in the accepted SVN style (reponame/{trunk|tags|branches}/code), and so it was a very smooth conversion.

My other repository is, of course, larger, and far more complex. It has grown organically (or in other words, haphazardly), and it was definitely not laid out as described above. Certain subdirectories eventually get to the {trunk|tags|branches}/code) convention, but in some cases, it was fairly deep in the hierarchy, and the git svn clone command in John's script was not handling the repository well sans parameters. Given that I had followed a fairly traditional SVN architecture and made a monolithic repository for all my code (something discouraged in the Git model), I decided that while I could fiddle with parameters passed to John's script, it would make more sense to follow his initial guide, and supplement it with a few other guides to break that huge repository into a number of project-specific Git repositories.

I found Dane Petersen's Splitting Subversion into Multiple Git Repositories invaluable in this regard. Combining its information with other articles and blog entries found on the web, I was able to split my huge repository into a number of smaller ones, keeping all their SVN history, branches, and tags.

One other challenge I faced in completing the task was that I wanted to raise my VPS memory allocation as little as possible during the conversion (as it happens, one of the goals of the conversion is to allow me to move the Git repos off of my VPS, or at least the bare Git repos); for the big SVN repo, this proved to be a challenge in that the SVN+HTTP process was eating up all my memory. I solved this by using the local SVN repo URL (file:///home/user/svn/reponame/subdirectory-I-want-to-split-and-convert) and liberal use of the screen command (I love screen). It took a long time to split and convert, but most of that time was simply letting the machine churn away at the data and checking every few hours to see if it had completed. And if I lost my SSH connection to my VPS, I could simply SSH back in and fire up screen to see how my process was doing.

I've thus far converted all my Subversion repositories to Git. I've even managed to split the big SVN repo into several Git repos. I'm now in the process of converting all of these Git repos to bare Git repos. I've also been cleaning up tags in the repos I've converted, as one side effect of the tag and branch conversion is that I've got a few duplicate tags and branches, marked by the '@' symbol and a Subversion revision number. So far, every one of these tags or branches has been identical to one not using the '@' symbol, so a little git diffing has allowed me to see which are identical and remove the SVN revision number tags/branches.

Once I've finished this, I'll look into authenticated Git hosting (either on my own server or on a service like Github) so that I can pull and push to a remote on a separate machine (for redundancy/backup reasons).

Latest poll

Which do you favor?