Why Subversion is not my favorite VCS

Subversion was one of those things I really, really wanted to like. After using CVS, and being frustrated with its limitations, the idea of Subversion was great: Build a better CVS. Unfortunately, while it has partially accomplished that goal, it has failed in other ways. In some areas, it is no better than CVS, and in others, it is worse. That’s a shame.

Where svn gets it right:

1. Not versioning file-by-file

In CVS, everything — tags, branches, revisions — applies on a file-by-file basis. A “branch” as a collection of files exists only insofar as you have a bunch of files that all have the same branch identifier in them. It’s based on the old RCS file format. It’s surprisingly robust for all of that, but it does make fixing things annoyingly difficult at times.

2. Branches are lazy copies

This is definitely the right way to do branching, as Perforce pretty clearly showed. It makes it easy to create a branch without taking up a lot of disk space, but it also makes a branch a collection of files, which is, of course, how most people think of them.

What svn gets wrong:

1. The whole branches/tags/trunk standard

Branches should not be in a separate namespace; this just complicates things and makes “trunk” seem more special than it really is. Granted, this is just a convention, but it’s so strongly adhered to that many supporting tools break if you violate it. (Also, raise your hand if someone has ever created “branches/trunk” in your repo. Yeah, I thought so.)

2. No tag support

No, really: The “tags” in the branches/tags/trunk standard are not tags as they are usually defined in a version control system. A tag is, semantically, a human-readable reference to a revision number, and should be substitutable any place a revision number is. The developers of Subversion seem to have missed this crucial point. As a result, in Subversion, you cannot, for example, say something like

 svn log -r"tag1":"tag2"

… you have to do something more complicated. Nor can you do something like

 svn checkout -r"tag1" svn://repo/branches/feature1

You have to say

 svn checkout svn://repo/tags/feature1

This is not just a matter of taste; there is no association between the tag and what it tagged in Subversion.

3. Merging is hard

I never actually adopted Subversion because it had no merge tracking. This is the number one feature that made branching and merging in CVS difficult, and the number one reason I used Perforce at several employers. I was forced to use Subversion at my current employer, and we looked forward eagerly to the promised merge tracking in 1.5. Unfortunately, when it arrived, it was not possible to “import” our previous merge history into the new merge tracking mechanism, which made it more or less impossible to start using it effectively, and in cases where we were starting fresh, it still suffered from the fact that it was a major kludge: The designers of Subversion didn’t build it in a way that made adding merge tracking easy. The result is not pretty; bolting wings onto a car doesn’t make it an airplane.

4. Remote use is awkward

It’s not as bad as Perforce (caveat: I haven’t used the latest version of Perforce, which has some new features related to remote use), but still, there’s no getting around the fact that Subversion is fundamentally tied to a single central server and there is no way to commit or examine history unless you are connected to the server.

5. The repo format is poorly documented and there are no recovery tools

6. …unless you use the BDB backend, which has its own problems

The default new repo format is FSFS, which is a format created by the Subversion developers. There is some documentation for it, but it’s not the clearest stuff in the world. (Of course, you could always read the code — if the code wasn’t impenetrable.) This would be fine if you never encountered any kind of repository corruption, or if you could always restore from backups. However, backups are slow (see below), so you may have a significant gap between them. If your repository gets corrupted, you look hopefully into the manual and see this:

 svnadmin recover REPOS_PATH

Yay! You are saved!

Well, no, you aren’t: That only works on BDB repos. :-( Which you didn’t use because, well, the Subversion folks recommend against using BDB because it has problems with NFS (this is actually a good recommendation). It turns out that there are no recovery commands for FSFS. So you email the svn list for help and discover that apparently this can’t happen, because no one ever responds to you except several people asking you if you ever found a solution. So you dig into the code to see if you can fix it yourself, only to find that

7. The code is inscrutable

With all due respect to the Subversion team, the svn code is some of the most godawful code I have ever tried to read. While it’s true that version control is not exactly trivial, I find the code for git much, much easier to read, to the point that I can actually figure out what it’s doing. Even the old CVS code was more clear.

8. One revision number for the whole repository.

This is actually not a big deal if the repository contains the code for a single unit of development, e.g., a library or an application. In such a case the revision number basically corresponds to a change set. The problem is that many Subversion repositories contain multiple projects that are only loosely related to each other, with the result that it’s not always entirely clear what, exactly, revision NNNNN refers to. While this is partly a problem with how people use Subversion rather than Subversion itself, it’s also the case that Subversion encourages this sort of use because of the way the svn server is tied to the repository.

9. There are no logs

If you come to svn from Perforce, this is surprising. Subversion doesn’t keep any sort of logs, because the repository is separate from the programs that interact with it. That is, the svn server is not the only thing that can make changes to the repo. Knowing what program made a change to the repository and when can be very useful in figuring out what caused data corruption for example.

Minor annoyances:

1. It’s slow. I realize that for some people, this is a major annoyance.

2. Backups are slow. If you have remote people worldwide, you don’t want to take the server offline to do the backups, so you do hot backups, but even they affect the performance, making it slower than usual.

3. Repositories are large. Our main Subversion repository at my current workplace occupied about 5.3 GB. The git import was less than 2 GB. This is fairly typical.

Overall, I think that Subversion only partially accomplished its original goal, to be a better CVS. By failing to design the software with branching and merging in mind, the developers created a system where branching is easy but merging is difficult. Branches are useless without good merging, and CVS’s biggest failing was in its lack of merge tracking. Unfortunately, this is the one area that Subversion got seriously wrong, and I believe that fixing it properly would require a major rewrite.

In future articles, I’ll talk about the pros and cons of various other version control systems, as well as continuous integration servers, bug-tracking systems, build systems (make, maven, ant, etc.), code review systems, and so forth.

This entry was posted in version control. Bookmark the permalink.

16 Responses to Why Subversion is not my favorite VCS

  1. Tom Murray says:

    This comment is directed at the “recovery” mentioned in bullet 6. I’m wondering if this still stands since apparently

    svnadmin recover REPOS_PATH

    was changed in version 1.5.X to specifically address the issue with recreating db/current in an FSFS repository from the existing revision files. I’m getting this information from this URL:

    http://www.farside.org.uk/200703/svnadmin_recover

    • ebneter says:

      Unfortunately, as far as i know, it will recover db/current but it won’t help if you have corrupted revision files, which was the situation I found myself in. What I don’t remember at the moment is if our corruption occurred before or after we switched to 1.5.x.

      At any rate, although anything is better than nothing, I’d still stand by the statement that there are other VCSs with more robust recovery mechanisms.

  2. Josh says:

    Branches are not lazy copies in Perforce. I work in a Perforce repository that contains 10s of gigabytes in a given branch. Subversion (which I am not defending) creates a new branch lightning fast. Perforce has to internally ‘copy’ (symbolically link, I suppose, within their database) all the files in the branch to the new location within Perforce. You then have to sync those 10s of gigabytes down before you can do anything. Subversion does an in-place switch, changing out only the differences between the branches.

  3. Warren says:

    I hated the “edit” thing in Perforce even more when my IDE was insisting on writing to files that I had no intention of directly changing (because it does that sort of thing, don’t ask) but Perforce was set up for read-only-until-edit, which would crash or lock up my IDE. I fixed that, and can now live with the “reconcile” feature which catches “unintentional edits” and lets me turn them into checkouts, or revert them.

    W

  4. Dan says:

    I totally agree on “wrong” points 1-3. For some reason, item 2 (tags) bugs me like hell, probably more than the others. Tags just seem so easy to “get right”, considering my use of gmail, del.icio.us, etc… tag metadata should have been added by now. Using separate directories is a huge work-around. Very clumsy.

    Having said that, I think SVN is huge leap forward compared to CVS. Which was a huge leap forward compared to RCS. Which was a huge leap forward compared to SCCS.

    SVN has definitely made big improvements in the 5 years or so I’ve been using it. Sure, new kids on the block have some neat stuff (“standing on the shoulders of giants”), but for my purposes (small team), SVN is the right fit.

    I’ve heard great things about Perforce. I’ve just never had the time (or *made* the time) to explore it.

  5. Pingback: Tweets that mention Why Subversion is not my favorite VCS | buildengineer.org -- Topsy.com

  6. Warren says:

    I find the “edit” workflow element in Perforce rediculous beyond belief. Apparently someone thinks that the server should be notified if I locally modify a file. Draconian much? Subversion gets the “edit is not a versionable action” concept right. I have a hard time taking Perforce seriously for that reason alone.

    I worked with a group of people who couldn’t handle the idea of working from a common synchronized set of files at all, and as a concession to them I investigated distributed version control systems. I looked at bazaar, mercurial, and git. I like Git.
    But I love Mercurial.

    For more on why Mercurial, and DVCS in general, are the Right Way to do it, check out http://www.hginit.com.

    Warren

    • ebneter says:

      In fairness, the “edit” feature is part of the way Perforce is able to be so fast: It keeps track of which file(s) you’re changing so it doesn’t have to examine everything when you do updates or commits. I’ve found that people get used to it pretty quickly, and it’s also possible to do it after the fact.

      In general, I’m happy to use anything that meets some basic criteria. One of my next posts will be about those criteria, in fact.

    • Trimbo says:

      Warren: central checkout is an essential feature for unmergable files like PNGs or some other sort of binary asset. This might not be a big deal for a lot of folks but it is for e.g. game developers who have gigabytes of files like this in their source repository. Also, as ebneter mentioned, it’s a lot faster to reconcile. Try “git status” on a 60 gig repo.

      • Masklinn says:

        Warren: central checkout is an essential feature for unmergable files like PNGs or some other sort of binary asset.

        Yes, but it’s not necessary for all file type and there is no reason to use it on text files.
        SVN does that much better: there’s a needs-lock property, which can be auto-set on predefined filetypes. Set it on binary filetypes, don’t on text filetypes, now altering binary files will require that you ask a lock from the server (exclusive edition/central checkout) but text files won’t and you can just edit them as you want.

        • ebneter says:

          I think we’re conflating two separate things here. In Perforce, you tell the system that you’re editing a file. This is non-exclusive; anyone else can edit the file as well and you and they will have to resolve your differences at the appropriate time. Perforce does allow you to create a lock on, e.g., binary files, the same as Subversion. The normal ‘p4 edit’ command just tells Perforce that you are working on a file, so that it treats it specially on updates from the server.

  7. Josh says:

    On Remote Use

    “It’s not as bad as Perforce”

    I am guessing you see git as having done it right? I guess, as a perforce advocate, I’m not really sure what you mean here. Subversion is somehow better than perforce in this regard?

    • ebneter says:

      Perforce requires a connection to the server to edit a file; svn does not. At least, that used to be true — I haven’t used the latest versions of Perforce, which I understand have some improvements in that area, hence the caveat in the post. For truly distributed remote use, Perforce proxies are not a good solution, although they work well when you have multi-user satellite sites.

      Don’t get me wrong: I love Perforce, and would much rather use it than Subversion. My current company chose git for a variety of reasons, but I would have happily converted to Perforce. I’ve used it, administered it, and converted to it (from CVS) at several jobs.

  8. Trimbo says:

    I ditched SVN the first time I tried to recover an old repository and ran into BDB compatibility issues. It was like “yeah keeping my source in this… THAT sounds like a great long term idea.”

    At work, we’ve always used Perforce. For home stuff I used the free Perforce config for a long while but now use git exclusively. I assume you’re going to write about Git soon?

    • admin says:

      Yep, working on a git article right now. Another topic will be “what makes a good VCS,” to help people sort out what’s good for them.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

ukash ukash bozdurma ukash satın al ukash al ukash ukash satın al ukash al ukash bozdurma ukash al ukash