November 22, 2010

Git And Continuous Integration

Originally published 13 Nov 2008

Subversion is the de facto version control system of the day, but Git is the rising star.  More and more people are using Git, but I'm a bit concerned about its effects on Continuous Integration (CI).

First some background...

Subversion follows a centralized repository model.  So does CVS and many others.  There's one server that everybody commits to.  Git can follow suit or be configured in a distributed fashion.  In a fully distributed model, each developer has a private copy of the repository.  Mercurial is an example of a distributed version control system.

My concern is really with the distributed model.  One of the appeals of this model, and particularly with having your own private repository, is the ability to experiment and check-in/commit at a much finer-grained rate than with a centralized model.  With a centralized model, you need to be more careful of your commits.  Otherwise, you could break the build and everybody else.  So consequently, you commit less frequently here.

But what's better in the context of CI?  I have to be honest and admit that I've never used the distributed model on a project, so I'm being theoretical here.  My gut feel is that with a distributed model, integration will be less frequent.  I believe people will check in to private repositories more often, but will push those changes to the main integration branch less frequently than people committing straight to the main branch in a centralized model.  In a centralized model, integration is in your face.  You can't commit without thinking about it.  In a distributed model, you can get carried away in your own little world.  And we've learned in the agile community that we should be integrating early and often, haven't we?

I compare the distributed model with multiple repositories to a centralized model with multiple branches.  If you can accept this analogy, you can probably see how integration would be less frequent.  I fear with Git that people will adopt more of a distributed model and thus, CI will suffer.  This is pure speculation on my part, and I'm interested to see how things play out.

So my point here is to be mindful of using lots of repositories with Git and its potential negative consequences on CI.

For some more comments on using Git and CI, particularly on larger teams, see this.

No comments:

Post a Comment