This has come up in a few different contexts recently, and while I understand that git is the current trendy choice for version control, I am not convinced that it is the best tool for every job. There are clearly places where a distributed version control system has advantages, and the most obvious of those is the one that git was designed for: a large open source software project with lots of independent contributors.
When working in a small company, with a group of engineers who are on the payroll, there doesn’t seem to me to be as much need for the additional complexity of a DVCS, even if those employees are distributed geographically.
It seems to me that there is a misunderstanding about the word distributed in DVCS. Being geographically distributed doesn’t mean your team can’t use Subversion or Perforce. As long as they are connected to the internet & able to connect to the server when needed, there should be little difference. And I pity the poor soul on a slow connection who is trying to clone a large git repository… Especially if they are just trying to look at the latest version of the source to learn!
What is being distributed is the entire repository. So rather than you checking out just the version you are interested in, when you clone a git repository you are (by default) getting every version and all the meta data. That’s great if you work offline a lot, or if you don’t have a trustworthy, backed up place to keep the repository, but if you are serious about software development, and especially if you are building a company on top of that software, you really should have a central repository anyway which is where you do your release builds from. And the repository you back up.
Linus Torvalds presented his case for distributed version control at a Google event, but almost all of his advantages are in the context of an open source project with contributors of unknown ability. In that environment, enabling all potential contributors to work on the project & track their changes locally and then submit them for review in a standardized way makes a lot of sense.
It should also be noted that he does maintain a central copy of the Linux source code at GitHub, so the idea that a git project can be managed without a central repository is myth. In open source even more than in a small startup, people need to know where they can go to get the latest official (trustworthy) version of the source. For that a central repository is pretty much essential. I am certain that nobody will be able to clone the git repository from Linus’ personal computer!
One on the most recent discussions I have had about git started with this tweet:
What struck me about this post was that suggesting the choice of version control tool was as important as the other three seems wrong. So I responded to Mr Srinivasan with that comment. What came back surprised me even more:
If I am reading that correctly, he feels that using git implies that a software team is more sophisticated than if they used subversion (or presumably Perforce or some other non-distributed version control). I could understand the argument that not using version control at all was a concern, but the specific tool choice seems irrelevant. I wonder if the choice of editor matters too?
Complex By Design
There are several places on the web where it is claimed that the complexity in git is there by design and that spending time trying to master its arcane way of doing things is a rite of passage. As if to confirm this, one third party response to my discussion above said this:
I appreciate that studying the tool in great detail, perhaps even downloading the source and really trying to understand it might be an interesting academic exercise. But I would prefer in a corporate setting that the version control tool be as simple as possible to understand and use. If I need to train every new hire (or test their understanding of the VCS during the hiring process), I would say the tool has failed.
Of course, if we look back at git’s origins we can see why having a more complex tool might serve a purpose. In that unmanaged open source world, where contributors are not interviewed or hired, having a tool that can weed out the less committed might be seen as a good thing. Of course, it could also be off putting to a lot of talented engineers who would rather spend their time working on their passion instead of learning how an intentionally arcane tool works!
The workflow for a large open source project, with thousands of contributors is very different to the flow inside a small startup team. External contributors branch off of the approved mainline, make their changes there and then they send a request to the project’s owner to pull their changes back into the mainline. The project’s maintainers might receive tens or hundreds of those pull requests each week. Certainly, in that environment they wouldn’t want to just open the flood gates & let unknown developers commit their changes directly.
While that flow could work in a corporate setting too (and I could easily build a peer review process around it), I would argue that a more typical workflow would be to allow your engineers to push their changes into the mainline once they happy with them. Having one person in a small team responsible for pulling everybody’s changes in to the mainline seems like an inefficient use of resources. Even more so if that person needs to keep pushing back changes to have conflicts resolved. IMHO, it is far better in a fast paced corporate environment to have each developer commit their changes to the central repository as soon as possible and have them resolve conflicts.
I am also a big believer in only having peer reviews happen for changes that are checked in. Not because I think anybody I work with would intentionally make a change post-review to slip something in to the code unnoticed, but simply because it is all too easy to forget to commit one file, especially when adding new files as part of a change. It is also easier for remote reviewers to get the code if it is checked in somewhere.
My own personal experience with git merge is something I would prefer to never repeat; two days lost to trying to repair the mess it made of a simple merge of more recent changes in the Android set of repositories into my own branch (the deltas from the release tag my branch forked from and a slightly newer tag on the same release branch).
That aside, Linus’ comparisons of git to CVS are somewhat unfair. Sure, git tracks merges better than CVS, but I doubt anybody would consider CVS to be the state of the art version control system. In fact, I don’t know anybody still using it. I would agree that Subversion is still weak in this area, but other options like Perforce or the crazily feature-rich ClearCase track every merge and are not tripped up by attempts to merge changes that have already been merged.
And the tools for merging conflicting files in Perforce are far superior to anything I’ve seen available for git (or subversion for that matter).