BACKGROUND IMAGE: deepagopi2011/stock.adobe.com
Version control is fundamental to work in modern software delivery teams. Users are hooked on Git version control's design and features, but commercialized distributions depart significantly from the pure open source technology.
Version control -- interchangeably referred to as source control -- is fundamental to DevOps because it enables software and infrastructure developers to iterate and update code independently, without creating a tangled mess of old and new versions, missing dependencies and faulty code.
Git on board
When users put application and infrastructure as code configuration files under source control, there is one consistent package of information for every system that interacts with it. With version control, a team of distributed software developers each writes one piece of the application and then combines and vets their work. Infrastructure developers use version control to eliminate brittle, snowflake IT deployments and iterate on a consistent architecture compliant with operational rules.
Version control tools include a mix of open source, supported and proprietary commercial offerings, including:
- Git version control, which is is the open source basis for GitHub, GitLab and Atlassian Bitbucket and is incorporated into other tools;
- Mercurial, which also underpins Bitbucket;
- Microsoft's Team Foundation Version Control (TFVC) in Team Foundation Server (TFS);
- the open source Concurrent Versions System;
- Apache Subversion; and
Git version control -- and commercial products based upon it -- is well-liked by its users, several of whom shared their opinions on version control tools at ChefConf 2018 in Chicago.
Git is ubiquitous in part because of the tool's open source nature, said Sean Wilbur, director of DevOps solution architecture and delivery at Perficient, a St. Louis-based systems integrator. When Git came about, other tools of varying quality enabled version control, but the developers had to go to senior management to request a tool purchase. Git was built on both technological need and involvement from the user community.
"The moment I realized the power of Git, it was like a religious experience," said Daniel Stone, systems engineer at a U.S.-based oil and gas company. Stone has worked with GitHub in the past for an open source project and now uses a GitLab internal server because his enterprise cannot publicly share source code.
The merge commit step, which generates an updated copy of the source code, was a "soul-destroying experience" until Stone switched to Git. "Git has patterns of design to prevent conflicts from occurring [during merge], and when they do, it's simpler to resolve [than in other tools]," he said. Git's three-way merge capability, wherein the tool automatically analyzes differences in multiple files compared to a shared ancestor of those branches, makes it easy for code owners to work in isolation.
Git's version history storage method quickly finds the first common ancestor of the two versions for a three-way merge, preventing conflicts and developer frustration, said Noah Kantrowitz, DevOps expert and consultant at Coderanger. Other tools, such as Microsoft TFVC and Mercurial, have adopted similar ancestor data storage and three-way merges with differentiating features in recent years. For example, Subversion merge tracking automatically manages the flow of changes between lines of development.
Git pros and cons
Git's other major draw among version control systems is that it is truly distributed, said Jeff Druin, DevOps manager at a healthcare provider. Other tools, such as Subversion, are centralized. Centralized version control means that one main copy of a project exists in one place. Microsoft TFS supports both centralized hosted version control in TFVC and distributed version control in Git. In an email statement, a Microsoft representative said that more than half of developers who use TFS and Visual Studio Team Services developers manage source code in Git.
DevOps pros say distributed tools, which give the user a full local copy of the repository, enable more independent work and reliable collaboration. "I can check out a repository and get on a plane," Wilbur said, and it isn't hard to create a repository or branch, check out branches and switch between branches. The lightweight design also encourages an app team with dozens of application components to create an equal number of repositories and stitch them together consistently, rather than share one.
Git makes it easy to swap views between versions and see what changed from one to another, Stone added. "Git diff shows everything so you can just look at the feature branches in a repo and compare them to yours." The Git diff command is used to show changes between two things, such as two commits or two trees, and it is natively built into Git.
Git version control also has downsides. "The main source of problems these days are performance issues when dealing with truly massive code repositories," a problem compounded over time as teams add to their source control history, Kantrowitz said. Google supports a protocol to speed up Git's network operations. Microsoft, another textbook example of a huge code repository built over many years, collaborates with GitHub on tools to handle huge repos. Microsoft moved to acquire GitHub in June 2018.
In general, user friendliness is an area where Git could improve. Git version control technology has a steeper learning curve compared to Mercurial, Kantrowitz said. "It takes a while for people to learn the ins and outs of how Git stores history and how to manipulate that history," he said.
Commercial products have stepped in to fill gaps. Vendors apply governance to the underlying Git version control project, adding an access control layer, unified hook management and other conveniences. Commercial products enable members of the IT team, such as release engineers, to safely interface with and consume Git, Wilbur said.
Git and commercial distributions
Git's rise in source control is inseparable from the growth of GitHub, the most well-known of the commercial distributions. GitHub, a hosted version control system, is frequently conflated with its underpinning Git technology, Kantrowitz noted. GitHub is home to a panoply of open source software, and developers had to learn Git to interact with projects hosted there -- the two grew recursively but are not equal, he said.
For example, GitHub's pull request system offers user-friendly, web-based code review with advanced features. Druin credited pull requests as the biggest benefit that vaults GitHub above other version control tools. His DevOps team is wholly on GitHub, while the rest of the company uses Microsoft TFS.
Other Git-based commercial products, such as Bitbucket and GitLab, also offer pre-merge code review features. GitLab builds in a full end-to-end continuous integration pipeline and offers features to tailor the user experience. Gerrit, which provides web-based code review and repository management for Git, enables a granular breakdown of which portions of a single branch are ready to merge and which aren't, Kantrowitz added.
However, this style of pull requests is not available from the open source, community-driven Git, which uses an email message-based workflow design that Kantrowitz called "unfriendly."
GitHub Desktop is an open source UI for Git on Mac or Windows, designed to streamline workflows and avoid command-line interactions. Microsoft's GitHub buy would combine a tech giant known for point-and-click administrator interfaces with the Git version control system in as yet undetermined ways.