If you’re like many shops that develop in an Agile fashion, you’ve probably discovered the beauty of delivering working software early and often. You’ve probably found that use of a continuous integration tool can begin to become a continuous delivery tool and help your DevOps team deploy production releases efficiently -- better, faster and cheaper. But you’ve probably run into a wall with your delivery team and the handoff of database changes to DevOps, where the reality of how different and difficult the care and feeding of databases is, compared to code. In this tip, we’ll explore why changes to the database are so difficult and we’ll discuss solutions and tools that will help your DevOps team make sure changes to the code and database stay in sync.
Why are changes to databases so hard?
The fundamental root cause of why databases are hard lays in the nature of databases themselves. They contain both structural elements -- the schema, stored procedures, etc., as well as data already stored in the database. What’s nice and easy to do in a development environment, such as blow away the database during the course of development simply isn’t an option when we must deploy the new code into production. So, when we get to continuous delivery for our products, we need to come to terms with a mindset and toolset to enable us to handle the challenges for database automation as it relates to DevOps deployment of the iterative and incremental functionality cycles that Agility wants us to work in.
The mindset: Treat your database like code
Start by treating your database like code! You would never consider just taking bits and pieces of code randomly scattered all over the place and trying to bundle that all up for a release. Your DevOps folks would spin you around with a “do over” if you even suggested that! Yet, many of these same teams are treating their databases in that fashion. Your code is coupled with other code in the same fashion as your code is coupled with your database. That’s why we commit related changes to the code repository together atomically. Yet it is still way too common for developers to make local changes in the database schemas, and expect that the database schema will be manually updated by DBAs and DevOps prior to the code being put in place, so errors in production do not occur. That’s crazy talk!
Each time that a change in functionality changes something on the database back end, you usually have a dual-pronged problem. Developers need to proceed in a database sandbox, and many times, this sandbox is on a different database platform than that which will be used in production (due to factors such as cost). And if you deploy code running against the wrong version of the database, your code will just not work.
To be successful, we need a database tool that will:
- Support database schema changes in a database vendor agnostic fashion (so we don’t have to worry about a schema change being syntactically different between database vendors).
- Support multiple developers all contributing code and database changes at once. Branching and merging code, too.
- Preserves the semantic meanings of the syntactic changes (we shouldn’t have to guess that EMP_ID and EMPLOYEE_ID are really the same attribute).
- Be general purpose enough to allow all possible forms of database changes, or refactorings (see Refactoring Databases: Evolutionary Database Design, by Scott Ambler and Pramod Kumar Sadalage, Addison-Wesley, 2006).
- Generates documentation for the database from the source, making documentation one less thing to worry about.
There are many ways that we attempt to fix the problem of database control and migration. Here’s a quick sample:
- Have a DBA fully design the database once and for all time in a BDUF (Big Design Up Front). Deploy the database once. Never allow changes. Hmmm. That doesn’t seem very Agile does it? Wonder if there could ever be a problem if requirements change, or if new ones come up?
- Take a complete laissez-faire attitude with respect to databases. Let any developer make whatever changes they feel are called for, and let DevOps figure it out when it comes time to deploy. That seems a little error prone to me, and probably to you, too. In fact, that’s pretty much the reason for this tip!
- Use a framework that supports database migrations, such as Ruby on Rails, and its ActiveRecord Migration features. There are several vendors that support the ideas (and RoR is a pretty good one), but they all are lacking in several important areas, such as vendor tie-in and lack of support of the general problem. For example, the ActiveRecord design pattern does not support many of the important many-to-many relationships you probably want to use in on your back end.
- Use some ad-hoc tools, such as general purpose “database diff” generators. Here too, we have lack of generalized solutions and are forced to mire in database details (such as different SQL dialects), and lose focus on the bigger problem.
A better solution can be found in an open-source tool called LiquiBase. Here, we have a tool that:
- Uses a single changelog to describe the database, and gets changed and checked in with the rest of the codebase.
- Is invoked from the command line, making it perfect for inclusion with your automation tooling.
- Works against a wide range of target databases, allowing the databases to be built anywhere from developer desktop to production.
- Allows for automatic upgrading of databases or for DBA ready scripts to be generated and then manually invoked.
- Puts the documentation into the changelog, and generates JavaDoc style documentation for your database, including who made changes and when the changes occurred.
Action plan for smooth delivery
- Start developing and checking-in your database along with the rest of your code, in an iterative and incremental fashion.
- Use LiquiBase to make the targeting of the database as easy as targeting a platform for the code to compile to.
- Use the DbDoc feature of LiquiBase to help humans understand the database better.
- Break down the problem of database migration and data scrubbing into PreChange, Change, and PostChange events with respect to when LiquiBase is run in the automation process.
- Now that your DBAs and DevOps people are freed of the hard and mundane activities previously required of them during deployments, allow them to get back to their real jobs, such as managing the hard problems surrounding non-functional requirements, for concerns such as throughput, capacity, and latency.
For a comprehensive resource on continuous integration, see Continuous integration: Achieving speed and quality in release management.
Q&A: Expert says prepare for DevOps and testing changes