Vladislav Kochelaevs - Fotolia
Managing data processes and warehousing can make or break a business. The security of business information, as well as its ability to transfer data seamlessly, is intricate. Change control tools are vital for developing projects by keeping code drifts to a minimum, providing a way to track issues, and helping to speed up feedback cycles.
Brokering European vacations requires U.K.-based TUI Travel to manage complex data structures. Information from various airlines, hotels, clients and internal sources all need to be accurate and visible across the organization to accomplish the company's core mission -- matching tourists with the best vacation at the best price. Capturing and analyzing the necessary data requires a data warehouse. Managing the data warehouse for TUI U.K. was a big job that landed on the plate of John Beeston.
Today, Beeston is a freelance data warehouse specialist, but he spent more than four years working with TUI's data warehouse. I spoke with Beeston in August 2014, near the end of his tenure as a development team lead at TUI. At that time, Beeston's primary focus was helping TUI overcome version control and release-management concerns.
Beeston stressed the importance of keeping changes in the databases orderly. Without proper controls in place, reference errors between databases are likely. Just the same as dependency errors tend to creep in when the source code isn't properly managed.
Development teams need a change management system that all developers can and will follow. For enterprise teams, that generally means using a specialized change control tool -- sometimes more than one. Beeston noted that many tools are available for source code control, but that similar controls for databases aren't as readily available.
A change management trio
"The industry standard for source code control is SVN (or Subversion), and it's great for managing source files," Beeston said. Unfortunately, a lot of the work Beeston oversaw wasn't stored in source files.
Most of his team's work was done with Informatica, an ETL (extract, translate and load) tool. (Remember, most of his team's mission revolved around improving data quality and availability.) Informatica provides a GUI that helps with creating and managing ETL rules, but it doesn't store those as files that are easy to integrate with SVN. Beeston said it is possible to integrate Informatica with SVN, "but you end up exporting individual objects as XML. It's slow and all the changes have to be duplicated across SVN for everything in Informatica."
So Beeston used Informatica for version control on his ETL rules and SVN to manage the source code files. But neither one was a perfect fit for Beeston's Oracle databases. This was a big problem for Beeston because about 15% to 20% of the work his team did was in Oracle databases. For the Oracle databases, Beeston used DBMaestro's TeamWork. Beeston noted the tool wouldn't let his developers update any of the databases without checking them in and out. Beeston said the only build issues they had on the database side after implementing TeamWork were "because a developer hadn't tagged something that should go into a particular build."
Bringing it all together
That means three separate change control systems were at play for Beeston's team. To keep things manageable for developers, Beeston used a Hudson CI server. "Hudson gives a front end that brings it together," Beeston said. "It provides a nice interface, security structure, good logging and monitoring capabilities." Hudson is open source and freely available. Beeston said Hudson figures out which change management system to run and automatically logs the results.
That doesn't mean it was easy to connect all the pieces. Beeston said it took a lot of custom Perl scripting and SQL scripts to stabilize the connections between each component. And Beeston still wanted to make the integration better. "The dream is that developers will just be able to check code in and if there are no errors, they don't have to worry about it anymore."
Beeston envisioned a continuous delivery system wherein developers don't have to go to Hudson, specifically. Each developer would check code into the system in which he or she is working and the changes would automatically be checked into all the appropriate change management systems.
"Ideally, I'd like that to be an instantaneous process, but getting to a daily build would be a big enough step for now." As of June 2014, Beeston's team was doing two to three builds per month and would release at the end of each project. Beeston was working on building a system that would let the team build at least once per day and release once per month, on a regular schedule. "Whatever is ready to go on release day, that's what we'll release," he said.
Beeston knew that getting there would require a lot more work. "Continuous integration and test-driven development (TDD) are tough to implement without the discipline to use automation." Feedback loops need to be much shorter. Beeston wanted to ramp up automated testing and take more novel approaches to development. "TDD is definitely on the docket for building automation," Beeston said. "There's also a whole mentality change that needs to come; most developers here have the old-fashioned approach of just 'check it in and forget it.'"