Software applications that "grew up with the business" often have the outdated practices and behaviors of the business so intertwined in them, in the form of hard-coded business rules and process, that they seem inseparable. Now after 40 years of serious business software development, "legacy applications" are something that larger organizations finally have to contend with, but few are looking forward to. It is akin to opening Pandora's Box.
For the business to remain healthy — which includes being responsive to its customers, being able to take full advantage of opportunities, and being agile in its ability to adjust to new conditions — updates to these applications need to be timely, predictable, reliable, low-risk, and done at a reasonable cost.
The key challenge with maintaining legacy systems is developing new functionality and enhancements, often without a clear understanding of how the system works. The good news is that products and approaches are emerging to help solve these problems.
What is a legacy system?
Perhaps the first order of business is to agree on a definition of what legacy applications are. The one I prefer, and that I have applied in this article, is as follows:
Legacy software applications are those that have outlived their original user requirements but remain in operation long enough to be modified until the application is substantially different from the original. The system functions correctly, but in reality a large percentage of the code is obsolete and the remainder works in ways that are not fully understood by those who maintain it.
What development needs to be done with legacy applications?
If legacy applications are to continue supporting the business, they need to evolve along with it. This can mean changes to support internal process change or maybe exposing what was originally internal applications to new, external users in response to ebusiness demands. At some point when changes can no longer be sustained in a timely, predictable, reliable, and low-risk manner, the application needs to be replaced if the business is to remain "healthy."
Recent studies on legacy applications show that the most common initiatives dealing with legacy systems today are to add new functionality, redesign the user interface, or replace the system. Since delving into the unknown is always risky, most IT professionals attempt to do this work in as non-invasive a manner as possible through a process called "wrapping" the application, or by keeping the changes as superficial as possible. In all cases, the more you understand about the application — or at least the portions you're going to be operating on — the less risky a proposition the operation becomes. This means not only unraveling how the application was implemented (design), but also what it was supposed to do (features). This is essential if support for these features is to continue or if they are to be extended or adjusted.
What is it about legacy applications that complicates this work?
What characterizes legacy applications is that the information relating to implementation and features isn't complete, accurate, current, or in one place. Often it is missing altogether. Worse still, the documentation that does exist is often filled with information from previous versions of the application that is no longer relevant and therefore misleading.
Just some of the other problems that plague legacy development:
- A lot of intellectual property is "embedded" in the application and no place else
- There's incomplete or sometimes no documentation, and the accuracy and currency of the documentation that does exist is suspect
- The original designers are no longer around
- There have been many "surgeons" who've performed a variety of operations over the years, but none bothered to take notes on what these operations were
- The application is based on older technologies (languages, middleware, frameworks, interfaces, etc.)
- Skill sets needed to work with the old technologies are no longer available
What are some of the techniques being employed?
"What one man can invent, another can discover."
— Sherlock Holmes.
The skills of a forensic detective are required to gain an understanding of a legacy application's implementation and its purpose. This understanding is essential to reducing risk and to making development feasible. Understanding is achieved by identifying the possible sources of information, prioritizing them, filtering the relevant from the irrelevant, and piecing together a jigsaw puzzle that elucidates the evolution of the application that has grown and changed over time. This understanding then provides the basis for moving forward with the needed development and hopefully turning the corner, providing a foundation for subsequent development.
In addition to the application and its source code, there are usually many other sources for background information. These include user documentation and training materials, the users, regression test sets, execution traces, models or prototypes created for past development, old requirements specifications, contracts, and personal notes.
Certain sources can be better resources for providing the different types of information sought. For example, observing users of the system can be good for identifying the core functions, but poor at finding infrequently used functions and the back-end data processing that's being performed. Conversely, studying the source code is a good way to understand the data processing and algorithms being used. Together, these two techniques can help piece together the system's features and what they are intended to accomplish. The downside is that these techniques are poor at identifying non-user-oriented functions.
What advances in automation help the situation?
The majority of tools whose purpose is to help with legacy application development have tended to focus on one source. Source code analyzers parse and analyze the source code and data stores in order to produce metrics and graphically depict the application's structure from different views. Another group of tools focus on monitoring transactions at interfaces in order to deduce the application's behavior.
While this information is useful, it usually provides a small portion of the information needed to significantly reduce the risk associated with legacy application development. A key pitfall of many development projects is not recognizing that there are two main "domains" in software development efforts: the "Problem Domain" and the "Solution Domain."
Business clients and end users tend to think and talk in the Problem Domain where the focus is on "features," while developers and IT professionals tend to think and talk in the Solution Domain where the focus is on the products of development. Source code analysis and transaction monitoring tools focus only on the Solution Domain. In other words, they're focused more on understanding how the legacy system was built rather than what it is intended to accomplish and why.
More recent and innovative solutions can handle the wide variety of sources required to develop a useful understanding and can extend this understanding from the Solution Domain up into the Problem Domain. This helps users understand a product's features, and allows them to map these features to business needs. It is like reconstructing an aircraft from its pieces following a crash in order to gain an understanding of what happened.
The most advanced of these tools let you create a model of the legacy application from the various "puzzle pieces" of information that the user has been able to gather. The model, or even portions of it, can be simulated to let the user and others analyze and validate that the legacy application has been represented correctly. This model then provides a basis for moving forward with enhancements or replacement.
The models created by these modern tools are a representation of (usually a portion of) the legacy application. In essence, the knowledge that was "trapped" in the legacy application has been extracted and represented in a model that can be manipulated to demonstrate the proposed changes to the application. The models will also allow for validation that any new development to the legacy application will support the new business need before an organization commits to spending money and time in development.
Once the decision is made to proceed, many tools can generate the artifacts needed to build and test the application. Compuware's Optimal Trace and Ravenflow's RAVEN, for example, can automatically generate documentation and other assets. And Blueprint's Requirements Center can generate complete workflow diagrams, simulations/prototypes, requirements, activity diagrams, documentation, and a complete set of tests.
People typically won't invest the time required to create a model of the entire legacy application. This is fine because in most cases, a comprehensive model is not necessary. Usually just creating a model of that portion of the application that is being changed, and some peripheral areas, is all that's needed. Over time, after several enhancements have been done in this manner, models can be combined to gradually "expose" more and more understanding of the legacy application. This will make future enhancements easier to implement and more feasible, thereby increasing the application's longevity.
Current trends toward new software delivery models also show promise in alleviating many of the current problems with legacy applications. Traditional software delivery models require customers to purchase perpetual licenses and "host" the software in-house. Upgrades to new versions were significant events with expensive logistics required to "certify" new releases, to upgrade all user installations, to convert datasets to the new version, and to train users on all the new and changed features. As a result, upgrades did not happen very often; maybe once a year at most.
Software delivery models are evolving, however. At the other end of the spectrum, already available in some markets like Customer Relationship Management, is Software as a Service (SaaS). It is similar to getting cable TV. The customer subscribes to the SaaS, and the service is provided "online." The customer does not have to deal with issues of certification, installation, and data conversion. In this model, the provider upgrades the application almost on a continual basis, often without the users even realizing it. With this continual stream of small changes, there's no "big-bang" event of the magnitude of described above. The application seemingly just evolves in sync with the business and, hopefully, the issue of legacy applications becomes a curious chapter in the history of computing. Time will tell.