How does the availability of applications via the cloud, and Software as a Service availability, change the priorities in application lifecycle management?
Any system is only as good as its availability. When one is developing an application that will be deployed to the cloud, it is important to understand that the remoteness of the execution platform, the cloud, is a disadvantage when an application fails. Do you know how long it will take you to deploy and emergency fix to your cloud environment?
Development processes are executed every day, and we routinely deploy fixes in a methodical and thoughtful manner. But what happens when the system in the cloud is down inexplicably?
Part of our approach to designing applications for the cloud is to design them in a fail-safe and fail-soft manner. The application architect and designers need to give as much attention to detecting and handling when the system is failing as they do to the normal operation of the system.
Customer interactions with a system that is failing must be handled in way that preserves customer data and tries to recover the user input. It is not uncommon to find cute messages like “Oops, something went wrong,” or “Well, this is embarrassing" in modern cloud-based applications that tell us the system knows there’s a problem and it’s being handled.
We must also have technology in place that reports these incidents with diagnostic data to the service management teams. The first we hear of an outage should not be the 6 o’clock news or an email from an irate customer. This too must be part of the architecture and design of the system.
Of course, critical to our cloud applications are the numerous services we incorporate from third parties that give our applications that rich user experience and save us the coding efforts. Wherever possible, it is important to get Service Level Agreements in place with these publishers especially around the volatility of the service’s interface.
The Software-as-a-Service features (SaaS) we incorporate may change without warning, or may be unavailable from time to time. And so we need to build SaaS detection into our architecture too.
All software systems fail from time to time, and cloud-based systems are not an exception. When we have to face this challenge, the detection, management and remediation should be a well-rehearsed process that can be executed calmly and efficiently. Make sure you build this activity into your test planning.
This was first published in November 2011