As a software testing professional with over 16 years of experience, if I’ve learned one thing, it’s that production is the wrong place to execute ANY testing activities. Testing is typically a destructive process which poses a high risk to the environment. It also has far-reaching implications in production environments where auditing and other reporting is important.
The first and most important reason not to test in production is that testing should never be the last phase of a project. Any team testing security in production is a team which will fail -- security flaws found in production are the most expensive flaws to fix. By the time new code has reached a production environment, a web of dependencies has been created between various application components and functions which makes the process of fixing an implementation flaw difficult and time-consuming. Moreover, most security vulnerabilities are actually the result of poor design rather than bad coding; redesigning an application in production can cost a considerable amount of effort. Time and again it has been proven that the earlier in the lifecycle a defect can be mitigated or fixed, the less expensive it will be to fix it -- in terms of time, effort and cost.
Keeping in mind that confidentiality, availability and integrity are the key aspects of security, consider the impact each type of testing might have in a production environment. Our production environment might be a personal health record repository (a Web application which stores health information about subscribed individuals). If a confidentiality flaw is discovered in production, and the vulnerability poses a high enough risk, the application may need to be pulled from production. Customers are unhappy when a service they rely on is unavailable, so the company’s reputation is at risk in this scenario. Consider also the idea of an availability defect being tested and discovered in production. By definition, testing for an availability defect means bringing the application down -- and again, site availability is a sensitive factor in a company’s reputation. Finally, integrity defects discovered in production may affect production data and reduce user confidence in the application, putting the company at risk for reputational harm and possible punitive damages.
One other factor to consider is similar to the Heisenberg principle: as testing occurs in production, unnatural consequences follow. Audit logs are filled with meaningless data. Application resources are consumed by testing teams, making them unavailable to actual customers. As logs fill and resources become constrained, the application may begin to behave in unexpected ways. If an audit of the site is performed at a later time, inaccurate audit logs may draw undesired attention to the service.
The two challenges in testing which are often ‘answered’ by testing in production are the availability of production-like scalability, and the access to production-like data. In spite of the tempting ‘easy access’ to these factors, teams are best served by developing test plans which account for them, and spending the extra time and money to create appropriate testing environments.
This was first published in September 2011