Performance testing is expensive. Hosting the application under test, designing the tests themselves, generating the load and managing the virtual users all have significant cost. Furthermore, since the results of any performance testing
In an ideal world, we would know that a certain percentage of users will follow Path A, the rest will follow Path B, and no user would be perverse enough to actually attempt Path C. But our users are always surprising us by the ways in which they put our software to use. If we build performance measurements into our ongoing testing and into the application itself, the cost of performance testing can be reduced or even eliminated.
This approach is not suitable if your application is going to experience an exceptional event. For instance, if your company has purchased its first Super Bowl advertisement, doing some performance testing in advance is likely a good investment. But if the load on your application can be described as normal within some wide boundaries, then this technique may save you a lot of expense.
Identify the interfaces
Every application has interfaces. Many have a User Interface (UI) for human beings to use. Many have an interface to a database, where the results of SQL statements are received. Many have an interface to a back-end server, where the results of system calls of some sort are received. For example, in a RESTful Web application, the UI code will issue various HTTP calls to a Web server to trigger certain functions in the application. The UI code gets the result of those server calls and displays the result of those calls to the user.
Regardless of the interface, it is always possible to measure how long it takes any given transaction to occur at that interface. If we create ongoing logs of the time that all of the transactions take in a production (or production-like) environment, we can mine the information in those logs to discover the particular calls to the particular interfaces that point to performance bottlenecks in our applications.
Measure each interaction
To take the example of the RESTful architecture, it is very likely all the functions that answer REST calls will share interface code, thus making it possible to implement a single timer for every call on the system, and to have the results of that timer posted to the server log. I have worked with such an application, and we would routinely scan the production server logs looking for REST calls that took longer than they should have. Upon finding server calls that took too long to execute, we would routinely find and fix each bottleneck indicated by each such over-long call. This is a very effective strategy; over time, the performance profile of the application becomes quite uniform, and the code itself improves as we iron out the performance glitches.
Another fine place to take measurements is at the UI. For example, the open-source Web test tool Watir (Web Application Testing in Ruby) contains a built-in timer that reports the time the browser takes to load the page as a result of each call issued by the tool. Again, it is a trivial matter to write each result to a log file, so that we can mine the results of our automated UI tests for examples where the page took too long to load. This is a particularly effective technique when checking the performance of the application against multiple browsers, since different browsers handle elements in pages differently.
Finally, it might be valuable to measure the amount of time each SQL statement takes. Because of the nature of relational databases, even queries that once were efficient may become less efficient as the database itself is altered over time.
The key aspect of doing this sort of measurement is also to commit to doing the analysis and fixing the bottlenecks in the production environment as part of the ongoing work of the development team. All the measurement in the world will not help if the team is not also dedicated to making the improvements indicated by undesirable performance measurements.
Performance measurement, not testing
An architecture that allows the convenient measurement of performance at each of its interfaces is likely to be well-written and more maintainable than one that does not allow such measurement. So the first thing we gain by such measurement is a better architecture.
When we start measuring performance at interfaces, it is likely that we will find a number of bottlenecks. If we are conscientious about finding and fixing such bottlenecks when we discover them, then over time the performance profile at the interface evens out. Each call to each interface takes a relatively small amount of time, and each call takes about as much time as every other call; or else we know the reason why.
And when we achieve that, when the performance profiles at our critical interfaces are uniformly in agreement with our expectations, and we have ironed out our bottlenecks in a production environment, we can be fairly sure that there will be no performance problems even if the load on the system rises significantly. We already handled our performance bottlenecks as they occurred, and we know our application is solid and robust.
And we know it without having to test it.
About the author: Chris McMahon is a software tester and former professional bass player. His
background in software testing is both deep and wide, having tested systems from mainframes to web
apps, from the deepest telecom layers and life-critical software to the frothiest eye candy. Chris
has been part of the greater public software testing community since about 2004, both writing about
the industry and contributing to open source projects like Watir, Selenium, and FreeBSD. His recent
work has been to start the process of prying software development from the cold, dead hands of
manufacturing and engineering into the warm light of artistic performance. A dedicated agile
telecommuter on distributed teams, Chris lives deep in the remote Four Corners area of the U.S.
Luckily, he has email: email@example.com.
This was first published in January 2010