In cartography and orienteering (the art of making maps, the horribly frustrating pastime of following maps), the rule of thumb is that the perfect map is a 1:1 ratio. However, a 1:1 scale map simply isn't realistic -- it's exactly as big as the area being covered! As maps increase in ratio, their accuracy drops.
In testing, especially performance testing, the closer your environments match your production environments, the more accurate your benchmarking will be. However, this can become infinitely expensive (many production systems can cost into hundreds of thousands of dollars). There are a few strategies available:
- Performance test on the production hardware at night: Roll out preproduction code during the lowest usage time, and run performance testing then. Not always practical, and it's a significant morale drain on the teams involved.
- Extrapolate: Some operations teams I have worked with have been extremely "formulaic" in their approach. They've assumed 50% of the resources will produce 50% of the load. Notice the huge error in thinking here -- this assumes everything scales in a linear manner, which I've almost never seen in my career as a tester. However, it's still better than doing nothing!
- Combining test data with empirical data: If you're deploying to a load-balanced environment, but have no load balancing hardware, you can combine your test data (the scalability experienced on one server) with real-world observation (the scalability across two load-balanced servers). This is sketchy because it assumes a perfect load balance and ignores a whole host of other factors.
- Discover trends: Start at low loads and increase your load (throughput, request rate, etc.) incrementally. Pay close attention to response time as well as resource demand. As you slowly ramp up the load, you'll see the general trend -- as you add users incrementally, does the response time increase in a linear manner proportional to the load increase? Does the response time increase faster than the load rate? Does the response time increase exponentially faster? You should be able to follow the curve and arrive at a best guess as to the response at high load. Note that this approach ignores the fact that, shortly before complete failure, a system begins to suffer exponential response-time increases. However you can detect trends with this and that's better than no data at all.
In my last organization, I found it kind of funny that our operations team 1) didn't know what MTTF testing was (mean time to failure) and didn't trust the extrapolation that, if we survived 500,000 transactions, and we averaged 100,000 transactions per day, we could assume five days of uptime but 2) would be perfectly happy predicting that a system which handles 500 concurrent users on one box will scale perfectly to 2000 concurrent users on four boxes. However in the end, many organizations take that approach. As a tester, your job is to 1) not believe it (or be very, very skeptical) and 2) do everything you can do to prove it wrong. The best approach, obviously, is to have a perfect replica of your production environment in which to perform your testing. Short of that, you have to be creative and yet scientific about your testing.
The final point to be made here is that you need to learn over time. Make you best attempt at testing, then when you go to production, monitor the actual live data. Learn how your assumptions in testing bear out in production. Then you can change your assumptions and, over time in a heuristic manner, your results will become more accurate.
This was first published in October 2008