I don't know what tool you're using, so I can't make any recommendations there, but I will say that it's been my experience that the problem with inaccurate reporting often isn't with the tools. Performance testing is some of the most difficult and challenging testing I've done. Rarely has it been the case that switching tools gave me "better" results.
The reason performance testing is so challenging is because when you're doing performance testing, you need to understand the software you're testing, the technology being used, the physical hardware and network the software's deployed on, and you need to understand the performance testing tool you're using. Perhaps the most difficult aspect of "understanding" each of these pieces, is to understand how they are all configured.
Many tools have configuration settings for when you record your scripts and another set of configuration settings for when you play them back. This is normally where I start when I see what I believe are inaccurate reports. Sometimes a small setting change can make all the difference.
How you record or code your performance tests also makes a huge difference. Simply putting a timer in the wrong place, mislabeling a transaction, or inserting a think time in the wrong location can throw off your accuracy significantly. If your load model is unrealistic or unrepresentative or your test data is invalid, you're also likely to see results that won't look accurate.
And that's all just focused on the tool. Network settings, web server settings, and application code can also affect your test results. Is your load balancer viewing your requests as a denial of service attack? Are your connections to your application server sticky? Do you need to geographically distribute your execution load to better reflect network latency? Factors like these could also account for what you're seeing.
I might suggest starting with a couple of simple steps to help diagnose the issues. First, scale back your scripts to be the simplest tests you could possibly have. Run those alone, at low loads, and tune/debug them until you feel you're getting accurate results. Work with all departments and roles (like network operations, DBAs, architects, etc...) to understand what's happening in your load test environment when those tests are running. Look at every log file you can get your hands on. And explore the configuration settings you have available for each of the components in your architecture (including your test tool).
Second, if that doesn't work, get a different tool try JMeter, WebLOAD, OpenSTA, or some other free tool or without using a tool at all simply write some code in the language of your choice that will run a simple test and log results, and run a couple of head-to-head tests. Again use very simple tests and scenarios with low load. Remember you're trying to debug and tune your tests. You're not load testing your application yet.
If you still can't get traction, call in a consultant for a day or two and see if they can help you isolate the issue. You might start with your tool vendor or with an independent that's worked in a lot of different contexts. Whoever you bring in should have some experience with the particular tool you're using and background knowledge in your technology and/or industry will also help.
Once you feel you've got some results you feel you can trust at low loads, slowly start ramping them up and making the load test models more realistic. You need to find the balance of load, test/data realism, and script/model complexity for this to all work. At each step along the way, continue to validate your results. There are plenty of monitoring tools (many of them free) that can help you validate your load test tool results.
This was first published in September 2009