Is there a list of metrics that would constitute an excellent scorecard for "performance testing" projects in particular?
The right metrics depend on the project, but I can provide a few performance metrics examples that I have found some value in. I'm going to list them as performance monitoring metrics; to turn them into performance test metrics, you'll report the numbers as a set at different levels of load. Some of these can be pulled daily or continually, while others require a human to enter a value in a spreadsheet.
When you store the data, don't just store the time, but also store the URL. If the URL contains account numbers or some such, you can remove those. That way, you can create a table for each generic URL or service that includes not just the average (mean), but also the median.
You might also consider calculating the percentage of pages that are served in over a second, over five seconds, over ten seconds and so on. I would also add the number of occurrences in a given time period (last day, week, or since the last deploy to production).
With that information you can make a sortable table; by sorting the headers you can find the URLs that are slowest to which population or see which of the most common URLs have performance problems, and so on. If the data is overwhelming, create several tables -- a most common URLs table, a slowest URLs in general table, a slowest URLs for outliers table, etc.
Here's an example of one such table in the wild:
Performance metric examples
Time on server
Most servers can calculate the time from request to response. You can either write these to a database directly or write them to a log that will periodically need to be harvested and added to the database.
Full round trip
Have a computer, ideally outside the firewall, set to run automated GUI checks all the time with an open-source tool like Selenium or commercial test automation software. As part of that check, write code to calculate the rendering time and store it in a database; you can pull and report that time just like time on server.
If you want your measurements to be a little more accurate, consider a browser plugin such as YSlow for FireFox. In addition to reporting accurate page download time, YSlow can report time for each file and graphic and also analyze the way the webpage is displayed and suggest fixes to improve download speed.
Using a tool like YSlow, report the time to load key pages (see sidebar).
User pain scale
Ask real customers or a customer proxy to run through a few scenarios under simulated load. At various points, ask them to stop and express their frustration on the “pain scale” of 1 to 10. At the end, you can ask how likely they would be to use the application again or recommend it to a friend.
Internal system measures
These do not represent any end-user experience, but instead indicate how hard the system is working. Think about CPU and memory utilization. For disk, you can report access utilization and capacity utilization. Your network admin may be able to calculate network saturation. When any of these approach capacity, performance will suffer.
Final thoughts on performance metrics
There is no "one right answer" for a performance dashboard; what works for one project and team might not work for another. My first question would be: Who is the customer? Engineers will want metrics to help tweak performance, managers will want to spot problems early and executives will just want to know that everything is OK. (Or, if not, who is working on it.)
The core issues with any dashboard are the audience, the problem to be solved, the effort to get the data and the data's value. Historically, I've found that the data that is most valuable to the customer is the hardest to gather. Things like the pain scale are harder to come by but more directly useful. I have to be very careful with automated measures to make sure the data is getting at useful information.
There is a balance here between providing too much information and not enough. On one hand, too much information can make figuring out the status a bit like looking for a needle in a hay stack, while too little runs the risk that you don't include something important and relevant.
When people ask for a dashboard, find out what they would do with the metrics -- what decisions those numbers would enable -- and look for questions that would support those decisions. That kind of thinking is called goal, question, metric. I like thinking about what measures to use this way.
Do you have a question for Matt Heusser or any of our other experts? Let us know and we'll post your answer in a future response. Email the Editor@SearchSoftwareQuality.com or leave a question in the comments.
This was first published in April 2014