Manage Learn to apply best practices and optimize your operations.

An excellent scorecard means picking the right performance metric

Requirements expert Robin Goldsmith explains how and why a software performance metric must relate to the type of performance one considers important.

Is there a list of metrics that would constitute an excellent scorecard for performance testing projects, in particular?

Robin GoldsmithRobin Goldsmith

This time of year, the term performance prompts images of Hollywood award shows. The show-biz analogy is relevant to testing because the acting awards that first come to mind are by no means entertainment's only form of measured performance. For example, entertainment performance also is measured in terms of a show's number of nominations, number of awards won, pre- and post-awards domestic and worldwide revenues and views by media type, box office ranking weeks, and production cost and completion time versus budget and schedule. Actors often are measured similarly; and while awards can increase ticket sales, great acting is rarely needed for commercial success.

A system or software project performance metric likewise should tie to the particular types of performance one considers important. Thus, business performance includes measures like a user's completion versus abandonment, which, in turn, need to be traced to its causes. Too often, only a handful of simplistic infrastructure and technology performance causes are considered.

For example, response time is probably the most widely used performance metric. At its most basic, response time is the length of time from the moment the user clicks, taps or presses a button to the moment results are displayed on the screen. The most common response time measure is how long it takes a requested webpage to download. Users supposedly start abandoning webpage downloads after about eight seconds. Smartphone users expect their smaller screens to fill much quicker.

However, response time is not really that simple to measure. What constitutes a response? Must it be full display of all requested data, or is it sufficient just to start displaying some of the data? How about displaying an indicator that the request is in process?

The performance of post-implementation help desk and maintenance support can make or break a system's success.

Additional factors often need to be accounted for, such as identifying the user and whose website it is ,can make a big difference. I'll wait for pictures of my family to download but not for pictures of your family. Casual interest tolerates less delay than greater interest in what is on the website. One will quickly abandon a slow website when a competitor offers something similar. When the slow site is the only source for something one really wants or needs, one sticks around longer.

Most performance testing is intended to demonstrate that performance is adequate under peak load. Relevant performance measures could include response time, throughput, cycle or transaction time from start to finish as well as resource consumption or damage. Whereas load testing typically measures performance for a short period of time, duration testing measures performance over an extended time period.

Another critical variable is the nature of the load. Load usually refers to the number of simultaneous users or total users within a given time frame, but load also could refer to database, file or transmission size. It gets more complicated from there. The big determinant of performance testing effectiveness is the "operational profile" -- which means reflecting actual usage patterns. Factors that the profile must reflect include transaction mix and distribution characteristics, as well as device capabilities.

For instance, a transaction that enters and validates many data fields creates a much different load from one with lots of "think time," such as displaying an article that the user spends time reading. A major function of performance and load testing tools is creating the profiled load and capturing relevant performance measures, often from technical system internals "under the covers." Monitoring tools take the same measures repeatedly to detect performance degradation as it gradually occurs.

Realize, too, that many additional nontechnical forms of performance are important to define, measure and test. While technical performance issues can affect usability, so can defects and other aspects that affect transaction times and user experience. Project managers are accustomed to measuring project performance in terms of meeting budget and schedule. Few teams know to test the adequacy of the budget schedule. Additionally, the performance of post-implementation help desk and maintenance support can make or break a system's success. Even so, they seldom are tested. They can and should be.

Dig Deeper on Topics Archive

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

Do you think your team's performance metrics are comprehensive enough?
anonymously - what do you think is needed to get there?
An excellent scorecard or any metric means picking a metric that can be actionable.
I believe that "Latency" metric is missing. Overall response time is latency + server response time and latency is the time required for request to reach the server. High latency can indicate environmental issue as well as high difference between min and max response times. For more details on main performance metrics refer to http://community.blazemeter.com/knowledgebase/articles/65153-the-load-reports guide
Thank you both for these excellent points, which actually are related. There are two important types of measures, those that tell you what result happened and those that help you understand the causes of why it happened. Latency is one such measure that helps you understand why the performance is what it is. Generally you need to accurately understand causes before meaningful action can be taken to improve the result.