How is software quality measured? It’s a tough question and one that has been debated throughout the 50-year history of software development. We talked to Capers Jones and Olivier Bonsignour, co-authors of the new book The Economics of Software Quality, to find out more about the metrics associated with software quality and hear about factors and techniques that their studies have found most beneficial to high software quality. This is part one of a three-part interview in which we explore many of the
SSQ: Your book starts by talking about the importance of software quality with quite a few statistics about defects and the high costs incurred when these defects occur. You also talk about the difficulty in defining software quality. Traditionally, QA organizations base a lot of their quality metrics on defects found. However, as you say, there are many attributes of quality, outside of “freedom from defects.” If you could give QA managers advice, what would you suggest would be the key metrics they should track for assessing quality?
Capers Jones/Olivier Bonsignour: For functional quality measuring defects by origin (requirements, design, code, user documents, and bad fixes) is a good start. Measuring defect removal efficiency (DRE) or the percentage of bugs found prior to release is expanding in use. Best in class companies approach 99% consistently. The average, unfortunately, is only about 85%.
As systems get larger, more complex and more distributed, it becomes important to measure Structural Quality in additional to Functional Quality. At a high level, Structural Quality attributes include Resiliency, Efficiency, Security and Maintainability of software. These quality attributes may not immediately result in defects, but they drive a great deal of unnecessary cost, they slow down enhancement and introduce systemic risk to IT-dependent enterprises.
Enterprises that build custom software for their businesses are becoming more adept at managing Structural Quality, but it’s still not a mature science. ISO has provided a high-level definition as part of the ISO 9126-3 and the subsequent ISO 25000:2005 but these norms cannot be directly applied to Structural Quality measurement. The Security domain is probably the most advanced one with the OWASP initiative. But there isn’t unfortunately a defined standard to measure the other Structural Quality characteristics. Hopefully initiatives such as the ones driven by the Consortium for IT Software Quality (CISQ) are going to pave the ground soon for an accepted definition of the key metrics to measure Structural Quality. Meanwhile my advice would be to use your common sense and first of all measure your adherence to known best practices. Thanks to the Internet, many of them have been widely discussed and exposed, and it’s quite easy nowadays to define a small set of rules applicable per type of application (the type of application being the combination of the technologies used, and the context of use of the application). For example, most IT applications are about managing data and a good portion of them are now relying on a RDBMS back-end. Every DBA on the planet knows that there are correct and incorrect ways to interact properly with a RDBMS. Yet there are still a lot of applications in production that do not interact properly. By tracking the adherence to a few rules related to the use of indexes, the structure of the SQL queries and the efficiency of the calls to the RDBMS, IT teams could avoid the most common pitfalls and enhance greatly the Structural Quality on the Performance axis.
SSQ: Table 1.5 in the first chapter of your book lists 121 software quality attributes and ranks them on a scale from +10 for extremely valuable attributes to -10 for attributes that have demonstrated extreme harm. How did you come up with these 121 attributes and how was their ranked value determined?
Jones/Bonsignour: The rankings come from observations in about 600 companies and 13,000 projects. Some of the more harmful attributes came from working as an expert witness in litigation where charges of poor quality were part of the case. The high-value methods were associated with projects in the top 10% of quality and productivity results.
SSQ: I notice that “Use of Agile methods” ranked a 9.00, “Use of hybrid methods” ranked a 9.00, but “use of waterfall methods” only ranked a 1.00. Why is this? Have there been studies to show that Agile (or hybrid) methods result in higher quality software than when the waterfall approach is used?
Jones/Bonsignour: The waterfall method has been troublesome for many years and correlates with high rates of creeping requirements and low levels of defect removal efficiency. Better methods include several flavors of Agile, the Rational Unified Process (RUP), and the Team Software Process (TSP). The term “hybrid” refers to the frequent customization of these methods and combining their best features.
To continue reading see:
This Q&A is based on The Economics of Software Quality (Jones/Bonsignour) Addison-Wesley Professional, and can be purchased by going to http://www.informit.com/title/0132582201
For a comprehensive resource on measuring quality, see Quality metrics: A guide to measuring software quality.