Manage Learn to apply best practices and optimize your operations.

How useful is code coverage?

In this first of a two-part series about code coverage, software consultant Mike Kelly explains code coverage and gives a specific example of how code coverage was calculated on a small program using the tool rcov for Ruby.

Most of the development teams I've worked with have used unit tests as a tool to provide feedback on the code they are working on. Unit tests provide quick feedback to the developer in terms of knowing when they might be done developing a particular piece of functionality. They also help simplify the structure of the system, can help produce more testable code, and often provide the first line of defense from a regression testing perspective. Just as unit tests provide feedback on the code, code coverage metrics provide feedback on those unit tests.

The central idea behind code coverage metrics is that you're trying to expose code that hasn't been tested yet. And while this is often used in conjunction with unit tests, I've also seen it used with more traditional GUI-level test automation and even manual testing. While code coverage doesn't tell you how good your testing is from a functional or data perspective - that is, it doesn't tell you if you wrote the "best" test cases for a particular piece of code - it's still useful in signaling where your tests might need more work.

Looking at code coverage for the first time

A few years ago when I was first experimenting with test-driven development I wrote the game Gobblet in Ruby. Gobblet is a strategy game, think of it as three-dimensional tic-tac-toe meets connect four. It was a fun project where I could experiment with a new development practice.

If I were explaining the code base to you, I might say things like:

  • It's a single Ruby script. It has a handful of classes: two classes that make up the player pieces, two classes that make up the board, and one for the game.
  • It's got around 460 lines of code. That gives you an idea of how big it is - it's small.
  • It currently has 16 unit tests with 245 assertions. That gives you some idea of the volume of tests I wrote. I like to think they were good tests, covering some edge conditions along with most happy-path scenarios.

It's small and simple and after over two years of playing it I've not encountered any major issues. And until today, I've never checked how much code coverage I was getting out of those tests. I was surprised when I ran my unit tests along with rcov (rcov is a code coverage tool for Ruby). Figure 1 below shows the results of my unit testing.

Figure 1: Code coverage example using rcov for Ruby.

Imagine my surprise when I saw 63% pop up in my browser. Where was the 90% coverage I had imagined I'd have? All my illusions of my well-tested code went right out the window. Clearly I hadn't done as much testing as I had thought I'd done. Or had I? Let's dig a little deeper...

With most tools (and rcov is no exception) you can drill into where the numbers come from. It's here that you're shown (with varying levels of detail and complexity depending on the tool you're using) where the numbers come from. Rcov only provides analysis on which lines of code have been executed, so it's relatively simple to look at. As you'll see in Figure 2 below, if a line of code was not executed, it's highlighted in red.

Figure 2: Lines of code not executed shown in red highlighting.

These are the first lines of code in my file that weren't tested. And that's depressing because if I had been doing test-driven development well, they would have been. But, as I said, it was my first try. I believe I added some of the error checking after I had written the initial functionality. I (apparently) didn't write any tests for that added functionality.

If I continue to scan the results, the pattern of small blocks of error checking code not getting tested repeats itself several times. That is until I hit a large amount of untested code (over 100 lines - or 20%) at the end of my program. It then hits me that I have no tests for any of the code that renders the screen. That's not something I could really write unit tests for.

So while I didn't have all the test coverage I had hoped for, it wasn't as bad as the numbers initially led me to believe. Even if I tested every line I could, I likely couldn't get this code coverage over 80% using just unit tests. Thus, in this simple example, we've illustrated one of the major issues in measuring code coverage.

How useful is code coverage as a measure?

As we saw in the example above, one issue with code coverage is that sometimes you can't test all the code, and other times you may not want to. Some examples of things that few teams write explicit tests for include getters and setters and code that's automatically generated by an IDE. Other things might just be difficult to test, like code that renders a GUI, private and protected methods, code with real-time logic, or code that requires complicated mocks.

Another caution around looking at coverage numbers is that just because something is covered by a test, doesn't mean it's covered by a good test. Consider the classic illustration of this problem in Listing 1 below.

def div(a, b)

   return a/b


 Listing 1: A divide-by-zero scenario in Ruby.

In this example, you might have a test that uses the values 4 and 2 for the parameters a and b. Of course, if you run that test, you'll get complete code coverage of the div method and likely won't uncover any issues. If that's the case, you may have hit a coverage goal, but you will have missed the obvious divide-by-zero scenario. This plays out with all sorts of scenarios where how you test is a critical component along with what you test.

When I look at my Gobblet code from the example above, I can see several places where I should have written tests - places where I have logic that isn't tested anywhere else. But I also see places where I've tested the same five-line method eight or nine times, because I'm concerned about the different scenarios of data that might go through that method. For me, code coverage provides a heuristic that makes me look at the code I'm working with and ask, "If you don't have a test here yet, do you want one?"

In that way, code coverage is a useful metric and likely you'll benefit from using it. But you'll want to exercise a little bit of caution in how you interpret the results. You'll find some great articles in the references at the end of part two, Code coverage: Beyond the basics, which go into more detail on the dangers of code coverage.

For more on measuring quality, see Quality metrics: A guide to measuring software quality.

Next Steps

Determine whether or not your enterprise needs application testing tools


Dig Deeper on Topics Archive

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.