Applying the scientific method to software testing

Software testing and scientific testing have commonalities. Learn why applying the scientific method to testing software applications is beneficial.

Christin Wiedemann holds a doctorate in astroparticle physics from the University of Stockholm in Sweden. She says that her deep roots in scientific exploration colors her approach to everything else. After graduating from university, Wiedemann started working in the software industry as an application developer, but found the work was less interesting in the business sector than it had been in university.

Christin WeidemannChristin Wiedemann

When she was called in to help with QA on a colleague's project, Wiedemann had an "aha moment." Testing software is a lot like doing scientific experimentation. She's had a passion for ensuring software quality ever since. This week, at the Software Test Professionals Conference (STP Con) in Phoenix, Wiedemann will share her thoughts on using scientific principles to further software quality. Here's a preview of the tips she'll be giving.

How does the scientific method apply to software testing?

Christin Wiedemann: Software testing can always benefit from a more structured approach. The scientific method isn't really one set of methods, but a larger set of guiding principles. It's about knowledge. Scientists want to find out how the world works; software testers want to know how the software they're testing works. Those two missions share a lot in common.

The scientific method is based on observation and experimentation. Testing is the same thing. We set up tests that are very much like experiments, and then we run them and observe what happens. That's the same way scientists test their hypotheses. We run experiments, measure the results and analyze the data to figure out what's really happening.

What are some of the important concepts that transfer well from the science lab to the software testing lab?

Wiedemann: Experimental scientists realize we can't prove a hypothesis is absolutely true. We can only prove where hypotheses are wrong somehow. The concept of empirical falsifiability is just proving ideas wrong through experiments. Testing is very similar in that we can't prove the software is flawless; we can only find ways to make the app fail through testing.

If you ask a business manager how much to test the software, they'll probably tell you to test everything. Good testers let them know we can't test everything. It would take an infinite number of tests to get at every possible scenario. We can only look for conditions under which software fails.

If tests find no failures, we can have more confidence that it's going to work, but we're still not ever completely sure. After many failed attempts to disprove a hypothesis, scientists build up confidence in hypothesis. It gives their theories credibility. Software testers are really doing the same thing.

So how do software quality engineers improve the credibility of their tests and results?

Wiedemann: Peer review is an important part of a scientist's credibility. Peer review is very important to science in general. It's almost an official part of the scientific method. Peers can be very critical -- not in the sense of giving negative feedback, but in terms of questioning your answers.

The biggest part of critical thinking is questioning your premises. I like to say, 'It's about questioning answers, not about answering questions.' Peer reviews will question your assumptions and question the results you've found. Not to put them down, but to see if they can be made better -- and that should really be the goal for all of us.

This is definitely something testers can do more. Having our work reviewed by our peers increases testers' credibility and the credibility of our tests. It's not about pointing out each other's mistakes. Different people bring different biases and perspectives, which increase the value and credibility of the testing. Evaluating information eliminates biases, distortions and assumptions. It's also a great way to learn from each other.

Can you explain the difference between induction and deduction?

Wiedemann: As humans, we have a lot of different thought processes. We use induction, deduction, abstraction and others constantly -- not just in testing, but in our everyday lives. It's important to understand precisely what we're doing when we think about software quality because we think this way all the time, but we're not always right.

Both induction and deduction are ways to come to conclusions based on prior knowledge. Deduction is when we make a specific conclusion from general knowledge. If I know ten out of the last ten mobile applications I've tested have a particular networking issue, then I might assume that the next mobile application I test will have the same networking issue. That's working from the premise that all mobile apps have this flaw, so my current mobile app must have it too. If that premise is true, then yes -- the current mobile app must have that flaw. The problem is that my premise might actually be false.

Induction is when we make a general conclusion from specific knowledge. So if I've tested a single mobile application and it happened to have a specific error, then I might reason that all mobile applications have this flaw. In this case I know for a fact that my premise is true -- I know this one mobile application is flawed because I tested it. However, my conclusion doesn't actually follow as a necessity of my premise. One defective mobile application doesn't mean that all mobile applications are defective. It's important for software testers to pay attention to the way we think, so we can improve our critical thinking.

Dig Deeper on Topics Archive