This is an excellent question. I'm willing to bet that many people struggle with this, or something very similar to it. I know I have. The short answer to this is you can't. Unless the algorithm is trivial, you can never be sure that you've covered all the possible meaningful tests.
You're not only dealing with a problem of documentation and the organization of that documentation. By your question, it sounds to me like you're also dealing with the problem of complete test coverage -- which is an impossible problem. Cem Kaner has done a lot of work on this topic. You can see him talk about it here or read about it here. The following is a great summary from the TestingEducation.org Web site:
"It's impossible to fully test a program. There are too many inputs, too many combinations, too many paths, too many places where too many types of interrupts can impact the execution of the program, too many ways the program can be used, and too many interesting ways the program can fail. In the face of an infinitely large testing task, we have to treat with skepticism the statements that some people make that the testing project must always do this or always deliver that. In the face of an infinitely large task, everything is a tradeoff -- work spent on one task is work not allocated to another. The proper allocation of resources to tasks and deliverables has to be a function of the information objectives of the project at hand.Does that mean you assume that it's hopeless? What can you do? Give up? Well, it's not quite that bad. You can still be successful, just accept that you may need to rely on things not specified in a software requirements specification.
"Several metrics appear to check how close we are to completeness of testing, and thereby define testing completeness. Coverage measures are an example; so are defect arrival rate probability models. If "complete testing" means that there are no remaining unknown bugs, then these approaches do not measure completeness of testing. In addition, there are predictable risks of using these metrics. Coverage measurement has value, but not as an indicator of how close we are to completion of testing."
Collaborate with the people who know the algorithms If it's possible to collaborate with those who know the algorithms, that's what you may need to do. Here are three experiences I've had where I've had to do this:
- Rating algorithms for insurance policies
- Interactive 3-D modeling algorithms for chemical compound analysis
- Algorithms for real-time monitoring and issue identification, prioritization, and alerting in a multi-state power grid
Now, I like to think I'm a smart guy and that I can figure out just about any problem. And that might be true, but all three of those experiences happened in the same year. It's just not an option for me, someone who specializes in software testing, to take the time to develop deep domain knowledge for applications like those where the subject matter and the math may not be intuitive.
On each of those projects I needed to work closely with subject matter experts. Sometimes I'm an order taker and I execute the tests that they design and other times I'm a facilitator and I talk them through the problem to help them design different types of tests based on my concept of software risk.
Here are some questions to ask to possibly help start the dialogue around collaboration:
- Are there specific tests you would like me to run for regression purposes? (This question might tell you about their concept of risk. What are they worried about breaking?)
- Are there things that you don't have time to test that you wish you did? Are there specific things you're worried about with the implementation? (Another question focused on understanding risk.)
- Can you walk me through how you design your test cases? What documents do you look at? What data do your reference? (This can give you an idea of the inputs that they use to design their test cases. It can tell you what they use to inform their test case design so you can better inform yours.)
- How would you know if a calculation was correct? How do you check your answers? (This can give you an idea of what oracles they might use when they test.)
- At what level do you interact with the algorithm? Do you test one algorithm at a time or many algorithms at a time? Or do you know? (This can help you understand the level of complexity in their testing in terms of visibility into the specific algorithms. You can better understand if they are testing one factor at a time, or many factors at a time.)
Look at different types of coverage (requirements, data, scenarios, code and so forth).
Another thing you can do is focus on different types of coverage. Using the input you get from subject matter expert that you're partnering with you may be able to envision new and interesting types of coverage. From your question, it sounds like you look at requirements-based coverage in your testing, but I've found that to be a relatively weak form of coverage for complex applications.
Is anyone testing based on complex scenarios? Is anyone looking at code coverage? Is anyone looking at the data processed or at representative data from production? Is anyone thinking about timing and state issues and testing for those? Who's looking at non-functional requirements like performance and maintainability?
Sometimes, when I just can't find a way to add value to the testing that a subject matter expert might be doing, I'll focus on those areas of testing where I can add value. For me, that means developing several different dimensions of coverage and focusing on making those deeper and richer in the scope of what they test.
Look for multiple oracles
Of course, the more types of coverage you have, the more you need oracles. And if the only person who knows if it "worked" or not is the subject matter expert, then you have your work cut out for you. For some products you can get your hands on a spreadsheet that a subject matter expert created. That might work. Other times you might be lucky enough to have a parallel application. That might work. And if you're really lucky, you'll have a manual (or stack of eight four-inch binders) that walks you through the algorithm so you could execute it by hand. If you have the patience, that might work.
Having multiple oracles allows support multiple types of coverage. It also better enables you to develop insights into the algorithms being tested. With strong oracles, over time it will be easier for you to design and develop your own tests without having to rely on someone else.
I remember a story recently told to me by a former CAD developer. At his company they had two mathematicians in the back of the office working on incredibly complex calculus. When an equation was complete someone would walk it out to the developers, drop it on their desk, and they would implement it to the best of their ability. At times the developers would look at one another in confusion. All of them took calculus in college, but had no idea what these equations were doing.
How did they test? They would send it back to the mathematicians. Was that all the testing they did? Nope. How successful was their software? Very.
Sometimes we just don't have a choice. Many of the systems we develop are incredibly complex. That's one reason why testing is important. But different testers can add value in different ways. A business user or a subject matter expert may add value to project in one way, and an expert tester may add value in another. Understanding the best way to leverage the expertise of both is often a difficult task.
Dig Deeper on Topics Archive
Related Q&A from Mike Kelly
There are multiple ways performance testing can be handled on an Agile team. An expert describes the benefits of various approaches. Continue Reading
Creating user acceptance tests out of basic software requirements documents can be a daunting task. Expert Mike Kelly points out logical approaches ... Continue Reading
Expert selects preferred performance testing tools for data warehouse/BI software testing needs. Continue Reading