Get started Bring yourself up to speed with our introductory content.

Four tips for effective software testing

2/5

Define expected software testing results independently

Source:  ra2studio/Fotolia
Visual Editor: Sarah Evans/TechTarget

When you run a test, you enter inputs or conditions. (Conditions are a form of inputs that, in production, ordinarily are not explicitly entered, such as time of year. Part of running a test often involves additional actions to create the conditions.) The system under test acts on the inputs or conditions and produces actual results. Results include displayed textual or graphical data, signals, noises, control of devices, database content changes, transmissions, printing, changes of state, links, etc.

But actual results are only half the story for effective software testing. What makes the execution a test, rather than production, is that we get the actual results so we can determine whether the software is working correctly. To tell, we compare the actual results to expected software testing results, which are our definition of software testing correctness.

If I run a test and get actual results but have not defined expected software testing results, what do I tend to presume? Unless the actual results are somehow so outlandish that I can't help but realize they are wrong, such as when the system blows up, I'm almost certain to assume that the expected results are whatever I got for actual results, regardless of whether the actual results are correct.

When expected software testing results are not defined adequately, it is often impossible for the tester to ascertain accurately whether the actual results are right or wrong. Consider how many tests are defined in a manner similar to, "Try this function and see if it works properly." "Works properly" is a conclusion, but not a specific-enough expected result on which to base said conclusion. Yet testers often somewhat blindly take for granted that they can guess needed inputs or conditions and corresponding actual results.

For a test to be effective, we must define software testing correctness (expected software testing results) independently of actual results so the actual results do not unduly influence definition of the expected results. As a practical matter, we also need to define the expected results before obtaining the actual results, or our determination of the expected results probably will be influenced by the actual results. In addition, we need to document the expected results in a form that is not subject to subsequent conscious or unconscious manipulation.

View All Photo Stories

Join the conversation

24 comments

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

Are you guilty of defining your expected software testing results based on actual results?
Cancel
Actually I've seen this happen with performance tests.  Often because performance is considered very late in a project, so we run a test, see some results, maybe determine they aren't good enough, so we try to improve versus the 'benchmark'.

The problem though is that spending all this time trying to define the expected, what if the expectation changes drastically between the time you write your test case, and the time you get to execute it on the new feature?
Cancel
I think the author is correct that it can be easy to fall into the trap of assessing the result of a test as simply 'OK' without critically analyzing it. This was a trap I often fell into when I was a new tester. However, there are many results that may pass an 'expected result' definition that will fail other oracles.

For example, if a text alert has been defined in the requirements as taking the form "<Last Name>, <First Name> is an invalid user", we may appropriately define our expected result to match that requirement. If that is the actual result, are we done with the test? What if other similar alerts display the name in a first name first format?

I do not think simply creating an expected result prior to running a test is a sufficient replacement for critically analyzing the result against a wide variety of oracles that can assist in determining the appropriateness of the actual result.
Cancel
This can be avoided by having not so descriptive test steps and expected results. Like, when i say:
1. Double click on the text box labelled "user name".
2. The field should become editable.
3. Enter "username1"
4.. Double click on the text box labelled "Password".
....
These kind of test cases need to be avoided and the better option will be "please enter the required credentials and login".
Because the latter will not restrict the user from a single click, or a "tab" key to next field and all. So they might end up in something that is similar to expected results but not some thing that is desired.
Cancel
Yep! I have to admit I am guilty of this. It's something I'm aware of, though, so that awareness helps me to take a step back and try to think more objectively.

Like Veretax, we also struggle with defining acceptable performance. A lot of the time we will simply decide to wait and see how it does. This probably leads to being more lenient with the actual results. 
Cancel
Thanks for all your comments.  @Veretax, please rethink yours and consider how quickly you would have jumped on it had I said such a thing.  You’re essentially suggesting spending typically great effort performing performance tests without taking the minimal effort needed to be able to tell whether performance is adequate.  The main reason expectations seem to change is because they weren’t identified adequately in the first place. If they do change, act accordingly. @CarolBrands and @swarnaj, I think you’re confusing your test case writing procedure with identifying expected results.  Expected results are what you expect, whether or not it’s written in an “expected results” field.  An oracle is a way of defining expected results. The only questions are how valid and reliable they are. @swarnaj, the examples you cite all produce actual results that you will compare to expected results, which will be problematic because  they’re not clear (or possibly even correct).   @Abuell, whether you recognize it or not, “waiting to see how it does” is your version of expected results.   Such expectations are just not very reliable.
Cancel

"When expected software testing results are not defined adequately, it is often impossible for the tester to ascertain accurately whether the actual results are right or wrong. Consider how many tests are defined in a manner similar to, "Try this function and see if it works properly." "Works properly" is a conclusion, but not a specific-enough expected result on which to base said conclusion. Yet testers often somewhat blindly take for granted that they can guess needed inputs or conditions and corresponding actual results."


- Why take blindly? Use oracles! - http://www.developsense.com/blog/2015/03/oracles-are-about-problems-not-correctness/

Cancel
@agareev, thank you for both your comments. An “oracle” is merely the means by which one ascertains whether a result is acceptable or not. Essentially, the oracle is the way one determines expected results. The oracle does not determine what conditions you examine, only how you assess acceptability of the conditions you do examine. Merely referring to one’s process as an oracle does not itself change the suitability of the process followed. An oracle process which produces unreliable assessments is not desirable and could be downright dangerous. One cannot help being influenced to think actual results must be right when the oracle is not applied until the actual results have been obtained.
Cancel
“An “oracle” is merely the means by which one ascertains whether a result is acceptable or not.” I don’t think that’s quite correct. As @agareev’s link points out, oracles are about problems, not correctness. All oracles all heuristic and, in that respect, they can tell us that a result is not acceptable, but they cannot tell us that a result is acceptable. That would be like saying that not detecting a failure proves that there are no failures.
Cancel
mcorum , thank you for your post.Wikipedia says for Oracle (software testing): In computing, software testers and software engineers can use an oracle as a mechanism for determining whether a test has passed or failed.[1]. The use of oracles involves comparing the output(s) of the system under test, for a given test-case input, to the output(s) that the oracle determines that product should have.”That sure sounds to me like expected results, which could be very precise or include degrees of variation. Merely saying you use an oracle does not imbue correctness. It seems to me that the term “oracle” often is used in an effete or pretentious manner to make an essentially simple concept sound grandiose or above questioning. Little seems served beyond further attempted aggrandizement from trying to make it even more esoteric.
Cancel
I’m not in the habit of using Wikipedia as an authoriotative source. I suspect that the contributors of that page are unfamiliar with the work of Elaine Weyuker (On Testing Nontestable Programs, 1980) and Doug Hoffman that show the fallacies associated with the concept of a complete oracle, and suggest instead that we only have heuristic oracles which can tell us if something failed, but not whether it passed. There’s nothing effete, pretentious, or grandiose about an oracle. It’s just a mechanism by which we recognize a problem. That sounds pretty simple and unassuming to me.
Cancel
Guilty...? I'm not sure that's quite the descriptive I'd use. Yes, I have looked at test results and gone back to adjust, correct and change. While my field rarely needs frequent or complex testing, we work hard to pass the test whether that's a formal test or client feedback. Then we return to our product and adjust. That would seem to be the whole point of testing in the first place. At least for us it is. 
Cancel

@RobinGoldsmith -

"Essentially, the oracle is the way one determines expected results."
Nope. An oracle (there are always multiple applicable oracles) is a way one evaluates actual results.

Furthermore, all oracles are heuristics.

Cancel
This is a very simplistic view of testing, which relies heavily on checking.  However, there are many possible things a tester could discover that won't fit into the need 'expected'/'actual' style.
Cancel
I think it's fair to treat tests like scientific experiments. You have a hypothesis (which is hopefully that the software will meet the requirements) and you define it with expected results. You run the experiment (which is your test) and you get an actual result as experimental data. From there you can confirm your hypothesis is correct or present a new hypothesis (which explains how a bug functions) to the developers.
Cancel
Oh, in a perfect world, I'd always have my expected results all ready to go before beginning any testing!

Unfortunately, in my real world, testing can sometimes get really fuzzy. Sometimes, no one can tell me what an application is "supposed" to do. And there is no documentation. And whoever wrote it is long gone. That's when the best that we can do is talk with users, compare old behavior to any new behavior, and decide if the new behavior is acceptable. I consider myself an advocate for the users, so if a behavior is acceptable to them, then it is acceptable to me. 
Cancel
Whatever the tester discovers that is of value relates to an evaluation of it, which in turn means that the tester is comparing what they discovered to what they expected. Just because the tester is not conscious of their expectation does not change the fact that it exists; but it does dramatically reduce the likelihood it is appropriate. The more one guesses about what to test and how to tell whether the actual results are correct, the more likely both are going to be wrong. Similarly, even though it’s often hard to recognize, expected results are especially likely to be wrong when they are influenced by actual results rather than derived independently of the actual results.
Cancel
So I think what Robin's driving at is that the expected results don't have to be handed down from the software designers, they just have to be explicitly stated before you start testing. If you don't have the expected results already when you get the software to test then your first step is to do the research Abuell is talking about. You look at older versions of the software, talk with users about what they need, and figure out how to define your expected results so you know when you run the tests if the software is working right or not.
Cancel
Experimental testing seems to be pretty scatter shot to provide any reasonable level of quality.

Like a car, a building or a machine, professional software is developed in 3 phases:
1) Define what the thing is supposed to do (gather requirements.)
2) Define how the thing is supposed to work (develop the design)
3) Build the thing (write the code).

Like a car, a building or a machine, professional software is tested in 3 phases:
1) check that the thing works (unit testing)
2) check that the thing works the way it is supposed to (verification testing)
3) check that the thing does what it is supposed to do (validation testing).

There is a way to shortcut this a bit by realizing with software, unit and verification testing are conceptually identical. If you can prove the correctness of the source code (which after a few deterministic steps is what the end user receives), you know the application works. All that is left is checking to see that the application does the right work.
Computation Logic Verification can prove correctness of the source code leaving only validation of the requirements.
Cancel
So I'll push back on this.

Why is it that treating software testing like experimentation and observation is such a bad thing?  Scientists often do this, not having any idea what outcomes are.  They may have a hoped for outcome but it really isn't a guarantee of success.  I feel that there are many ways were things cannot be expected, and that by focusing too much on the expected we lose sight of other aspects of various features.
Cancel
What happens if we find problems when we were just exploring and learning the software? What if we are actively testing something, and discover something interesting but aren't sure whether it was a problem or not and have to ask someone else?

Much of testing is about making observations and then later forming ideas around those (much like experiment design in science). If you have an expected result, it might be good to question where that expectation comes from.
Cancel

I recommend this article to all testing practitioners who want to have a good laugh. And as an example of not learning history.


For education, better look at this: http://2009.stpcon.com/pdf/Hoffman_Why_Tests_Don't_Pass.pdf


"The SUT doesn’t really pass or fail
– “Pass” means we noticed only expected behaviors
– “Fail” means we noticed something that needs investigation"

Cancel
@agareev, thank you for both your comments. I’ve known Doug Hoffman for many years, have the highest regard for him, and recently spent considerable time with him at the STP Conference where we both were speaking. An “oracle” is merely the means by which one ascertains whether a result is acceptable or not. Essentially, the oracle is the way one determines expected results. The oracle does not determine what conditions you examine, only how you assess acceptability of the conditions you do examine. Nor does the oracle define how to respond when actual results do not match expected results. Further investigation is a generic response that seems reasonable. In many if not most instances, such further investigation is done by the developer fixing the detected bug. Doug’s presentation dealt with generating numerous additional conditions to examine, but acceptability of results for each condition must be assessed based on its specifics. Merely referring to one’s process as an oracle does not itself change the suitability of the process followed. I think Doug would readily agree that an oracle process which produces unreliable assessments is not desirable and could be downright dangerous. One cannot help being influenced to think actual results must be right when the oracle is not applied until the actual results have been obtained.
Cancel

@RobinGoldsmith -

I see a very elaborate response but I hardly understand what is the point you're making.

All oracles are heuristics. The outcome will vary based on the human skill, experience, and context. All testing methods and principles produce assessments reliable only to some extent. One doesn't need expected results at all - to test - and to find important problems.

Cancel

-ADS BY GOOGLE

SearchMicroservices

TheServerSide.com

SearchCloudApplications

SearchAWS

SearchBusinessAnalytics

SearchFinancialApplications

SearchHealthIT

DevOpsAgenda

Close