Graeme Dawes - Fotolia

Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Is artificial intelligence in software testing coming to you?

Is the machine software tester coming to make your life even more stressful? Expert Matt Heusser explains where AI and testing intersect, and reveals why you don't need to worry about it.

With classic test tooling, we tell the computer to follow a series of steps, and then check the results against some expectations defined upfront. But will there ever be a role for artificial intelligence in software testing, aka a machine software tester?


Imagine testing a mortgage calculator, not by having a half dozen predefined examples, but by randomly selecting valid data. That is, select a random interest rate from 0% to 5%, a random loan amount, loan term and so on. Then, you code another algorithm, called the oracle, which predicts what the answer should be. Run the software, see if the oracle and the software itself match, and you may or may not have a problem. That approach to test tooling is pretty well-defined; it is a simple example of what some call model-driven testing, which can expand to do things such as taking random walks through an application, providing random data to everyone and predicting what the results should look like. Run those tests overnight and they can find some interesting bugs.

It's tempting to call that artificial intelligence, but if you think about it, the computer isn't really learning. The application is following predefined rules. Terms like artificial intelligence and machine learning imply that the computer discovers the rules, or, perhaps, creates its own rules. With machine learning, the software can look at a thousand examples -- or a million -- and create its own oracle. Could this be like a machine software tester?

Here's a simple enough example: When you search for "software testing" on Google, you don't have a way to know if the algorithm is correct. You don't know, for example, if the pages at the top are the most relevant or have the most authority, how the software makes the tradeoff, and you certainly don't know how Google takes into account your location, search history and what you have clicked on in the past to improve your results. Yet, if the results were all about college exam preparation, or if a search for "The Beatles" returned "one of 25 results," you'd know that something was wrong. Your very life experience is an oracle of sorts.

With artificial intelligence -- in software testing or not -- we feed the computer massive data sets, along with some judgments about each piece of data, then let the computer try to figure out connections. Paul Graham, for example, once suggested a Bayesian filter for email, where humans first identify thousands of emails as spam or not spam, and feed that information into the computer. With each new document listed as spam and not spam, an inductive algorithm tries to figure out what the spam has in common, and to predict if incoming email is spam or not. This approach to spam is exactly what Gmail does, with the added benefit of a "report spam" button that provides more information for the filter. If one person presses the button accidentally, no harm done, but a copy-paste spammer -- even one that injects some randomness -- can be thwarted in minutes if the recipients use that spam button.

Now, sit back and think for a minute about the potential of artificial intelligence in software testing. You could train your application to recognize problems.

Training your application

Web crawlers and link checkers go through your entire website looking for 404 errors. Model-based software can recognize a crash -- a page with some text, such as "error in ./ Application" listed. Imagine training your software in a different way -- to find things that just don't look right. For example, tab order is complex. Typically, it moves left to right, then top to bottom, but there may be visual indicators, such as grouping boxes, which make the rules different. Imagine software that sat on top of your browser, watching your every move. When something is wrong, you type a keystroke combination and click the screen element that has a problem, which is reported back up to a database. Thousands of people are doing this, all at the same time. Eventually, the computer learns to recognize things that look odd. Once that's possible, we combine machine learning about fields -- for example, fields named "first_name" have this common set of valid inputs: John, Michelle, Sarah, Robert -- with the model-driven techniques to take random walks through an application -- and we can have an army of machines with inductive expertise testing our software overnight.

If that seems like a bit of a dream, it probably is. The software doesn't exist yet.

We don't have to wait for this ideal machine learning, though, to take the idea of artificial intelligence in software testing forward. Visual testing is a process where you record a test, then rerun that test on a new build. Each morning -- or whenever -- the testers can quickly move through differences, using a tool to verify them. They can mark each change as an error, something that needs to return to a previous state, or as a new feature, which becomes the new standard. Most visual test tools allow their users to train the software to ignore fields that change all the time -- automatically generated date fields -- or to only focus on things that shouldn't change.

All programming is essentially creating change, and these visual tools are offering change detection. That might seem redundant, telling the computer, "Yes, that change is what we expected," but it also provides a very fast way to review any visual changes -- not just the preplanned expected results so common in classic test tooling.

Between expected results in a traditional tool and visual inspection results, what's left for a tester to do?


First of all, don't worry too much. The newspaper was supposed to go away when the radio came -- and a hundred years later, my tiny town of 5,000 still has a functioning newspaper.

Second, don't worry too much. Even with a machine software tester, visual inspection tools still need a human to run them. The machine learning that will automatically find issues will find general website problems, such as crashes and text that bleeds into other text on a screen. The software won't have subject-matter expertise; it won't understand how multiline discounting works on a bill of material, and it won't understand the non-Windows-standard user interface decisions your company has decided are standard. And, of course, artificial intelligence in software testing doesn't exist yet. Most of the successful machine learning projects today in testing are more like analyzing a set of errors in production logs to figure out what behavior is driving those errors using a programming language, like R.

Even with a machine software tester, visual inspection tools still need a human to run them.

So, yes, take a look at visual testing tools. Take a look at programming in R. See if visual inspection tools make sense to augment the work, to push faster or better.

The future isn't going to be decided by one choice -- human testing or machine software testers. It will be more about humans and machines.

Next Steps

More changes await software testers -- and not all of them are happy ones

Testers, feel like you want to program? You may need to learn

From AI to big data -- another testing to think about

Dig Deeper on Topics Archive

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

What keeps you up at night when you think about artificial intelligence and the role it may play in software testing?
The key motivation of universe is happiness. AI will help humanity to move in that direction.
There’s really nothing about the application of AI to software testing that keeps me up. Even if there are advances that make machine learning or AI better suited for software testing, there will always be a need to have a set of human eyes that are attached to a human brain focused on the program.
What keeps me up is irresponsible enthusiasm.
Scientists have someone else (typically, government) funding their quench for experiments - but they do it in a lab. Even with safety protocols things sometimes go south.
But nowadays it seems like humanity is willing to experiment in its own crib. Oh, well. Maybe, after exterminating humans, Skynet will at least solve global warming disaster.
What was called AI years ago is now common usage, e.g. databases, Neural Networks, Watson.....  Machines are good at many things.  Humans are good at others.  We will continue to blend and advance.  Testers doing heavy scripted tests without thinking, should change as machines, models, and robots are capable of doing those things which I personally find boring.  Real creativity likely won't be taken over by machines anytime soon. 
1. It will write the same type of code automatically
2. Software development or software testing will be more script less
3. Manual Testing will be there as automation testing cannot give much more assurance.
4. Automation script developer will still be needed as automation script development is fully dependent on the application's presence
5. There will be much more production based companies those are going to make the tools (Developement/Automation)
6. If the industrialization keeps going, there will not be a problem of becoming job less as improvement areas and R &D areas will increase every time
Good article that does a nice job explaining the Oracle problem in the context of “AI.” I did get a laugh when I read the sentence, “You could train your application to recognize problems.” It made me think it’s hard enough to train people to recognize problems, let alone my application.
OK, let's give those machines a lesson of good old human Weinbergian testing.

"You could train your application to recognize problems.".

Train. Attempt training. In some context. Given funds. Given time.
Your application.Special software (if you have it).
To recognize. Machines still totally lack awareness, so cognition is not valid term. To detect. And it's not for sure, so - to attempt detecting.
Problems. Machines don't understand what a "problem" is. Mechanized definition of a problem is not a problem - it's a model. Patterns that may indicate presence of some problem of certain, pre-defined, class.

Putting it all together.

Someone, in some context, with some constraints, may attempt training a specialized software to attempt detecting some patterns that may indicate presence of some problem of certain, pre-defined, class or classes.
Interesting thoughts. Totally agree, it's difficult enough to figure out how to guide and teach a person to test software. We've got quite a ways to go before we can successfully teach a computer to test. 
When I thought about developing AI for software testing, I decided to use static software metrics as attributes (eg code complexity) to train the AI with, and applying google's PageRank algorithm to the dependency graph from NPM as a stand-in for quality (popularity=quality in open source). If it all worked, the same static metrics would be collected from any given project and the AI would predict it's overall quality; enough to produce a badge to be displayed on a dashboard. My laptop isn't powerful enough to do the work in a reasonable time :(