Jumbo2010 - Fotolia

Get started Bring yourself up to speed with our introductory content.

How do I tackle machine learning in software testing?

Machine learning is the next big thing, and software testers are just now being asked to tackle this new type of software. Expert Gerie Owen offers on-point advice.

Help! I've been assigned to test software that learns, and I don't know what that means.

Welcome to the world of machine learning in software testing. Machine-learning software takes past data and uses that data to better understand and make decisions in a problem domain. It consists of a series of mathematical algorithms that are able to adjust themselves based on its understanding of that data. It won't produce an exact answer, but it will usually produce one that is close enough to correct for its problem domain.

This type of software usually uses a technology called neural networks, which, to put it in a simple way, mimics the operation of the human brain. There are other technologies, such as genetic algorithms and rules-based systems, but most deep-learning systems are using neural networks.

Machine learning in software testing requires an entirely different approach. You will rarely, if ever, get the same result twice with the same input. Testing these systems requires a deep understanding of the problem domain and the ability to quantify the results you need in that domain. Are your results "good enough?" You have to internalize that a bug is more than just an unexpected output.

For machine learning in software testing, you should also have a high-level understanding of the learning architecture. You don't have to read the code, but you do have to be aware of the architecture of your network and how the algorithms interact with one another. You might have to tell the developers that they have to toss out their approach and start over again. Don't let the highly mathematical nature scare you. Machine learning in software testing is accessible to all testers with an open mind.

The machine-learning revolution is just starting; if you haven't encountered it by now, you likely will in the near future. With machine learning in software testing, you need to be comfortable with being able to measure and quantify your testing and objectively explain your confidence in the results.

Next Steps

It's time to take your test skills to the next level

Is data science in your testing future?

It's time for testers to get to know the business side

Dig Deeper on AI in software testing

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

How do you approach machine learning in software testing in your organization?
This depends greatly on the type of data your organization is trying to adapt to.  The main idea is to create fake data sets that should allow a predictable response from the machine learning algorithm.  For example, if the program is supposed to extract food oriented marketing information, then you would create fake usage profiles for people ranging from vegans to sugar junkies.
Test the training data and algorithm more than the output
First of all, all programmers must know there is no such thing as machine learning.  A more realistic definition is that a program can be made adaptive to different types of data.  
So for testing, there are areas to concentrate on.  

First is can the software recognize different data types?  You have to look at the program, determine what types of data it is supposed to be able to detect, and then create extreme to borderline data sets to see if they are recognized.  

Second you then have to ensure the correct adaptation path is activated when that data set is encountered?  

Thirds you have to consider what the programmers ignored or forget?  You have to imagine what other data sets are possible, and see what the software does or does not do.  This is the most challenging.  The biggest problem with neural nets is that the point is speed with large data sets, so programmers often make decisions based on early data samples, which ignore the fact later data could be entirely different.
One way to look at is can it be possible to make ML to learn on current execution. Let ML learn what data come in and for that input what data goes out (Initially keep some mechanism to validate whether system learnt/output is good or bad) and let late that learnt data and use same to apply it on testing with test data