If you're getting serious about test tooling, especially as a customer at the user interface level, you'll likely see your test time increase linearly. After a great start, things grow and grow. Eventually the tests take too long to run; by the time they finish, new changes have been made to the software.
Most of the attempts to add test tooling on top of an application simply fail, with automation delay as a common cause. By the time enough of the application is covered to significantly reduce test time, the company has spent a huge amount of time and money. Meanwhile, the automated behemoth is slow, brittle and hard to maintain. Automation delay can be a killer.
So what can be done? How can automated software testing be both effective and efficient? Here some approaches to accelerate your test practices by squeezing out redundant actions and aligning functions so that they can run simultaneously.
1. Eliminate redundant operations
One way to look at automated checks is to split them into three parts: setup, actual test and teardown.
Often, the setup is as long as the test itself. We have to set up users and add data to be searched for before we can perform the search that is the actual test. In some cases, setup will take more time than the test. This is especially true if we have to set up through the user interface.
Let's consider a simple user search test. First, we put the users in three different accounts: the main account that we are logged into, an account we have permission on and an account we do not. For each account we create two users, one with a name that will match our search terms and one that will not. Finally, when the setup is done, we can perform the search to see the users we should see and not see the users we should not. To set up users, you will need to fill out a web form, check your email, follow a link, log in and complete the rest of the profile details. If you do that through the user interface, setup could take 10 to 20 times as long as the test.
All of that setup work doesn't really give us any more confidence -- and tests can be unpredictable for reasons not easy to pin down anyway -- and it is unlikely to find bugs in the areas of create profile or account. Those features will be tested elsewhere and do not need to be tested again.
There are two common ways to get rid of these redundancy problems. The first is to have a test data set with all the users we might need in the future. These data sets can lead to setup problems, as they need to change over time to support new features. The second is to have some other, faster way to create the accounts and users, such as the command-line or through an API.
The most redundant operation of all might be login.
2. Tokenize login
For web and mobile applications, most or all of the interactive pieces happen behind a login. Having every test log in will take time. Before tackling that, consider how to speed up the tests by eliminating the tedious step of going to the homepage, clicking log in, typing in your username and password, clicking submit and verifying the logged-in page.
There are plenty of technical ways to do this. If your application performs basic authentication over the web, you can put the test user and password into the URL. Another common way is to have the code that runs the tests generate a session ID, then put that session ID into the cookie on the client. Depending on your environment, there may be other options to skip login, such as adding a token or code to your http headers.
Once you've cut setup and login, it’s time to change the nature of the test themselves from end-to-end to domain-to-database.
3. Create D-to-D tests
Brian Marick, one of the authors of the Agile Manifesto, once told me that computers are really good at doing things that do not require incredibly quick thinking. Humans, on the other hand, are inconsistent and slow, but they can think.
One classic way to think about end-to-end tests, as a user journey, allows the human the variety of experience to recognize, act and respond. It takes advantage of how humans are inconsistent to increase coverage over time. Reproducing that in automated tooling is, as software architect Titus Fortner puts it, the "worst of both worlds." You lose the computer's ability to do many things at once, and you lose the human investigative ability.
An alternative is what Fortner calls domain-to-database, or D-to-D, tests. These D-to-D tests have all the redundancy eliminated; they drive directly to exactly what the software will test. Then they exercise that function, look at the results and stop. A good D-to-D test might run for 30 seconds.
D-to-D tests are isolated. They contain everything they need to run and are designed to work while other things are happening on the system. This arrangement might require you to create unique users, accounts and orders so that another test can appear, perform a search and not see the old data the previous test created.
Once the redundancy is gone and the test is isolated and fast, it will be possible to split the tests and run a dozen of them at once.
4. Run tests in parallel
Teams can choose from many varieties of software testing, but, if time is a concern, you're going to want to select carefully.
Two hundred tests that each last one minute will run for over three hours. That assumes everything passes and the tests don't get stuck. Something might well go wrong. If the software waits to type into a textbook that does not appear because of an error, then waits to click on buttons that do not appear, and so on, the time could easily be twice that.
Once the team has isolated D-to-D tests, it will be possible to run several tests at the same time on different computers, often on a grid or in the cloud. A 16-node cluster takes that three hour runtime down to under 15 minutes. If the tests double because they fail to abort on failure, runtime is still limited to half an hour.
The challenge with running tests in parallel is almost entirely one of structure and isolation. Tests that reuse the same user ID and run at the same time may cause leakage, where the database is different for that user for a second test, causing different results that look like errors. Once that work is done, putting the tests in parallel is mostly a matter of technical infrastructure.
5. Use implicit waits and abort on failure
After a user clicks submit, you typically wait for a new form to arrive. The lazy way to wait is to hardcode some period of time to wait, such as sleep(30), which will sleep for 30 seconds. If the page is rendered in five seconds, we've just introduced 25 seconds of wait. Multiply that extra 25 seconds times 10 per test, then times the number of tests, and you start to understand why test suites take so long to run.
Most modern test tools have the ability to use something like wait_for_element_present, an explicit wait that will end when the element appears. Other options would be to write overrides to the click or type commands, to wait for an element before clicking into it.
The next problem is the failure case. If a button isn't there, the test software will then wait another 30 seconds to click on another element that is not there, then wait another 30 seconds to type into an element that is not there, and so on. Override the test framework to throw an exception, ending the test when a failure occurs, and you could save hours on a test run -- without meaningful loss of information.
6. Test less
The fastest automated software test is one that never starts.
Most teams that start off on test tooling need to retest everything, all the time, because any given change could break any other piece of the software system. The solution proposed is something W. Edwards Deming called mass inspection. Mass inspection was the problem with American automobiles in the 1970s. They would be built badly, checked and then rebuilt. The result was that American cars cost too much and had too many problems. Then Toyota started preventing defects. The system of poka-yoke, which is Japanese for avoid mistakes, combines continuous improvement with building the same product over and over again. So, for example, you could color code which tab goes into which slot.
In software, our assembly line is not repeatable. Every build of software is essentially a new product, with new and emergent risks. Yet it is possible to decouple systems so that we can deploy just one tiny piece -- say a change to the way search results are returned. With that in place, we do not need to perform a mass inspection. It is also possible to change coding habits to make regression defects much rarer or add monitoring and the ability to roll back a change. These combine to have us simply run and maintain a handful of the most important tests.
Putting It All Together
Most of the patterns above join together to one simple idea. Compose your GUI test tooling out of small, isolated pieces with vastly reduced redundancy. Essentially you chop up the application into slices that execute the full stack. Then, run several at the same time.
Along the way, take a hard look at why you have so many. Do you need them? See if you can change your engineering practices to reduce the requirement for automated tooling that runs through the full stack.