A common way to route information through embedded applications is with messages. Unfortunately the timing of their comings and goings can create the most troublesome problems. As a tester looking to avoid awkward system crashes and freezes, you should investigate what happens when events pepper the system in unusual ways. While the tester’s tools on
Duration and messages
Messages arrive at your system to perform some function. Sometimes they may quickly set a state variable or dump a message to a log. More typically, extended time is needed before the message’s business is completed. This elapsed time we will call the duration of an event. The longer the duration, the more time for unexpected interactions.
The five-fold catalog of events
Event style 1: Separated. First one event, then time passes and finally the other event.
Most system designers assume this disjointed occurrence of events. The first starts, does its work and concludes. Some time passes. Then the second starts, does its work and ends. This is a happy day case.
Alas, this is not a target rich environment for bug hunters so we move on to consider the others.
Event style 2: Overlap. There is no time in between.
Logically, this case has two events overlapping. The second event begins before the first one ends or the other way around.
The above diagram is the classic race condition example. It is easy to imagine the damage this can do. Testing is harder since the winner may not always be the same event. If a few of these are bumbling about, then your test results may become unpredictable and irreproducible.
Event style 3: Engulfed
This is another prime source of bugs. Here, the first event begins, then the second begins, but it ends before the first one ends. If this sequence is not anticipated by the programmer, then another race condition takes over the system.
This culprit is especially nasty whenever there is strong structural coupling between the parts of the system governed by the events. Almost inevitably there will be unexpected, undesired consequences.
For completeness, two more logical possibilities exist. They are true “edge” cases that may not be worth the expense to test.
Event style 4: Tag-team. The second event begins immediately after the first ends.
Event style 5: Twins. The events start and end at the same time. Two variants of this style are when they start together but one ends before the other, and when they end together but one starts first.
Now that all the logical cases are identified, how can this catalog help an embedded systems tester?
We test devices to play audio and video on embedded systems found in home networks. On any evening while parked in front of a television set, Mr. Modern Couch Potato might:
- Start recording a TV show and then watch it the next night. (Separated)
- Start a recording on one channel while changing to hear music on another channel. (Overlap)
- Start a recording and surf around among many other channels. (Engulfed)
- Listen passively as the game show host says, “We’ll be right back after this,” and then watch the commercials and rest of the show. (Tag-team)
- Record a show while watching it. (Twins)
These examples are black box tests. Most embedded systems cry out for black box testing. Unfortunately the tester’s opportunities for intervening in an embedded system may be greatly reduced because the interfaces are not easily accessed. Instead, your approach to testing is often limited to firing user actions that will cause many system events. If you have a way to atomically create raw, isolated system events in your embedded system, you are blessed.
However, white box testing is not impossible in an input challenged embedded system. Logs, specifically time-stamped logs, are your tools. While creating events to match each of the above event types may be too expensive, it may be easy to look for their natural occurrence. It has been said that the heart of automated testing is using computers to do what they are best at, not just as mechanical substitutes for human testers. Computers are really good at analyzing text (logs) and making simple comparisons (time-order events).
All the components in our systems must log messages. The overhead is surprisingly light. The log messages give a play-by-play of the messages moving through the embedded system. We run our black box tests but add an extra step. Tools are available which will analyze the messages looking for surprising sequences of messages.
Don’t give up because you cannot get your embedded system to do the simple unit testing you had hoped for. As you test the system, make a simple inventory of the occurrences of critical messages. Based on the time stamps, can you claim to have witnessed all the logical patterns? Based on appropriate code reviews, can you claim that only the right sequences will occur. Do you have the data, in the form of analyzed logs, to back up you claims? It’s a big job, but it’s the kind where computers, guided by the logical patterns, will excel.
This was first published in February 2011