In the world of software development, we often look for patterns that will help us both with coding or testing
applications. By analyzing the types of defects that are found in particular domain areas, we can create tools or tests that will catch those bugs. Jon Hagar, a consultant from Invision, has been doing research in preparation for an upcoming book, looking for patterns of defects found in embedded software. In this tip, we find out how embedded software is different from traditional applications and some of the biggest areas that cause problems in embedded software, according to Hagar’s findings.
How is embedded software different?
The first point of confusion is the question of what exactly constitutes embedded software. Hagar quotes from Wikipedia that embedded software is software "written for machines that are not, first and foremost computers." He adds that this software “depends on unique hardware to solve a specialized problem interacting with and/or controlling the ‘real world.’” Examples of embedded software would be the software that controls medical devices, planes, rockets, traffic control, routers, robots and cell phones. There is some debate whether or not software that runs on smart phones would be considered embedded, since smart phones could almost be considered mini-computers. However, in general, embedded software is dependent on the hardware on which it’s being run. Hagar notes these additional complications of embedded software:
- Software and hardware development cycles done in parallel, where aspects of the hardware may be unknown to the software development effort;
- Hardware problems which are often fixed with software late in the project;
- Limited or specialized user interfaces;
- Small amounts of dense complex functions often in the control theory or safety/hazard domains;
- Very tight real-time performance issues (often in mili- or micro- second ranges); and
- Highly limited resources (memory, processor speed, bandwidth, etc.)
The embedded software error taxonomy
Hagar created a table in which he split categories of bugs found in embedded software. He calls this table “an embedded software error taxonomy.”
In the table his columns list four different domain areas, Aerospace, Med sys 1, General risks 2, and General media 3. I asked Hagar for clarification on these. He explained each of these areas:
- Aerospace -- software in planes, spacecraft, military systems (controlled by FAA and DoD)
- Med sys 1 -- medical systems such as pacemakers, medical pumps, scanners, etc (controlled by FDA)
- General risks 2 -- transportation and industrial systems (cars, trains, elevators, factory control system, etc)
- General media 3 -- Other things (smart devices, cell phones, home electronics)
His rows described categories of errors found in embedded software. Though there were 25 different categories, the high-level categories consisted of timing, computation, data, logic, integration, commands and user interface (UI) errors.
Observations and most common types of defects by domain
Some interesting observations from looking at the table included that a whopping 43% of the errors found in the medical systems domain were caused from “Logic and/or control ordering.” This logic subcategory also accounted for 30% of the errors in the domain area labeled as “general media 3,” (smart devices, cell phones, and home electronics), which was also the highest percentage of bugs for that domain type.
The types of errors in the aerospace domain were spread throughout many of the categories, with the highest percentage of bugs, 16%, in the software-hardware interfaces subcategory. This also accounted for 13% of the general risks 2 bugs. However, two other categories tied for the highest percentage of bugs found in general risks 2 area. Data-pointers and UI-User / Operator Interface bugs each showed 23%, indicating that these were the two biggest problem areas for those devices falling into the general risks 2 category, transportation and industrial systems.
Are all of these really embedded software problems?
Some of the types of bugs do not seem related to embedded software development, but related to other factors, such as the programming language. For example, Hagar writes, "One report indicated as many as 70% errors were associated with C pointers." I asked him if this indicated more about the issues with the C-programming language rather than errors that are more prevalent within embedded software?
Hagar answers, “It was an industry report on embedded C software, but you are right, there is a general problem here associated with language; but a pointer problem that impacts say something like a memory leak can be ‘worse’ in embedded because you often cannot just ‘reboot’ the system if you are mid-flight. Thus, many embedded systems take special action or testing in such an area. If it is lack of C skill, bad humans, language issues or other root causes, maybe it is not as important as the fact that the embedded industry as a whole sees these bugs and failures, many of which can be easily avoided or detected, but we don't.”
More to come
Hagar has collected a lot of data and done quite a bit of analysis on the types of bugs that are prevalent in embedded systems. He and James Whittaker are working on a book which will provide a more in-depth look, not only at the types of bugs that you’ll find in embedded systems, but more importantly, how to attack them.