Often we test in “clean” conditions, simulating resources and users using the application as intended. But how do we test for conditions which aren’t so clean? Shmuel Gershon will be speaking on "Fuzzing and Fault Modeling for Product Evaluation" at the upcoming STAREAST conference.
SearchSoftwareQuality (SSQ) contributor Matt Heusser sat down with Gershon for a preview of what attendees might learn from Gershon’s presentation.
SSQ:Ok, you've got our attention. When you say "fuzzing and fault modeling" -- what exactly do you mean?
Shmuel Gershon: Matt, I can't deny the names are catchy.
When I say "fuzzing and fault modeling," I’m referring to a way of thinking, and to techniques that are unique in exercising tricky areas of a product.
Let me elaborate on that. During the development life cycle everyone is working in ideal and clean conditions. None of the resources will fail us, we have enough memory, and all inputs are provided by us over known channels. With these techniques, we are asking ourselves, “Are these pristine conditions enough? Are we preparing the product for real life, or we are raising it in a bubble?”
So fault models help you control resources and the environment to simulate potential faults you identify. Fuzzing will help you create input of such varied and random forms that it may resemble data-noise or a malicious attack.
SSQ: Let's look at an example. Take an application: windows-based, Web-based, you choose. What kind of defects can it find?
Gershon: Web-based would be a good example because they have commonly known inner workings. One of the many interfaces of an electronic store would be a search-by-query Web service interface. Such an interface has layers of parsers (Web server, database, e-commerce structure...) parsing layers of protocols and requests (HTTP, SOAP, SQL...), and each of these parsers is built on dangerous assumptions about the input. A “fuzzer” can help you generate a multitude of random inputs, looking for input handling problems in all parsing layers.
SSQ: Ok, I think I understand the potential. Now do me a favor and tell me how you run the test. Do you write a computer program? What tools might you use? And how can someone pull results out?
Gershon: Let's keep the same specific example. The strength of fuzzing lies in easy generation of random data structures, so instead of looking at the data format and deciding for yourself how to manipulate it, you let software do that for you. A fuzzer could do things like sending randomly created POST vars, and also random packets that violate the legitimate structure of HTTP (for example). You can write your own program like you mention, you can pick a tool ready for your protocol, or you can build on top of well-known frameworks. Fuzzing tools either generate data from scratch or manipulate existing data. (This example is a network packet, but it could be a file or a memory buffer as well).
How do you pull results out? Well, as with normal testing, there is no one pass-or-fail formula. But software surprised with such random inputs will behave in unexpected forms -- you want to be alert for big faults such as your application crashing, or for issues such as delays between responses, unusual logging behavior, and error messages.
SSQ: Now on to fault modeling. If we stay with the same application as before, how can we "fault model" it? When does that happen? What do we do differently as a result?
Gershon: Fault models are easier to understand. Our e-commerce example is a complex system, and the ways it can fault are multiple and complex as well. Fault modeling means mapping faults that can occur during operation, and creating a scenario to reach them. For example, what happens to the app when a request for memory is rejected? What happens when the disk is full? What if the network is lagging or latent? Some tests are simpler, like removing a necessary DLL File, but some require tweaking or hacking the environment, like leaving no network port available for connection. Fault modeling is freeing your mind to understand that there's a lot happening under the application's hood, and believing that the environment can be under your control.
SSQ:Do you think these technologies have broad applicability to software testing -- or not? If not, what niches you believe they are suited for?
Gershon: If your software has complex protocols or input structures then there may be a lot to discover with fuzzing. If your software has critical dependence on resources, then looking at their fault models may give you important information.
Let's try to think about examples of use of these techniques pushed to extremes: If we have an e-commerce operation the size of Amazon with myriad channels, sources and protocols for input and data, then fuzzing can be a major source of learning about how input handling can disrupt our services. If our imaginary product is a secluded weather data collection registering weather and environmental data in a disk, then using fault models to help us test any of these resources failing will help us learn about how to make this system more robust. (It can take long time until someone visits the station to discover problems).
SSQ:I think I understand how fuzzing can discover security vulnerabilities; can you speak to itsbroader general application?
Gershon: Sure. If during your fuzzing you reach a bad error message, an incorrect response, or part of your software moves into a different state than the rest, you'll be witnessing a functional issue that will teach you about problems in the app.
For the sake of diversity, let's pick a different example, like a text editor. I show in my talk a fuzzed file loaded in Open Office Writer bringing down the whole Open Office suite including Open Office Impress. In fact, arguably that can be interpreted as a security issue, but I can certainly see the functional problem in it.
SSQ:Where can someone go to learn more about this topic? Is there a specific website or tool you can recommend?
Gershon: I'm making my slides available, but they don’t include the demonstrations I show in my talk. The Web is packed with information about fuzzing: I like the original paper for its clarity, but it’s 20 years old, so I suggest to complete it with the first article in this publication, which not only explains fuzzing, but also gives examples of fuzzing with some tools.
My introduction to fault models was the book "How to break software," but the tools and apps on the CD are frustratingly out-of-date. So for lack of tools, as different contexts have different faults, we often build or write the tool we need according to needs.
SSQ:This is great stuff Shmuel; thank you for your time. One last question: Where can our readers go to learn more about you and your work?
Gershon: On my blog at http://testing.gershon.info they can find some of my articles and my email. I hang out on Twitter at @sgershon and can be found in Skype as 'sgershon'.
Shmuel Gershon learned about fuzzing and fault modeling while working for Intel Corporation as a software tester in Israel. Since then he has applied the concept on traditional software, firmware, and embedded systems. A sort of "international tester of mystery," in 2010 Shmuel acted as the lead organizer of the Danish Alliance, a tester’s gathering modeled after the Rebel Alliance, the invitation-only software test fraternity, of which Gershon is also a member.