Symdrive: Testing Drivers Without Devices

SymDrive: Testing Drivers without Devices Matthew J. Renzelmann, Asim Kadav and Michael M. Swift Computer Sciences Department, University of Wisconsin–Madison mjr,kadav,swift @cs.wisc.edu { } Abstract modes. Device-driver development and testing is a complex and Static analysis tools such as Coverity [17] and Mi- error-prone undertaking. For example, testing error- crosoft’s Static Driver Verifier [31] can find many bugs handling code requires simulating faulty inputs from the quickly. However, these tools are tuned for fast, rela- device. A single driver may support dozens of devices, tively shallow analysis of large amounts of code and there- and a developer may not have access to any of them. Con- fore only approximate the behavior of some code features, sequently, many Linux driver patches include the com- such as pointers. Furthermore, they have difficulty with ment “compile tested only.” bugs that span multiple invocations of the driver. Hence, SymDrive is a system for testing Linux and FreeBSD static analysis misses large aspects of driver behavior. drivers without their devices present. The system uses We address these challenges using symbolic execution symbolic execution to remove the need for hardware, and to test device drivers. This approach executes driver code extends past tools with three new features. First, Sym- on all possible device inputs, allows driver code to execute Drive uses static-analysis and source-to-source transfor- without the device present, and provides more thorough mation to greatly reduce the effort of testing a new driver. coverage of driver code, including error handling code. 2 Second, SymDrive checkers are ordinary C code and ex- DDT [26] and S E[14, 15] previously applied symbolic ecute in the kernel, where they have full access to kernel execution to driver testing, but these systems require sub- and driver state. Finally, SymDrive provides an execution- stantial developer effort to test new classes of drivers and, tracing tool to identify how a patch changes I/O to the in many cases, even specific new drivers. device and to compare device-driver implementations. In This paper presents SymDrive, a system to test Linux applying SymDrive to 21 Linux drivers and 5 FreeBSD and FreeBSD drivers without devices. SymDrive uses drivers, we found 39 bugs. static analysis to identify key features of the driver code, such as entry-point functions and loops. With this analy- 1 Introduction sis, SymDrive produces an instrumented driver with call- Device drivers are critical to operating-system reliability, outs to test code that allows many drivers to be tested with yet are difficult to test and debug. They run in kernel no modifications. The remaining drivers require a few mode, which prohibits the use of many runtime program- annotations to assist symbolic execution at locations that analysis tools available for user-mode code, such as Val- SymDrive identifies. grind [34]. Their need for hardware can prevent testing We designed SymDrive for three purposes. First, a altogether: over two dozen driver Linux and FreeBSD driver developer can use SymDrive to test driver patches patches include the comment “compile tested only,” in- by thoroughly executing all branches affected by the code dicating that the developer was unable or unwilling to run changes. Second, a developer can use SymDrive as a de- the driver. Even with hardware, it is difficult to test error- bugging tool to compare the behavior of a functioning handling code that runs in response to a device error or driver against a non-functioning driver. Third, SymDrive malfunction. Thorough testing of failure-handling code can serve as a general-purpose bug-finding tool and per- is time consuming and requires exhaustive fault-injection form broad testing of many drivers with little developer tests with a range of faulty inputs. input. Complicating matters, a single driver may support SymDrive is built with the S2E system by Chipounov dozens of devices with different code paths. For exam- et al. [14, 15] , which can make any data within a vir- ple, one of the 18 supported medium access controllers tual machine symbolic and explore its effect. SymDrive in the E1000 network driver requires an additional EEP- makes device inputs to the driver symbolic, thereby elim- ROM read operation while configuring flow-control and inating the need for the device and allowing execution on link settings. Testing error handling in this driver requires the complete range of device inputs. In addition, S2E en- the specific device, and consideration of its specific failure ables SymDrive to further enhance code coverage by mak- USENIX Association 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’12) 279 ing other inputs to the driver symbolic, such as data from (iii) efficiency. First, SymDrive must be able to find bugs the applications and the kernel. When it detects a failure, that are hard to find using other mechanisms, such as nor- either through an invalid operation or an explicit check, mal testing or static analysis tools. Second, SymDrive SymDrive reports the failure location and inputs that trig- must require low developer effort to test a new driver and ger the failure. therefore support many device classes, buses, and operat- SymDrive extends S2E with three major components. ing systems. Finally, SymDrive must be fast enough to First, SymDrive uses SymGen, a static-analysis and code apply to every patch. transformation tool, to analyze and instrument driver code 2.1 Symbolic Execution before testing. SymGen automatically performs nearly all the tasks previous systems left for developers, such as SymDrive uses symbolic execution to execute device- identifying the driver/kernel interface, and also provides driver code without the device being present. Symbolic hints to S2E to speed testing. Consequently, little effort execution allows a program’s input to be replaced with a is needed to apply SymDrive to additional drivers, driver symbolic value, which represents all possible values the classes, or buses. As evidence, we have applied SymDrive data may have. A symbolic-execution engine runs the to eleven classes of drivers on five buses in two operating code and tracks which values are symbolic and which systems. have concrete (i.e., fully defined) values, such as initial- Second, SymDrive provides a test framework that al- ized variables. When the program compares a symbolic lows checkers that validate driver behavior to be writ- value, the engine forks execution into multiple paths, one ten as ordinary C code and execute in the kernel. These for each outcome of the comparison. It then executes each checkers have access to kernel state and the parameters path with the symbolic value constrained by the chosen and results of calls between the driver and the kernel. A outcome of the comparison. For example, the predicate checker can make pre- and post-condition assertions over x>5 forks execution by copying the running program. In one copy, the code executes the path where x 5 and the driver behavior, and raise an error if the driver misbe- ≤ haves. Using bugs and kernel programming requirements other executes the path where x>5. Subsequent com- culled from code, documentation, and mailing lists, we parisons can further constrain a value. In places where wrote 49 checkers comprising 564 lines of code to en- specific values are needed, such as printing a value, the force rules that maintainers commonly check during code engine can concretize data by producing a single value reviews, such as matched allocation/free calls across entry that satisfies all constraints over the data. points, no memory leaks, and proper use of kernel APIs. Symbolic execution detects bugs either through ille- Finally, SymDrive provides an execution-tracing mech- gal operations, such as dereferencing a null pointer, or anism for logging the path of driver execution, including through explicit assertions over behavior, and can show the instruction pointer and stack trace of every I/O op- the state of the executing path at the failure site. eration. These traces can be used to compare execution Symbolic execution with S2E. SymDrive is built on a across different driver revisions and implementations. For modified version of the S2E symbolic execution frame- example, a developer can debug where a buggy driver di- work. S2E executes a complete virtual machine as the verges in behavior from a previous working one. We have program under test. Thus, symbolic data can be used any- also used this facility to compare driver implementations where in the operating system, including drivers and ap- across operating systems. plications. S2E is a virtual machine monitor (VMM) that We demonstrate SymDrive’s value by applying it to 26 tracks the use of symbolic data within an executing virtual drivers, and find 39 bugs, including two security vulner- machine. The VMM tracks each executing path within the abilities. We also find two driver/device interface viola- VM, and schedules CPU time between paths. Each path tions when comparing Linux and FreeBSD drivers. To the is treated like a thread, and the scheduler selects which best of our knowledge, no symbolic execution tool has ex- path to execute and when to switch execution to a differ- amined as many drivers. In addition, SymDrive achieved ent path. over 80% code coverage in most drivers, and is largely S2E supports plug-ins, which are modules loaded into limited by the ability of user-mode tests to invoke driver the VMM that can be invoked to record information or to entry points. When we use SymDrive to execute code modify execution. SymDrive uses plugins to implement changed by driver patches, SymDrive achieves over 95% symbolic hardware, path scheduling, and code-coverage coverage on 12 patches in 3 drivers. monitoring. 2 Motivation 2.2 Why Symbolic Execution? The goal of our work is to improve driver quality through Symbolic execution is often used to achieve high cover- thorough testing and validation.

Load more