Device Driver Safety Through a Reference Validation Mechanism ∗

Device Driver Safety Through a Reference Validation Mechanism ∗ Dan Williams, Patrick Reynolds, Kevin Walsh, Emin Gun¨ Sirer, Fred B. Schneider Cornell University Abstract to isolate the operating system from accidental driver Device drivers typically execute in supervisor mode and faults, but these drivers retain sufficient I/O privileges thus must be fully trusted. This paper describes how to that they must still be trusted. move them out of the trusted computing base, by running This paper introduces a practical mechanism for exe- them without supervisor privileges and constraining their cuting device drivers in user space and without privilege. interactions with hardware devices. An implementation Specifically, device drivers are isolated using hardware of this approach in the Nexus operating system executes protection boundaries. Each device driver is given ac- drivers in user space, leveraging hardware isolation and cess only to the minimum resources and operations nec- checking their behavior against a safety specification. essary to support the devices it controls (least privilege), 1 These Nexus drivers have performance comparable to in- thereby shrinking the TCB. A system in which device kernel, trusted drivers, with a level of CPU overhead ac- drivers have minimal privileges is easier to audit and less ceptable for most applications. For example, the moni- susceptible to Trojans in third-party device drivers. tored driver for an Intel e1000 Ethernet card has through- Even in user space, device drivers execute hardware put comparable to a trusted driver for the same hardware I/O operations and handle interrupts. These operations under Linux. And a monitored driver for the Intel i810 can cause device behavior that compromises the integrity sound card provides continuous playback. Drivers for a or availability of the kernel or other programs. There- disk and a USB mouse have also been moved success- fore, our driver architecture introduces a global, trusted fully to operate in user space with safety specifications. reference validation mechanism (RVM) [3] that mediates all interaction between device drivers and devices. The RVM invokes a device-specific reference monitor to val- 1 Introduction idate interactions between a driver and its associated device, thereby ensuring the driver conforms to a device Device drivers constitute over half of the source code of safety specification (DSS), which defines allowed and, many operating system kernels, with a bug rate up to by extension, prohibited behaviors. seven times higher than other kernel code [10]. They The DSS is expressed in a domain-specific language are often written by outside developers, and they are less and defines a state machine that accepts permissible tran- rigorously examined and tested than the rest of the kernel sitions by a monitored device driver. We provide a com- code. Yet device drivers are part of the trusted computing piler to translate a DSS into a reference monitor that im- base (TCB) of every application, because the monolithic plements the state machine. Every operation by the de- architecture of mainstream operating systems forces device driver is vetted by the reference monitor, and oper- vice drivers to be executed inside the kernel, with high ations that would cause an illegal transition are blocked. privilege. Some microkernels and other research operat- The entire architecture is depicted in Figure 1. ing systems [2, 9, 21, 24] run device drivers in user space The RVM protects the integrity, confidentiality, and ∗Supported by NICECAP cooperative agreement FA8750-07- availability of the system, by preventing: 2-0037 administered by AFRL, AFOSR grant F49620-03-1-0156, Illegal reads and writes: Drivers cannot read or National Science Foundation Grants 0430161 and CCF-0424422 • (TRUST), ONR Grant N00014-01-1-0968, and Microsoft Corporation. modify memory they do not own. USENIX Association 8th USENIX Symposium on Operating Systems Design and Implementation 241 Unprivileged totype implementation demonstrates that this approach can defend against malicious drivers and that the perfor- Device drivers mance costs of this enhanced security are not prohibitive. User space 2 Device I/O Model Reference DSSes RVM Monitors Trusted Device drivers send commands to devices, check de- Compiler Kernel vice status using registers, receive notification of status Interrupts I/O changes through interrupts, and initiate bulk data transfers using direct memory access (DMA). How they do so Device constitutes a platform’s I/O model. Our work is targeted Device to the x86 architecture and PCI buses; what follows is a Figure 1: Safe user-space device driver architecture. brief overview of the I/O model on that platform. Similar features are found on other processors and buses. Modern buses implement device enumeration and Priority escalation: Drivers cannot escalate their • endpoint identification. Each device on a PCI bus is iden- scheduling priority. tified by a 16-bit vendor identifier and a 16-bit model Processor starvation: Drivers cannot hold the • number; the resulting 32-bit device identifier identifies CPU for more than a pre-specified number of time the device.2 Some devices with different model num- slices. bers may nonetheless be similar enough to share a single Device-specific attacks: Drivers cannot exhaust • driver and a single DSS. Device enumeration is a pro- device resources or cause physical damage to de- cess for identifying all devices attached to a bus; end- vices. point identification is the process of querying a device In addition, given a suitable DSS, an RVM can enforce for its type, capabilities, and resource requirements. site-specific policies to govern how devices are used. For Device enumeration and endpoint identification typi- example, administrators at confidentiality-sensitive or- cally occur at boot time. Interrupt lines and I/O regis- ganizations might wish to disallow the use of attached ters are assigned, according to device requests, to all de- microphones or cameras; or administrators of trusted vices discovered. Device identifiers govern which device networks might wish to disallow promiscuous (sniffing) drivers to load. Unrecognized devices, for which no DSS mode on network cards. is available, are ignored and are not available to drivers. One alternative to our approach for monitoring and Devices have registers, which are read and written by constraining device driver behavior is to use hardware drivers to get status, send commands, and transfer data. capable of blocking illegal operations. Hardware-based The registers comprise I/O ports (accessed using instruc- approaches, however, are necessarily limited to policies tions like inb and outb), memory-mapped I/O, and expressed in terms of hardware events and abstractions. PCI-configuration registers. Each register is identified An IOMMU [1, 4, 14, 23], for example, can limit the by a type and an address. Contiguous sets of registers ability of devices to perform DMA transfers to or from constitute a range, identified by type, base address, and physical addresses the associated drivers cannot read or limit (the number of addresses in the range). For all reg- write directly. IOMMUs, however, do not mediate as- ister types, accesses are parameterized by an address, a pects of driver and system safety that go beyond the size, and, for writes, a value of the given size. Write memory access interface [7]; for example, an IOMMU operations elicit no response; read operations produce a cannot prevent interrupt livelock, limit excessively long value of the given size as a response. Both operations interrupt processing, protect devices from physical harm can cause side effects on a device. by drivers, or enforce site-specific policies. As IOMMUs Devices that transfer large amounts of data typically become prevalent, our approach could leverage them as employ DMA rather than requiring a device driver to hardware accelerators for memory protection. transfer each word of data individually through device In sum, this paper shows how to augment common registers. Before initiating a DMA transfer, the device memory protection techniques with device-specific ref- driver typically sets a control register on the device to erence monitors to execute drivers with limited privilege point to a buffer in memory. Some devices can perform and in user space. The requisite infrastructure is small, DMA to or from multiple memory locations; in this case, easy to audit, and shared across all devices. Our pro- a control register might contain a pointer to a list, ring, 242 8th USENIX Symposium on Operating Systems Design and Implementation USENIX Association or tree structure with pointers to many buffers. Device performs low-level I/O operations, handles interrupts, or drivers using DMA transfers must first obtain from the initiates DMA transfers, and drivers for peripherals com- kernel one or more memory regions with known, fixed, municate with their devices through the controller driver. physical addresses. Hence, all communication is visible to a single reference Devices can be synchronous or asynchronous. Drivers monitor, which suffices to validate the operations of all must poll synchronous devices for completed operations drivers in the hierarchy. or changes in status. In contrast, when a driver submits Some devices, particularly high-performance network an operation to an asynchronous device, the driver yields cards and 3-D graphics cards, support loading and exe- the CPU until the device later signals its response (or cuting programs (e.g., for TCP offload or vertex shad- other status change) by interrupting the processor. When ing) on a processor on the device. Other devices may that interrupt occurs, the operating system invokes code support loading firmware, either ephemerally or perma- specified by the driver. In most cases, an interrupt must nently. Such programs and firmware change the way the be acknowledged by a driver, or the device will continue device behaves; thus, they must be trustworthy. Pro- to send the same interrupt. Interrupts can be prioritized grams and firmware are loaded through I/O operations relative to each other, but they generally occur with a or DMA, both of which can be monitored.

Load more