On the Construction of Reliable Device Drivers Leonid Ryzhyk
Total Page:16
File Type:pdf, Size:1020Kb
On the Construction of Reliable Device Drivers Leonid Ryzhyk Ph.D. 2009 ii iii ‘I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously pub- lished or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this the- sis is the product of my own work, except to the extent that as- sistance from others in the project’s design and conception or in style, presentation, and linguistic expression is acknowledged.’ Signed .................................. Date .................................. iv Abstract This dissertation is dedicated to the problem of device driver reliability. Software defects in device drivers constitute the biggest source of failure in operating systems, causing sig- nificant damage through downtime and data loss. Previous research on driver reliability has concentrated on detecting and mitigating defects in existing drivers using static analysis or runtime isolation. In contrast, this dissertation presents an approach to reducing the number of defects through an improved device driver architecture and development process. In analysing factors that contribute to driver complexity and induce errors, I show that a large proportion of errors are due to two key shortcomings in the device-driver architecture enforced by current operating systems: poorly-defined communication protocols between drivers and the operating system, which confuse developers and lead to protocol violations, and a multithreaded model of computation, which leads to numerous race conditions and deadlocks. To address the first shortcoming, I propose to describe driver protocols using a formal, state-machine based, language, which avoids confusion and ambiguity and helps driver writers implement correct behaviour. The second issue is addressed by abandoning multithreading in drivers in favour of a more disciplined event-driven model of computation, which eliminates most concurrency-related faults. These improvements reduce the number of defects without radically changing the way drivers are developed. In order to further reduce the impact of human error on driver reliability, I propose to automate the driver development process by synthesising the implementation of a driver from the combination of three formal specifications: a device-class specification that de- scribes common properties of a class of similar devices, a device specification that describes a concrete representative of the class, and an operating system interface specification that describes the communication protocol between the driver and the operating system. This approach allows those with the most appropriate skills and knowledge to develop speci- fications: device specifications are developed by device manufacturers, operating system specifications by the operating system designers. The device-class specification is the only one that requires understanding of both hardware and software-related issues. However writing such a specification is a one-off task that only needs to be completed once for a class of devices. This approach also facilitates the reuse of specifications: a single operating-system specification can be combined with many device specifications to synthesise drivers for v vi multiple devices. Likewise, since device specifications are independent of any operating system, drivers for different systems can be synthesised from a single device specification. As a result, the likelihood of errors due to incorrect specifications is reduced because these specifications are shared by many drivers. I demonstrate that the proposed techniques can be incorporated into existing operating systems without sacrificing performance or functionality by presenting their implementation in Linux. This implementation allows drivers developed using these techniques to coexist with conventional Linux drivers, providing a gradual migration path to more reliable drivers. Acknowledgements I am grateful to my supervisors Gernot Heiser and Ihor Kuz for their guidance throughout the project. Working with them, I enjoyed a lot of freedom in choosing the research direction and exploring various ideas by way of trial and error, while getting advice, feedback, and support when I needed them. Throughout the project, several people have helped develop the ideas presented in this work. In particular, collaboration with Timothy Bourke on modelling device drivers in Esterel inspired the design of the Tingu driver protocol specification language. Rob van Glabbeek’s lectures on process calculus were a major influence on the design of the Termite driver synthesis tool. Introduction to the theory of two-player games by Franck Cassez was instrumental in developing the Termite synthesis algorithm. I want to thank Peter Chubb and Etienne Le Sueur for their collaboration in implement- ing and evaluating the Dingo driver framework. I want to thank Balachandra Mirla for his input in the SD controller driver synthesis case study. I want to thank John Keys for his help in implementing the Termite compiler. Finally, I want to thank all my colleagues at ERTOS whose knowledge, team spirit, and sense of humour have helped reduce the inevitable stress of being a graduate student. vii viii Related Publications [1] Leonid Ryzhyk, Peter Chubb, Ihor Kuz, Etienne Le Sueur, and Gernot Heiser. Auto- matic device driver synthesis with Termite. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles, Big Sky, MT, USA, October 2009. [2] Leonid Ryzhyk, Peter Chubb, Ihor Kuz, and Gernot Heiser. Dingo: Taming device drivers. In Proceedings of the 4th EuroSys Conference, Nuremberg, Germany, April 2009. [3] Leonid Ryzhyk, Ihor Kuz, and Gernot Heiser. Formalising device driver interfaces. In Proceedings of the 4th Workshop on Programming Languages and Operating Systems, Stevenson, Washington, USA, October 2007. [4] Leonid Ryzhyk, Timothy Bourke, and Ihor Kuz. Reliable device drivers require well- defined protocols. In Proceedings of the 3rd Workshop on Hot Topics in System Depend- ability, Edinburgh, UK, June 2007. ix x RELATED PUBLICATIONS Contents 1 Introduction 1 1.1 Improving OS support for device drivers with Dingo . 4 1.2 Automatic device driver synthesis with Termite . 4 1.3 Contributions ................................ 6 1.4 Chapteroutline ............................... 6 2 Background 7 2.1 The history of device drivers . 8 2.2 I/O hardware organisation in modern computer systems . 9 2.2.1 Peripheral Component Interconnect bus . 10 2.2.2 UniversalSerialBus . ...... ..... ...... ..... 13 2.3 Device drivers in modern operating systems . 15 2.4 GenericOSservices............................. 17 2.4.1 Memorymanagement ....................... 17 2.4.2 Timers ............................... 20 2.5 Concurrency and synchronisation in device drivers . 20 2.5.1 The Mac OS X workloop architecture . 21 2.5.2 Windows Driver Foundation synchronisation scopes . 21 2.6 Hotplugging ................................ 22 2.7 Powermanagement ............................. 22 2.8 A taxonomy of driver failures . 23 3 Related work 27 3.1 Hardware-based fault tolerance . 28 3.1.1 Drivers in capability-based computer systems . 29 3.1.2 User-level device drivers in microkernel-based systems . 29 3.1.3 Device driver isolation in monolithic OSs . 34 3.1.4 User-level device drivers in paravirtualised systems . 36 3.2 Software-based fault isolation . 37 3.3 Fault removal using static analysis . 39 3.4 Language-based fault prevention and fault tolerance . 42 xi xii CONTENTS 3.5 Preventing and tolerating device protocol violations . 46 3.6 Fault prevention through automatic device driver synthesis . 47 3.7 Conclusions................................. 48 4 Root-cause analysis of driver defects 49 4.1 Methodology ................................ 50 4.2 Example................................... 51 4.3 Defects caused by the complexity of device protocols . 53 4.4 Defects caused by the complexity of operating system (OS) protocols . 55 4.5 Concurrency defects . 58 4.6 Generic programming faults . 60 4.7 Limitationsofthestudy..... ...... ..... ...... ..... 60 4.8 Conclusions................................. 61 5 Device driver architecture for improved reliability 63 5.1 OverviewofDingo ............................. 64 5.2 An event-based architecture for drivers . 65 5.2.1 Cwithevents............................ 69 5.2.2 Implementation of C with events . 70 5.2.3 DingoonLinux........................... 71 5.2.4 Selectively reintroducing multithreading . 72 5.2.5 Comparison with existing architectures . 75 5.3 Tingu: describing driver software protocols . 75 5.3.1 Introduction to Tingu by example . 77 5.3.2 Discussion ............................. 86 5.3.3 Detecting protocol violations at runtime . 86 5.3.4 From protocols to implementation . 87 5.4 Evaluation.................................. 90 5.4.1 Codecomplexity .......................... 91 5.4.2 Reliability.............................. 91 5.4.3 Performance ............................ 93 5.5 Conclusions................................. 98 6 Automatic device driver synthesis with Termite 99 6.1 Motivation.................................. 99 6.2 Overview of driver synthesis . 101