Secure Virtualization with Formal Methods by Cynthia Koren Levine

Secure Virtualization with Formal Methods by Cynthia Koren Levine Sturton A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley Committee in charge: Professor David Wagner, Chair Associate Professor Sanjit A. Seshia Assistant Professor Brian Carver Fall 2013 Secure Virtualization with Formal Methods Copyright 2013 by Cynthia Koren Levine Sturton 1 Abstract Secure Virtualization with Formal Methods by Cynthia Koren Levine Sturton Doctor of Philosophy in Computer Science University of California, Berkeley Professor David Wagner, Chair Virtualization software is increasingly a part of the infrastructure behind our online activ- ities. Companies and institutions that produce online content are taking advantage of the \infrastructure as a service" cloud computing model to obtain cheap and reliable computing power. Cloud providers are able to provide this service by letting multiple client operating systems share a single physical machine, and they use virtualization technology to do that. The virtualization layer also provides isolation between guests, protecting each from unwanted access by the co-tenants. Beyond cloud computing, virtualization software has a variety of security-critical applications, including intrusion detection systems, malware analysis, and providing a secure execution environment in end-users' personal machines. In this work, we investigate the verification of isolation properties for virtualization software. Large data structures, such as page tables and caches, are often used to keep track of emulated state and are central to providing correct isolation. We identify these large data structures as one of the biggest challenges in applying traditional formal methods to the verification of isolation properties in virtualization software. We present a new semi-automatic procedure, S2W , to tackle this challenge. Our approach uses a combination of abstraction and bounded model checking and allows for the verification of safety properties of large or unbounded arrays. The key new ideas are a set of heuristics for creating an abstract model and computing a bound on the reachability diameter of its state space. We evaluate this methodology using six case studies, including verification of the address translation logic in the Bochs x86 emulator, and verification of security properties of several hypervisor models. In all of our case studies, we show that our heuristics are effective: we are able to prove the safety property of interest in a reasonable amount of time (the longest verification takes 70 minutes to complete), and our abstraction-based model checking returns no spurious counter-examples. 2 One weakness of using model checking is that the verification result is only as good as the model; if the model does not accurately represent the system under consideration, properties proven true of the model may or may not be true of the system. We present a theoretical framework for describing how to validate a model against the corresponding source code, and an implementation of the framework using symbolic execution and satisfiability modulo theories (SMT) solving. We evaluate our procedure on a number of case studies, including the Bochs address translation logic, a component of the Berkeley Packet Filter, the TCAS suite, the FTP server from GNU Inetutils, and a component of the XMHF hypervisor. Our results show that even for small, well understood code bases, a hand-written model is likely to have errors. For example, in the model for the Bochs address translation logic { a small model of only 300 lines of code that was vigorously used and tested as part of our work on S2W { our model validation engine found seven errors, none of which affected the results of the earlier effort. i To my husband, with love and gratitude. ii Contents Contents ii List of Figures iii List of Tables iv 1 Introduction1 2 Background6 2.1 Virtualization Software.............................. 6 2.2 Model Checking.................................. 8 2.3 Related Work................................... 11 3 Verifying Large Data Structures using Small and Short Worlds 17 3.1 Running Example................................. 18 3.2 Formal Description of the Problem ....................... 19 3.3 Methodology ................................... 22 3.4 Evaluation..................................... 28 3.5 Related Work................................... 38 3.6 Conclusion..................................... 39 4 Model Validation 40 4.1 Running Example................................. 42 4.2 Theoretical Formulation and Approach..................... 43 4.3 Implementation.................................. 50 4.4 Evaluation: Data-Centric Validation ...................... 55 4.5 Evaluation: Operation-Centric Validation.................... 59 4.6 Related Work................................... 66 4.7 Conclusion..................................... 67 5 Conclusion 68 Bibliography 69 iii List of Figures 2.1 An overview of the model checking work flow.................... 11 3.1 An illustration of memory with a simple cache. .................. 18 3.2 The UCLID expression syntax. ........................... 19 3.3 An illustration of a page table walk. ........................ 29 3.4 An illustration of memory with a CAM-based cache. ............... 33 3.5 An illustration of shadow page tables......................... 35 4.1 An overview of the model validation work flow. .................. 42 4.2 Example code and corresponding model....................... 43 4.3 The five steps in our model validation process.................... 50 4.4 Example code with a dynamically determined loop bound............. 54 4.5 Simplified code from the BPF program........................ 56 4.6 An illustration of the BPF program. ........................ 56 4.7 Simplified code from the ftpd software........................ 57 4.8 An illustration of the ftpd program.......................... 58 4.9 Simplified code from the XMHF software. ..................... 59 iv List of Tables 3.1 The model of the Bochs TLB............................. 30 3.2 Next-state assignments for the shadow paging model................ 35 4.1 Bochs modeling bugs (6 of 7) ............................ 62 4.2 Code and path coverage for model validation. ................... 65 4.3 Types of modeling bugs found. ........................... 65 v Acknowledgments I thank my advisor, David Wagner, for his support and guidance throughout my graduate studies. His keen insights and deep understanding of all things security strengthened my research, and I learned a tremendous amount from working with him. I am also grateful to him for helping me to develop collaborations with folks outside of Berkeley's computer science division. They have been integral to my growth and development as a researcher. Any success I have had is due in large part to those collaborations. I thank my committee members, Brian Carver and Sanjit Seshia, and my collaborators on the work appearing in this thesis: Rohit Sinha, Michael McCoyd, Sakshi Jain, Thurston Dang, and Petros Maniatis. I would especially like to thank Sanjit. Through collaborations with him on this and other work, I have learned to appreciate the power, and limitations, of using formal verification in a security context. His encouragement and advice throughout the years has made him a valuable mentor and greatly enriched my graduate school experience. Although our work together is not a part of this thesis, I would like to thank Sam King and Matthew Hicks for fun, and fruitful, collaborations over the years. I am particularly grateful to Sam for his interest in, and encouragement of my studies and future career. I am grateful to Colleen Lewis for her friendship. We have spent hours together laughing, stressing, and working, and it has all made for a wonderful graduate school experience. I am grateful to my family for their ongoing love and encouragement. And I am deeply grateful to my husband. He has supported me with love, kindness, generosity, and delicious food throughout the long and circuitous path that led to this point. 1 Chapter 1 Introduction Virtualization software, such as CPU emulators, virtual machine monitors (VMM), or hyper- visors (HV), provides many practical benefits. It typically sits below the operating system and adds a layer of indirection between the operating system and the hardware platform. This is useful for a variety of applications. It can be used to present an instruction set architecture different than that of the actual hardware platform, which is useful for the development of operating systems and applications for new platforms. It can be used to mul- tiplex hardware resources to allow multiple operating systems to co-exist on one platform. And it provides a vantage point below the operating system for more complete monitoring and analysis of the system, which is useful in the development and testing of new operating systems, and for malware analysis. Virtualization is useful; however, virtualization software is typically complex and executes with high privilege levels, making it especially vulnerable to attack. In this work, we seek to verify the correctness of security-critical components of virtualization software. We investigate the use of formal verification techniques to prove properties about the security of system virtualization software in order to increase the overall security of the systems that rely

Load more