A Practical, Lightweight, and Flexible Confinement Framework in Ebpf

A Practical, Lightweight, and Flexible Confinement Framework in eBPF by William P. Findlay A thesis submitted to the Faculty of Graduate and Postdoctoral Affairs in partial fulfillment of the requirements for the degree of Master of Computer Science August, 2021 Carleton University Ottawa, Ontario © 2021 William P. Findlay To my parents and grandparents, for believing in me even when I didn't believe in myself. i Abstract Confining operating system processes is essential for preserving least-privilege access to system resources, hardening the system against successful exploitation by malicious actors. Classically, confinement on Linux has been accomplished through a variety of disparate confinement primitives, each targeting a different aspect of process behaviour and each with its own set of policy semantics. This has led to difficulties in realizing practical confinement goals due to the complexities, inter-dependence relationships, and semantic gaps that arise from recombining existing confinement primitives in unintended ways. Linux containers are a particularly poignant example of this phenomenon, with existing container security policies often being overly complex and overly permissive in practice. To better isolate user processes and achieve practical confinement goals, we argue that novel confinement mechanisms are needed to bridge the semantic gap between security policy and enforcement. We hypothesize that a new Linux kernel technology, eBPF, enables the creation of precisely such a confinement mechanism. eBPF programs can be dynamically loaded into the kernel by a privileged process and are checked for safety be- fore they run in kernelspace. This approach affords an opportunity to create an adoptable, container-specific confinement mechanism without tying the kernel down to one specific implementation. Further, an eBPF-based confinement solution can be loaded and unloaded at runtime, without updating or even restarting the operating system kernel; this prop- ii erty enables rapid prototyping and debugging, similar in spirit to how we debug userspace applications in practice. In this thesis, we present the design and implementation of two novel confinement so- lutions based on eBPF, BPFBox and its successor, BPFContain. We discuss issues in the Linux confinement space that motivated the creation of BPFBox and BPFContain, discuss policy examples, and present the results of a performance evaluation and informal security analysis. Results from this research indicate that BPFBox and BPFContain incur modest overhead despite their increased flexibility over existing Linux security solu- tions. We also find that there may be significant opportunities to improve BPFBox and BPFContain and to introduce future security mechanisms based on eBPF. iii Acknowledgements I would first and foremost like to thank my thesis supervisors, Dr. Anil Somayaji and Dr. David Barrera for their constant support, sage advice, and invaluable feedback (particularly on early drafts of this document). I feel confident in saying that I would not have reached this point in my graduate school career if not for their dedication and encouragement. I am also grateful to my other committee members, Dr. Lianying (Viau) Zhao, Dr. Paula Branco, and Dr. Frank Dehne for taking the time to read and evaluate my work. I would also like to thank the professors and fellow members of the CCSL/CISL sister labs for their valuable feedback on early iterations of my work, and for providing a stimulating environment to learn, grow, and foster my passion for operating system security. I am indebted to the innumerable members of the BPF and Linux kernel development community, whose hard work and dedication to free and open-source software are reflected in the very foundations of this research. In particular, I would like to acknowledge Alexei Starovoitov and Daniel Borkmann for creating eBPF, Andrii Nakryiko for his work on libbpf and CO-RE, and K.P. Singh for his work on bringing LSM hook support to BPF. Many other members of the BPF community have proved invaluable sources of inspiration and guidance throughout my academic career. While they are too many to name here, I appreciate them all the same. Lastly, I would like to thank my friends and family for their continued and unwavering support throughout this endeavour (and for many more endeavours to come). iv Prior Publication A publication and pre-print have arisen as a direct result of the research in this thesis. While these works represent joint contributions of all authors, any sections reproduced in this thesis represent the sole work of the thesis author, with editorial and positioning contributions by co-authors. Each work is listed below. Chapter4 contains text and ideas from our paper \ BPFBox: Simple Precise Process Confinement in eBPF (Extended BPF)" [58], co-authored with Anil Somayaji and David Barrera, and published at the Cloud Computer Security Workshop (CCSW) 2020 as part of the ACM CCS conference. Chapter5 contains some text and ideas from our paper \ BPFContain: Fixing the Soft Underbelly of Container Security" [57], co-authored with David Barrera and Anil Somayaji. An early draft of this work is available on the arXiv pre-print database, although it differs substantially from the version presented in this thesis. v Contents Abstract ii Acknowledgements iv Prior Publicationv List of Figuresx List of Tables xi List of Code Listings xii 1. Introduction1 1.1. Research Questions............................4 1.2. Motivation.................................4 1.2.1. Contextualizing the Problem...................4 1.2.2. Why Design a New Confinement Framework?.........7 1.2.3. Why eBPF?............................8 1.3. Contributions............................... 10 1.4. Outline................................... 11 2. Background and Related Work 13 2.1. Confinement in Operating Systems................... 14 2.2. Classic Unix Process Security Model.................. 15 2.2.1. The Reference Monitor...................... 16 2.2.2. Virtual Memory and Memory Protection............ 18 2.2.3. Discretionary Access Control................... 20 vi Contents 2.3. Extensions to the Unix Security Model................. 29 2.3.1. POSIX Capabilities........................ 29 2.3.2. Mandatory Access Control.................... 31 2.3.3. System Call Filtering and Capabilities............. 37 2.3.4. Taint Tracking.......................... 44 2.4. Process-Level Virtualization....................... 45 2.5. Containers and Virtual Machines.................... 49 2.5.1. Container Security........................ 50 2.6. Extended BPF.............................. 59 2.6.1. Origins of BPF: Efficient Packet Filtering and Beyond.... 59 2.6.2. eBPF Programs.......................... 64 2.6.3. eBPF Maps............................ 70 2.6.4. Userspace Front Ends....................... 71 2.6.5. Comparing eBPF with Loadable Kernel Modules....... 73 3. The Confinement Problem 77 3.1. Rethinking the Virtualization Narrative................. 78 3.2. Fundamental Issues with Linux Confinement.............. 81 3.3. How Containers Apply Confinement Primitives............. 86 3.4. Design Goals............................... 89 3.5. Why Two Implementations?....................... 92 3.6. The BPFBox and BPFContain Threat Model........... 92 3.6.1. Differences Between BPFBox and BPFContain ....... 93 3.6.2. The Adversary and Attack Vectors............... 93 3.7. Summary................................. 94 4. BPFBox: A Prototype Process Confinement Mechanism 96 4.1. BPFBox Overview............................ 97 4.1.1. Policy Enforcement at a High Level............... 98 4.2. BPFBox Implementation........................ 100 4.2.1. Architectural Overview...................... 100 4.2.2. BPFBox Policy Enforcement.................. 102 4.2.3. Managing Process State..................... 106 4.2.4. Context-Aware Policy...................... 108 4.2.5. Collecting and Logging Audit Data............... 110 4.3. BPFBox Policy Language........................ 111 4.3.1. Filesystem Rules......................... 111 4.3.2. Network Rules.......................... 114 4.3.3. Signal Rules............................ 115 vii Contents 4.3.4. Ptrace Rules............................ 116 4.3.5. Allow, Taint, and Audit Decorators............... 116 4.3.6. Func and Kfunc Decorators................... 117 4.4. State of the BPFBox Implementation................. 118 4.5. Summary................................. 118 5. BPFContain: Extending BPFBox to Model Containers 120 5.1. BPFBox's Limitations and the Transition Toward BPFContain .. 121 5.1.1. Motivating BPFContain .................... 123 5.2. BPFContain Overview......................... 125 5.2.1. Policy Enforcement at a High Level............... 126 5.3. BPFContain Implementation..................... 129 5.3.1. Architectural Overview...................... 129 5.3.2. Policy Deserialization and Loading............... 131 5.3.3. Policy Enforcement........................ 134 5.3.4. Default Policy........................... 137 5.3.5. Managing Container State.................... 141 5.3.6. Collecting and Logging Audit Data............... 143 5.4. BPFContain Policy Language..................... 143 5.4.1. File and Filesystem Rules.................... 145 5.4.2. Device Rules........................... 147 5.4.3. Network Rules.......................... 148 5.4.4. IPC Rules............................. 149 5.4.5. Capability Rules......................... 150 5.5. Improvements Over BPFBox ...................... 151 5.5.1. Minimizing Runtime Dependencies..............

Load more