Deker: Decomposing Commodity Kernels for Verification
Total Page:16
File Type:pdf, Size:1020Kb
TWC: Small: Deker: Decomposing Commodity Kernels for Verification Overview. Despite numerous ways in how the use of computer systems has evolved over the last decades, the software engineering technology behind the very core of the systems stack—an operating system kernel— remains unchanged since early computer systems. Starting as a relatively simple software layer aimed at providing isolation and multiplexing of hardware, modern kernels combine dozens of complex subsystems and consist of tens of millions of lines of code. Today, OS kernels are part of many mission critical systems. We trust these systems not only to run correctly when faced with thousands of development commits and massive re-engineering efforts, but also to withstand targeted security attacks. Unfortunately, modern kernels are still developed with a legacy software engineering techniques - a combination of an unsafe pro- gramming language, primitive concurrency primitives, and virtually no testing or verification tools. Today these systems are faulty and vulnerable. Even worse, complexity and size of modern kernels as they exist today will likely keep them beyond the reach of testing, static analysis, and software verification tools for years to come. Proposed research. The PIs propose creating Deker, a framework for decomposing and verifying com- modity operating system kernels. Deker turns a de-facto standard commodity operating system kernel into a collection of strongly isolated subsystems suitable for verification. Despite multiple decades of evolution and improvements in software verification tools, almost none of them made their way into regular industry practice. Deker aims to amend this using a holistic approach unifying modular redesign of legacy compo- nents with customized verification techniques. While decomposing the kernel and providing complete iso- lation of subsystems, Deker remains practical: retains source-level compatibility with the non-decomposed kernel, enables incremental adoption, and remains fast. As the main glue connecting decomposition and verification efforts, a rigorous interface definition language (IDL) is proposed for specifying protocols that govern decomposed subsystems. Explicit protocol specification is enabling easier verification and mainte- nance, while the accompanying IDL compiler subsequently facilitates automatic generation of appropriate stubs that are correct by construction - thereby justifying manual programmer effort that goes into writing IDL descriptions. Intellectual merit. The first contribution of this work is a set of techniques, principles, and tools enabling decomposition of a fully-featured operating system kernel in a practical manner. Deker develops patterns of decomposition as a set of recipes for decomposing legacy components. Deker relies on a powerful IDL to generate the glue code enabling transparent function invocation and object synchronization across share- nothing subsystems. Deker’s IDL defines disciplines for synchronizing object hierarchies and invoking isolated subsystems. The second main contribution is a custom verification framework that builds on top of Deker’s decomposed environment. The framework seamlessly integrates with the rest of Deker through IDL descriptions of subsystem interfaces that are leveraged to extract environment models needed for mod- ular verification and properties ensuring correct behavior of subsystems. The verification framework em- beds tailored algorithms for efficient handling of decomposed subsystems. Broader impacts. PIs expect that the proposed work will provide a foundation for mitigating the vast economic damage that is enabled by the programming errors and security vulnerabilities in modern OS kernels. By decomposing and verifying an unmodified, commodity OS kernel, Deker builds a practical foundation for verifiable systems. Many kinds of software faults, security attacks, malware botnets, and related activities will be largely eliminated. Deker will be implemented as part of the de-facto research and industry standard Linux operating system, and will be open source, directly benefiting the broader community. Finally, critical for advancing science, diversity of research ideas is only possible through diversity of their creators. This work will help a traditionally underrepresented students in the security and verification communities. PIs expect a female MS student to be a lead research contributor on Deker’s language mechanisms. Contents 1 Introduction 1 1.1 Deker: Verification through Decomposition . .2 2 Threat Model 3 3 Background and Related Work 4 3.1 Vulnerabilities in OS Kernels . .4 3.2 Kernel Decomposition . .5 3.3 Kernel Verification . .5 4 Preliminary work 6 5 Detailed Research Plan 7 5.1 Task 1: Getting Decomposed Subsystems Up and Running . .7 5.2 Task 2: Running SMACK on Representative Subsystems . .8 5.3 Task 3: Decomposition Patterns . .8 5.4 Task 4: Language Support for Decomposition and Verification . 10 5.5 Task 5: Support for Efficient Decomposed Environments . 11 5.6 Task 6: Tailored Verification Algorithms . 12 6 Team 13 7 Timeline and Management Plan 13 8 Broader Impacts of the Proposed Work 14 9 Results from Prior NSF Support 15 1 Introduction An operating system (OS) kernel is the single most critical part of the systems stack. The OS kernel ensures isolation, security, and access control for multiple mistrusting workloads and users. In a modern system, an attacker is one kernel vulnerability away from gaining control over the entire machine. A successful kernel attack provides the ability to make the threat persistent in face of reboots, conceal it from the user and anti- virus security tools, establish a platform for compromising local applications, collect sensitive financial information and user credentials, mount attacks on the network hosts, and establish distributed, peer-to- peer command and control infrastructure. Modern kernels are notoriously complex. A typical kernel code routinely employs manual management of low-level concurrency primitives, handles millions of object allocations and deallocations per second, implements numerous security and access control checks, and adheres to multiple conventions describing allocation, locking, and synchronization of kernel data structures in nearly every kernel function. Despite the rapid evolution of computer systems over the last four decades, modern OS kernels rely on the software development technology that has not changed since early computer systems. Due to rapid development rate (the de-facto industry standard Linux kernel features over 50 thousand commits a year) and a huge codebase (the latest version of the Linux kernel contains over 12 million lines of C/C++ and assembly code in 20141), bugs and vulnerabilities are routinely introduced into the kernel code. In 2014, the Common Vul- nerabilities and Exposures database lists 129 Linux kernel vulnerabilities that allow for privilege escalation, denial-of-service, and other exploits. This number is consistent across several years [21, 84]. Being ubiquitous, modern OS kernels are primary targets for security attacks. They provide an indus- try standard execution environment for nearly every consumer and enterprise device: home entertainment systems, routers, embedded devices, mobile, laptop, tablet, and desktop computers, enterprise workspace, and data center infrastructure. Today, OS kernels are a de-facto part of many mission critical systems rang- ing from embedded medical devices [99, 146] to industrial control systems [96, 117]. Attackers routinely employ sophisticated vulnerability discovery tools like black box fuzzers [5,54,110] and vulnerability scan- ners [105, 108, 126]. Without support from verification tools, industry standard OS kernels make nearly every computer system on the planet vulnerable. Despite numerous advances in software verification, static analysis, and testing tools, software verifica- tion community has admittedly largely failed to address the needs of an average industry-grade OS kernel developer. Apart from static analysis tools that perform only very shallow code analysis for simple classes of bugs (e.g., Coverity [25]), it is telling that by and large none of the more precise and powerful verification tools made inroads into OS industry practice—for example, to the best of our knowledge, none are regu- larly used in the Linux kernel development process. While scalability of software verifiers improved orders of magnitude in the last decade (e.g., SAGE [54]), existing legacy monolithic kernels are still beyond their reach due to their great complexity in terms of size, number of components, their elaborate interactions, and hardware dependencies. There have been approaches targeting particular subsystems in isolation (e.g., de- vice drivers [6,10,67,85,112,147]), but those require manually writing extensive environment specifications in a formalism typically understood only by verification experts. Such specifications are completely dis- joint from the actual source code they model and are hard to maintain, which means they quickly fall out of sync as code is evolving and become largely obsolete. Multiple projects attempt to re-implement kernel functionality from scratch in a safer, verification friendly language [22,57,64,86,149]. Although promising, these approaches are still far from being applicable in a realistic deployment. Modern kernels accumulate several decades of development effort that result in irreplaceable functionality: hundreds of device drivers, dozens network protocols, block storage stacks,