Attack Surface Metrics and Automated Compile-Time OS Kernel Tailoring
Total Page:16
File Type:pdf, Size:1020Kb
Attack Surface Metrics and Automated Compile-Time OS Kernel Tailoring Anil Kurmus1, Reinhard Tartler2, Daniela Dorneanu1, Bernhard Heinloth2, Valentin Rothberg2, Andreas Ruprecht2, Wolfgang Schroder-Preikschat¨ 2, Daniel Lohmann2, and Rudiger¨ Kapitza3 1IBM Research - Zurich 2Friedrich-Alexander University Erlangen-Nurnberg¨ 3TU Braunschweig Abstract tem administrators who rely on a distribution-maintained kernel for the daily operation of their systems. On the The economy of mechanism security principle states that Linux distributor side, kernel maintainers can make only program design should be kept as small and simple as possi- very few assumptions on the kernel configuration for their ble. In practice, this principle is often disregarded to max- users: Without a specific use case, the only option is to en- imize user satisfaction, resulting in systems supporting a able every available configuration option to maximize the vast number of features by default, which in turn offers at- functionality. The ever-growing kernel code size, caused by tackers a large code base to exploit. The Linux kernel ex- the addition of new features, such as drivers and file sys- emplifies this problem: distributors include a large number tems, at an increasing pace, indicates that the Linux kernel of features, such as support for exotic filesystems and socket will be subject to ever more vulnerabilities. types, and attackers often take advantage of those. In addition, as a consequence of the development, test- A simple approach to produce a smaller kernel is to man- ing, and patching process of large software projects, the less ually configure a tailored Linux kernel. However, the more a functionality is used, the more likely it is to contain de- than 11;000 configuration options available in recent Linux fects. Indeed, developers mostly focus on fixing issues that versions make this a time-consuming and non-trivial task. are reported by their user base. As rarely used functionali- We design and implement an automated approach to pro- ties only account for reliability issues in a small portion of duce a kernel configuration that is adapted to a particular the user base, this process greatly improves the overall re- workload and hardware, and present an attack surface eval- liability of the software. However, malicious attackers can, uation framework for evaluating security improvements for and do, still target vulnerabilities in those less-often-used the different kernels obtained. Our results show that, for functionalities. A recent example from the Linux kernel is real-world server use cases, the attack surface reduction an arbitrary kernel memory read and write vulnerability in obtained by tailoring the kernel ranges from about 50% to the reliable datagram sockets (RDS) (CVE-2010-3904), a 85%. Therefore, kernel tailoring is an attractive approach rarely used socket type. to improve the security of the Linux kernel in practice. If the intended use of a system is known at kernel com- pilation time, an effective approach to reduce the kernel’s attack surface is to configure the kernel to not include un- 1 Introduction needed functionality. However, finding a suitable configu- ration requires extensive technical expertise about currently more than 11;000 Linux configuration options, and needs to The Linux kernel is a commonly attacked target. In 2011 be repeated at each kernel update. Therefore, maintaining alone, 148 Common Vulnerabilities and Exposures (CVE)1 such a custom-configured kernel entails considerable main- entries for Linux have been reported, and this number is ex- tenance and engineering costs. pected to grow every year. This is a serious problem for sys- Moreover, while it is widely accepted that making pro- 1http://cve.mitre.org/ grams “smaller” improves security, quantitatively measur- 1 ing security improvements remains a difficult and impor- comparison of the effects of these choices on our mea- tant problem [49]. Existing work on system security of- surements. ten measures improvements in terms of Trusted Comput- ing Base (TCB) reduction, which in practice often trans- • A tool that, given the kernel sources and run-time lates into a measurement of the total number of source lines traces characterizing a use case, produces a small ker- of code (SLOC) (e.g., [19, 34]). Although these metrics nel configuration, taking into account LKMs, which are sensible (as every line of code can have a vulnerabil- includes all kernel functionalities necessary for the ity) and easy to obtain, they can be imprecise. For instance, workload. on a given kernel configuration, a large part of the kernel • An evaluation of the attack surface reduction as well sources will not be compiled, many parts will only be com- as performance results in the case of a LAMP-based piled as kernel modules which might never be loaded, and server and a workstation providing access to files via some functions might simply not be within reach of an at- NFS. tacker. This paper presents metrics for quantifying the security The remainder of this paper is structured as follows: Sec- of an OS kernel and a tool-assisted approach to automat- tion 2 defines the notions of attack surface and attack sur- ically determine a kernel configuration that enables only face measurement, as well as a set of attack surface met- kernel functionalities that are actually necessary in a given rics that can be used in practice for our evaluation. Sec- scenario. Although it is easy to quantify the size of the re- tion 3 presents an overview of the tailoring process, and sulting kernel binaries, this is not convincing evidence that the implementation of the underlying automated kernel- the resulting kernel indeed presents less of an attack surface configuration-tailoring tool. Section 4 evaluates the attack to potential attackers. Hence, after defining what attack sur- surface reduction and performance of such an approach in face means, we quantify the security gains in two distinct two use cases and with several attack surface reduction met- security models in terms of attack surface reduction. The rics. These results are then discussed in Section 5. Section 6 first security model considers that the entire kernel can be presents related work. The paper concludes in Section 7. subject to attacks and is therefore a good reference for com- parison to previous work, whereas the second considers the 2 Security Metrics scenario of a restricted attacker, and is a good reference for evaluating the security improvements of configuration tai- In this section, we present two distinct security models, loring in the context of unprivileged local attackers. Our and, for each of them, security metrics (attack surface mea- measurements take into account the static call graph of the surements) which we use in Section 4 to evaluate and quan- kernel and the possible entry points of the attacker to pro- tify the security of a running Linux kernel. The dependence vide a more accurate comparison. between notions defined or used in this section are summa- Our automated kernel-tailoring approach builds on our rized in Figure 1. previous work [51], and extends it with multiple improve- ments, including loadable kernel module (LKM) support. 2.1 Preliminary definitions When compared to other hardening solutions, a notable ad- vantage of the kernel-configuration tailoring approach is Definition 1 (Call graph). A call graph is a directed graph that it makes no changes to the source code of the kernel: (F;C), where F ⊆ is the set of nodes and represents the therefore, it is impossible to introduce new defects into the N set of functions as declared in the source of a program, and kernel source. This approach uses run-time traces as input C ⊆ F × F the set of arcs, which represent all direct and for deducing a suitable kernel configuration, and we show indirect function calls. We denote the set of all call graphs it to work equally well in different use cases. We detail by G . the use our tool to tailor a “Linux, Apache, MySQL and PHP (LAMP) stack” kernel on server hardware, as well In practice, static source code analysis at compile time as a network file system (NFS) running on a workstation. (that takes all compile-time configuration options into ac- We obtain comparative measurements of the tailored ker- count) is used to obtain such a call graph. nels that show that configuration-tailoring incurs no over- head and no stability issues, while greatly reducing the at- Definition 2 (Entry and barrier functions). A security tack surface in both security models. model defines a set of entry functions E ⊆ F, which cor- The major contributions of this paper are: responds to the set of functions directly callable by an at- tacker, and a set of barrier functions X ⊆ F, which corre- • A definition of an attack surface and an attack-surface sponds to the set of functions that, even if reachable, would metric based on static call graphs and security models; prevent an attacker from progressing further into the call examples of metrics satisfying this definition, and a graph. Application Application Application Application Security Entry and barrier Attack surface model functions (privileged) (unprivileged) (privileged) (unprivileged) Attack surface measurement LKM LKM System call interface System call interface Program source Call graph: Attack surface (on-demand (on-demand and configuration functions and calls metric loadable) loadable) Core Kernel LKM LKM Core Kernel LKM LKM (driver) (driver) LKM Hardware interface LKM Hardware interface Figure 1. Dependencies between notions de- (other) (other) attacker fined in this section. Hardware Hardware attack surface partial a.s. running kernel E would typically be the interface of the program that is exposed to the attacker, whereas X would typically be Figure 2. On the left, the GENSEC model. On the set of functions that perform authorization for privileges the right, the ISOLSEC model.