Address Space Layout Permutation (ASLP): Towards Fine-Grained Randomization of Commodity Software
Total Page:16
File Type:pdf, Size:1020Kb
Address Space Layout Permutation (ASLP): Towards Fine-Grained Randomization of Commodity Software Chongkyung Kil,∗ Jinsuk Jun,∗ Christopher Bookholt,∗ Jun Xu,† Peng Ning∗ Department of Computer Science∗ Google, Inc.† North Carolina State University {ckil, jjun2, cgbookho, pning}@ncsu.edu [email protected] Abstract a memory corruption attack), an attacker attempts to alter program memory with the goal of causing that program to Address space randomization is an emerging and behave in a malicious way. The result of a successful at- promising method for stopping a broad range of memory tack ranges from system instability to execution of arbitrary corruption attacks. By randomly shifting critical memory code. A quick survey of US-CERT Cyber Security Alerts regions at process initialization time, address space ran- between mid-2005 and 2004 shows that at least 56% of the domization converts an otherwise successful malicious at- attacks have a memory corruption component [24]. tack into a benign process crash. However, existing ap- Memory corruption vulnerabilities are typically caused proaches either introduce insufficient randomness, or re- by the lack of input validation in the C programming lan- quire source code modification. While insufficient random- guage, with which the programmers are offered the free- ness allows successful brute-force attacks, as shown in re- dom to decide when and how to handle inputs. This flex- cent studies, the required source code modification prevents ibility often results in improved application performance. this effective method from being used for commodity soft- However, the number of vulnerabilities caused by failures ware, which is the major source of exploited vulnerabilities of input validation indicates that programming errors of this on the Internet. We propose Address Space Layout Permu- type are easy to make and difficult to fix. Ad Hoc methods tation (ASLP) that introduces high degree of randomness such as StackGuard [8] only target at specific types of at- (or high entropy) with minimal performance overhead. Es- tacks. Static code analyzers can be used to find such bugs at sential to ASLP is a novel binary rewriting tool that can compile time. Due to the inherent difficulties in deeply an- place the static code and data segments of a compiled exe- alyzing C code, these analyzers often make strong assump- cutable to a randomly specified location and performs fine- tions or simplification that lead to a significant number of grained permutation of procedure bodies in the code seg- false positives and false negatives. Methods such as CCured ment as well as static data objects in the data segment. We [18] offer a way to guarantee that a program is free from have also modified the Linux operating system kernel to per- memory corruption vulnerabilities. However, such meth- mute stack, heap, and memory mapped regions. Together, ods incur significant runtime overhead that hinders produc- ASLP completely permutes memory regions in an applica- tion deployment. In addition, most of the aforementioned tion. Our security and performance evaluation shows min- approaches require access to the source code. This is par- imal performance overhead with orders of magnitude im- ticularly a problem for commodity software, for which the provement in randomness (e.g., up to 29 bits of randomness source code is typically unavailable. on a 32-bit architecture). While bug detection and prevention techniques are pre- ferred, continued discoveries of memory corruption vulner- 1 Introduction abilities indicate alternatives must be sought. We believe Memory corruption vulnerability has been the most com- that mitigating this kind of attacks would give attackers sig- monly exploited one among the software vulnerabilities that nificantly fewer ways to exploit their targets, thereby re- allow an attacker to take control of computers. Examples of ducing the threat they pose. One promising method is ad- memory corruption attacks include buffer overflows [19], dress space randomization [26]. It has been observed that format string exploits [20], and double-free attacks [1]. In most attacks use absolute memory addresses during mem- an attack exploiting a memory corruption vulnerability (or, ory corruption attacks. Address space randomization ran- 1 domizes the layout of process memory, thereby making the tions of this paper are as follows: critical memory addresses unpredictable and breaking the • ASLP provides probabilistic protection an order of hard-coded address assumption. As a result, a memory cor- magnitude stronger than previous techniques. ruption attack will most likely cause a vulnerable program to crash, rather than allow the attacker to take control of the • ASLP randomizes regions throughout the entire user program. memory space, including static code and data seg- Several address space randomization techniques have ments. Program transformation is automatically done been proposed [2, 3, 22, 23, 26]. Among the existing by our binary rewriting tool without the requirement of approaches, PaX Address Space Layout Randomization source code modification. (ASLR) [23] and address obfuscation [3] are most visible. • The performance overhead is generally very low (less PaX ASLR randomly relocates the stack, heap, and shared than 1%). In comparison, existing techniques that are library regions with kernel support, but does not efficiently capable of randomly relocating static code and data randomize locations of code and static data segments. Ad- segments, in particular PIE and Address obfuscation, dress obfuscation expands the randomization to the static incur more than 10% overhead on average. code and data regions by modifying compiler but requires source code access and incurs 11% performance overhead The rest of this paper is organized as follows. Section 2 on average. Position-independent executables (PIE) [9] al- discusses related work. Section 3 describes the design and lows a program to run as shared object so the base ad- implementation of ASLP. Section 4 presents the evaluation dress of the code and data segment can be relocatable, of ASLP. Section 5 describes the limitations of current im- but it also incurs 14% performance degradation on aver- plementation, and Section 6 concludes the paper. age (shown in our performance evaluation presented later 2 Related Work in the paper). While insufficient randomness allows suc- cessful brute-force attacks, as shown in recent studies, the The seminal work on program randomization by Forrest required source code modification and performance degra- et al. illustrated the value of diversity in ecosystems and dation prevent these effective methods from being used for similar benefits for diverse computing environments [10]. commodity software, which is the major source of exploited In short, their case for diversity is that if a vulnerability vulnerabilities on the Internet. is found in one computer system, the same vulnerability is likely to be effective in all identical systems. By introduc- In this paper, we propose address space layout permu- ing diversity into a population, resistance to vulnerabilities tation (ASLP) to increase the programs’ randomness with is increased. Address randomization achieves diversity by minimal performance overhead. ASLP permutes all sec- making the virtual memory layout of every process unique tions (including code and static data) in the program ad- and unpredictable. Existing address space randomization dress space. This is done in two ways. First, we cre- approaches can be divided into two categories: user level ate a novel binary rewriting tool that randomly relocates and kernel level. User level and kernel level approaches static code and data segments, randomly re-orders functions achieve randomization with modification to the user space within code segment, and data objects within data segment. applications and the kernel, respectively. Both approaches Our rewriting tool operates directly on compiled program have their advantages and disadvantages, which should be executable, and does not require source code modification. carefully reviewed by the user before the application of the We only need the relocation information from the compile- techniques. time linker to perform the randomization rewriting. Such information is produced by all existing C compilers. Sec- User Level Randomization Address obfuscation [2, ond, to randomly permute stack, heap, and memory mapped 3] introduced a mechanism to not only randomly shift the regions, we modify the Linux kernel. Our kernel changes placement of the three critical regions, but also randomly conserve as much virtual address space as possible to in- re-order objects in static code and data segments. This ap- crease randomness. Our binary rewriting tool can be au- proach relies on a source code transformation tool [29] to tomatically and transparently invoked before a program is perform the randomization. It introduced special pointer launched, while our kernel level support runs without any variables to store actual locations of objects for static code additional change to the runtime system. To validate the and data randomization. In addition, they add randomly practicality and effectiveness of ASLP, we have evaluated sized pads to stack, heap, and shared library using the ini- ASLP using security and performance benchmarks. Our se- tialization code, the wrapper function, and the junk code curity benchmark result shows that ASLP can provide up respectively. to 29 bits of randomness on a 32-bit architecture,