Automated Exploitation of Fully Randomized Executables Austin
Total Page:16
File Type:pdf, Size:1020Kb
Automated Exploitation of Fully Randomized Executables by Austin James Gadient B.S. Computer Engineering United States Air Force Academy (2018) Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Engineering at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY February 2020 ○c Austin Gadient, 2019. All rights reserved. The author hereby grants to MIT permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole or in part in any medium now known or hereafter created. Author................................................................ Department of Electrical Engineering and Computer Science December 18, 2019 Certified by. Martin C. Rinard Professor of Computer Science and Electrical Engineering Thesis Supervisor Accepted by . Leslie A. Kolodziejski Professor of Computer Science and Electrical Engineering Chair, Department Committee on Graduate Students 2 Automated Exploitation of Fully Randomized Executables by Austin James Gadient Submitted to the Department of Electrical Engineering and Computer Science on December 18, 2019, in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Engineering Abstract Despite significant research into their remediation, memory corruption vulnerabil- ities remain a significant issue today. Attacks created to exploit these vulnerabilities are remarkably brittle due to their dependence on the presence of specific security settings, use of certain compiler toolchains, and the target’s environment. I present Marten, a new end-to-end system for automatically discovering, exploiting, and com- bining information leakage and buffer overflow vulnerabilities to generate exploits that are robust against defenses, compilation decisions, and environment differences. Marten has generated four control flow hijacking exploits and three information leak- age exploits. One of the information leakage exploits is generated against a previously undiscovered zero-day vulnerability in Serveez-0.2.2. CVE-2019-16200 has been as- signed to this vulnerability. These results highlight Marten’s ability to generate robust exploits that bypass modern defenses to download and execute injected code selected by an attacker. Thesis Supervisor: Martin C. Rinard Title: Professor of Computer Science and Electrical Engineering 3 4 Acknowledgments Thank you to my advisor, Martin Rinard, for his expert guidance, mentorship, and leadership throughout the entirety of my research. His insight and recommendations helped me improve both the presentation and substance of this work immeasurably. I would also like to thank Jeff Perkins, Ricardo Barato, Eli Davis, and Baltazar Ortiz for their expert knowledge and work upon which my research is built. Isaac Newton once said, "If I have seen further, it is by standing upon the shoulders of gi- ants." These men are the giants and their efforts are the shoulders upon which I stand. Additionally, I thank my teachers, from pre-kindergarten to now, who simultane- ously drove and satiated my hunger for knowledge. Without their efforts, I would never have developed the discipline and skillset that has allowed me to succeed aca- demically at MIT. In particular, I would like to thank my mother for reading to me every night when I was young and homeschooling me for three years. I never had a teacher who cared more than her and I doubt I ever will. Finally, I give great thanks the my father Anthony, my sisters Katharyn and Megan, and my girlfriend Dominique for their support through every difficult stage in my life. I would not be where I am today without their unconditional love. 5 6 Contents 1 Introduction 15 1.1 Background . 15 1.1.1 Memory Corruption Arms Race . 15 1.2 Marten . 17 1.3 Scope . 17 1.4 Contributions . 18 1.5 Structure . 20 2 Overview 21 2.1 Defenses . 21 2.1.1 ASLR . 21 2.1.2 NX . 22 2.1.3 Full RELRO . 22 2.1.4 Fortify Source . 23 2.2 Threat Model . 23 2.3 Marten . 24 2.4 Instrumentation . 24 2.5 Vulnerability Discovery . 24 2.6 Information Leakage Exploit . 25 2.7 Exploit Generation . 25 3 Example 27 3.1 Dnsmasq . 27 7 3.1.1 Trace Generation and Analysis . 27 3.1.2 Vulnerability Discovery . 28 3.1.3 Information Leakage . 29 3.1.4 Vulnerability Discovery . 30 3.1.5 ROP Chain Generation . 30 3.2 Final Exploit . 32 4 Program Instrumentation 33 4.0.1 Logging Execution Traces . 33 4.0.2 Trace Analyzer . 34 5 Vulnerability Discovery 37 5.1 Vulnerability Discovery Algorithm . 37 5.2 Success Functions . 39 5.2.1 Overflow . 39 5.2.2 Information Leak . 39 5.3 Path Explosion and Symbolic Execution . 40 6 Information Leakage 43 6.1 Information Discovery Algorithm . 43 6.2 Information Leakage Verification . 45 6.3 Information Identification . 45 6.4 GDB Scripting . 46 7 Exploit Generation 49 7.1 ROP Chain Generation . 50 7.2 Stack Offset Calculation . 52 7.3 Exploit Finalization . 52 7.4 Chain Size . 53 7.5 Exploit Reliability . 54 8 8 Results 55 8.1 Dnsmasq-2.77 . 56 8.2 Nginx-1.4.0 and OpenSSL-1.0.1 . 56 8.3 Asterisk-11.1.2 and OpenSSL-1.0.1 . 59 8.4 Serveez-0.0.20 . 61 9 Related work 65 10 Conclusion 71 9 10 List of Figures 2-1 The overall workflow of Marten. The diagram is split between program instrumentation and trace generation (pink), analysis that generates different portions of the final exploit (blue), and the final exploit’s direct interaction with the remote process (green). Section numbers for each component are given in parenthesis. 22 3-3 Statistics of ROP chains generated for each target system’s 64-bit libc library . 31 6-1 Disassembly when setting breakpoint in outpacket.c at line 84 for Dnsmasq-2.77 . 46 8-2 CVE-2013-2028: Stack Buffer Overflow in Nginx . 58 8-3 CVE-2013-2685 Asterisk Buffer Overflow Vulnerability . 60 8-4 CVE-2019-16200 Serveez Information Leak Vulnerability . 61 8-5 Bugtraq ID 42560: Serveez Buffer Overflow Vulnerability . 62 11 12 List of Tables 8.1 Results for benchmark applications. Marten generates working exploits for seven vulnerabilities and chains them together to create exploits that bypass full randomization and other defenses. 55 13 14 Chapter 1 Introduction Memory corruption vulnerabilities have been a common target of control-flow hijacking attacks for decades [20, 31, 39, 21]. Attackers inject code into a running process, or reuse code that already exists, to subvert the system. Address Space Layout Randomization (ASLR) has proven to be an effective defense against these attacks [36]. While it is possible to apply ASLR to only some parts of the process, modern systems such as Ubuntu 18.04 apply full ASLR to all code and data in a process. This thesis addresses full ASLR. 1.1 Background This section provides brief summaries that describe the history of memory cor- ruption vulnerabilities. It gives useful context for the design decisions made in this thesis. 1.1.1 Memory Corruption Arms Race Memory corruption exploits have existed for decades. One notable paper on these attacks was introduced in 1996 with Aleph One’s influential work on stack buffer overflows [20]. This technical description made the art of program exploitation main- stream and provided a base from which offensive and defense techniques grew. The 15 paper by Aleph One focused on stack buffer overflows and their exploitation inthe absence of defenses. The goal of these attacks is to write past the end of a stack buffer and corrupt a return pointer which can be set to arbitrarily execute codewith the same permissions as the process in which the code runs. This type of attack was effective in targeting server programs and allowed attackers to inject code and subvert critical systems. The popularity of stack buffer overflow attacks motivated the development ofone important defense, non-executable memory (NX) [7]. NX takes advantage of the fact that some values in a process’s address space represent code, while some represent data. The code that needs to execute is typically mapped into its own location such as the .text section in executable and linkable format (ELF) binaries [16]. NX made it so that the code inside of a process could be executed but not written, and the data could be written but not executed. Thus, a simple xor operation was applied to executable memory and writable memory. This defense prevented attackers from simply injecting their own code, often referred to as shellcode because of its tendency to spawn a shell, into a process. In response to this defense, attackers developed the return-to-libc attack [40]. This attack worked by using code that was already present in the process to launch attacker selected commands. This method was realized by calling library functions such as execve with specially selected parameters. Another similar attack, return oriented programming (ROP), was also developed [32]. This attack used small sequences of instructions, often referred to as gadgets, that occurred before return instructions. It was shown that given a sufficiently large body of x86 code, a Turing complete setof operations could be performed using the gadgets [32]. To defend against code reuse attacks like return-to-libc and ROP, ASLR was de- veloped [36]. ASLR works by randomizing the location of code within a process’s address space. The central idea is that if an attacker cannot find the code they want to reuse, then their exploits will fail. The combination of ASLR and NX is extremely effective in preventing exploits, but they are not insurmountable. If an attacker can trigger an information leak to gain knowledge about the layout of a target process’s 16 address space.