Automatic Detection of Memory Corruption Attacks

AUTOMATIC DETECTION OF MEMORY CORRUPTION ATTACKS Thesis submitted in partial fulfillment of the requirements for the degree of Master of Science (by Research) in Computer Science by Pankaj Kohli 200607011 pankaj [email protected] International Institute of Information Technology Hyderabad, India December 2008 INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY Hyderabad, India CERTIFICATE It is certified that the work contained in this thesis, titled \Automatic Detection of Memory Cor- ruption Attacks" by Pankaj Kohli, has been carried out under my supervision and is not submitted elsewhere for a degree. Date Adviser: Dr. B. Bruhadeshwar Copyright c Pankaj Kohli, 2009 All Rights Reserved To my parents, for their everlasting love and support. Knowing is not enough; we must apply. Willing is not enough; we must do. - Johann Wolfgang von Goethe Acknowledgments I would like to thank my adviser, Dr. Bruhadeshwar for his support and guidance during the development of the ideas in this thesis, and for the helpful comments on the text. I thank Dr. Srinathan who inspired me as an individual through his various teachings during my course work at IIIT Hyderabad. I am thankful to Dr. Venkaiah for his immense support in the lab. Many thanks to my friends Sai Sathyanarayan and Vinayak Kumar, who shared their valuable ideas with me during my research work. I am grateful to all the faculty of IIIT Hyderabad whose encouragement towards research helped me contribute some good work to the research community. Abstract With the growth of Internet, there has been a tremendous increase in the security attacks against computer systems. Due to the improper use of programming language by programmers, software often exposes security vulnerabilities, such as buffer overflows and format string bugs. Such software vulnerabilities relating to memory safety, are the most common vulnerabilities used by attackers to gain control over the execution of a program running on a computer system. By carefully crafting an exploit for these vulnerabilities, attackers can make a privileged program transfer execution-flow to a malicious piece of code. Such memory corruption attacks are among the most powerful and common attacks against software applications. In the recent years, memory corruption attacks have accounted for more than half of all the reported CERT advisories. A large number of defensive techniques have been described in the literature that either attempt to eliminate specific vulnerabilities entirely or attempt to combat their exploitation. The work presented in this thesis makes two significant contributions. Firstly, it presents FormatShield, a novel approach to defend against format string attacks. Secondly, it presents Coarse Grained Dynamic Taint Analysis, a generic technique that uses information flow tracking to defeat a broad range of memory corruption attacks. FormatShield automatically identifies call sites in a running process that are vulnerable to format string attacks. Using binary rewriting, the list of exploitable program contexts at these vulnerable call sites is dumped into the program binary. Attacks are detected when malicious input is found at such call sites. FormatShield can defend against all types of format string attacks, i.e. arbitrary memory read attempts and arbitrary memory write attempts, including non-control data attacks. Coarse grained dynamic taint analysis works by labeling the data received from untrusted sources, such as network, as unsafe or tainted. Data derived from such tainted data is itself marked as tainted. Attacks are detected when control branches to a location specified by the unsafe data. It does not requires source code of the program to be protected and is capable of defending against a broad range of memory corruption attacks, including non-control data attacks. Also, our experiments show that it incurs modest performance overhead, making it suitable for use in production environment. Contents 1 Introduction 1 1.1 Problem Statement . .1 1.2 Goal of the thesis . .3 1.3 Terminology and Conventions . .4 1.4 Thesis Organization . .4 2 Preliminaries 5 2.1 Process Address Space . .5 2.2 Intel x86 Function Call Mechanism . .6 2.3 ELF File Format . .7 2.3.1 Linking View . .8 2.3.2 Loading View . .8 2.3.3 Dynamic Linking . 10 2.4 Memory Corruption Attacks . 12 2.4.1 Buffer Overflows . 12 2.4.2 Format String Attacks . 17 2.4.3 Integer Overflows . 18 2.4.4 Double Free Attacks . 19 2.4.5 Globbing Vulnerabilities . 19 2.5 Attack Targets . 19 2.6 Attack Variations . 20 2.6.1 Return into libc Attacks . 21 2.6.2 NOP Sled....................................... 22 2.6.3 Jump to Register . 22 2.6.4 Code Spraying Attacks . 23 2.6.5 Non-Control Data Attacks . 23 3 Related Work 25 3.1 Security Policies and Code Reviews . 25 3.2 Language Approach . 26 3.3 Safe C Libraries . 26 3.4 Operating System Extensions . 27 3.4.1 Randomized Addresses . 27 3.4.2 Randomized Instruction Sets . 29 3.5 Compiler Modifications . 30 3.6 Runtime Detection of Attacks . 31 3.6.1 Executable Monitoring . 32 ii 3.6.2 Software Fault Injection . 32 3.7 Information Flow Tracking . 33 3.8 Static Analysis . 33 3.9 Anomaly Detection . 34 4 FormatShield 36 4.1 Approach Description . 36 4.2 Implementation . 37 4.2.1 Identifying vulnerable call sites . 37 4.2.2 Binary Rewriting . 38 4.2.3 Implementation Issues . 40 4.3 Evaluation . 40 4.3.1 Effectiveness . 40 4.3.2 Performance Testing . 41 4.4 Discussion . 42 4.4.1 False Positives and False Negatives. 42 4.4.2 Limitations . 42 5 Coarse Grained Dynamic Taint Analysis 43 5.1 Coarse Grained Dynamic Taint Analysis . 43 5.1.1 Approach Overview . 44 5.1.2 Framework Overview . 45 5.1.3 Exploit Detection Policy . 48 5.2 Evaluation . 49 5.2.1 Effectiveness Evaluation . 50 5.2.2 Performance Evaluation . 52 5.3 Discussion . 54 6 Conclusion and Future Work 56 6.1 Conclusion . 56 6.2 Future Work . 56 Bibliography 56 iii List of Figures 1.1 Breakdown of NIST National Vulnerability Database (NVD) of software security vulnerabilities (2006 and 2007-Q1/Q2) . .1 2.1 Process Address Space . .6 2.2 C code and the generated code showing the function prologue, epilogue and call . .7 2.3 Linking and Loading view of an ELF executable . .9 2.4 Calling an externally defined function using .plt and .got . 11 2.5 A Buffer Overflow . 12 2.6 A sample program vulnerable to stack overflow. 13 2.7 Stack frame for the function func() (a.) before and (b.) after strcpy(). An attacker injects a large input that overwrites the stored return address on the stack making it point to the injected code. 14 2.8 (a.) Allocated Chunk, and (b.) free'ed chunk . 15 2.9 The unlink() macro and its equivalent code . 16 2.10 An attacker constructs fake chunks by overflowing a chunk . 16 2.11 Stack layout when printf() is called. (a.) On giving a legitimate input, the program prints HELLO. (b.) On giving a malicious input (\%x%x%x%x"), the program prints 44415441 (hex equivalent of DATA). 17 2.12 Different types of integer overflows - a width overflow, an arithmetic overflow and a signedness bug caused due to an arithmetic overflow . 19 2.13 A return into libc attack . 21 2.14 NOP Sled technique . 22 2.15 Jump to register technique . 23 4.1 Only the third call to output() is exploitable . 37 4.2 ELF binary a. before rewriting b. after rewriting . 38 4.3 Dynamic symbol table before rewriting the binary . 39 4.4 Dynamic symbol table after rewriting the binary. A new dynamic symbol named fsprotect is added while rewriting the binary which points to the new section at address 0x08047114..................................... 39 4.5 Sections before rewriting the binary . 39 4.6 Sections after rewriting the binary. A new loadable read-only section named fsprotect is added which holds the context information. The .dynsym, .dynstr and .hash sections shown are extended copies of the original ones. The original .dynsym, .dynstr and .hash are still loaded at their original load addresses. 39 5.1 Approach Overview . ..

Load more