Heaptherapy: an Efficient End-To-End Solution Against Heap Buffer

HeapTherapy: An Efficient End-to-end Solution against Heap Buffer Overflows Qiang Zeng*, Mingyi Zhao*, Peng Liu The Pennsylvania State University University Park, PA, 16802, USA Email: fquz105, muz127, [email protected] Abstract—For decades buffer overflows have been one of the The first category contains a variety of approaches, which most prevalent and dangerous software vulnerabilities. Although range from bounds checking [21], [38], [6]; to shadow memory many techniques have been proposed to address the problem, based checking [25], [39], [5], [12]; to control/data flow they mostly introduce a very high overhead while others assume monitoring [23], [4], [11]; to N-version/N-variant systems [16], the availability of a separate system to pinpoint attacks or provide [7], [30]. Even with many optimization methods they still lead detailed traces for defense generation, which is very slow in itself to a very high overhead, or require significant extra computing and requires considerable extra resources. We propose the first efficient solution against heap buffer overflows that integrates resources. For instance, AddressSanitizer [39] uses a much attack detection, exploit diagnosis, and defense generation in more efficient shadow mapping and a more compact shadow a single system, named HeapTherapy. It conducts on-the-fly encoding than Valgrind (Memcheck) [25] and TaintTrace [12], lightweight trace collection on the production system (instead but still incurs 73% slowdown and over 3X memory overhead. of a separate replaying system), and initiates automated diagnosis As another example, N-variant execution [16] requires dou- upon exploit detection to generate defenses in realtime. It can bling resource consumption and system maintenance, which is handle both over-write and over-read attacks, such as the recently costly. explosed Heartbleed attack. The system has no false positives, and keeps effective under polymorphic exploits. It is compatible with Instead of performing expensive full execution monitoring, mainstream hardware and operating systems, and does not rely approaches in the second category generate tailored defenses on specific allocation algorithms. We evaluated HeapTherapy on against learned vulnerabilities. Given a zero-day vulnerability, a variety of services (database, web, and ftp) and benchmarks (SPEC CPU2006); it incurs a very low average overhead in terms they trade the prevention of the first buffer overflow(s) for of both speed (6.2%) and memory (7.7%). subsequent low-cost protection. An example of the second category is patching. However, generating patches is a lengthy procedure. According to Symantec the average time for gener- I. INTRODUCTION ating a critical patch for enterprise applications is 28 days [43]. A few approaches that generate defenses quickly after exploit Programs written in C and C++ contain a large number detection have been proposed. One popular approach is to of buffer overflow bugs, which involve writes or reads going generate input filters to filter out suspicious input [22], [27], beyond buffer boundaries.1 In addition to erroneous execution, [50]. However, the false negative rate rises when dealing with buffer overflow bugs can lead to various security threats, input obfuscation, and false positives are a common issue for including data corruption, control-flow hijack, and informa- identifying innocuous requests as attacks. A type of meth- tion leakage. The recently published Heartbleed vulnerability, ods, such as Vulnerability-Specific Execution-based Filtering which has affected hundreds of thousands of servers, was (VSEF) [26] and Vigilante: [14], are based on the observation due to a heap buffer over-read bug that leads to information that, given a sample exploit, only a small portion of instructions leakage [17], [28]. are relevant to the exploit. A defense that monitors the relevant Although there are many tools dedicated to finding buffer part of the program execution is generated to block further overflow bugs through static analysis [47], it is very unlikely to exploits. It is more efficient than full execution monitoring eliminate all the bugs through static analysis. The reality is that and performs better when handling input obfuscation. in 2014 one third of newly exposed software vulnerabilities published by CERT were related to buffer overflows [45]. However, VSEF relies on a separate system to provide Therefore, runtime measures that protect program execution sample exploits for analysis and defense generation; how to against overflow attacks are important. Such measures can pinpoint malicious inputs efficiently is a challenging problem be roughly divided into three categories: (1) Full execution in itself, and such a system might be unavailable due to monitoring; (2) Approaches that learn from history to improve resource constraints. Second, a defense in VSEF takes effect themselves; and (3) Measures that greatly increase the diffi- by instrumenting instructions accessing a specific overflow culty of exploiting buffer overflows. Examples of the third cat- target. While instructions that access a target near a vulnerable egory include StackGuard [15], Data Execution Prevention [2], buffer on the stack, such as a return address, can be easily Address Space Layout Randomization [49], [8], and concurrent identified, it is very unlikely to determine the instructions heap scanning [52], [44]. that access adjacent regions of a vulnerable heap buffer, for a heap buffer can be allocated almost anywhere on the heap. *These two authors have contributed equally. A proper solution to heap buffer overflows is missing in that 1The term “overflow” in this paper refers to both over-writes and over-reads. work. Vigilante has the same issues. 1 As stack-based buffer overflows are better addressed nowa- current calling context ID against the one contained in any days, heap buffer overflows gain growing attention of at- defense. If they match, the new buffer is regarded vulnerable tackers [35], [52]. By noticing that full program execution and a guard page is attached after it. Subsequently, without monitoring usually incurs a high overhead, we target an tracking access instructions or the information flow, out-of- efficient heap buffer overflow countermeasure falling into the bounds buffer access due to continuous read or writes are second category, i.e., learning from history to protect against prevented by system protection automatically. While the guard attacks. It should meet the following goals simultaneously: page is an expensive enhancement [32], HeapTherapy applies (G1) All-in-one solution: trace collection and defense gen- it only to vulnerable heap buffers with a low overall overhead. eration should be built directly into the production system, so that it does not need a separate system and can respond We have implemented HeapTherapy, and evaluated it on a to any detection of attacks instantly. (G2) High efficiency: variety of services (database, web, and ftp) and benchmarks the overall overhead should be low. (G3) High accuracy: the (SPEC CPU2006). The throughput overheads incurred by generated defense should have few to no false positives and HeapTherapy on service programs are all less than 8% with negatives even with polymorphic attacks. (G4) Compatible zero false positives. A thorough evaluation on SPEC shows with mainstream hardware and runtime environment: it should that the speed overhead averages 6.2% and the memory over- not require special hardware and can work with existing head 7.7% when dealing with 10 synthesized vulnerabilities runtime environment; ideally, it does not depend on custom simultaneously. heap allocation algorithms. To our knowledge there is no such We made the following contributions: solution satisfying all the desirable requirements. • We propose an end-to-end solution that integrates In this paper, we present HeapTherapy, a highly efficient defense generation and overflow prevention in a single end-to-end solution against heap buffer overflows that meets system. The defense is generated automatically on the all the goals above. Unlike approaches applying costly bounds user side, so the user system can be enhanced instantly, checking or data/control flow tracking, HeapTherapy employs and the user does not need to maintain a separate inexpensive techniques to identify vulnerable heap buffers system for defense generation. swiftly and enhance them locally. HeapTherapy contains in- memory trace collection, online exploit detection and realtime • Compared to existing work, HeapTherapy incurs a defense generation as part of the defense system. HeapTherapy very low speed and memory overhead. identifies vulnerable heap buffers based on the intrinsic char- acteristics of an exploit, as opposed to filtering out malicious • The defense generated by HeapTherapy does not have inputs based on signatures, so it is effective under polymorphic false positives, and keeps effective under polymorphic attacks. Finally, it can be easily deployed and does not rely on attacks. specific allocation algorithms. • It is compatible with existing hardware, systems, and libraries. It does not require a custom heap allocation To detect over-read attacks HeapTherapy places inaccessi- 2 ble guard pages randomly throughout the heap space, so that, algorithm. Thus, its deployment is convenient. when repetitive attacks such as Heartbleed are launched to • While none of the techniques employed in HeapTher- collect jigsaw pieces on the heap, it is highly probable that a apy, such as canaries, guard pages, probabilistic detec-

Load more