A Platform for Secure Static Binary Instrumentation ∗

A Platform for Secure Static Binary Instrumentation ∗ Mingwei Zhang Rui Qiao Niranjan Hasabnis R. Sekar Stony Brook University Abstract instrumentations can be more easily and extensively opti- Program instrumentation techniques form the basis of many mized by exploiting higher level information such as types. recent software security defenses, including defenses against However, binary instrumentations are more widely applica- common exploits and security policy enforcement. As com- ble since users have ready access to binaries. Moreover, se- pared to source-code instrumentation, binary instrumenta- curity instrumentations should be applied to all code, includ- tion is easier to use and more broadly applicable due to the ing all libraries, inline assembly code, and any code inserted ready availability of binary code. Two key features needed by the compiler/linker. Here again, binary based techniques for security instrumentations are (a) it should be applied are advantageous. to all application code, including code contained in various Binary instrumentation can either be static or dynamic. system and application libraries, and (b) it should be non- Static binary instrumentation (SBI) is performed offline on bypassable. So far, dynamic binary instrumentation (DBI) binary files, whereas dynamic binary instrumentation (DBI) techniques have provided these features, whereas static bi- operates on code already loaded into main memory. DBI nary instrumentation (SBI) techniques have lacked them. techniques disassemble and instrument each basic block just These features, combined with ease of use, have made DBI before its first execution. DBI has been the technique of the de facto choice for security instrumentations. However, choice for security instrumentation of COTS binaries. Pre- DBI techniques can incur high overheads in several common vious works have used DBI for sandboxing [15, 17], taint- usage scenarios, such as application startups, system-calls, tracking [23, 28], defense from return-oriented program- and many real-world applications. We therefore develop a ming (ROP) attacks [11, 12], and other techniques for hard- ening benign code [16, 26]. This is because DBI platforms new platform for secure static binary instrumentation (PSI) that overcomes these drawbacks of DBI techniques, while provide several features that simplify instrumentation devel- retaining the security, robustness and ease-of-use features. opment, while ensuring their security: • We illustrate the versatility of PSI by developing several Non-bypassable instrumentation. DBI tools check every instrumentation applications: basic block counting, shadow control-transfer to ensure that the target is instrumented, stack defense against control-flow hijack and return-oriented and hence can block attempts to escape security checks programming attacks, and system call and library policy en- enforced by the instrumentation, e.g., by returning/jump- forcement. While being competitive with the best DBI tools ing to (i) data (code injection attacks), (ii) middle of an on CPU-intensive SPEC 2006 benchmark, PSI provides an instruction (typical ROP attacks), or (iii) into the middle order of magnitude reduction in overheads on a collection of of added instrumentation or past its end. real-world applications. • Completeness. DBI techniques instrument all executed code, regardless of whether they reside in executables or 1. Introduction libraries. The alternative of omitting library instrumentation is unsatisfactory because the vast majority of code Program instrumentation has played a central role in exploit executed by today’s applications resides in libraries, and detection/prevention, security policy enforcement, applica- all of this code will remain unprotected. tion monitoring and debugging. Such instrumentation may • Ease of use. Instrumenting an application is as simple be performed either on source or binary code. Source code as prefixing its invocation with the name of a wrapper ∗ This work was supported in part by AFOSR grant FA9550-09-1-0539, program. There is no need to explicitly instrument the NSF grant CNS-0831298, and ONR grant N000140710928. application prior to its run, nor is there a need to know (and instrument) the libraries used by it. Moreover, DBI platforms such as Pin [19], DynamoRIO [8] and Valgrind Permission to make digital or hard copies of all or part of this work for personal or [22] provide a convenient API that greatly simplifies the classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation development of new instrumentations (also called tools). on the first page. Copyrights for components of this work owned by others than ACM Previous SBI techniques have lacked these features, and must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a moreover, were targeted at specific problems such as control- fee. Request permissions from [email protected]. flow integrity (CFI) [2] or software fault isolation (SFI) [32]. VEE ’2014, March 1–2, 2014, Salt Lake City, Utah, USA.. Copyright c 2014 ACM 978-1-4503-2764-0/14/03. $15.00. Consequently, they don’t address the issues in developing a http://dx.doi.org/10.1145/2576195.2576208 general-purpose platform for binary instrumentation. Main Results and Contributions for determining library dependencies (e.g., ldd on Linux) We present a general-purpose binary instrumentation plat- cannot identify libraries that are loaded after an application begins execution. For a collection of popular real-world ap- form PSI that addresses the shortcomings of previous SBI techniques. It combines the benefits of DBI with the advan- plications, more that 40% of library loads occurred after the tages that are unique to SBI, including: commencement of application execution. To support seam- less instrumentation of such libraries, we have developed an • the ability to trade off increased (offline) analysis/instru- on-demand static instrumentation technique. mentation time for faster runtime performance, and Good performance across a wide range of applications • avoiding reliance on a potentially large and complex virtual environment at runtime We present a comparative performance evaluation of PSI against the two leading DBI tools, namely DynamoRIO and Moreover, PSI provides low overheads across a wide range Pin. We summarize our results below: of benchmarks, while avoiding some pitfalls of DBI plat- • BB counting on SPEC 2006. forms such as high application startup times and high over- PSI incurs an average overheads for systems applications. Our key contributions are head of 69% on SPEC 2006 for basic-block counting, summarized below. compared with 53% for DynamoRIO and 97% for Pin. • Shadow stack. PSI’s overhead is less than a quarter of that Secure static instrumentation. Two key features needed reported by ROPdefender (18% vs 74%). for security instrumentations are completeness (instrumenta- • lmbench Microbenchmark. The average overhead of PSI tion should be applied to all code that can get executed) and is about 10 times smaller than DynamoRIO, and 200 times non-bypassability (instrumented code should not be able to smaller than Pin. bypass or subvert the added instrumentation). Many previ- • A collection of real-world applications. P I’s overheads ous SBI techniques, including Native Client [35], PittSFIeld S are about 7 to 13 times lower than DynamoRIO, and 60 [20], PEBIL [18] and many others [2,6, 14, 24, 36, 37], are times lower than Pin on several real-world applications, not complete for COTS binaries since they require additional including compilation, software updates, etc. information (such as symbol or relocation information) for correctly instrumenting binaries. Our PSI platform will be available for download from Among techniques applicable to stripped binaries, Reins http://seclab.cs.stonybrook.edu/download. [33] and SecondWrite [4, 13] don’t instrument all libraries. Scope and Limitations. PSI is a general platform for the Moreover, several of these techniques, including Dyninst instrumentation of COTS applications. It is targeted at be- [10], SecondWrite and Binary stirring [34], do not prevent nign COTS binaries that do not employ obfuscation. If a execution from escaping instrumentation since they do not binary employs obfuscation (as is common with malware), check the validity of targets such as return addresses. Al- our disassembly technique can fail to detect all of its code, or though our BinCFI [38] system performs such checks, it perform incorrect disassembly. However, since control flows implements a single hard-coded instrumentation, and hence are checked at runtime, PSI will block all attempts to exe- there is no general discussion or treatment of instrumenta- cute code that wasn’t disassembled and instrumented. Thus, tion non-bypassability. binaries employing obfuscation may fail at runtime due to A versatile, easy-to-use static instrumentation platform. control-flow transfers being blocked, but they won’t be able Our platform provides an easy-to-use API with convenient to bypass PSI. abstractions for low-level instrumentation, including data Similar to most SBI techniques, PSI does not support self- structures to capture instructions, basic blocks, and control- modifying code. In particular, any attempt to modify existing flow graphs. We illustrate the

A Platform for Secure Static Binary Instrumentation ∗

Dynamic Optimization of IA-32 Applications Under Dynamorio by Timothy Garnett

Dynamic Binary Translation & Instrumentation

Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation

An Infrastructure for Adaptive Dynamic Optimization

Using Dynamic Binary Instrumentation for Security(And How You May Get

Coverage-Guided Fuzzing of Grpc Interface

Valgrind a Framework for Heavyweight Dynamic Binary

Using Dynamic Binary Instrumentation for Security (And How You May Get Caught Red Handed)

Dynamic Data Flow Analysis Via Virtual Code Integration (Aka the Spiderpig Case)

Introspective Intrusion Detection

Transparent Dynamic Instrumentation

Efficient Cross-Architecture Hardware Virtualisation