ASIA CCS'17 ~ Strict Virtual Call Integrity Checking for C++ Binaries

Strict Virtual Call Integrity Checking for C++ Binaries Mohamed Elsabagh Dan Fleck Angelos Stavrou [email protected] dfl[email protected] [email protected] Department of Computer Science George Mason University Fairfax, VA 22030, USA ABSTRACT though modern systems are equipped with W⊕X and Data Modern operating systems are equipped with defenses that Execution Prevention (DEP), attackers can still achieve ar- render legacy code injection attacks inoperable. However, bitrary code execution by repurposing existing code from the attackers can bypass these defenses by crafting attacks that program memory, in what is known as code-reuse attacks. reuse existing code in a program's memory. One of the most This can range from reusing blocks of instructions, such common classes of attacks manipulates memory data used as Return Oriented Programming (ROP), to even reusing indirectly to execute code, such as function pointers. This is whole functions in a Function Reuse Attack (FRA). especially prevalent in C++ programs, since tables of func- The use of Control Flow Integrity (CFI) [1], which is a crit- tion pointers (vtables) are used by all major compilers to ical program security property, can assure the program does support polymorphism. In this paper, we propose VCI, a not execute unintended code. Unfortunately, constructing binary rewriting system that secures C++ binaries against a sound and complete CFI policy has proven to be a chal- vtable attacks. VCI works directly on stripped binary files. lenging task [9]. Enforcing CFI is especially hard due to It identifies and reconstructs various C++ semantics from indirect control flow transfer, such as indirect calls through the binary, and constructs a strict CFI policy by resolving function pointers. The problem becomes even harder if the and pairing virtual function calls (vcalls) with precise sets source code is not available. This makes binary-only solu- of target classes. The policy is enforced by instrumenting tions very desirable, since, in practice, the source code of checks into the binary at vcall sites. Experimental results many programs is not always available, and that includes on SPEC CPU2006 and Firefox show that VCI is signifi- many commercial products, 3rd party libraries, legacy soft- cantly more precise than state-of-the-art binary solutions. ware and firmware to name a few. Even if the source code is Testing against the ground truth from the source-based de- available, compiling in new protections is not always feasible fense GCC VTV, VCI achieved greater than 60% precision or desirable, for instance, due to the presence of legacy code in most cases, accounting for at least 48% to 99% additional and compiler dependencies. reduction in the attack surface compared to the state-of- Indirect calls are prevalent in OOP languages in order the-art binary defenses. VCI incurs a 7:79% average run- to enable polymorphism. Of particular interest to us is ++ time overhead which is comparable to the state-of-the-art. C , where all major compilers, including GCC, LLVM, and ++ In addition, we discuss how VCI defends against real-world MSVC, support C polymorphism via tables of function attacks, and how it impacts advanced vtable reuse attacks pointers. This is also the case for compilers of closely related # ++ such as COOP. languages, such as C and D. C supports class and function polymorphs by allowing derived classes to redefine base functions that are declared virtual. Each object of a class Keywords that (re)defines virtual functions stores a pointer (vptr) to Virtual table attacks; C++; Control flow integrity; Type-call a read-only table of pointers to virtual function definitions pairing; Static binary analysis (called vtable for short). To invoke a virtual function, the compiler generates code that indirectly executes the corresponding function in the object's vtable (see Section 2). We 1. INTRODUCTION refer to such code sites in the binary as virtual call (vcall) Presently, memory subversion remains an unsolved secu- sites. rity threat. By manipulating control data, such as func- In an unprotected binary, an attacker with control over tion pointers and return addresses, attackers can hijack the an object's memory or vtable can call any function within control flow of programs and execute arbitrary code. Even the program whenever the program uses the object's vtable to make a vcall. This is typically achieved by exploiting a memory access bug that enables overwriting the in Permission to make digital or hard copies of all or part of this work for personal or vptr classroom use is granted without fee provided that copies are not made or distributed an object's memory, in what is known as a \vtable attack". for profit or commercial advantage and that copies bear this notice and the full cita- Perhaps the most common class of enabler bugs in this cat- tion on the first page. Copyrights for components of this work owned by others than egory is the infamous use-after-free [2]. Here, a pointer to a ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior specific permission freed object is used in a later program statement (a dangling and/or a fee. Request permissions from [email protected]. pointer) to invoke one of the object's virtual functions. This ASIA CCS ’17, April 02-06, 2017, Abu Dhabi, United Arab Emirates dangling pointer can allow an attacker to execute arbitrary c 2017 ACM. ISBN 978-1-4503-4944-4/17/04. $15.00 code if she can control the contents of the object's freed DOI: http://dx.doi.org/10.1145/3052973.3052976 140 memory, e.g., using heap overflows or heap spraying [12]. its runtime overhead. We show that VCI can mitigate Such bugs are very prevalent in commodity desktop appli- real-world attacks, and incurs a comparable overhead cations, such as office suites and browsers, since they are to existing solutions. typically written in C++. Recent studies (e.g., [7, 22, 25]) The rest of the paper is organized as follows: Section 2 suggested use-after-free vulnerabilities account for at least provides an overview of relevant C++ primitives. In Sec- 69% of all vulnerabilities in browsers, about 50% of Win- tion 3 we define the threat model, discuss vtable attacks dows 7 exploits, and 21% of all vulnerabilities in all operat- and give an overview of our solution. Section 4 lays out the ing systems. details of VCI. We evaluate VCI in Section 5, and discuss In this paper, we present VCI, a static binary CFI system limitations and improvements in Section 6. We present re- that retrofits C++ binaries with defenses against vtable at- lated work in Section 7 and conclude in Section 8. In the tacks. VCI protects the binaries by enforcing a strict CFI Appendix, we provide additional technical details and dis- policy that limits the number of callable function from vcall cuss complementary policies. sites (see Section 3). VCI works on stripped binaries, without needing debug, symbol or type information. To deter- mine valid function targets we developed algorithms to re- 2. BACKGROUND construct several C++ semantics from binaries, namely: vta- Commodity applications, such as office suites and web bles, constructors, class layouts, class hierarchies, and vcalls browsers, are built with performance in mind. Given the (see Section 4). VCI exploits patterns in the assembly, and sophisticated functionalities they provide, it is standard to uses backward slicing and inter-procedural analysis to sym- use languages that provide sufficient levels of abstraction bolically trace the this pointer expressions of objects across with a minimal performance penalty. Therefore, low-level function boundaries. It builds a mapping between vcall sites object-oriented languages, such as C++, are typically the and their target class types. It then instruments the binary choice for their implementation. To enable polymorphism, by generating and injecting the integrity policy to enforce C++ uses virtual functions. A function is declared virtual the mapping at runtime. if its behavior (implementation) can be changed by derived We implemented a prototype of VCI in C++ on Linux, us- classes. The exact function body to be called is determined ing Dyninst [15] for binary parsing and rewriting. The pro- at runtime depending on the invoking object's class. totype consists of ∼3500 SLOC for the analysis in addition to a ∼500 SLOC dynamic library where the integrity pol- 2.1 Polymorphism and Virtual Tables icy procedures reside. Experimental results (see Section 5) All major C++ compilers, including GCC, Clang/LLVM, on the C++ SPEC CPU2006 benchmarks and Mozilla Fire- fox show that VCI significantly reduces the attack surface MSVC, Linux versions of HP and Intel compilers, use vta- compared to the state-of-the-art binary vtable defenses. For bles to dispatch virtual functions. A vtable is a reserved instance, in comparison with VTint [43] and vfGuard [30], read-only table in the binary that contains function point- VCI achieved at least 96% and 48% additional reduction ers to the definitions of virtual functions accessible through a polymorphic class. A polymorphic class is a class that in the number of allowable vcall targets, respectively. In 1 comparison to GCC VTV (source-based ground truth), VCI declares, defines or inherits virtual functions. Each virtual achieved the highest precision amongst other binary solu- function in a class has a corresponding offset in the class' tions, with 100% precision in some cases and greater than vtable which stores the address of the implementation body 60% precision for the majority of the test programs. Our of the function in the code section. Whenever an object of experiments show that VCI incurs a low runtime overhead some class type invokes a virtual function, the class' vtable (∼7:79%), and can defend against real-world exploits includ- is accessed, and the address at the corresponding function ing the recent COOP attacks [10, 34].

Load more