Watchdoglite: Hardware-Accelerated Compiler-Based Pointer Checking

WatchdogLite: Hardware-Accelerated Compiler-Based Pointer Checking Santosh Nagarakatte Milo M. K. Martin Steve Zdancewic Rutgers University University of Pennsylvania [email protected] [email protected] [email protected] Abstract Efficiently and comprehensively detecting and protecting against Lack of memory safety in C is the root cause of a multitude of seri- memory safety violations is unsurprisingly a well researched topic ous bugs and security vulnerabilities. Numerous software-only and with numerous proposals over the years [1, 2, 4, 9–14, 27, 29, 31, hardware-based schemes have been proposed to enforce memory 36]. These include both software-only tools [1, 2, 4, 8, 11, 13, safety. Among these approaches, pointer-based checking, which 14, 27, 29, 31] and hardware instantiations [9, 10, 12, 25, 36]. maintains per-pointer metadata in a disjoint metadata space, has Beyond academia, recent tools from industry—such as Google’s been recognized as providing comprehensive memory safety. Soft- Address Sanitizer [40] in the LLVM compiler and Intel’s Pointer ware approaches for pointer-based checking have high performance Checker compiler [15], patent application [35], and recently an- overheads. In contrast, hardware approaches introduce a myriad nounced MPX ISA extensions [19]—illustrate the importance of of hardware structures and widgets to mitigate those performance detecting memory safety violations. overheads. Prior proposals for detecting memory safety violations pro- This paper proposes WatchdogLite, an ISA extension that provide a wide spectrum of protection ranging from partial counter- vides hardware acceleration for a compiler implementation of measures to comprehensive memory safety. These proposals make pointer-based checking. This division of labor between the com- tradeoffs along the dimensions of performance, protection, and piler and the hardware allows for hardware acceleration while us- compatibility with existing applications. Szekeres et al. [43] sur- ing only preexisting architectural registers. By leveraging the com- veyed the entire space of memory safety vulnerabilities and en- piler to identify pointers, perform check elimination, and insert forcement mechanisms and identified pointer-based checking as the the new instructions, this approach attains performance similar to only approach to provide comprehensive and non-probabilistic de- prior hardware-intensive approaches without adding any hardware tection of memory safety vulnerabilities. structures for tracking metadata. Pointer-based checking [2, 10, 12, 15, 27–29, 29, 36, 47] gives every pointer a view of memory that it can legally access by main- Categories and Subject Descriptors C.1 [Computer Systems Or- taining per-pointer metadata. To retain memory layout compatibil- ganization]: Processor Architectures; D.2.5 [Software Engineer- ity, some proposals place this per-pointer metadata in a disjoint- ing]: Testing and Debugging; D.3.4 [Programming Languages]: metadata space [12, 15, 27, 28]. The per-pointer metadata is prop- Processors agated on pointer operations. Conceptually, every pointer derefer- ence is checked using its metadata. General Terms Languages, Performance, Security Comprehensive memory safety requires detecting both spatial (bounds) violations and temporal (dangling pointer or use-after- Keywords memory safety, spatial safety, temporal safety, bounds free) violations. To detect spatial safety violations, base and bounds checking, use-after-free checking metadata is maintained with each pointer. Temporal violations may be detected using unique identifier based checking on memory 1. Introduction accesses [2, 10, 25, 28, 36, 47] or by invalidating the bounds of all pointers to an object when deallocating the object [15, 16, 42] C and C++ are the languages of choice for implementing infras- so that subsequent bounds checks will fail. tructure code and all kinds of low-level software. Such languages Pointer-based checking can be implemented in various parts of remain in common usage both for legacy reasons and because the system stack—via source code rewriting, the compiler, and/or they provide low-level access to underlying hardware, explicit con- in hardware. Recent compiler-based implementations have reduced trol over memory management, and high performance. However, the performance overhead for comprehensive memory safety to a longstanding problem with code written in C/C++ is the lack approximately 2× on average. These overheads are attained by of memory safety: accessing beyond the bounds (spatial safety instrumenting optimized code and using information available to violations) and accessing unallocated/deallocated memory loca- the compiler. Unfortunately, this overhead is likely still too large for tions (temporal safety violations). The lack of memory safety production use. As a consequence, researchers have proposed using causes simple programming errors to become the root cause of hardware to accelerate pointer checking [10, 12, 16, 25, 35], but a multitude of memory corruption bugs and security vulnerabili- these hardware proposals—including Watchdog [25], our own prior ties [7, 37, 38]. proposal—introduce significant hardware complexity and require various hardware structures dedicated to recording metadata state. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed See Section 2 for a comparison of these strategies. for profit or commercial advantage and that copies bear this notice and the full citation This paper proposes WatchdogLite, an ISA extension to accel- on the first page. To copy otherwise, to republish, to post on servers or to redistribute erate pointer-based checking without adding any new hardware for to lists, requires prior specific permission and/or a fee. maintaining metadata state. The proposed instructions accelerate CGO ’14 February 15-19 2014, Orlando, FL, USA the three key memory-safety checking operations: loading and stor- Copyright c 2014 ACM 978-1-4503-2670-4/14/02. $15.00 http://dx.doi.org/10.1145/2544137.2544147 ing metadata, bounds checking, and use-after-free checking. The instructions operate on the ISA’s preexisting architectural registers. (a) Pointer Arithmetic (b) Pointer Load (c) Pointer Store (d) Memory Allocations (f) Temporal Check p = malloc(size); q = p + index; int **p, *q; int **p, *q; tcheck(p_key, p_lock) { p_key = next_key++; // or &p[index] ... ... if (p_key != *(p_lock)) p_lock = allocate_lock(); q_base = p_base; raise exception(); scheck(p, p_base, p_bound); *(p_lock) = p_key; q_bound = p_bound; scheck(p, p_base, p_bound); } tcheck(p_key, p_lock); p_ base = p; q_key = p_key; tcheck(p_key, p_lock); q = *p; p_bound = p != 0 ? p+size: 0; q_lock = p_lock; *p = q; (g) Spatial Check q_base = lookup(p)->base; lookup(p)->base = q_base; scheck(p, p_base, p_bound, size) { q_bound = lookup(p)->bound; (e) Memory Deallocations lookup(p)->bound = q_bound; if (p < p_base || q_key = lookup(p)->key; free(p); lookup(p)->key = q_key; p + size >= p_bound) q_lock = lookup(p)->lock; lookup(p)->lock = q_lock; *(p_lock) = INVALID; raise exception(); deallocate_lock(p_lock); } Figure 1. (a) Pointer metadata propagation with pointer arithmetic, (b) metadata propagation through memory with metadata lookups on loads, (c) metadata lookups with pointer stores, (d) pointer metadata creation on memory allocations, (e) identifier metadata being invalidated on memory deallocations, (f) lock and key checking using identifier metadata, and (g) spatial check performed using bounds metadata. The compiler explicitly inserts these instructions, uses pre- Such vulnerabilities—including buffer overflows and use-after-free existing static optimizations to eliminate many checks, and per- vulnerabilities—are still pervasive but they are not new [37, 43]. forms in-register metadata propagation by copy elimination and Informally, enforcing memory safety has two primary components: standard register allocation. Relying on the compiler to perform preventing spatial violations (out-of-bounds memory accesses and these tasks largely eliminates the need for various previously pro- buffer overflows of all sorts) and preventing temporal safety vio- posed dedicated hardware structures that track and cache metadata. lations (memory accesses to deallocated memory, a.k.a. dangling Experiments based on extensions to our SoftBound+CETS pointer or use-after-free violations). compiler instrumentation show that the performance overhead Pointer-based metadata. In a pointer-based approach, meta- for enforcing comprehensive memory safety is reduced on av- data is maintained with each pointer, providing it a view of the erage from 90% (without hardware acceleration) to 29% (with memory that it can safely access according to the language speci- the new instructions). This overhead is similar to prior hardware fication. This representation permits the creation of out-of-bounds schemes, which use extensive hardware structures to track and pointers and pointers to the internal elements of objects/structs and propagate metadata state, which indicates that the proposed ISA arrays (both of which are allowed in C/C++). Figure 1 illustrates extension is a more pragmatic approach for hardware acceleration the pointer-based metadata propagation and checking abstractly us- of memory safety enforcement than prior hardware-centric propos- ing pseudo C code notation. The metadata—base, bound, lock, and als [12, 25, 35].

Watchdoglite: Hardware-Accelerated Compiler-Based Pointer Checking

Deciding Memory Safety for Single-Pass Heap-Manipulating Programs

Intra-Unikernel Isolation with Intel Memory Protection Keys

“Sok: Eternal War in Memory”

MPTEE: Bringing Flexible and Efficient Memory Protection to Intel

Intel® Xeon® E3-1200 V5 Processor Family

Sok: Hardware Security Support for Trustworthy Execution

Operating System Support for Run-Time Security with a Trusted Execution Environment

ERIM: Secure, Efﬁcient In-Process Isolation with Protection Keys (MPK)

Securing Intel Sgx Against Side-Channel Attacks Via Load-Time Synthesis

Lessons Learned from Porting Helenos to RISC-V

Memory Corruption Mitigation Via Hardening and Testing

Efficient Fault Tolerance Using Intel MPX And