Bryn Mawr College Scholarship, Research, and Creative Work at Bryn Mawr College Computer Science Faculty Research and Computer Science Scholarship
2013 Ironclad C++ A Library-Augmented Type-Safe Subset of C++ Christian DeLozier
Richard A. Eisenberg Bryn Mawr College
Santosh Nagarakatte
Peter-Michael Osera
Milo M.K. Martin
See next page for additional authors
Let us know how access to this document benefits ouy . Follow this and additional works at: http://repository.brynmawr.edu/compsci_pubs Part of the Programming Languages and Compilers Commons
Custom Citation Christian DeLozier, Richard Eisenberg, Santos Nagarakatte, Peter-Michael Osera, Milo M.K. Martin, Steve Zdancewic, "Ironclad C++: a library-augmented type-safe subset of C++," ACM SIGPLAN Notices - OOPSLA '13 48.10 (October 2013): 287-304.
This paper is posted at Scholarship, Research, and Creative Work at Bryn Mawr College. http://repository.brynmawr.edu/compsci_pubs/8
For more information, please contact [email protected]. Authors Christian DeLozier, Richard A. Eisenberg, Santosh Nagarakatte, Peter-Michael Osera, Milo M.K. Martin, and Steve Zdancewic
This article is available at Scholarship, Research, and Creative Work at Bryn Mawr College: http://repository.brynmawr.edu/ compsci_pubs/8 Ironclad C++ A Library-Augmented Type-Safe Subset of C++
Christian DeLozier Richard Eisenberg Santosh Nagarakatte† Peter-Michael Osera Milo M. K. Martin Steve Zdancewic Computer and Information Science Department, University of Pennsylvania †Computer Science Department, Rutgers University {delozier, eir, posera, milom, stevez}@cis.upenn.edu [email protected]
Abstract 1. Introduction The C++ programming language remains widely used, de- C and C++ are widely used programming languages for im- spite inheriting many unsafe features from C—features that plementing web browsers, native mobile applications, com- often lead to failures of type or memory safety that manifest pilers, databases, and other infrastructure software [33]. C as buffer overflows, use-after-free vulnerabilities, or abstrac- and C++ provide efficiency and low-level control, but these tion violations. Malicious attackers can exploit such viola- advantages come at the well-known cost of lack of memory tions to compromise application and system security. and type safety. This unsafety allows programming errors This paper introduces Ironclad C++, an approach to such as buffer overflows (accessing location beyond the ob- bringing the benefits of type and memory safety to C++. ject or array bounds), use-after-free errors (accessing mem- Ironclad C++ is, in essence, a library-augmented, type-safe ory locations that have been freed), and erroneous type casts subset of C++. All Ironclad C++ programs are valid C++ to cause arbitrary memory corruption and break program- programs that can be compiled using standard, off-the-shelf ming abstractions. More dangerously, malicious attackers C++ compilers. However, not all valid C++ programs are exploit such bugs to compromise system security [29]. valid Ironclad C++ programs: a syntactic source-code val- Recognizing this problem, many approaches have been idator statically prevents the use of unsafe C++ features. To proposed to prevent memory safety violations or enforce full enforce safety properties that are difficult to check statically, memory safety in C and C-like languages [2, 4, 9, 10, 12, Ironclad C++ applies dynamic checks via templated “smart 14, 17, 23–25, 31, 36]. Collectively, these prior works iden- pointer” classes. tified several key principles for bringing safety efficiently Using a semi-automatic refactoring tool, we have ported to C. However, one challenge in making C memory safe is nearly 50K lines of code to Ironclad C++. These benchmarks that C provides limited language support for creating type- incur a performance overhead of 12% on average, compared safe programming abstractions. In contrast, although C++ to the original unsafe C++ code. includes many unsafe features, C++ also provides advanced constructs that enable type-safe programming, such as tem- Categories and Subject Descriptors D.3.3 [Language plates and dynamically checked type-casts. Constructs and Features]: Dynamic storage management Ironclad C++ is our proposal to bring comprehensive General Terms Languages, Performance, Reliability, Se- memory and type safety to C++. Ironclad C++ is, in essence, curity a library-augmented [32] type-safe subset of C++. As such, an Ironclad C++ program is a valid C++ program (but not all Keywords C++, type safety, memory safety, local pointers C++ programs are valid Ironclad C++ programs). Dynamic checking is implemented in a “smart pointer” library, so no
Permission to make digital or hard copies of part or all of this work is granted without language extensions or compiler changes are required. Iron- fee provided that that copies bear this notice and the full citation on the first page. clad C++ uses a syntactic static validation tool to ensure the Copyrights for third-party components of this work must be honored. code conforms to the Ironclad C++ subset of C++, and the This work is licensed under the Creative Commons Attribution-NoDerivs 3.0 Unported code is then compiled using an unmodified off-the-shelf C++ License: http://creativecommons.org/licenses/by-nd/3.0/deed.en_US compiler (see Figure 1). This paper describes Ironclad C++ and reports on our ex-
OOPSLA ’13, October 29, 2013, Indianapolis, IN, USA. perience programming in this type-safe subset of C++, in- Copyright © 2013. Copyright is held by the owner/author(s). cluding: ACM 978-1-4503-2374-1/13/10. http://dx.doi.org/10.1145/2509136.2509550
1 Refactor Code No Differentiating array pointers from non-array point- ers. Several prior memory safety proposals have recognized Source Validator Is Valid? Code Yes the performance benefit of distinguishing between a pointer (Safe) to an array (which requires bounds information) versus a pointer to non-array (a.k.a. singleton) object (which does Ironclad C++ Compiler Executable not). Doing so automatically has typically relied on whole- Library (unmodified) program analysis at compile time or introduced language ex- Figure 1. Workflow for coding with Ironclad C++ tensions. For example, CCured [25] uses a whole-program type inference at compile time to distinguish between sin- • a C++ smart pointer library that efficiently and compre- gleton and array pointers, Cyclone [17] introduces different hensively enforces strong typing and bounds safety, type decorators, and some pool allocation approaches [9] create type-homogeneous pools of singleton objects. Sys- • a hybrid static–dynamic checking technique that prevents tems without such differentiation implicitly treat all pointers use-after-free errors for stack objects, without requiring as array pointers, adding unnecessary space and time over- costly heap allocation, heads for bounds checking. • an opt-in, source-level heap-precise garbage collector de- Ironclad C++ captures this differentiation between single- signed to reduce memory leaks due to conservative GC, ton and array pointers without language extension or whole- • tools for semi-automated refactoring of existing C and program analysis during each compilation by using the well- C++ code into the Ironclad subset of C++ and a validator known C++ technique of smart pointers [1, 2, 8]. Smart that enforces compliance with that subset, and pointers leverage C++’s template and operator overloading constructs, and they have previously been used to dynam- • experimental evaluation of the performance overheads of ically insert safety checks [2] or perform reference count- this approach to enforce memory safety. ing [8] on pointer operations. Ironclad C++ requires that all To evaluate practicality and performance, we refactored bare C++ pointer types be replaced with one from a suite of several C/C++ programs—over 50K lines in total—into smart pointers, some of which include bounds information Ironclad C++. We performed this translation with a semi- and thus support pointer arithmetic and indexing (for array automatic refactoring tool we created to assist in this pointers) and some that avoid the bounds checking overhead conversion, and we report our experiences converting these (for singleton pointers). The distinction between singleton programs. We also converted multiple test suites designed and array smart pointer types allows their overloaded oper- for evaluating memory safety bug detectors, which con- ators to perform the minimum dynamic checking necessary firmed that Ironclad C++ does successfully detect memory to detect bounds violations based on the type of the pointer. access violations. Using performance-oriented benchmarks, Enforcing strong typing. C’s use of void* and we measured an average performance overhead of 12% for unchecked type casts results in either pessimistic typing enforcing complete type and memory safety. assumptions, which can significantly increase the overhead of dynamic checking [23, 27], and/or the failure to detect 2. Overview and Approach all memory safety violations. Disallowing unsafe casts in Type-safe languages such as Java and C# enforce type C reduces checking overhead, but doing so has typically safety through a combination of static and dynamic enforce- required augmenting the C language in some way. For ment mechanisms such as static typing rules, dynamically example, Cyclone found it necessary to support generics, checked type casts, null dereference checking, runtime CCured adds RTTI (run-time type information) pointers, and bounds checking, and safe dynamic memory management. both support structural subtyping. However, C++ already In such languages, violations of type safety are prevented by provides alternatives to C’s unsafe constructs (for example, either rejecting the program at compile time (type errors) or templates and class hierarchies). Yet, to facilitate adoption, raising an exception during execution. The aim of Ironclad C++ inherited many of C’s unsafe constructs. Ironclad C++ C++ is to use this same set of familiar static/dynamic takes a different approach and explicitly enforces strong enforcement mechanisms to bring type-safety to C++ while typing by disallowing legacy type-unsafe constructs and retaining much of the syntax, performance, and efficiency requiring that all pointer type-cast operations are either of C++. To provide context on how these enforcement known to be safe statically or checked dynamically (by mechanisms apply to C/C++, this section overviews some building upon C++’s existing dynamic_cast construct). key principles of efficient type safety for C, surveys their Heap-safety through conservative garbage collection. implementations in prior work, and describes how these Ironclad C++’s smart pointers provide strong typing and principles can be brought to C++ by leveraging existing bounds safety, but they do not prevent use-after-free er- language features. rors. To avoid the overhead of reference counting [12] or use-after-free checking [2, 24, 36], Ironclad C++ facilitates
2 the use of conservative garbage collection [7] by targeting Type Capabilities Safety Checks a key challenge of using conservative GC to enforce ptr Dereference Null Check safety: conservative collection can lead to non-deterministic aptr Dereference Bounds Check memory leaks (due to non-pointer data that “looks” like Index Bounds Check a pointer) [5, 30]. To reduce such memory leaks due to Arithmetic No Check† conservative garbage collection, Ironclad C++ supports lptr Dereference Null Check heap-precise garbage collection. Inspired by prior work Receive address-of object Lifetime Check on mostly-copying and more-precise conservative garbage Receive this pointer Lifetime Check collection [3, 11, 15, 30], Ironclad’s collector treats the laptr Dereference Bounds Check roots conservatively but supports the precise identification Receive address-of object Lifetime Check of pointers in heap objects by employing mark() class Receive this pointer Lifetime Check methods to precisely identify pointer fields and pointer Hold static-sized array Lifetime Check containing members. Index Bounds Check Facilitating stack-allocation safety. Garbage collection Arithmetic No Check† does not prevent dangling pointers to stack objects. In recog- †: Arbitrary pointer arithmetic is permitted because a nition of this problem, CCured prevents use-after-free errors bounds check occurs before an aptr is dereferenced. to stack objects by selectively converting escaping stack- allocated objects into heap-allocated objects (a.k.a. heapi- Table 1. This table describes the capabilities and safety fication), which unfortunately introduces significant perfor- checks performed by each pointer type. As more letters are mance overheads in some programs [25]. To prevent use- added to the type, the capabilities increase, but the overhead after-free errors for stack allocated memory without the per- also increases due to additional required checks. formance penalties of heapification, this paper introduces hybrid static-dynamic checking of stack pointer lifetimes. scribes safety without considering arrays or memory deallo- The hybrid checking in Ironclad C++ avoids the perfor- cation (Section 3.1), it then describes support for references mance overheads of heapification with simple static check- (Section 3.2), arrays, and pointer arithmetic (Section 3.3). ing and limited dynamic checking. Further sections discuss memory deallocation safety for the Validating conformance statically. Statically validating heap (Section 4) and stack (Section 5). that code conforms to the rules of Ironclad C++ is paramount for ensuring safety. Without some form of static validation, 3.1 Strong Static Typing with ptr
3 Example: Strong Static Typing float radius(Shape* shape) { float radius(ptr
Supporting type casts safely. By disallowing raw compared to C because only POD (plain old data) types can pointers, Ironclad C++ also implicitly disallows both void* be used in unions. pointers and unsafe pointer-to-pointer casts. To support safe Rule (Unions). Unions are disallowed in Ironclad C++. pointer-to-pointer casts, Ironclad C++ provides a cast to a ptr
4 are allowed in Ironclad C++. References will be further dis- 3.4 Pointer Initialization cussed in Section 5.3. If pointers were allowed to be uninitialized, a pointer could 3.3 Bounds Checking with aptr
5 optimized template specializations for standard data types, class Circle : such as char. // Inherits from Shape & IroncladPreciseGC The
6 discarding the address if it has already been marked. To provided by delete between the garage collector, which re- perform the “marking” operation, the collector invokes the claims the object’s memory, and the destruct() method mark() method on the object, which precisely identifies all (provided by ptr and aptr), which explicitly calls the ob- pointers in the allocation and pushes them onto the mark ject’s destructor. This approach avoids the problem of calling stack. If the type of the allocation does not inherit from the destructor asynchronously, but it has the disadvantage that IroncladPreciseGC class, the collector falls back to us- Ironclad C++ does not prevent a program from accessing an ing the conservative marking strategy. Like the conservative object after its destructor has been called. Such an access— collector, once the allocation has been processed, the alloca- although certainly a bug in the program—does not violate tion is recorded as having been marked. This process repeats type safety because the garbage collector ensures the under- until the mark stack is empty, and the collector frees all allo- lying memory is still valid (i.e., has not be reclaimed) and cations that have not been marked. thus remains properly typed. 4.3 Requirements for Precise Marking 5. Stack Deallocation Safety via lptr For an object to be precisely marked, all of the objects that Although garbage collection prevents all dangling pointers it directly contains (i.e., not pointers to objects) and all of to objects on the heap, it does not protect against dangling its base classes must define mark() methods because the pointers to stack-allocated objects. One way to prevent such heap-precise collector must be aware of all pointers in the errors is to forgo some of the efficiency benefits of stack allo- physical layout of the marked object. A class’s mark() is cation by limiting the use of stack allocation to non-escaping responsible for adding all pointers contained in the object objects only (a.k.a. heapification). To avoid the performance to the mark stack, including all pointer that are data member penalties of heapification, Ironclad C++ provides additional fields, in object fields, or in an inherited-from base class. The templated smart pointers that, cooperatively with the static mark() method accomplishes this task by calling mark() on code validator and C++ type checker, uses dynamic lifetime all smart pointers data members, on all object data member checking to prevent use-after-free errors for stack allocations fields, and on all inherited-from base classes. The mark() while avoiding heapification in almost all cases. method of each smart pointer class pushes the address onto the mark stack. The mark() methods of the object’s data 5.1 Heapification members will in turn subsequently call mark() methods on A common approach for preventing use-after-free errors for their fields, resulting in all pointers contained by the alloca- stack allocations in a garbage collected system is simply to tion being added to the mark stack. Primitive non-pointer restrict stack allocations by employing heapification [25], types need not be marked. Figure 3 provides an example which is the process of heap-allocating an object that was mark() method for a simple class hierarchy. Calling the originally allocated on the stack. Heapification enforces mark() method on class that does not actually contain point- deallocation safety by conservatively disallowing any object ers is unlikely to hurt performance because such mark() whose address might escape the function from being allo- methods will be empty and defined in the class’s header file, cated on the stack. For example, without inter-procedural allowing them to be optimized away by the compiler’s in- analysis, heapification requires heap allocation of any lining pass. The heap-precise collector uses template spe- object whose address is passed into a function. This is a cialized mark() methods for arrays of primitive data types. particular challenge for C++ code, because object methods Primitive types do not contain pointers, so these methods do and constructors are implicitly passed the address of the nothing and thus avoid the overhead of scanning an array object (the this pointer), thus disallowing stack allocation when it clearly does not contain pointers. of almost all objects unless inter-procedural analysis could In Ironclad C++, the mark() methods are supplied prove otherwise. Unfortunately, heapification results in by the programmer and part of the program’s permanent significant performance degradations in some cases (see source code. To ensure safety, the mark() methods are Section 9.5). checked for conformance with the above rules by the 5.2 Dynamic Lifetime Checking Ironclad C++ validator (Section 7), unlike prior systems that utilize programmer-supplied marking [3, 30]. To assist Ironclad C++ reduces the need for heapification by focusing the programmer, the Ironclad C++ refactoring tool can on preventing the two causes of dangling pointers to stack al- automatically generate mark() methods in many cases. locations: stack addresses that escape to the heap or globals and stack addresses that escape to longer-living stack frames. 4.4 Object Destruction To prevent dangling pointers to the stack, Ironclad C++ In C++, calling delete on a heap-allocated object will de- identifies, tracks, and prevents the unsafe escape of stack struct the object and reclaim the memory in which the object addresses using two additional templated pointer classes, had been allocated. In Ironclad C++, the garbage collector lptr
7 focused on purely static enforcement through sophisticated each assignment into or out of the local pointer will not cre- program analyses. ate a dangling pointer. To prevent stack addresses from escaping to the heap or Overall, Ironclad C++ enforces the following invariant to globals (example shown in Figure 4(a)), Ironclad C++ com- ensure deallocation safety with regard to stack allocations. bines static restrictions on where stack addresses may exist Invariant (Pointer lifetime). The lifetime of a pointer may in memory with a dynamic check on assignment between lo- not exceed the lifetime of the value that it points to. cal pointers and non-local pointers (ptr and aptr). Ironclad C++ does not place any restrictions on where ptr and aptr To more concretely explain the dynamic checks applied can be allocated; ptr and aptr can both exist on the heap or by local pointers, we give a case-by-case analysis of the non- as globals. Therefore, it is unsafe for a ptr or aptr to hold trivial assignments between lptr and ptr. the address of a stack allocation. In C++, the address of a stack allocation can be initially Case: Assign from ptr
8 (A) Stack Address Escapes to Global Scope int* global = NULL; ptr
void f() void f() { { g(); g(); cout << *global; // dangling pointer dereference cout << *global; // Prevented by check in g() } }
void g() void g() { { int y = 1; int y = 1; int* q = &y; lptr
void g(int** ptr_to_p) void g(lptr
9 which is used as a return value. The ref
10 Types τ :: = ptrhτi | lptrhτi | C Expression variables x,y Surface exprs e :: = x | null | e.x | e .f (e ) n 1 2 Locations ` :: = x @y1 ..ym | newC() | &e | ∗e Pointer values pv :: = ` | bad ptr Internal exprs e :: = ` | {s; returne} | error Values v :: = ptr(pv) | lptr(pv) | C Statements s :: = e1 = e2 | s1;s2 Store Σ :: = · | Σ[` 7→ v] | skip | error Store typing Ψ :: = · | Ψ[` : τ] i Class decls cls :: = structC {τi xi; meths}; Class context ∆ :: = {cls1 ..clsm} i Methods meth :: = τ1 f (τ2 x){τi xi ;s;returne} Programs prog :: = ∆;void main() {s} Figure 6. Core Ironclad Syntax
expressions statements
∆;n ∆;n typing Ψ `exp e : τ Ψ `stmt sok EVAL STMT ASSIGN PTR PTR (` ) = ptr(pv ) (` ) = ptr(pv ) evaluation (Σ,e) −→n (Σ0,e0) (Σ,s) −→n (Σ0,s0) Σ 1 1 Σ 2 2 exp ∆ ∆ 0 stmt Σ = Σ[`1 7→ ptr(pv2)] n 0 Figure 7. Typing judgments and evaluation relations (Σ,`1 = `2) −→ (Σ ,skip) stmt ∆ to reflect the new stack frame: When assigning a (non-null) lptr to a ptr, we verify that the EVAL EXP BODY CONG1 lptr does indeed point to the heap by checking that the store index of the location referred to by the lptr is 0. (Σ,s) −→n+1(Σ0,s0) stmt ∆ n 0 0 (Σ,{s; returne}) −→∆(Σ ,{s ; returne}) exp EVAL STMT ASSIGN PTR LPTR 0 Finally, when a frame expression returns, the frame expres- Σ(`1) = ptr(pv1) Σ(`2) = lptr(x @π) 0 0 sion is replaced with the location of the return value. Σ = Σ[`1 7→ ptr(x @π)] n 0 (Σ,`1 = `2) −→ (Σ ,skip) EVAL EXP BODY RET stmt ∆ xn freshforΣ n Σ2 = copy store(Σ,`,x ) When the lptr does not point to the heap, we raise an error. 0 Σ = ΣΣ2\(n + 1) (Σ,{skip; return`}) −→n (Σ0,xn@) EVAL STMT ASSIGN PTR LPTR ERR exp ∆ 0 Σ(`1) = ptr(pv1) n 6= 0 The premises copy the return value ` into a fresh base loca- n0 Σ(`2) = lptr(x @π2) n tion x in the caller’s frame (taking care to copy additional n (Σ,`1 = `2) −→ (Σ,error) locations if the returned value is an object) and pop the stack. stmt ∆ The result of the method call then becomes that fresh loca- tion. Note that no dynamic check (say, to make sure that the Finally, when assigning (non-null) lptrs, the dynamic check value stored in xn is valid after the function returns) is needed ensures that the lptr being assigned to out-lives the location here because the type system enforces that the returned value it receives by comparing the appropriate store indices. cannot be a lptr in this rule: EVAL STMT ASSIGN LPTR LPTR TYPE EXP BODY Σ(x n1 @π ) = lptr(pv ) 0 0 1 1 1 ∀τ .τ 6= lptrhτ i n2 Σ(`2) = lptr(x2 @π2) ∆;n+1 ∆;n+1 0 n n Ψ `stmt sok Ψ `exp e : τ Σ = Σ[x1 1 @π1 7→ lptr(x2 2 @π2)] n2 ≤ n1 n ∆; n1 n 0 Ψ `exp {s; returne} : τ (Σ,x1 @π1 = `2) −→ (Σ ,skip) stmt ∆
6.3 Pointer Semantics If the lptr does not out-live its new location, we raise an error With respect to the pointer lifetime invariant, the most inter- like the ptr-lptr case above. esting rules concern the assignment of pointers as well as the In addition to standard typing rules for statements and constraints we place on their values in the store. The simplest expressions, Core Ironclad enforces a consistency judgment case is when we assign between two ptr values where we over the store through the store typing. Two of these binding simply overwrite the left-hand pointer with the right-hand consistency rules for pointers capture the pointer lifetime pointer in the store. invariant.
11 The first rule concerns ptrs and requires that the location 2. If Ψ `∆;n e : τ and Ψ ` Σok then e is `, error, or (Σ,e) −→ exp st exp pointed to by the ptr is on the heap (at index 0). n 0 0 ∆(Σ ,e ). CONS BINDING PTR In Core Ironclad, well-formed terms take a step or step 0 Σ(xn@π) = ptr(x0n @π0) n0 = 0 to error. error terms arise only when an attempted pointer 0 Ψ(x0n @π0) = τ assignment breaks the pointer lifetime invariant. This corre- sponds to the dynamic checks outlined in Section 6.3. Ψ;Σ ` xn@π : ptrhτiok st1 We prove preservation at stack height |s|+n (respectively The second rule concerns lptrs and requires that lptrs only |e| + n) where n is the height of the stack up to statement s exist at base locations without paths (not embedded within (respectively e) and |s| (respectively |e|) is the number of an object) and that the location pointed to by a particular lptr additional stack frames embedded in the subject of the eval- 0 is in the same stack frame or a lower one. uation. The extra condition Ψ ⊆n Ψ says that new store typ- 0 CONS BINDING LPTR ing Ψ has all the bindings of the old store typing Ψ up to stack height n. The complete details of the proofs of these n0 Σ(xn@) = lptr(x0 @π0) n0 ≤ n theorems can be found in our companion technical report. 0 Ψ(x0n @π0) = τ n Ψ;Σ `st1 x @: lptrhτiok 7. Validator for Ironclad C++ 6.4 The Pointer Lifetime Invariant Although many of the rules of Ironclad C++ are simple Invariant (Pointer lifetime). For all bindings of the form enough for a well-intentioned programmer to follow un- n n [x1 1 @π1 7→ ptr(pv)] and [x1 1 @π1 7→ lptr(pv)] in Σ, if pv = aided, ensuring safety requires the use of a static syntactic n x2 2 @π2 (i.e., is non-null) then n2 ≤ n1. validation tool to certify the code’s conformance to the Iron- clad C++ subset. The role of the validator is to ensure that We can now present our theorem that states that the any violations of memory/type safety will be caught by ei- pointer lifetime invariant is preserved by execution. We ther: (1) the standard C++ static type checker or (2) the dy- make use of the summary judgment wf (∆,Ψ,Σ,k), which namic type cast and bounds checks performed by the smart asserts that the class and method context ∆, the store typing pointers. For example, the validator ensures that no raw Ψ, and the store Σ are all consistent with one another and pointer or union types remain. For heap-precise garbage col- contain only bindings at or below stack height k. These lection, the validator ensures that user-defined mark() meth- checks also include the pointer lifetime invariant. Therefore, ods correctly identify all pointers, pointer-containing mem- wf (∆,Ψ,Σ,k) implies that the pointer lifetime invariant bers, and inherited base classes of each precisely marked holds for Σ. class. To prevent dangling pointers to stack objects via dy- Theorem (Pointer lifetime invariant is preserved). If namic lifetime checking, the validator checks that all uses ∆;n n 0 0 wf (∆,Ψ,Σ,k), Ψ `stmt sok, and (Σ,s) −→ (Σ ,s ), where of address-of, the this pointer, and conversions from stack stmt ∆ arrays to pointers immediately enter an or . The k is the maximum stack height used within the statement s, lptr laptr syntactic validator examines the initializer lists for each class then the pointer lifetime invariant holds in the store Σ0. constructor to ensure that pointer class members are safely This theorem is a straightforward corollary of a standard initialized. The validator ensures that references are not used type preservation lemma. That is, preservation tells us that as class members. Finally, the validator ensures that expres- the wf judgment is preserved by evaluation. sions returned by reference match one of the following con- Lemma 1 (Preservation). structs: dereference of an aptr or ptr, a reference parame- ∆;n n 0 0 ter, the dereference of the this pointer, or a class member. 1. If wf (∆,Ψ,Σ,|s|+n), Ψ `stmt sok, and (Σ,s) −→ (Σ ,s ), stmt ∆ Once the code passes the static syntactic validator, the then there exists Ψ0 such that wf (∆,Ψ0,Σ0,|s| + n), C++ type-checker then statically enforces the remaining ∆;n 0 0 Ψ `stmt s ok, and Ψ ⊆n Ψ . type safety properties. For example, unsafe casts and void* ∆;n n 0 0 will not type-check in validated code because Ironclad C++ 2. If wf (∆,Ψ,Σ,|e|+n), Ψ `exp e : τ, and (Σ,e) −→∆(Σ ,e ), exp smart pointers explicitly do not support them. Similarly, 0 0 0 0 ∆;n then there exists Ψ such that wf (∆,Ψ ,Σ ,|e|+n), Ψ `exp array indexing and pointer arithmetic operators are not 0 0 e : τ, and Ψ ⊆n Ψ . defined on ptr
12 void f(void*); 8.2 Step 2: Increasing Type-safety ⇓ (Manual) After refactoring the code to compile with a C++ compiler and use C++ allocation and deallocation functions, we per- template
13 Manual Code Changes Automated Code Changes Benchmark Lang Class Ptrs Refs LoC C++ Alloc Pre Post Ptr Types SysFunc Alloc Casts blackscholes C 1 20% N/A 405 0 5 4 2 5 5 4 0 bzip2 C 2 47% N/A 5731 16 41 28 14 224 21 30 30 lbm C 1 57% N/A 904 1 3 20 6 49 7 2 4 sjeng C 2 46% N/A 10544 45 24 164 158 310 82 30 0 astar C++ 25 7% 35% 4280 0 60 2 7 72 15 76 2 canneal C++ 3 27% 29% 2817 0 0 2 9 72 3 2 1 fluidanimate C++ 4 7% 7% 2785 0 1 0 2 85 7 44 1 leveldb C++ 66 49% 24% 16188 0 0 160 149 1028 69 195 33 namd C++ 15 46% 7% 3886 0 0 0 44 265 11 69 10 streamcluster C++ 5 37% 0% 1767 0 25 2 2 63 17 9 21 swaptions C++ 1 39% 0% 1095 0 11 1 12 63 3 0 9 Table 2. Characterization of the evaluated programs. From left to right, benchmark name, source language, number of class- es/structs, % pointer declarations, % reference declarations, lines of code, manually refactored lines of code, and automatically refactored lines of code. refactoring is intended to be performed only once, the num- eter to type aptr
14 start and end indices of the reduction on both input arrays, contains a single pointer field. Unlike ptr, the template spe- and then runs the computation at unchecked speeds. With cialization of atomic for aptr requires the use of a lock or this optimization, the streamcluster benchmark executes transaction to atomically update the three fields of the aptr. with no measurable overhead compared to the original. Using these atomic pointers, any program that was data- race free prior to using Ironclad C++ remains data-race free. swaptions Swaptions spends most of its runtime per- forming operations on vectors and matrices of floating External Libraries Ideally, library source-code would also pointer numbers. For matrix structures, swaptions uses an be refactored to Ironclad C++, but we acknowledge that it array-of-array-pointers to approximate a two-dimensional may not always be feasible. In such cases, Ironclad C++ pro- array (i.e., aptr
15 1.5 Translated Bounds Bounds + Stack Bounds + Stack + GC Bounds + Stack + Heap-Precise GC
1.0
0.5
0.0 astar blackscholes bzip2 canneal fluidanimate lbm ldb-fseq ldb-rreverse ldb-rhot namd sjeng stcluster swaptions average Normalized Runtime Figure 9. Normalized runtimes for refactored Ironclad C++ code with (adding checking from left to right): no checking, bounds checked arrays, safe stack allocations, safe heap allocations, and heap-precise garbage collection.
2.0 Translated Bounds Bounds + Stack Bounds + Stack + GC Bounds + Stack + Heap-Precise GC 1.5 1.0 0.5
0.0 astar blackscholes bzip2 canneal fluidanimate lbm ldb-fseq ldb-rreverse ldb-rhot namd sjeng stcluster swaptions average Normalized Mem. Usage Figure 10. Normalized memory usage for Ironclad C++ code with (from left to right) bounds checking metadata, bounds and stack checking metadata, both metadata with conservative GC, and both metadata with heap-precise GC. ers. To reduce overhead, the smart pointer implementation 9.3 Overheads of Garbage Collection uses Clang’s always_inline function attribute to ensure Figure 9 shows that the runtime overhead of garbage collec- the compiler inlines the dereference and indexing operators. tion is negligible in our benchmarks. Although perhaps sur- We used Valgrind’s Massif tool to measure memory usage prising, our benchmarks are not typical of programs used in overheads. garbage collection studies, which are generally selected for We used the Boehm-Demers-Weiser conservative their frequent allocation behavior. For example, several of garbage collector both as the baseline garbage collector and our benchmarks allocate memory only during initialization as the basis for the heap-precise garbage collector [7]. To im- and do not deallocate memory—resulting in extremely rare plement the heap-precise collector, we extended the marking collection invocation (less than once per second for many implementation of the Boehm-Demers-Weiser collector to benchmarks). The benchmark that collects most frequently call the mark() method of allocations initialized for precise (six hundred times per second), swaptions, incurs an addi- collection rather than pushing the allocation’s address onto tional 23% performance penalty due to garbage collection. the conservative collector’s mark stack. Any addresses The garbage collector increases memory usage by 14% passed back to the collector from mark() are added to on average and up to 85% for leveldb-fillseq (Fig- the mark stack. Each potential address passes through the ure 10) when compared to explicit deallocation with the existing checks employed by the Boehm-Demers-Weiser same underlying memory allocator. One caveat is that the collector for duplicate marking and blacklisting. allocator underlying the conservative collector uses more 9.2 Overall Performance space on average than Clang’s default memory allocator (by 29%) even when operating in explicit memory deallocation The overall performance overhead for bringing type and mode; if this overhead is included, the total memory over- memory safety to the refactored programs is just 12% on head of GC rises to 43% (not shown on Figure 10). average. Figure 9 shows these results, and it also includes re- sults for multiple configurations to show the impact of each aspect of Ironclad C++. The left-most bar in each group 9.4 Benefits of Heap-Precise Collection (“Translated”) shows the normalized execution time of the Figure 10 also shows the impact of Ironclad’s heap-precise fully refactored, strongly typed benchmarks when dynamic extension to the garbage collector. In most cases, the bound checking, dynamic lifetime checking, and the garbage heap-precise collector provides no appreciable reduc- collector are all disabled. These results indicate there is tion of memory usage, but in two cases — astar and negligible overhead from replacing raw pointers with smart leveldb-fillseq — the pure-conservative collector pointers. The second bar from the left in each group of Fig- suffers due to imprecise identification of heap pointers. ure 9 (“Bounds”) shows bounds checking is by far the most When applying heap-precise collection, these programs’ significant contributor to the overall performance overhead. memory usage is reduced by 28% and 66%, respectively,
16 Heapify (No GC) Dynamic Lifetime (No GC) Dynamic Lifetime (With GC) Unsafe 1.0 0.8 0.6 0.4 0.2
0.0 astar blackscholes bzip2 canneal fluidanimate lbm ldb-fseq ldb-rreverse ldb-rhot namd sjeng stcluster swaptions geom Normalized Runtime Figure 11. Runtime normalized to using heapification with GC. Bars from left to right are heapification with unsafe deallo- cation, dynamic lifetime checking (safe stack allocations), dynamic lifetime checking with GC (safe allocations), and unsafe deallocation. compared to the unmodified conservative collector. The 10. Conclusion memory overheads of bzip2 and namd do not improve Ironclad C++ brings type safety to C++ at a runtime over- under heap-precise collection due to stack-allocated arrays head of 12%. We demonstrated the feasibility of refactor- (which are not tagged) with elements that are misidentified ing C and C++ code to Ironclad C++ with the help of a as pointers. The overall average memory overhead of GC semi-automatic refactoring tool. With heap-precise garbage vs. explicit memory allocation drops from 14% to 9% with collection, Ironclad C++ provides an optional interface for the addition of heap-precise collection. precisely identifying heap pointers, which was shown to de- We observe a 2% performance penalty for heap-precise crease average memory usage of garbage collection. We in- garbage collection. Upon investigation, we found this vestigated both heapification and dynamic lifetime check- penalty is not due to slower tracing time using mark() ing for enforcing stack deallocation safety and found that methods during garbage collection; the penalty is a result dynamic lifetime checking offered flexibility, memory con- of changes in data layout introduced by our implementation trol, and limited source code modifications as compared to of heap-precise collection (which wraps each allocation heapification. Overall, our experiences and experimental re- with a templated wrapper class with a virtual table pointer sults indicate that Ironclad C++ has the potential to be an for calling the correct mark() method). We confirmed this effective, low-overhead, and pragmatic approach for bring- hypothesis by including the same header when using the ing comprehensive memory safety to C++. completely conservative collector, which yielded the same runtime overheads. Acknowledgments 9.5 Benefits of Dynamic Lifetime Checking We would like to thank Emery Berger, Mathias Payer, and Dynamic lifetime checking incurs less than 1% overhead John Regehr for their comments and suggestions about this over the baseline of providing no safety checking for stack work. This research was funded in part by the U.S. Gov- deallocation. We originally observed overheads of 17% in ernment by ONR award N000141110596 and NSF grants bzip2, but these overheads were due to the creation of an CNS-1116682 and CCF-1065166. The views and conclu- laptr temporary in a tight loop. The optimization described sions contained in this document are those of the authors and in Section 8.5 eliminated the overhead from dynamic life- should not be interpreted as representing the official policies, time checking in bzip2. either expressed or implied, of the U.S. Government. In Figure 11, we compare dynamic lifetime checking to the other notable alternative for stack allocation safety: References heapification. On average, heapification is 2× slower than [1] A. Alexandrescu. Modern C++ Design: Generic Program- dynamic lifetime checking. We observed two situations ming and Design Patterns Applied. Addison-Wesley, Boston, in which heapification lead to increased performance MA, 2001. overheads: a stack-allocated array escaping to a function [2] T. M. Austin, S. E. Breach, and G. S. Sohi. Efficient Detection (occurs in sjeng) and calling a method on a stack-allocated of All Pointer and Array Access Errors. In Proceedings of object (occurs in leveldb). As a result, dynamic lifetime the SIGPLAN 1994 Conference on Programming Language checking is faster than heapification by 27.5× (sjeng), 6.2× Design and Implementation, June 1994. (ldb-fseq), 9.1× (ldb-rreverse), and 6.4× (ldb-rhot). [3] J. Bartlett. Mostly-Copying Garbage Collection Picks Up In almost all cases, dynamic lifetime checking avoids the Generations and C++. Technical report, DEC, 1989. use of heapification for enforcing stack deallocation safety. [4] E. D. Berger and B. G. Zorn. DieHard: Probabilistic Memory The runtime performance of dynamic lifetime checking is Safety for Unsafe Languages. In Proceedings of the SIGPLAN nearly identical to the performance of code with no safety 2006 Conference on Programming Language Design and Im- checking for stack deallocation. plementation, pages 158–168, June 2006.
17 [5] H.-J. Boehm. Space Efficient Conservative Garbage Collec- [21] D. Lomet. Making Pointers Safe in System Programming tion. In Proceedings of the SIGPLAN 1993 Conference on Languages. IEEE Transactions on Software Engineering, Programming Language Design and Implementation, pages pages 87 – 96, Jan. 1985. 197–206, June 1993. [22] S. Lu, Z. Li, F. Qin, L. Tan, P. Zhou, and Y. Zhou. Bug- [6] H.-J. Boehm and M. Spertus. Garbage collection in the next bench: Benchmarks for Evaluating Bug Detection tools. In C++ standard. In Proceedings of the 2009 International Sym- In PLDI Workshop on the Evaluation of Software Defect De- posium on Memory Management, pages 30–38, June 2009. tection Tools, June 2005. [7] H.-J. Boehm and M. Weiser. Garbage Collection in an Unco- [23] S. Nagarakatte, J. Zhao, M. M. K. Martin, and S. Zdancewic. operative Environment. Software — Practice & Experience, SoftBound: Highly Compatible and Complete Spatial Mem- 18(9):807–820, Sept. 1988. ory Safety for C. In Proceedings of the SIGPLAN 2009 Con- [8] D. Colvin, G. and Adler, D. Smart Pointers - Boost 1.48.0. ference on Programming Language Design and Implementa- Boost C++ Libraries, Jan. 2012. www.boost.org/docs/ tion, June 2009. libs/1_48_0/libs/smart_ptr/smart_ptr.htm. [24] S. Nagarakatte, J. Zhao, M. M. K. Martin, and S. Zdancewic. [9] D. Dhurjati and V. Adve. Backwards-Compatible Array CETS: Compiler Enforced Temporal Safety for C. In Pro- Bounds Checking for C with Very Low Overhead. In Proceed- ceedings of the 2010 International Symposium on Memory ings of the 28th International Conference on Software Engi- Management, June 2010. neering (ICSE), pages 162–171, 2006. [25] G. C. Necula, J. Condit, M. Harren, S. McPeak, and [10] D. Dhurjati, S. Kowshik, V. Adve, and C. Lattner. Memory W. Weimer. CCured: Type-Safe Retrofitting of Legacy Soft- Safety Without Runtime Checks or Garbage Collection. In ware. ACM Transactions on Programming Languages and Proceedings of the 2003 ACM SIGPLAN Conference on Lan- Systems, 27(3), May 2005. guage, Compiler, and Tool for Embedded Systems (LCTES), [26] NIST Juliet Test Suite for C/C++. NIST, 2010. pages 69–80, 2003. http://samate.nist.gov/SRD/testCases/suites/ [11] D. Edelson and I. Pohl. A Copying Collector for C++. In Juliet-2010-12.c.cpp.zip. Proceedings of The 18th ACM SIGPLAN/SIGACT Symposium [27] Y. Oiwa. Implementation of the Memory-safe Full ANSI- on Principles of Programming Languages (POPL), pages 51– C Compiler. In Proceedings of the SIGPLAN 2009 Confer- 58, Jan. 1991. ence on Programming Language Design and Implementation, [12] D. Gay, R. Ennals, and E. Brewer. Safe Manual Memory Man- pages 259–269, June 2009. agement. In Proceedings of the 2007 International Sympo- [28] P.-M. Osera, R. Eisenberg, C. DeLozier, S. Nagarakatte, sium on Memory Management, Oct. 2007. M. M. K. Martin, and S. Zdancewic. Core Ironclad. Technical [13] D. Grossman, G. Morrisett, T. Jim, M. Hicks, Y. Wang, and Report MS-CIS-13-06, University of Pennsylvania, 2013. J. Cheney. Region-Based Memory Management in Cyclone. [29] J. Pincus and B. Baker. Beyond Stack Smashing: Recent Ad- In Proceedings of the SIGPLAN 2002 Conference on Pro- vances in Exploiting Buffer Overruns. IEEE Security & Pri- gramming Language Design and Implementation, June 2002. vacy, 2(4):20–27, 2004. [14] R. Hastings and B. Joyce. Purify: Fast Detection of Mem- [30] J. Rafkind, A. Wick, M. Flatt, and J. Regehr. Precise Garbage ory Leaks and Access Errors. In Proc. of the Winter Usenix Collection for C. In Proceedings of the 2009 International Conference, 1992. Symposium on Memory Management, June 2009. [15] M. Hirzel and A. Diwan. On the type accuracy of garbage col- [31] M. S. Simpson and R. K. Barua. MemSafe: Ensuring the lection. In Proceedings of the 2000 International Symposium Spatial and Temporal Memory Safety of C at Runtime. In on Memory Management, pages 1–11, Oct. 2004. IEEE International Workshop on Source Code Analysis and [16] International Standard ISO/IEC 14882:2011. Programming Manipulation, pages 199–208, 2010. Languages – C++. International Organization for Standards, [32] B. Stroustrup. A Rationale for Semantically Enhanced Library 2011. Languages. In Library-Centric Software Design, page 44, [17] T. Jim, G. Morrisett, D. Grossman, M. Hicks, J. Cheney, and 2005. Y. Wang. Cyclone: A Safe Dialect of C. In Proceedings of the [33] B. Stroustrup. Software Development for Infrastructure. Com- 2002 USENIX Annual Technical Conference, June 2002. puter, 45:47–58, Jan. 2012. [18] J. Jonathan G. Rossie and D. P. Friedman. An Algebraic Se- [34] E. Unger. Severe memory problems on 32-bit Linux, mantics of Subobjects. In Proceedings of the 17th SIGPLAN April 2012. https://groups.google.com/d/topic/ Conference on Object-Oriented Programming, Systems, Lan- golang-nuts/qxlxu5RZAI0/discussion. guages and Application (OOPSLA), Nov. 2002. [35] J. Wilander and M. Kamkar. A Comparison of Publicly Avail- [19] R. Jones and R. Lins. Garbage Collection: Algorithms for Au- able Tools for Dynamic Buffer Overflow Prevention. In Pro- tomatic Dynamic Memory Management. John Wiley & Sons, ceedings of the Network and Distributed Systems Security 1996. Symposium, 2003. [20] C. Lattner and V. Adve. LLVM: A Compilation Framework [36] W. Xu, D. C. DuVarney, and R. Sekar. An Efficient and for Lifelong Program Analysis & Transformation. In Proceed- Backwards-Compatible Transformation to Ensure Memory ings of the International Symposium on Code Generation and Safety of C Programs. In Proceedings of the 12th ACM SIG- Optimization, page 75, 2004. SOFT International Symposium on Foundations of Software Engineering (FSE), pages 117–126, 2004.
18