Backwards-Compatible Array Bounds Checking for C with Very Low Overhead∗
Total Page:16
File Type:pdf, Size:1020Kb
Backwards-Compatible Array Bounds Checking for C with Very Low Overhead∗ Dinakar Dhurjati and Vikram Adve Department of Computer Science University of Illinois at Urbana-Champaign 201 N. Goodwin Ave., Urbana, IL 61801 {dhurjati,vadve}@cs.uiuc.edu ABSTRACT work on detecting array bounds violations or buffer over- The problem of enforcing correct usage of array and pointer runs, because the best existing solutions to date are either references in C and C++ programs remains unsolved. The far too expensive for use in deployed production code or approach proposed by Jones and Kelly (extended by Ruwase raise serious practical difficulties for use in real-world devel- and Lam) is the only one we know of that does not require opment situations. significant manual changes to programs, but it has extremely The fundamental difficulty of bounds checking in C and high overheads of 5x-6x and 11x–12x in the two versions. In C++ is the need to track, at run-time, the intended tar- this paper, we describe a collection of techniques that dra- get object of each pointer value (called the intended referent matically reduce the overhead of this approach, by exploit- by Jones and Kelly [10]). Unlike safe languages like Java, ing a fine-grain partitioning of memory called Automatic pointer arithmetic in C and C++ allows a pointer to be com- Pool Allocation. Together, these techniques bring the aver- puted into the middle of an array or string object and used age overhead checks down to only 12% for a set of bench- later to further index into the object. Because such interme- marks (but 69% for one case). We show that the memory diate pointers can be saved into arbitrary data structures in partitioning is key to bringing down this overhead. We also memory and passed via function calls, checking the later in- show that our technique successfully detects all buffer over- dexing operations requires tracking the intended referent of run violations in a test suite modeling reported violations in the pointer through in-memory data structures and function some important real-world programs. calls. The compiler must transform the program to perform this tracking, and this has proved a very difficult problem. More specifically, there are three broad classes of solu- Categories and Subject Descriptors tions: D.3 [Software]: Programming Languages; D.2.5 [Software]: Software Engineering—Testing and Debugging • Use an expanded pointer representation (“fat point- ers”) to record information about the intended referent with each pointer: This approach allows efficient look- General Terms up of the pointer but the non-standard pointer rep- Reliability, Security, Languages resentation is incompatible with external, unchecked code, e.g. precompiled libraries. The difficulties of Keywords solving this problem in existing legacy code makes this approach largely impractical by itself. The challenges compilers, array bounds checking, programming languages, involved are described in more detail in Section 6. region management, automatic pool allocation. • Store the metadata separately from the pointer but use 1. INTRODUCTION a map (e.g., a hash table) from pointers to metadata: This reduces but does not eliminate the compatibility This paper addresses the problem of enforcing correct us- problems of fat pointers, because checked pointers pos- age of array and pointer references in C and C++ programs. sibly modified by an external library must have their This remains an unsolved problem despite a long history of metadata updated at a library call. Furthermore, this ∗ This work is supported in part by the NSF Embedded Sys- adds a potentially high cost for searching the maps for tems program (award CCR-02-09202), the NSF Next Gen- the referent on loads and stores through pointers. eration Software Program (award CNS 04-06351), and an NSF CAREER award (EIA-0093426). • Store only the address ranges of live objects and en- Permission to make digital or hard copies of all or part of this work for sure that intermediate pointer arithmetic never crosses personal or classroom use is granted without fee provided that copies are out of the original object into another valid object [10]: not made or distributed for profit or commercial advantage and that copies This approach, attributed to Jones and Kelly, stores bear this notice and the full citation on the first page. To copy otherwise, to the address ranges in a global table (organized as a republish, to post on servers or to redistribute to lists, requires prior specific splay tree) and looks up the table (or the splay tree) permission and/or a fee. ICSE’06, May 20–28, 2006, Shanghai, China. for the intended referent before every pointer arith- Copyright 2006 ACM 1-59593-085-X/06/0005 ...$5.00. metic operation. This eliminates the incompatibilities caused by associating metadata with pointers them- Automatic Pool Allocation uses a pointer analysis to cre- selves, but current solutions based on this approach ate fine-grain, often short-lived, logical partitions (“pools”) have even higher overhead than the previous two ap- of memory objects [13]. By maintaining a separate splay proaches. Jones and Kelly [10] report overheads of tree for each pool, we greatly reduce the typical size of 5x-6x for most programs. Ruwase and Lam [17] ex- the trees at each query, and hence the expected cost of tend the Jones and Kelly approach to support a larger the tree lookup. Furthermore, unlike some arbitrary parti- class of C programs, but report slowdowns of a factor tioning of memory objects, the properties of pool allocation of 11x–12x if enforcing bounds for all objects, and of provide three additional benefits. First, the target pool for 1.6x–2x for several significant programs even if only en- each pointer variable or pointer field is unique and known forcing bounds for strings. These overheads are far too at compile-time, and therefore does not have to be found high for use in “production code” (i.e., finished code (tracked or searched for) at run-time. Second, because pool deployed to end-users), which is important if bounds allocation often creates type-homogeneous pools, it is pos- checks are to be used as a security mechanism (not sible at run-time to check whether a particular allocation just for debugging). For brevity, we refer to these two is a single element and avoid entering those objects in the approaches as JK and JKRL in this paper. search trees. Finally, we believe that segregating objects by data structure has a tendency to separate frequently Note that compile-time checking of array bounds viola- searched data from other data, making search trees more tions via static analysis is not sufficient by itself because it efficient (we have not evaluated this hypothesis but it would is usually only successful at proving correctness of a frac- be interesting to do so). tion (usually small) of array and pointer references [2, 6, 7, We evaluate the net overhead of our approach for a col- 8, 19]. Therefore, such static checking techniques are pri- lection of benchmarks and three operating system daemons. marily useful to reduce the number of run-time checks. Our technique works “out-of-the-box” for all these pro- An acceptable solution for production code would be one grams, with no manual changes. We find that the average that has no compatibility problems (like the Jones-Kelly ap- overhead is only about 12% across the benchmarks (and neg- proach and its extension), but has overhead low enough for ligible for the daemons), although it is 69% in one case. We production use. A state-of-the-art static checking algorithm also used the Zitser’s [23] suite of programs modeling buffer can and should be used to reduce the overhead but we view overrun violations reported in several widely used programs that as reducing overhead by some constant fraction, for — 4 in sendmail, 3in wu-ftpd, and 4 in bind — and found any of the run-time techniques. The discussion above shows that our technique successfully detects all these violations. that none of the three current run-time checking approaches Overall, we believe we have achieved the twin goals that are come close to providing such an acceptable solution, with or needed for practical use of array bounds checking in produc- without static checking. tion runs, even for legacy applications: overhead typically In this paper, we describe a method that dramatically low enough for production use, and a fully automatic tech- reduces the run-time overhead of Jones and Kelly’s “refer- nique requiring no manual changes. ent table” method with the Ruwase-Lam extension, to the The next section provides a brief summary of Automatic point that we believe it can be used in production code (and Pool Allocation and the pointer analysis on which it is static checking and other static optimizations could reduce based. Section 3 briefly describes the Jones-Kelly algorithm the overhead even further). We propose two key improve- with the Ruwase-Lam extension, and then describes how we ments to the approach: maintain and query the referent object maps on a per-pool 1. We exploit a compile-time transformation called Au- basis. It also describes three optimizations to reduce the tomatic Pool Allocation to greatly reduce the cost of number or cost of referent object queries. Section 5 describes the referent lookups by partitioning the global splay our experimental evaluation and results. Section 6 compares tree into many small trees, while ensuring that the our work with previous work on array bounds checking, and tree to search is known at compile-time. The transfor- Section 7 concludes with a summary and a brief discussion mation also safely eliminates many scalar objects from of possible future work.