Escape Analysis on Lists∗
Total Page:16
File Type:pdf, Size:1020Kb
Escape Analysis on Lists∗ Young Gil Parkyand Benjamin Goldberg Department of Computer Science Courant Institute of Mathematical Sciences New York Universityz Abstract overhead, and other optimizations that are possible when the lifetimes of objects can be computed stati- Higher order functional programs constantly allocate cally. objects dynamically. These objects are typically cons Our approach is to define a high-level non-standard cells, closures, and records and are generally allocated semantics that, in many ways, is similar to the stan- in the heap and reclaimed later by some garbage col- dard semantics and captures the escape behavior caused lection process. This paper describes a compile time by the constructs in a functional language. The ad- analysis, called escape analysis, for determining the vantage of our analysis lies in its conceptual simplicity lifetime of dynamically created objects in higher or- and portability (i.e. no assumption is made about an der functional programs, and describes optimizations underlying abstract machine). that can be performed, based on the analysis, to im- prove storage allocation and reclamation of such ob- jects. In particular, our analysis can be applied to 1 Introduction programs manipulating lists, in which case optimiza- tions can be performed to allow whole cons cells in Higher order functional programs constantly allocate spines of lists to be either reclaimed at once or reused objects dynamically. These objects are typically cons without incurring any garbage collection overhead. In cells, closures, and records and are generally allocated a previous paper on escape analysis [10], we had left in the heap and reclaimed later by some garbage col- open the problem of performing escape analysis on lection process. Garbage collection overhead can be lists. reduced by performing compile time analyses that al- Escape analysis simply determines when the ar- low the following optimizations to be performed: gument (or some part of the argument) to a function • Stack allocation: Objects that would otherwise call is returned by that call. This simple piece of in- be allocated in the heap and then reclaimed us- formation turns out to be sufficiently powerful to al- ing garbage collection are allocated in an activa- low stack allocation of objects, compile-time garbage tion record on the stack and automatically (and collection, reduction of run-time storage reclamation cheaply) reclaimed when the activation record ∗This research was funded in part by the National Sci- is popped off the stack. ence Foundation (#CCR-8909634) and by DARPA/ONR (#N00014-90-1110). • In-place Reuse: When objects (such as cons y Present address: School of Computer Science, University cells) are no longer needed, they can be reused of Windsor, 401 Sunset Ave, Windsor, Ontario, Canada N9B 3P4, Email: [email protected] directly by the program without invoking the zAuthors' address: 251 Mercer Street, New York, N.Y. garbage collector. A typical example of this is 10012, Email: [email protected], [email protected]. when a function constructs a new list by de- structively modifying the cons cells that were contained in an argument to that function. • Block Allocation/Reclamation: A number of ob- jects (such as the cons cells of a list) are allo- cated together in a contiguous block of memory. Eventually, the whole block is put on the free 0 list, rather than the individual objects. This al- lows reclamation of larger segments of memory, The first point reflects our inability to determine pre- cisely, without actually running the program, which (top) LH 1stspine jH) - - individual cells of a list might escape. To form a terminating compile-time escape analysis, one must 2ndspine ) ? - ? - choose an approximation of program behavior. The 3rdspine ) ?- ?- ?- ?- second point reflects our belief that the spines are a . good choice of approximation, since the cells of each . ?. ? ?. ? ?. ? ?. ? spine of a list tend to be treated identically. Many functions, such as append, reduce, map, length, etc, operate on all cells of a spine. Many other functions Figure 1: Spines of a List have the form: f L = if predicate(car L) then .... and reduces run-time overhead by avoiding the else f (cdr L) traversal of the individual objects (in a mark- sweep collection, for instance). or In this paper, we present a compile time analysis f L x = if x = n then ... that provides sufficient information to perform all of else f (cdr L) (arith-op x) these optimizations on higher order functional pro- grams. The analysis is called escape analysis. In a In general, it is impossible to determine at compile previous paper [10] we described an escape analysis time when the recursion will bottom out. One simply for non-list objects, such as closures, and left open has to assume that all cells in the spine of the list L the problem of performing the analysis in the pres- will be visited. ence of lists. This paper solves that problem. Escape Consider, for example, the following program: analysis answers a simple question: Given a function let map f l = if (l=nil) then nil application, does a parameter (or some part of a pa- else cons (f (car l)) rameter) get returned in the result of the application? (map f (cdr l)) ; If so, the parameter is said to escape from the appli- pair x = [car x, car (cdr x)] ; cation. If the escaping parameter is a list, we would in map pair [[1,2],[3,4],[5,6]] particularly like to know which spines of the parame- ter escape. The spines of a list are a way of describing An escape analysis would allow us to determine the substructures of a list and are defined as follows: following properties of the program at compile time: Definition 1 (Spines of a list) Given a list L and 1. The top spine of pair's parameter does not es- some i ≥ 1, the top ith spine of L is defined as the cape from pair, only some elements do. set of cons cells accessible by a sequence of operations consisting of car and cdr where the number of occur- 2. The top spine of map's parameter l does not rences of car is (i − 1). Similarly, given a list L with escape from map, and the elements of l escape d spines and some j ≥ 1, the bottom jth spine of L from map to the extent they escape from the is defined as the top (d − j + 1) spine of L. unknown function f. In Figure 1 we illustrate what we mean by the 3. In the call (map pair [[1,2],[3,4],[5,6]]), spines of a list. Naturally, an empty list (nil) and the top two spines of the second argument of any non-list objects have no (or zero) spine. map do not escape. We have chosen to analyze the escape properties This means that (at least) the following optimizations of lists in terms of their spines for two reasons: could be performed: • It is an approximation to the run-time behavior • Stack Allocation: The spine of the list [[1,2],[3,4],[5,6]] that allows a compile-time analysis. and the spine of each element of the list could • It reflects the programming style commonly used be allocated in the activation record for map. for strongly typed languages, such as ML, in Thus, when map returns, the cons cells of those which lists are homogeneous (all elements have spines would disappear. It seems strange to al- the same type) and functions (such as append, locate a list in a stack, but there is really no map, etc.) often operate over complete spines of reason not to (as long as we keep in mind the lists. safety considerations outlined in [6]). • In-place Reuse: If the parameter l of map is an first-order analysis for LISP that constructs graphs unshared (sharing information can also be ob- representing possible list structures and analyzes the tained from escape information), we can recycle graphs for possible storage optimizations. Our analy- the cells of the spine of l within map to be used sis, in contrast, concentrates on typed languages such in the spine of the result (since l's spine does as ML and benefits from a type system that restricts not escape). The call to cons inside map can the ways that lists can be created (for example, one reuse the first cell in the spine of l for the re- cannot say f = cons(x,x)) and the sharing that can sult. Pair can reuse the spine of x in its result. occur within a list. Orbit, an optimizing compiler for Scheme, uses a simple first-order escape analysis to • Block Allocation/Reclamation: If neither of the stack allocate closures [15]. Other analyses for opti- above optimizations are used, then the top two mizing storage allocation were proposed in [14, 5]. spines of the list [[1,2],[3,4],[5,6]] can be Deutsch [8] presents a lifetime and sharing analy- allocated in some block in the heap. When sis for higher order languages. While the goals of the map finishes, that whole block can be placed on paper seem to be similar to ours, the approach is very the free list, freeing all the cons cells without different. The analysis consisted of defining a low- traversing the list. level operational model for a higher order functional This paper particularly describes an escape anal- language, translating a program into a sequence of ysis on lists in higher-order, polymorphically-typed operations in this model, and then performing an functional programs, and describes optimizations and analysis to determine the lifetimes of dynamically cre- other analysis (sharing analysis) that can be performed ated objects.