Implementation of the Typed Call-By-Value Lambda-Calculus
Total Page:16
File Type:pdf, Size:1020Kb
Implementation of the Typed Call-by-Value λ-calculus using a Stack of Regions Mads Tofte, University of Copenhagen∗ Jean-Pierre Talpin, European Computer-Industry Research Center † Abstract from the source program. Heap-based schemes are less eager to reuse memory. Generational garbage collec- We present a translation scheme for the polymorphi- tion collects young objects when the allocation space cally typed call-by-value λ-calculus. All runtime val- is used up. Hayes[11] discusses how to reclaim large, ues, including function closures, are put into regions. old objects. The store consists of a stack of regions. Region in- Garbage collection can be very fast. Indeed, there ference and effect inference are used to infer where is a much quoted argument that the amortized cost of regions can be allocated and de-allocated. Recursive copying garbage collection tends to zero, as memory functions are handled using a limited form of polymor- tends to infinity[2, page 206]. Novice functional pro- phic recursion. The translation is proved correct with grammers often report that on their machines, mem- respect to a store semantics, which models a region- ory is a constant, not a variable, and that this constant based run-time system. Experimental results suggest has to be uncomfortably large for their programs to that regions tend to be small, that region allocation is run well. The practical ambition of our work is to frequent and that overall memory demands are usually reduce the required size of this constant significantly. modest, even without garbage collection. We shall present measurements that indicate that our translation scheme holds some promise in this respect. In this paper, we propose a translation scheme for 1 Introduction Milner’s call-by-value λ-calculus with recursive func- tions and polymorphic let[22,7]. The key features of The stack allocation scheme for block-structured our scheme are: languages[9] often gives economical use of memory re- 1. It determines lexically scoped lifetimes for all run- sources. Part of the reason for this is that the stack time values, including function closures, base val- discipline is eager to reuse dead memory locations (i.e. ues and records; locations, whose contents is of no consequence to the rest of the computation). Every point of allocation is 2. It is provably safe; matched by a point of de-allocation and these points 3. It is able to distinguish the lifetimes of different can easily be identified in the source program. invocations of the same recursive function; In heap-based storage management schemes[4,19, 18], allocation is separate from de-allocation, the latter This last feature is essential for obtaining good mem- being handled by garbage collection. This separation ory usage (see Section 5). is useful when the lifetime of values is not apparent Our model of the runtime system involves a stack of regions, see Figure 1. We do not expect always to ∗Postal address: Department of Computer Science (DIKU), be able to determine the size of a region when we allo- University of Copenhagen, Universitetsparken 1, DK-2100 Copenhagen Ø, Denmark; email: [email protected]. cate it. Part of the reason for this is that we consider †Work done while at Ecole des Mines de Paris. Cur- recursive datatypes, such as lists, a must; the size of rent address: European Computer-Industry Research Center a region which is supposed to hold the spine of a list, (ECRC GmbH), Arabella Straße 17, D-81925 Mu¨nchen; email: say, cannot in general be determined when the region [email protected] Copyright 1994 ACM. Appeared in the Proceedings of is allocated. Therefore, not all regions can be allo- the 21st Annual ACM SIGPLAN-SIGACT Symposium on cated on a hardware stack, although regions of known Principles of Programming Languages, January 1994, pp. size can. 188–201. Permission to copy without fee all or part of Our allocation scheme differs from the classical this material is granted provided that the copies are not stack allocation scheme in that it admits functions as made or distributed for direct commercial advantage, the first-class values and is intended to work for recursive ACM copyright notice and the title of the publication and datatypes. (So far, the only recursive datatype we its date appear, and notice is given that copying is by have dealt with is lists.) permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. 188 in the number of evaluation steps by an arbitrarily large factor[10, page 618]. Georgeff also presents an implementation scheme which does not involve trans- lation, although this relies on not using call-by-value reduction, when actual parameters are functions. We translate every well-typed source language ex- pression, e, into a target language expression, e!, which is identical with e, except for certain region annota- tions. The evaluation of e! corresponds, step for step, to the evaluation of e. Two forms of annotations are . e1 at ρ r0 r1 r2 r3 letregion ρ in e2 end The first form is used whenever e is an expression Figure 1: The store is a stack of regions; a region is a 1 which directly produces a value. (Constant expres- box in the picture. sions, λ-abstractions and tuple expressions fall into this category.) The ρ is a region variable; it indicates Ruggieri and Murtagh[28] propose a stack of regions that the value of e1 is to be put in the region bound in conjunction with a traditional heap. Each region is to ρ. associated with an activation record (this is not neces- The second form introduces a region variable ρ with sarily the case in our scheme). They use a combination local scope e2. At runtime, first an unused region, r, of interprocedural and intraprocedural data-flow anal- is allocated and bound to ρ. Then e2 is evaluated ysis to find suitable regions to put values in. We use a (probably using r). Finally, r is de-allocated. The type-inference based analysis. They consider updates, letregion expression is the only way of introducing which we do not. However, we deal with polymor- and eliminating regions. Hence regions are allocated phism and higher-order functions, which they do not. and de-allocated in a stack-like manner. The device we use for grouping values according Inoue et al.[15] present an interesting technique for to regions is unification of region variables, using es- compile-time analysis of runtime garbage cells in lists. sentially the idea of Baker[3], namely that two value- Their method inserts pairs of HOLD and RECLAIMη producing expressions e and e should be given the instructions in the target language. HOLD holds on 1 2 same “at ρ” annotation, if and only if type check- to a pointer, p say, to the root cell of its argument ing, directly or indirectly, unifies the type of e and and RECLAIMη collects those cells that are reachable 1 e . Baker does not prove safety, however, nor does he from p and fit the path description η. HOLD and RE- 2 deal with polymorphism. CLAIM pairs are nested, so the HOLD pointers can be held in a stack, not entirely unlike our stack of regions. To obtain good separation of lifetimes, we introduce In our scheme, however, the unit of collection is one explicit region polymorphism, by which we mean that entire region, i.e., there is no traversal of values in con- regions can be given as arguments to functions at run- nection with region collection. The path descriptions time. For example, the successor function succ = of Inoue et al. make it possible to distinguish between λx.x + 1 is compiled into the individual members of a list. This is not possible Λ[ρ,ρ!].λx.letregion ρ!! in our scheme, as we treat all the elements of the same in (x + (1 at ρ!!)) at ρ! list as equal. Inoue et al. report a 100% reclamation end rate for garbage list cells produced by Quicksort[15, page 575]. We obtain a 100% reclamation rate (but for 1 word) for all garbage produced by Quicksort, which has the type scheme without garbage collection (see Section 5). Hudak[13] describes a reference counting scheme for get(ρ),put(ρ!) ρ, ρ!.(int, ρ) { } (int, ρ!) a first-order call-by-value functional language. Refer- ∀ −−−−−−−−−−−−−→ ence counting may give more precise use information, meaning that, for any ρ and ρ!, the function accepts an than our scheme, as we only distinguish between “no integer at ρ and produces an integer at ρ! (performing use” and “perhaps some use.” a get operation on region ρ and a put operation on Georgeff[10] describes an implementation scheme region ρ! in the process). Now succ will put its result for typed lambda expressions in so-called simple form in different regions, depending on the context: together with a transformation of expressions into sim- ple form. The transformation can result in an increase succ[ρ , ρ ](5 at ρ ) succ[ρ , ρ ](x) ··· 12 9 12 ··· 1 4 189 Moreover, we make the special provision that a recur- Dom(f) and Rng(f), respectively. When f and g sive function, f, can call itself with region arguments are finite maps, f + g is the finite map whose do- which are different from its formal region parameters main is Dom(f) Dom(g) and whose value is g(x), if and which may well be local to the body of the recur- x Dom(g), an∪d f(x) otherwise.