arXiv:1201.0023v1 [cs.PL] 29 Dec 2011 surrounding func(int,int) iue1i h hpllnug.The language. Chapel the in 1 Figure b problems classic safety the type of poses allocation stack with language predictable. sho and construct language small relatively each for langua overhead performance-oriented run-time [Appl for the C which meant Objective is of and design blocks 2011] The the [ISO 2011]. C++ to alternative of lan saf efficient expressions a programming more lambda provide Chapel new also the the could to but in ternative 2011], use al. et co for simp [Chamberlain not intended for guage strives does is that that design and way performance This a or in fu safety type allocation first-class promise stack integrating with for languages design into a describes paper This Introduction 1. storage t function-local a efficiency. combines with over that safety, trol design for simple system, a effect pro present the and we to paper regions this an exposing In by safety mer. simplicity both up programm provide give lose but that ciency, but ones safety funarg and regain efficiency, upward which predictable classic systems the are There reference: lem. they functions variables about worry the must ing one safety, preserve efficiency safety, To between plicity. tension fu a first-class is mixing allocation in C+ stack difficulty and in The expressions C. lambda Objective to in relevant blocks langu also Chapel is the problem on work this our can but from system comes emulsion problem effect nice this a and in form type terest features a these helps that that show detergent naturally the we don’t paper this functions In first-class gether. and allocation Stack Abstract naoyosfnto hc eest parameters to refers which function anonymous an h alto call the aibe ntecl tc,tevariables the stack, call an the parameters on allocates variables Chapel Because twice. argument its invokes and int when Cprgtntc ilapa eeoc pern’oto i option ’preprint’ once here appear will notice [Copyright h tagtowr nerto ffis-ls ucin i functions first-class of integration straightforward The stertr ye h xml hndfie h function the defines then example The type. return the is inc2 inc2 scle;tecl to call the called; is compose compose padfnr problem funarg upward h oain rvosyalctdto allocated previously locations the , h first the ; ucin Both function. oobtain to int eeyG ik ihe .Vtue,adJnta .Turner D. Jonathan and Vitousek, M. Michael Siek, G. Jeremy steipttp n h second the and type input the is inc2 compose f ucinta increments that function a , Mss17] lutae in illustrated 1970], [Moses and compose f and a opee.During completed. has g removed.] s r ucin ftype of functions are g fet o Funargs for Effects r olne live longer no are ucinreturns function f nvriyo ooaoa Boulder at Colorado of University f and and [email protected] n sim- and , o con- for , u in- Our . g i to- mix nctions nctions local d ecause outliv- g effi- d l be uld gram- e in ges fthe of and + licity. Inc. e prob- t a nto ral- er may age, inc ype m- er- be a - 1 ucin h rgamrcudeett oytevle of values the copy to elect could the programmer in example, the For i function, function. values a of with In copying associated the [Apple storage enable C tra Objective solution: of simple blocks a the offer and 2011] 2011] c [ISO involve C++ of that sions cases above the useful as programmi many such functions, higher-order disallow still many they support idioms, [2002] al. et Grossman thi that evidence provides wor Cyclone case. The on the safety. [2002] type al. et and Grossman regions efficiency explicit predictable with both language) provide langua intermediate programming Henglein an a 1999, to that suggest opposed al. [2004] et al. Crary et Tofte 1995, 2001]. up allocat al. on within et restriction [Aiken occur LIFO system deallocation memor that the relaxed references effect of work dangling absence Later and funargs. the the fas type as guarantee LIFO such a to errors, a uses [1992] in Jouvelot Calculus and Talpin Region allocated/deallocated The are ion. regions and regions nemdaelnug,wihw ee oa h einCalcu Region the as ty explicit to an a refer with designed we lus, 1997] which [1994, language, Talpin intermediate and Tofte are. ingredients programming costs. higher-order run-time enables over control that programmer the but gives safety, thereby statically, type caught are ing funargs upward which in sign rvsefiinyfrmn rgas u tl osntdeliv not does still but programs, many predictable P programmer which and for 1996], Goldberg Feeley efficiency 1978, and proves Serrano [Steele 1994, stack Talpin and the Tofte 1990, on allocated functi safely variab which for be determine compilers to re analyses Advanced static is employ 1994]. languages allocation Rozas garbage and stack [Miller use whereas fast garba and performan challenge designing predictable heap research However, provide on-going the memory. that on algorithms the them collection reclaim allocate va to local they or collection stack; parameters the allocate on not do functions first-class safety. type to counterexample a have we oti auso ye htaedfeetfrom different are that types of values contain rcicx n) n eunx+1 } 1; + inc); x compose(inc, return inc2(0); = { inc2 int var int): inc(x: proc rccmoef func(int,int), compose(f: proc hl h yeadefc ytm fTfee l 20]and [2004] al. et Tofte of systems effect and type the While hl uhadsg sntpeeti h ieaue h key the literature, the in present not is design a such While de- language a seek we language, programming Chapel the For oaodteuwr uagpolm otlnugswith languages most problem, funarg upward the avoid To } eunfnxit{vryfx;rtr () }; g(y); return y=f(x); var fun(x:int){ return iue1. Figure xml fa padfnr nChapel. in funarg upward an of Example :fn(n,n):fn(n,n){ func(int,int) func(int,int)): g: region fcec Tfee l 2004]. al. et [Tofte efficiency compose btato:vle r loae into allocated are values abstraction: ucin h abaexpres- lambda The function. func(int,int) compose ei an is ce e may les o and ion achiev- t ex- nto 2018/3/1 riables would urried e(as ge f liably ward tal. et by k onal ala `a and ped and so , is s im- ark ng ge h- er c. y - g into storage associated with the anonymous function, thereby variables x,y keeping them alive for the lifetime of the function. One might be integers n ∈ Z tempted to make this copying behavior the only semantics, but in operators op ::= + | − | . . . many situations the copy is too expensive [J¨arvi et al. 2007]. (Sup- effects ϕ ::= x,...,x pose the copied object is an array.) In the C++ design, the program- types T ::= int | func(T,T, [ϕ]) |
2 2018/3/1 To see how this works, consider the following step of execution, 1 var x = 1; where addx ≡ fun(z:int)[x]{return x + z;}. 2 var addx = fun(z: int)[x]{ return x + z; }; 3 var twice = fun(f: func(int,int,[x]), y: int)[x] { let g=addx in fun(y:int)[x]{var t=g(y); return g(t);} 4 var t = f(y); return f(t); −→ fun(y:int)[x]{var t=addx (y); return addx (t);} 5 }; 6 var b = twice(addx, 3); The let expression has caused the addx function to be copied into 7 return b; the body of the function. Of course, a production quality compiler would not use substi- Figure 3. Example with first-class functions and stack allocation. tution (which implies run-time code generation) but instead would use a closure representation of functions. A closure consists of a function pointer paired with an array of the values from the let- variables that are live in the current scope. The tail call bound and fix-bound variables that occur free in the function body. return e (e ); For readers familiar with C++, a closure is just a “function object” 1 2 or “functor” in which data members are used to store copies of the is a function call in which nothing else is left to be done in the variables. current function after the call e1(e2) returns. To reduce the notational overhead of adding let expressions, Allow Downward Funargs Several variations on the example one can add syntactic sugar to function expressions for declaring in Figure 3 serve to demonstrate the interplay between first-class that a variable be copied, as in C++ [ISO 2011]. The following functions, stack allocation, and effect annotations. The example shows f listed after the normal parameters of the function on line 4 defines a twice function, with a function parameter f that it calls to indicate that f is to be copied into function-local storage. twice, first on the parameter y and then on the result of the first 3 var twice = fun(f: func(int,int,[x])) { call. The example also defines the addx function, which adds its 4 return fun(y:int; f)[x] { parameter z and the global variable x. The example then invokes the 5 var t = f(y); return f(t); twice function with addx as a parameter; so the addx argument is 6 }; an example of a downward funarg. 7 }; Downward funargs are benign with respect to memory safety and are allowed by our type system. In this case, the addx function Effect Polymorphism As described so far, the type and effect reads from x, so x is recorded in the type of addx. The call to twice system is too restrictive because function types are annotated with with addx is allowed by the type system because the type of addx, specific variables. For example, the below twice function has a including its effect, matches the type of parameter f. The call to f function parameter f that is annotated with the effect [x]. Thus, inside twice is allowed because twice has also declared x as its calling twice with addx is fine but calling twice with addy is not effect. Then looking back to the call to twice, it is allowed because because the effect of addy is to read y, not x. the lifetime of variable x encompasses the call to twice. However, the above twice function is surprisingly specific. It var x = 1; may only be called with a function that reads from x. We discuss var y = 2; shortly how to make twice more general. var twice = fun(f: func(int,int,[x]), y: int)[x]{ var t = f(y); return f(t); Disallow Upward Funargs Next consider the twice function }; written in curried form, that is, taking one argument at a time as var addx = fun(z:int)[x]{ return x + z; }; shown below. The function created on line 4 is an example of an var addy = fun(z:int)[y]{ return y + z; }; upward funarg that is disallowed by our type system. It reads from var b = twice(addx, 3); /* OK */ three variables, one that is local to the function (y) and two from var c = twice(addy, 3); /* Error */ surrounding scopes (directly from f and indirectly from x). The return b + c; return of this function is disallowed because the variable f is going out of scope, so this upward funarg contains a dangling reference. The solution to this problem is to parameterize functions with respect to their effects [Tofte and Talpin 1994, Pierce 2004]. In 3 var twice = fun(f: func(int,int,[x])) { F2C, the support for this comes from the effect abstraction fun(f: func(int,int,[p]), y: int)[p] { 3 var twice = fun(f: func(int,int,[x])) { var t = f(y); return f(t); 4 return (let g=f }; 5 in fun(y:int)[x] { ... 6 var t = g(y); return g(t); var b = twice 3 2018/3/1 Of course, we would like to parameterize with respect to ar- var y = g(2); bitrary numbers of variables, so that twice could be used with return y; functions with no effect or an effect with greater than one variable. See Grossman et al. [2002] for the generalization of effect poly- Inside g, the tail call return f(x) has the effect {x}. A naive type morphism that addresses this need. From a syntactic standpoint, checking algorithm would reject this tail call because the parameter explicit effect application, as in twice var f = (let y = y in fix f: <~z> func(T1,T2, [ϕ]). Now the tail call return f1(x2) has the effect x1, which is be- <~z> fun(x:T1)[ϕ]{ s1 }); nign because x2, and not x1, is popped prior to the call. (Another s2 solution is to use the de Bruijn [1972] representation for variables.) (The portions of the syntax related to ~z, x, y, and ϕ are optional.) Summary of the Design The F2C calculus combines first-class functions with stack allocation, enabling the full range of higher- Tail Calls and Effects To support space-efficient tail calls, F2C order programming while ensuring type safety and minimizing pops the current procedure call frame prior to performing the tail programmer-visible complexity through a regionless type and ef- call. To make this safe, our type system does not allow tail calls to fect system. Further, the design gives the programmer control over functions that read from variables on the current frame. We depart efficiency by providing by-reference and by-copy options for cap- from the running example to show a classic example [Appel 1992], turing the free variables of a function. listed below, that was problematic for region inference. Ideally, this example uses O(n) space; the approach of Tofte and Talpin [1994] used O(n2) but the later approach of Aiken et al. [1995] reduced 3. Direct Semantics and Type Safety 2 the space to O(n). Indeed, the below example uses O(n) in F C This section presents the dynamic and static semantics of F2C and 2 because the two calls in tail position are executed as tail calls. then a proof of type safety. proc s(i:int): int list { 3.1 Dynamic Semantics if (iszero(i)) { 2 return nil; We give a substitution-based, single-step semantics for F C that } else { takes inspiration from the semantics of Helsen and Thiemann var p = dec(i); var sp = s(p); [2000] and Crary et al. [1999]. However, to make stack allocation return cons(0, sp); explicit, the semantics operates on states of the form } hs, κ, σ, ni } proc f(n:int, x: int list):int [s]{ where s is the body of the currently executing function, κ is the var z = length(x); control stack, σ is the value stack, and n is the number of stack proc g(; f, n):int [s] { values that are associated with the currently executing function. On var s100 = s(100); return f(dec(n), s100); a real machine, the control and value stacks are intertwined; our } formalization is more clear with them separated. We write ǫ for the if (iszero(n)) return 0; empty stack and v · σ for pushing v onto the front of the stack σ. else return g(); (So we represent stacks as cons-style lists.) We write |σ| for the } length of stack σ and concatenation of stacks is juxtaposition, so var r = f(100,nil); the concatenation of σ1 and σ2 is written σ1σ2. return r; In our semantics, variables are mapped to stack locations (rep- resented with natural numbers). We index starting from the back of In F2C, the programmer knows for sure that every tail call does the stack so that existing locations are not disturbed by growing the not use extra stack space. Also, the programmer is notified with an stack at the front. Given the notation σi for the normal front-to-back error message if a tail call could lead to a dangling reference. indexing, we use the following notation for indexing back-to-front. Take Care with Variables Particular care must be taken with σ{i} = σ|σ|−1−i the implementation of variables in a language with a type and As usual in a substitution-based semantics, the syntax of the effect system such as F2C. The straightforward approach leads to language must be slightly expanded for use by the dynamic seman- incorrect conclusions during type checking. Consider the following tics, which we show in Figure 4 with the differences highlighted in program with two variables with the name x. gray. The most important difference is that expressions and effects var x = 1; include stack locations. Stack locations have the form ℓT , com- proc f(a: int):int [x]{ return a + x; } bining the address ℓ with type T . This type annotation is ignored proc g(x: int):int [x]{ return f(x); } by the dynamic semantics (see Section 4); it is merely a technical device used in the proof of type safety. The need for these type 2 For the purposes on this example, F2C is extended with support for if annotations propagates to needing type annotations in variable ini- statements and lists in the obvious way. tialization and for function return types. These type annotations are 4 2018/3/1 Definition 1. The dynamic semantics of F2C is specified by the stack index ℓ ∈ N partial function eval defined by the following. stack location l ::= x | ℓT eval(s)= observe(JeK ) iff hs, ǫ, ǫ, 0i −→∗ hreturn e;, ǫ, σ, ni effects ϕ ::= l,...,l σ types T ::= ⊤ | int | func(T,T, [ϕ]) | abstractions f ::= fun(x:T ): T [ϕ]{s} | . . . observe(fun(x:T1):T2[ϕ]{s})= fun statements s ::= var x :T =e; s | . . . observe( values v ::= n | fun(x:T ):T [ϕ]{s} | Jlet x=e1 in e2Kσ = J[x:=Je1Kσ]e2Kσ of a stack-allocated variable x to type T . We write Γ,x:copy T for extending the type environment with variables that are bound by Jfix x:T.fK = J[x:=fix x:T.f]fK σ σ the let or fix expressions. The copy annotation helps the type J 5 2018/3/1 ς −→ ς h(var x:T =e; s), κ, σ, ni −→h[x:=|σ|T ]s, κ, JeKσ · σ, n + 1i (INIT) h(var x=e1(e2); s1), κ, σ, ni −→h[y:=|σ|T1 ]s2, (x:T2,s1, n) · κ, Je2Kσ · σ, 1i if Je1Kσ = fun(y:T1):T2[ϕ]{s2} (CALL) hreturn e1(e2);, κ, σ, ni −→h[y:=(|σ|−n)T1 ]s, κ, Je2Kσ · drop(n, σ), 1i if Je1Kσ = fun(y:T1):T2[ϕ]{s} (TAILCALL) hreturn e;, (x:T,s,n2) · κ, σ, n1i −→h[x:=(|σ|−n1)T ]s, κ, JeKσ · drop(n1,σ), n2 + 1i (RETURN) Figure 6. The single-step reduction relation. The rules for tail call and return must both prevent the escape of upward funargs with dangling references. The rules accomplish this by requiring that the free variables in the returned value’s type not include any local variables. The free variables of a type, written fv(T ), is defined as follows. Γ ⊢ ϕ fv(int)= ∅ Γ ⊢ ϕ iff ∀x ∈ ϕ. ∃T. Γ(x)= T fv(func(T1,T2, [ϕ])) = fv(T1) ∪ ϕ ∪ fv(T2) Γ ⊢ T fv( 3.3 Substitution vs. Environments Γ ⊢⊤ Γ ⊢ int Γ ⊢ func(T1,T2, [ϕ]) Γ ⊢ 6 2018/3/1 ⊢ Γ only deal with closed expression (Γ = ∅) because, in a well-typed program, the variables are substituted away before we come to eval- ⊢ Γ Γ ⊢ T uate an expression. The process of evaluation replaces stack loca- ⊢ ∅ ⊢ Γ,x:T tions with their associated values, so the resulting value is well- typed in an empty effect context. There may, however, still be ef- ⊢ σ : Σ fects in the type T (if the value is a function and the body of the ∅; ∅⊢ v : T ⊢ σ : Σ function contains stack locations). The effects in T dictate how the ⊢ ǫ : ǫ ⊢ v · σ : T · Σ resulting value can be used. To ensure that the stack locations cor- respond to values of the appropriate type, this lemma includes the Σ ⊢ ϕ Σ ⊢ ϕ ≡∀ℓT ∈ ϕ. Σ{ℓ} = T premises ⊢ σ : Σ and Σ ⊢ ϕ. Σ ⊢ κ : T ⇒ T Lemma 2 (Evaluation Safety). If ∅ ⊢ ϕ, ∅; ϕ ⊢ e : T , ⊢ σ : Σ, and Σ ⊢ ϕ, then ∅; ∅⊢ JeKσ : T . Σ ⊢ ϕ1 drop(n, Σ) ⊢ ϕ1 − ϕ2 ∅,x:T1; ϕ1 ∪ {x}; ϕ2 ∪ {x}⊢ s : T2 Looking at the various cases for evaluation in Figure 5, there ∅⊢ T1 drop(n, Σ) ⊢ κ : T2 ⇒ T3 are several that require knowing that a subexpression evaluates to a particular form of value. For example, to evaluate an effect Σ ⊢ ǫ : T ⇒ T Σ ⊢ (x:T1,s,n) · κ : T1 ⇒ T3 application e<ℓT >, the expression e must evaluate to an effect ′ ⊢ ς : T application, which has the form 3.4.1 Expression Evaluation Safety 1. If ∅; ϕ1 ⊢ e : T , then Γ; ϕ1 ⊢ e : T . 2. If ∅; ϕ ; ϕ ⊢ s : T , then Γ; ϕ ; ϕ ⊢ s : T . All of the reduction rules (Figure 6) require evaluating an expres- 1 2 1 2 sion, which means we need a notion of type safety for expressions. The expression e being substituted also goes from a context This is captured in the Evaluation Safety Lemma, stated below. We with no effects to a context with possibly many effects. Thus, we 7 2018/3/1 also need weakening for effects. The following Effect Weakening ⌊x⌋ = x Lemma does not need to be proved simultaneously for statements because the typing of the body of a function does not depend on the ⌊n⌋ = n current effect. ⌊op(e1 ...en)⌋ = op(⌊e1⌋ . . . ⌊en⌋) ⌊let x=e in e ⌋ = let x=⌊e ⌋ in ⌊e ⌋ Lemma 6 (Effect Weakening). If Γ; ϕ ⊢ e : T , ϕ ⊆ ϕ′, and 1 2 1 2 ′ ′ ⌊fix x:T. f⌋ = fix x. ⌊f⌋ Γ ⊢ ϕ , then Γ; ϕ ⊢ e : T . ⌊e 1. If Γ1,x:T1, Γ2 ⊢ T2, then Γ1, [Γ2] ⊢ [T2]. accomplished via a proof based on logical relations, but we leave 2. If Γ1,x:T1, Γ2; ϕ1 ⊢ e : T2, then Γ1, [Γ2]; [ϕ1] ⊢ [e] : [T2]. that for future work. 3. If Γ ,x:T , Γ ; ϕ ; ϕ ⊢ s : T , 1 1 2 1 2 2 Concretely, because effect abstractions do not depend on their then Γ , [Γ ]; [ϕ ]; [ϕ ] ⊢ [s] : [T ]. 1 2 1 2 2 arguments, it is unnecessary to perform effect substitutions at run- Of course, to handle the case of evaluating a primitive operator, time. In the direct semantics of F2C, evaluation of effect applica- we require that all the primitive operators be type safe. tions e<ℓT > involves substitution of the stack location ℓT into the subterms of . However, since effect abstraction variables may be Constraint 1 (Type safe primitive operators). If typeof (op) = e substituted by concrete variables of any type, effect variables may (T ,...,T ,T ) and ∅; ∅ ⊢ v : T for i ∈ {1,...,n}, then 1 n r i i only be used in effect applications and function effect annotations, δ(op, (v ,...,v )) = v′ and ∅; ∅⊢ v′ : T for some v′. 1 n r which do not interact with the rest of the program at runtime — No other lemmas are required to prove Evaluation Safety. in other words, F2C programs are parametric with regard to effect polymorphism. 3.4.2 Step Safety We remove both the type and effect annotations by an erasure The proof of Step Safety requires a few more key lemmas. Lemma 8 function, defined in Figure 9. We even erase effect abstractions and is a consequence of how the type system is designed to prevent effect applications. The syntactic restriction in F2C that the body upward funargs, that is, the return of functions with dangling refer- of an effect abstraction is an abstraction (a value) means that there ences to local variables. is no need to delay the evaluation of the body (because the body is already a value). Lemma 8 (Locals not in Return Type). Suppose Γ ⊢ ϕ1 and We may evaluate erased programs with similar rules to those . If , then . 2 Γ ⊢ ϕ2 Γ; ϕ1; ϕ2 ⊢ s : T fv(T ) ∩ ϕ2 = ∅ which evaluate F C and we refer to the resulting evaluation func- 2 The substitution of stack addresses into the return type of a function tion as eval erased . The similarity of these rules to those for F C doesn’t change the return type (which would break type preserva- leads us to the following: tion), because of the above lemma together with the following. Proposition 1 (Preservation of semantics under erasure). If ∅; ∅; ∅⊢ Lemma 9 (Substitution of Non-free Variables). If x∈ / fv(T ), then s : T , then eval(s)= eval erased (⌊s⌋). ′ [x:=ℓT ]T = T . We have performed rigorous testing of this property and found A function may have fewer effects than are present in the calling no violations. It should be straightforward to prove this preserva- context, so we need the following weakening lemma. tion of semantics via a simulation argument. Lemma 10 (Satisfaction Weakening). If Σ ⊢ ϕ1 and ϕ2 ⊆ ϕ1, then Σ ⊢ ϕ2. 5. Translation to the Region Calculus 2 Finally, in tail calls and when returning from a function, we pop In this section we relate F C to a region calculus, in particular, a the values corresponding to parameters and local variables from calculus similar to Henglein’s ETL and RTL calculi [Pierce 2004]. the stack, but the remaining stack is still well typed. The goal of this section is not to provide an aid to implementation, but to aid the reader in understanding the relation between our Lemma 11 (Drop Stack). If ⊢ σ : Σ, then ⊢ drop(n, σ) : work and prior work on regions. Thus, we choose a relatively drop(n, Σ). prototypical region calculus at the cost of losing efficient tail calls. The proof of Step Safety (Lemma 1) then proceeds by case analysis (There are other region-based calculi that would enable efficient tail calls [Aiken et al. 1995, Crary et al. 1999].) on the statement component of the state ς1. The syntax for the region calculus is defined in Figure 10. The type system is standard [Pierce 2004]. We use the variable 4. Erasure-based Semantics r to range over expressions, to easily distinguish region calculus An important question from both the semantic and implementa- expressions from F2C expressions. The form new ρ.r allocates an tion perspective is whether the execution of an effect abstraction empty region and binds it to the name ρ for use in r. The form depends in any way on the effects with which it is instantiated. We r at ρ allocates a location in region ρ, evaluates expression r and answer this question in the negative by demonstrating that F2C can stores its value into that location, then returns the location. The be implemented using an erasure-based approach. A more direct ar- form r ! ρ evaluates r to a location and dereferences the location, gument for this property, known as relational parametricity, can be that is, extracts the value stored at that location of region ρ. In 8 2018/3/1 region variables ρ JT KR JintKR = int effects ϕ ::= {ρ,...,ρ} JϕKR ϕ Jfunc(T1,T2, [ϕ])KR = JT1KR → JT2KR types T ::= int | T → T |∀ρ.T | T at ρ expressions r ::= x | n | op(r,...,r) J Jop(e1,...,en)KR = op(Jr1KR,..., JrnKR) addition to the forms in Figure 10, we use the following syntactic Jfun(x:T1):T2[ϕ]{s}KR = λy. new ρ. let x=y at ρ in JsKR(x:=ρ) sugar: let x=r1 in r2 , (λx.r2) r1. where y,ρ are fresh Figure 11 shows our translation from F2C to the region calcu- J x ! ρ if R(x)= ρ T 6= ⊤ Γ ∼R Γr JxKR = (x otherwise ∅∼R ∅ (Γ,x:T ) ∼R (Γr,y:JT KR,x:JT KR at R(x)) 2 The translation from F C to this region calculus is type preserv- T 6= ⊤ Γ ∼R Γr ing, that is, it maps well-typed programs to well-typed programs. (Γ,x:T ) ∼ (Γ ,x:JT K at R(x)) The proof is in the accompanying technical report. R r R Theorem 2 (Translation to Regions Preserves Types). Suppose Γ ∼R Γr Γ ∼R Γr Γ ⊢ R, Γ ⊢ ϕ1, and Γ ∼R Γr. (Γ,x:⊤) ∼R Γr (Γ,x:copy T ) ∼R (Γr,x:JT KR) 1. If Γ; ϕ1 ⊢ e : T , then Γr; Jϕ1KR ⊢ JeKR : JT KR. JϕKR JϕKR = {R(x) | x ∈ ϕ} 2. If Γ; ϕ1; ϕ2 ⊢ s : T , then Γr; Jϕ1KR ⊢ JsKR : JT KR. Γ ⊢ R We already proved type safety in Section 3.4 with respect to the ∀x. (∃T. Γ(x)= T ) iff (∃ρ. R(x)= ρ) direct semantics of F2C. With Theorem 2 in hand, we could write an alternative proof that relies on the type safety of the region Figure 11. Translation of F2C to a region calculus. calculus. The translation to regions is also be semantics preserving. That is, evaluating a program s under the direct semantics yields the of a regionless type and effect system and programmer controlled same result as translating the program and then evaluating under the function-local storage. dynamic semantics of the region calculus [Pierce 2004], for which Efficient but unsafe The lambda expressions of C++ [ISO 2011, we use the name eval R. We have tested this property on numerous programs and have found no violations. The proof of this should be J¨arvi et al. 2007] give programmers control over whether to capture 2 straightforward using a simulation argument. references to variables or to make copies. The design of F C offers the addition of static checking to catch upward funargs with Proposition 2 (Semantics Preserving). If ∅; ∅; ∅ ⊢ s : T , then dangling references. eval(s)= eval R(JsK∅). Safe but less efficient The blocks in Objective C [Apple Inc. 2011] give programmers a choice between copying the value of a 6. Related Work variable into function-local storage (the default) or storing the vari- Here we place the design of F2C in relation to other points in the able on the heap (called block storage) with automatic memory design space. To the best of our knowledge, F2C provides a unique management. Unfortunately, neither of these options is ideal for the balance of safety, efficiency, and simplicity through its combination most common use case for first-class functions: downward funargs. 9 2018/3/1 In such situations, storing the variable on the stack and capturing a J.-D. Choi, M. Gupta, M. Serrano, V. C. Sreedhar, and S. Midkiff. Escape reference to it is both safe and the most efficient. analysis for java. In OOPSLA ’99: Proceedings of the 14th ACM SIG- PLAN conference on Object-oriented programming, systems, languages, Safe but efficiency is less predictable Compilers for garbage- and applications, pages 1–19, New York, NY, USA, 1999. ACM Press. collected languages, such as Java, Standard ML, and Scheme, K. Crary, D. Walker, and G. Morrisett. Typed memory management in a optimize memory allocation by performing static analyses (such calculus of capabilities. In Proceedings of the 26th ACM SIGPLAN- as escape analysis [Goldberg and Park 1990, Choi et al. 1999], SIGACT symposium on Principles of programming languages, POPL region inference [Tofte and Talpin 1994], and storage use analy- ’99, pages 262–275, New York, NY, USA, 1999. ACM. doi: http: sis [Serrano and Feeley 1996]) to decide when objects can use a //doi.acm.org/10.1145/292540.292564. FIFO memory management scheme. However, from the program- N. de Bruijn. Lambda calculus notation with nameless dummies, a tool for mer standpoint, whether the static analyses succeeds on a given automatic formula manipulation, with application to the Church-Rosser program is unpredictable, which in turn means that the time and theorem. Indagationes Mathematicae (Proceedings), 75(5):381–392, space efficiency of the program is unpredictable [Tofte et al. 2004]. 1972. doi: 10.1016/1385-7258(72)90034-0. The design of F2C offers an alternative in which programmers C. Flanagan, A. Sabry, B. F. Duba, and M. Felleisen. The essence of compil- can mandate the stack allocation of objects, thereby achieving pre- ing with continuations. In PLDI ’93: Proceedings of the ACM SIGPLAN dictable efficiency. 1993 conference on Programming language design and implementation, pages 237–247, New York, NY, USA, 1993. ACM Press. Safe and efficient, but complex Several programming languages M. Fluet and G. Morrisett. Monadic regions. J. Funct. Program., 16(4-5): employ type and effect systems to provide safety and efficiency, but 485–545, 2006. at the cost of exposing regions to the programmer [Grossman et al. B. Goldberg and Y. G. Park. Higher order escape analysis. In ESOP, 1990. 2002, Boyapati et al. 2003]. (Several intermediate representations D. Grossman, G. Morrisett, T. Jim, M. Hicks, Y. Wang, and J. Cheney. also use explicit regions, but because they are intermediate rep- Region-based memory management in cyclone. In PLDI ’02: Proceed- resentations, regions are not necissarily exposed to the program- ings of the ACM SIGPLAN 2002 Conference on Programming language mer [Crary et al. 1999, Tofte and Talpin 1994].) Instead of regions, design and implementation, pages 282–293, New York, NY, USA, 2002. the F2C design relies on traditional stack allocation where param- ACM Press. eters and local variables are implicitly allocated on the stack. The S. Helsen and P. Thiemann. Syntactic type soundness for the region 2 effects of F C are sets of variables instead of sets of region names. calculus. In 4th International Workshop on Higher Order Operational Techniques in Semantics (HOOTS 2000), volume 41 of ENTCS, pages 7. Conclusion 1–19. Elsevier, 2000. F. Henglein, H. Makholm, and H. Niss. A direct approach to control-flow This paper presents a design for mixing first-class functions and sensitive region-based memory management. In Proceedings of the 3rd stack allocation that ensures type safety, enables the full range of ACM SIGPLAN international conference on Principles and practice of higher-order programming, and gives the programmer control over declarative programming, PPDP ’01, pages 175–186, New York, NY, efficiency while minimizing programmer-visible complexity. The USA, 2001. ACM. doi: http://doi.acm.org/10.1145/773184.773203. design uses a type and effect system based on variables instead of ISO. Working draft, standard for programming language C++. Technical region names. The system allows downward funargs and disallows Report N3242, ISO, February 2011. upward funargs with dangling references. To facilitate safe upward J. J¨arvi, J. Freeman, and L. Crowl. Lambda expressions and closures for funargs, the design gives programmers the choice of copying val- c++ (revision 1). Technical Report N2329, ISO/IEC JTC 1 SC22 WG21, ues into function-local storage. The paper formalizes the design in June 2007. 2 the F C calculus, defining its syntax, type system, and dynamic J. S. Miller and G. J. Rozas. Garbage collection is fast, but a stack is faster. semantics, and gives a proof of type safety. This paper also demon- AI Memos AIM-1462, MIT Artificial Intelligence Lab, March 1994. 2 strates that F C is parametric with respect to effects so effect ab- J. Moses. The function of FUNCTION in LISP or why the FUNARG prob- stractions can be implemented via erasure. Finally, the paper relates lem should be called the environment problem. SIGSAM Bull., pages 2 F C to a region calculus through a type-directed translation. 13–27, July 1970. doi: http://doi.acm.org/10.1145/1093410.1093411. B. C. Pierce. Types and Programming Languages. MIT Press, 2002. References B. C. Pierce, editor. Advanced Topics in Types and Programming Lan- A. Aiken, M. F¨ahndrich, and R. Levien. Better static memory management: guages. The MIT press, 2004. improving region-based analysis of higher-order languages. In Proceed- G. D. Plotkin. Call-by-name, call-by-value and the lambda-calculus. Theo- ings of the ACM SIGPLAN 1995 conference on Programming language retical Computer Science, 1(2):125–159, December 1975. design and implementation, PLDI ’95, pages 174–185, New York, NY, A. Sabry. MinML: Syntax, static semantics, dynamic semantics, and type USA, 1995. ACM. doi: http://doi.acm.org/10.1145/207110.207137. safety. Course notes for b522, February 2002. A. W. Appel. Compiling with continuations. Cambridge University Press, M. Serrano and M. Feeley. Storage use analysis and its applications. In New York, NY, USA, 1992. ISBN 0-521-41695-7. Proceedings of the first ACM SIGPLAN international conference on Apple Inc. Blocks programming topics. Technical report, Apple Inc., Functional programming, ICFP ’96, pages 50–61, New York, NY, USA, Cupertino, CA, March 2011. 1996. ACM. doi: http://doi.acm.org/10.1145/232627.232635. C. Boyapati, A. Salcianu, W. Beebee, Jr., and M. Rinard. Ownership G. L. Steele. Rabbit: A compiler for Scheme. Technical report, Cambridge, types for safe region-based memory management in real-time java. In MA, USA, 1978. Proceedings of the ACM SIGPLAN 2003 conference on Programming J.-P. Talpin and P. Jouvelot. The type and effect discipline. In Logic in language design and implementation, PLDI ’03, pages 324–337, New Computer Science, 1992. LICS ’92., Proceedings of the Seventh Annual York, NY, USA, 2003. ACM. doi: http://doi.acm.org/10.1145/781131. IEEE Symposium on, pages 162 –173, jun 1992. doi: 10.1109/LICS. 781168. 1992.185530. L. Cardelli. Handbook of Computer Science and Engineering, chapter Type M. Tofte and J.-P. Talpin. Implementation of the typed call-by-value Systems. CRC Press, 1997. lambda-calculus using a stack of regions. In POPL ’94: Proceedings B. Chamberlain, S. Deitz, S. Hoffswell, J. Plevyak, H. Zima, and R. Dia- of the 21st ACM SIGPLAN-SIGACT symposium on Principles of pro- conescu. Chapel Specification. Cray Inc, 0.82 edition, October 2011. gramming languages, pages 188–201. ACM Press, 1994. 10 2018/3/1 M. Tofte and J.-P. Talpin. Region-based memory management. Inf. Com- put., 132(2):109–176, 1997. M. Tofte, L. Birkedal, M. Elsman, and N. Hallenberg. A retrospective on region-based memory management. Higher-Order and Symbolic Computation Journal, 17:245–265, 2004. 11 2018/3/1