<<

Extensible Programming with First-Class Cases

Matthias Blume Umut A. Acar Wonseok Chae Toyota Technological Institute at Chicago {blume,umut,wchae}@tti-.org

Abstract mediate language. This translation is formalized as part of our type We present language mechanisms for polymorphic, extensible system. records and their exact dual, polymorphic sums with extensible As in any dual construction, the introduction form of the primal first-class cases. These features make it possible to easily extend corresponds to the elimination form of the dual. Thus, elimination existing code with new cases. In fact, such extensions do not re- forms of sums (.g., match or case) correspond to introduction quire any changes to code that adheres to a particular programming forms of records. In particular, record extension (an introduction style. Using that style, individual extensions can be written inde- form) corresponds to extension of cases (an elimination form). This duality motivates making cases first-class values as opposed pendently and later be composed to form larger components. These 1 language mechanisms provide a solution to the expression problem. to mere syntactic form. With cases being first-class and extensible, We study the proposed mechanisms in the context of an im- one can use the usual mechanisms of functional abstraction in a plicitly typed, purely functional language PolyR. We give a type style of programming that facilitates composable extensions. We system for the language and provide rules for a 2-phase transforma- show examples for this later in the introduction and in Section 2. tion: first into an explicitly typed λ-calculus with record polymor- Our polymorphic sums provide a notion of type refinement sim- phism, and finally to efficient index-passing code. The first phase ilar to data sorts of Freeman and Pfenning [9] and give rise to a sim- eliminates sums and cases by taking advantage of the duality with ple programming pattern facilitating composable extensions. Com- records. posable extensions can be used as a principled approach to solv- We implement a version of PolyR extended with imperative fea- ing the well-known expression problem described by Wadler [30]. tures and pattern matching—we call this language MLPolyR. Pro- There have been many attempts at solving the expression problem, grams in MLPolyR require no type annotations—the implemen- most of them in an object-oriented context [28, 22, 3, 16, 27, 6, 8, tation employs a reconstruction algorithm to infer all types. The 33, 18, 4, 29, 34]. Garrigue shows an approach based on polymor- generates (currently for PowerPC) and op- phic variants which is similar to our proposal, but somewhat less timizes the representation of sums by eliminating closures gener- general [11]. ated by the dual construction. Composable record extension and its dual Categories and Subject Descriptors .3.3 [Programming Lan- guages]: Language Constructs and Features—Polymorphism To understand the underlying mechanism, it is instructive to first look at an example. In MLPolyR we write { a = 1, ... = } General Terms Design, Languages to create a new record which extends record r with a new field a. Since records are first-class values, we can abstract over the Keywords duality, first-class cases, records, sums record being extended and obtain a function add a that extends any argument record (as long as it does not already contain a) with 1. Introduction a field a. Such a function can be thought of as the “difference” between its result and its argument: In this paper we study language mechanisms for polymorphic ex- fun add a r = { a = 1, ... = r } tensible records and their duals: polymorphic sums with a mecha- Here the difference consists of a field labeled a of type int and nism for adding new cases to existing code handling such sums. We value 1. The type of function add a is inferred as:2 present a based on a straightforward application of row val add a: { β } → { a: int, β } polymorphism [26] and incorporate it into a dialect of ML called We can write similar functions add b and add c of types MLPolyR. Our compiler for MLPolyR provides efficient type re- { β } → { b:bool, β } and { β } → { c:string, β } construction of principal types by using a variant of the well-known which add fields b and c respectively: algorithm W [19]. The key technical insight to fully general type fun add b r = { b = true, ... = r } inference is to keep separate type constructors for sums and cases during . Taking advantage of duality, sums and cases fun add c r = { c = "hello", ... = r } can be eliminated later by translation into an explicitly typed inter- We can then “add up” record differences represented by add a, add b, add c by composing these functions: fun add ab r = add a (add b r) fun add bc r = add b (add c r) The inferred types are: Permission to make digital or hard copies of all or part of this work for personal or val add ab: { β } → { a: int, b: bool, β } classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation val add bc: { β } → { b: bool, c: string, β } on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. 1 ICFP’06 September 16–21, 2006, Portland, Oregon, USA. In Standard ML [20], the corresponding syntactic form is known as match. Copyright c 2006 ACM 1-59593-309-3/06/0009. . . $5.00. 2 We omit the universal quantifier and the of the row variable β. Finally, we can create actual records by “adding” differences to the fun kv2kb kv = λv.‘App(kv,[v]) empty record: fun kb2kv kb = withfresh(λxr.‘Lam([xr],kb(‘Var xr))) val a = add a {} fun cvt app(e,~e,k ) = let val ab = add ab {} v fun lc([],kb) = kb([]) val bc = add bc {} | lc(e::~e,kb) = pc(e,~e,λ(v, ~v).kb(v::~v)) When translated to the dual, extensibility of records becomes exten- and pc(e,~e,kb) = cvt(e,λv.lc(~e,λ~v.kb(v,~v))) sibility of code. Here is a function representing the difference be- in pc(e,~e,λ(v, ~v).‘App(v,kv::~v)) end tween two code fragments, one of which can handle case ‘A while the other, represented by the argument c, cannot: and cvt lam(~x,e) = withfresh (λxk. fun add A c = cases ‘A () => print "A" ‘Lam(xk::~x,cvt(e,kv2kb(‘Var xk)))) default: c and cvt(e,kb) = match e with Note that function add A corresponds to add a of the dual. The cases ‘Con i ⇒ kb(‘Con i) type inferred for add A is: | ‘Var x ⇒ kb(‘Var x) val add A: (<β> ,→ ()) → (<‘A of (), β> ,→ ()) | ‘Lam(~x,e)⇒ kb(cvt lam(~x,e)) Here a type hρi ,→ τ denotes the type of first-class cases where | ‘App(e,~e)⇒ cvt app(e,~e,kb2kv(kb)) hρi is the sum type that is being handled and τ is the result. One fun convert e = cvt lam([],e) can think of ,→ as an alternative function arrow whose elimination form will be discussed below. Examples for functions add B and Figure 1. A simple CPS-converter. When converting ‘App, the contin- add C (corresponding to add b and add c in the dual) are: fun add B c = cases ‘B () => print "B" uation builder kb must be turned into a syntactic value kv that can be passed as an additional argument; this k is a new ‘Lam with a single parameter x default: c v r and a body obtained by applying k to xr (see function kb2kv). Conversely, fun add C c = cases ‘C () => print "C" b to convert a ‘Lam, an extra formal argument xk representing the continu- default: c ation is added. The corresponding kb simply constructs an ‘App of xk to As in the dual, we can now compose difference functions to obtain whatever argument v is given to kb (see function kv2kb). For clarity, the larger differences: code for ‘App and ‘Lam has been separated out into functions cvt app and fun add AB c = add A (add B c) cvt lam; cvt app uses helper functions lc and pc to recursively convert fun add BC c = add B (add C c) the operator e and all operands ~e. By applying a difference to the empty case nocases we obtain case values: val case A = add A nocases Following Appel [1], the conversion can be performed in val case AB = add AB nocases essentially linear time using an approach that could be called val case BC = add BC nocases “continuation-builder passing style” where the converter function These values can be used in a match form. The match construct is cvt (see Figure 1) receives the expression e to be converted and kb, the elimination form for the case arrow ,→. The following expres- the continuation builder, which represents the context in which e sion will cause "B" to be printed: appeared within the original expression. Once the converter comes match ‘B () with case BC up with a syntactic value v for the result of e, kb is invoked on this The previous examples demonstrate how functional record exten- v in order to produce the CPS expression representing e’ original sion in the primal corresponds to code extension in the dual. This context. forms the basis for our proposed solution to the expression prob- lem. In the non-recursive case, code extension is as straightforward 2.1 Variants, polymorphism, , and refinement as shown above. Section 2 discusses extensible CPS conversion as As explained in the introduction, our language MLPolyR has poly- a realistic example where a recursive type is extended in an inter- morphic sum types in the style of OCaml.5 The type system is based esting way. on Remy-style´ row polymorphism, handles equi-recursive types, and can infer principle types for all language constructs. For func- tion convert in Figure 1, the compiler calculates the following 2. Case study: CPS conversion type: Let us consider a recursive sum type whose values represent un- val convert: typed λ-terms. We use a Scheme-like calculus [21] with multi- ∀α.∀ξ : {‘App}, ζ : {‘Con,‘Lam,‘Var}. 3 parameter functions. Variables ‘Var x are represented by small ( as <‘App of (,[]), ‘Con of α, integers x, applications ‘App(e,~e) by an operator e and a list of ‘Lam of ([int], ), ‘Var of int>) → operands4 ~e, and λ-abstractions ‘Lam(~x,e) by a list of bound vari- (ν as <‘Con of α, ables ~x and a body e. We also include constants ‘Con c. ‘Lam of ([int], <‘App of (ν,[ν]), ξ>), Terms whose outermost constructor is ‘Con, ‘Var or ‘Lam are ‘Var of int, ζ>) called syntactic values. A commonly considered refinement of the term type restricts ‘App to only take such syntactic values. This Here  is a recursive sum type, indicated by keyword as and a type row enclosed in < ... >;  can clearly be recognized as is known as the CPS-restriction, since assuming a particular - 6 uation strategy (e.g., call-by-value, left-to-right) every unrestricted the type of lambda expressions. Similarly, type ν is the type of term can be converted into an “equivalent” CPS-term by making syntactic values. Function convert is polymorphic in α, the type use of Continuation-Passing Style. of values carried by ‘Con. Finally, ξ and ζ are row type variables constrained to a particular kind. The kind is a of labels that must be absent in any instantiation. Like in Standard ML, the type printer 3 As we will see below, our CPS-converter requires this feature to accom- in our compiler does not ever print universal quantifiers for ordinary modate continuation arguments. 5 4 We use the vector-arrow notation as in ~a to indicate variables or sub-terms In fact, even our syntax for constructors is inspired by OCaml’s choice. that are lists. 6 We use Haskell notation [t] for MLPolyR’s built-in list type. fun cvt app(cvt,e,~e,kv) = ... as before ... fun Let(x,e1,e2) = ‘App(‘Lam([x],e2),[e1]) fun cvt lam(cvt,~x,e) = ... as before ... fun cvti c(cvt,kb) = fun cvt c(cvt,kb) = cases ... same cases as before ... cases ‘If(ec,et,ee)⇒withfresh(λxk. Let(xk,kb2kv kb,cvt(ec,λvc. 0 fun mkConvert (c,e) = let val kb=kv2kb(‘Var xk) 0 0 let fun cvt(e,kb) = match e with c(cvt,kb) in ‘If(vc,cvt(et,kb),cvt(ee,kb)) end))) in cvt lam(cvt,[],e) end default: cvt c(cvt,kb)

fun convert e = mkConvert(cvt c,e) fun converti e = mkConvert(cvti c,e)

Figure 2. Preparing for extensibility. Explicitly open-code recursion Figure 3. Extending the CPS converter to handle ‘If. and separate the cases from the scrutinee in the match-construct.

fun cvt lcc(cvt,x ,e,x ) = type variables, and it suppresses them for row variables whenever c k withfresh(λx .withfresh(λx . the kind can be inferred from how the variable is used. In this case, d r Let(x ,‘Lam([x ,x ],‘App(‘Var x ,[‘Var x ])), since ξ appears in a row with ‘App and ζ appears in a row with c d r k r cvt(e,kv2kb(‘Var x ))))) ‘Con, ‘Lam, and ‘Var, kind information can be left implicit. In k fact, even the identity of the variables is irrelevant since there is fun cvtc c(cvt,k ) = only one occurrence of each. Therefore, the actual type expression b cases ‘LetCC(x ,e)⇒ printed by the MLPolyR compiler is the following: c withfresh(λxk.Let(xk,kb2kv(kb), val convert: cvt lcc(cvt,xc,e,xk))) ( as <‘App of (,[]), ‘Con of α, default: cvt c(cvt,kb) ‘Lam of ([int], ), ‘Var of int>) → fun convertc e = mkConvert(cvtc c,e) (ν as <‘Con of α, ‘Lam of ([int], <‘App of (ν,[ν]), ...>), Figure 4. Extending the CPS converter to handle ‘LetCC. ‘Var of int, ...>) A remarkable fact about this inferred type is its precision. Even though we had in mind only one restriction, namely that ‘App value that potentially contains constructors other than ‘Con, ‘Var, cannot appear directly inside another ‘App, the inference engine ‘Lam, and ‘App. Clearly, the type system is doing the right thing noticed that the converter enforces another invariant: all bodies of here, since the code itself is in no way prepared to handle anything ‘Lam are instances of ‘App. Nevertheless, this extra precision is not but those four cases. harmful. The result type is polymorphic and can be instantiated to To make the code extensible, we need a way of adding new the one we may have had in mind: cases, i.e., new language constructs that are handled. Since these new constructs should be allowed to appear anywhere within a (ν as <‘Con of α, given input expression, we also must open up the recursion. ‘Lam of ([int], ( as <‘App of (ν,[ν]), ‘Con of α, As explained earlier, in MLPolyR, the cases of a match- ‘Lam of ([int], ), expression handling a sum type hρi and returning a value of type ‘Var of int>)), τ are represented by first-class values of type hρi ,→ τ, and these ‘Var of int>) values are extensible in the same sense in which records can be extended with new fields. Thus, to prepare the code for future In other words, although any occurrence of ‘Lam in the output extensions we separate the cases from the scrutinee and parame- will in fact have been applied to a value constructed with ‘App, one terizing them by closing over their free variables. By letting one can send this output into a context that is prepared to also handle of these free variables be the recursive instance of function cvt other cases for ‘Lam. In fact, the output type is flexible enough itself we straightforwardly achieve open recursion. Finally, we put to be instantiated to the original unrestricted expression type. This the mechanism that closes the recursion into its own reusable rou- means that we could, for example, compose the convert function tine mkConvert and then use it to recover the original function with itself: convert by applying mkConvert to cvt c (see Figure 2). fun convert twice e = convert (convert e) 2.3 Extending the input language Doing so may be of limited use, but more importantly, existing With this preparation in place, it is now very easy to extend the con- utility routines such as pretty-printers, evaluators, and so forth verter to handle new language constructs. For example, the code that work on unrestricted expression can also be composed with in Figure 3 introduces a conditional ‘If which can appear in the function convert. input, and, if it does, will also appear in the output. The key con- Conversely, if the code for convert contained a bug that can struct that makes this work is the cases form with a default: clause. cause its output to violate the intended invariant, then this fact Here, a single new case (‘If) is handled, and the default explicitly would also be visible in the type. Composition with code that refers to the original set of four cases represented by cvt c. A new expects the invariant will then fail to type-check. converter converti, now handling five cases including ‘If, is ob- tained by closing the recursion using the same function mkConvert 2.2 Preparing for extensibility as before. As we have seen, row polymorphism represents a form of type Another example for an extension is the addition of ‘LetCC to refinement for the output of function convert. The input type, the input language. ‘LetCC is a binding construct which introduces however, is rigid. This means that we cannot convert to a a variable that, within its , refers to an “escape procedure” τ ::= α | int | τ1 → τ2 | {ρ} | hρi | α as hρi | hρi ,→ τ fun cvti c other c (cvt,kb) = cases ‘If(ec,et,ee)⇒ ... as before ... ρ ::= β |  | l : τ, ρ default: other c(cvt,kb) κ ::= {l1, . . . , lk} fun cvtc c other c (cvt,kb) = cases ‘LetCC(xc,e)⇒ ... as before ... σ ::= ∀(α1, . . . , αm). ∀(β1 : κ1, . . . , βn : κn).τ default: other c(cvt,kb) k k v ::= n | fun f x = e | l v | { li = vi }i=1 | { li xi ⇒ ei }i=1

fun converti e = mkConvert(cvti c cvt c,e) e ::= n | fun f x = e | x | l e | e1 e2 | let x = e1 in e2 | k fun convertc e = mkConvert(cvtc c cvt c,e) { li = ei }i=1 | e1 ⊗ {l = e2} | e l | k fun convertci e = mkConvert(cvtc c(cvti c cvt c),e) { li xi ⇒ ei }i=1 | e1 ⊕ { l x ⇒ e2 } | e l | e.l | match e1with e2

Figure 5. Extensions parameterized by what is being extended. Figure 6. The abstract syntax of PolyR. representing the current continuation.7 In CPS-converted code, the τ¯ ::= α | int | τ¯1 → τ¯2 | {ρ¯} | α as τ¯ | current continuation is always directly available as a value, mean- ∀(α1, . . . , αm). ∀(β1::κ1, . . . , βn::κn).τ¯ ing that ‘LetCC can be supported without need for a new language ρ¯ ::= β | β  τ¯ |  | l :τ, ¯ ρ¯ construct in the output language. As a result, the extension shown in Figure 4 extends the input language only. κ ::= {l1, . . . , lk} k v¯ ::= n | fun f x :τ ¯ =e ¯ | { li =v ¯i }i=1 | 2.4 Linearly composable extensions Λ(α1, . . . , αm).Λ(β1::κ1, . . . , βn::κn).e¯

One major weakness of the two extensions (‘If and ‘LetCC) e¯ ::= x | n | fun f x :τ ¯ =e ¯ | let x :τ ¯ =e ¯1 in e¯2 shown so far is that they are not orthogonal since each of them Λ(α1, . . . , αm).Λ(β1::κ1, . . . , βn::κn).e¯ | explicitly extends the original converter rather than another, poten- e¯1 e¯2 | e¯[¯τ1,..., τ¯m][¯ρ1,..., ρ¯n] | k tially already extended version. But with the mechanisms shown, { li =e ¯i }i=1 | e.l¯ | e¯1 ⊗ {l =e ¯2} | e¯ l this deficiency can be overcome quite easily by parameterizing the extension over what is being extended. The resulting pattern is Figure 7. The abstract syntax of FR. shown in Figure 5. Notice how the two extensions have become differences in the sense explained in the introduction, so they are now composable. We can think of them as layers of functionality types (hρi ,→ τ). We denote the set of free type variables of a that can be “stacked.” type τ by FTV(τ) and that of a typing context Γ by FTV(Γ). One can take the idea of extension composition to the extreme Row-types consist of a row(-type) variables β, and possibly empty by using a programming style (or “pattern”) where every case is sequences of and type matchings. The set of free row type written individually as a single layer in the above sense. Given k variables of a type τ is denoted by FRV(τ). For that of a typing such layers, one can easily generate a converter for any of the 2k context Γ we write FRV(Γ). The recursive sum α as hρi specifies possible input languages simply by stacking the corresponding sub- a sum type where the type variable α can recursively occur in the set of layers. To support this idea, MLPolyR provides the syntactic definition of ρ. form nocases of type hi ,→ α for any α. This form represents “no Type schemas rely on kinds, denoted by κ and variants, defined functionality” and can be used as the “base” upon which to stack. as sets of labels. Kinds are associated with row variables and spec- ify the labels that a row variable must not contain. Type schemas, denoted by σ (and variants), are defined as 3. The PolyR Language σ ::= ∀(α , . . . , α ). ∀(β ::κ , . . . , β ::κ ).τ This section describes an idealization of MLPolyR, called PolyR, 1 m 1 1 n n with polymorphic, extensible sums, records, and first-class cases. The quantifiers bind occurrences of type variables {α1, . . . , αm} To compile PolyR, we first translate it into a version of , and of row variables {β1, . . . , βm} that are free in τ. The kinds of called FR (Section 3.2). FR has support for records but not for the row variables are given by {κ1, . . . , κn}. sums. For brevity, we give the semantics for PolyR and the Expressions consist of values, functions, variables, translation to FR together (Section 3.3). Section 3.5 describes the constructors (l e), applications, let bindings, record expressions, translation from FR into an untyped λ-calculus. case expressions, and cases. Record expressions consist of record constructors {l1 = e1, . . . , lk = ek} (which we will often ab- 3.1 Abstract Syntax k breviate as { li = ei }i=1), record extensions e1 ⊗ {l = e2}, Figure 6 shows the abstract syntax for the language PolyR. The record subtractions e l, and record selections e.l. Case ex- meta-variable x and its variants range over variables. Meta-variable pressions are symmetric to records and consist of case constructors k l and variants range over an unspecified set of labels. Meta variables {l1 x1 ⇒ e1, . . . , lk xk ⇒ ek } (abbreviated as {li xi ⇒ ei }i=1), α and β (and variants) range over type and row-type variables re- case extensions e1 ⊕ { l x ⇒ e2 }, and case subtractions e l. spectively. Variables, labels, type variables, and row-type variables A match expression match e1with e2 matches e1 to the expres- are mutually disjoint sets. sions e2 whose value must be a case. Values consist of numbers, The types of the language are separated into (ordinary) types named functions, records where each field is a value, and cases. (denoted by τ and variants), row-types (denoted by ρ and variants), and type schemas (denoted by σ and variants). The types consist 3.2 System F of type variables α, the base type int, function types, record types PolyR expressions can be translated into expressions of a variant ({ρ}), sum types (hρi), recursive sum types (α as hρi), and case of System F with records and named functions. We call this lan- guage FR. Figure 7 shows the syntax of the FR language. (For the 7 ‘LetCC(x,e) is the same as Scheme’s (call/cc (lambda (x) rest of the paper we will use the terms “System F” and FR in- e)) [21]. terchangeably.) The language can be derived from PolyR by ex- κ ⊆ ∆(β) ∆ ` ρ \ κ l 6∈ κ τ I τ¯ ρ I ρ¯ ∆ ` \ κ ∆ ` β \ κ ∆ ` (l : τ, ρ) \ κ β I β  I  l : τ, ρ I l :τ, ¯ ρ¯ 

τ1 τ¯1 ρ;τ ¯2 ρ¯ I I ∆ ` τ2 ok ∆ ` τ1 ok β;τ ¯ β τ¯ ;τ ¯ (l : τ1, ρ);τ ¯2 l :τ ¯1 → τ¯2, ρ¯ I   I  I ∆ ` α ok ∆ ` int ok ∆ ` τ1 → τ2 ok ∆ ` ρ ok ∆ ` ρ ok τ1 I τ¯1 τ2 I τ¯2 ∆ ` {ρ} ok ∆ ` hρi ok α α int int τ → τ τ¯ → τ¯ I I 1 2 I 1 2 ∆ ` ρ ok ∆ ` ρ ok ∆ ` τ ok 0 0 0 τ ≈ τ τ I τ¯ ρ I ρ¯ ρ; α I ρ¯ ∆ ` α as hρi ok ∆ ` hρi ,→ τ ok 0 τ I τ¯ {ρ} I {ρ¯} hρi I ∀α. {ρ¯} → α hρi I τ¯ τ I τ¯ ρ;τ ¯ I ρ¯ α as hρi α as τ¯ hρi ,→ τ {ρ¯} I I ∆ ` β ok ∆ `  ok Figure 8. The translation of rows (top) and types (bottom) of ∆ ` τ ok ∆ ` ρ \{l} ∆ ` ρ ok PolyR to FR. ∆ ` l : τ, ρ ok

Figure 9. The lacks relations, and well-formed types and row- types (from top to bottom in that order) . cluding sum types, case types, operations on sum types and cases, adding type abstraction, and type application. To distinguish be- tween PolyR types and expressions from FR types and expressions, pressed by the ≈ relation, we ignore the order of labels in rows the FR meta-variables for expressions and types are written with a (Appendix B). bar, e.g., e,¯ v,¯ τ¯. Typing rules for the PolyR language (Figure 10) are non- The types of FR consist of type variables, the int type, function deterministic. Care must be taken to not introduce ill-formed types types, record types, recursive types, and polymorphic types. Record when “guessing” the types of functions, i.e., when creating an in- types are defined in terms of row types denoted by ρ¯ (and variants) stance of a polymorphic type, and when constructing bigger row- that consist of sequences of labeled types that can either end with types from existing row-types. Figure 9 defines the notion of well- an empty row , a row variable β, or a row arrow β  τ¯. The key formed types and row-types. The definitions rely on a lacks relation difference between the row types of the PolyR language and FR between rows and sets of labels. We say that a row ρ lacks a set language is the inclusion of the row-arrow β  τ¯. Row arrows are of labels κ under the kinding context ∆, denoted ∆ ` ρ \ κ, if ρ critical to represent sums and cases in terms of records. does not contain any of the labels from κ; if ρ contains a row-type The expressions of the language consist of variables, numbers, variable β, then the kind of β (recorded in ∆) must be a superset type abstractions (Λ(α1, . . . , αm).Λ(β1::κ1, . . . , βn::κn).e¯), type of κ. We say that a row-type ρ is well formed under ∆, denoted applications (e¯[¯τ1,..., τ¯m][¯ρ1,..., ρ¯n]), functions, applications, ∆ ` ρ ok if ρ consists of distinct labels and lacks the labels let expressions, and record expressions. The values consist of num- specified by the kinding environment. We say that a type τ is well- bers, functions, type abstractions, and records where each field is a formed under some kinding context ∆, denoted ∆ ` τ ok , if all value. row-(sub)types of τ are well formed under ∆. Throughout the paper, we omit empty bindings for type- and Figure 10 shows the typing rules for PolyR and their transla- row variables in type abstractions and empty type- and row type tion to System F. The judgments take place under a kinding con- arguments in type applications. For examples, we may write ∀β : text ∆ and the typing context Γ. The kinding context maps row κ.τ¯ or e¯[¯ρ] when no type variables are quantified. variables to kinds—the kind of a row variable is the set of labels that the variable is known not to contain. The typing context maps (ordinary) variables to type schemas. The judgments take the form 3.3 Static Semantics and Translation to System F ∆; Γ ` e : τ I e¯ :τ ¯ and state that, under the kinding context We present the static semantics of PolyR and simultaneously show ∆ and the typing context Γ, the PolyR expression e has type τ the translation of PolyR to System F. and translates to the FR expression e¯ with τ¯. The following lemma Figure 8 shows the translation for row arrows and the translation states that the translation preserves the types of terms with respect of types of the PolyR language to those of FR. Row-types are to the translation. The proof of this lemma is omitted here. translated either directly, written as ρ I ρ¯, or in the context of a type τ¯, written as ρ;τ ¯ I ρ¯. The translation ρ I ρ¯ translates ρ Lemma 1 pointwise by translating each field. The judgment ρ;τ ¯2 I ρ¯ relates If ∆; Γ ` e : τ I e¯ :τ ¯, then τ I τ¯. each field l : τ of ρ to a field l :τ ¯ → τ¯2 of ρ¯ where τ¯ is obtained by translating τ. If ρ is a row-type variable β, then the result is The most interesting judgments are those that introduce and β  τ¯2. eliminate polymorphism (the let/val and the var judgments), The types of PolyR are translated into System F by trans- and those that operate on sums, records, and cases. lating sums and cases into records (Figure 8). This makes the The PolyR language supports ML-style polymorphism (let rules that handle sums and cases particularly interesting. Sum polymorphism). How a is type-checked depends on types are translated into record types where each field is a func- whether the expression whose value is being bound is a syntactic tion from a member of the sum type to a universally quantified value or not. If the expression is of the form let x = v1 in e2, then type variable α. More precisely, the sum type hρi is translated the type of the value τ¯1 is generalized over all free type variables by first translating the row type ρ into ρ¯ under a type variable and row-type variables; the generalization requires constructing α and then generalizing the {ρ¯} → α. For ex- a kind for each row type (let/val judgment). If the expression is ample, the sum type hl1 : int, l2 : int → inti is translated into of the form let x = e1 in e2, where e1 is not a value, then the the type ∀α. {l1 : int → α, l2 :(int → int) → α} → α. As ex- type of e1 is not generalized (let/non-val judgment). There are two 0 Γ(x) = ∀α1 . . . αm.∀β1 :: κ1 . . . βn :: κn.τ ∀i.∆ ` τi ok ∀j.(∆ ` ρj ok ) ∧ (∆ ` ρj \ κj ) 0 τ = τ [τi/αi, ρj /βj ]i=1...m,j=1...n τ τ¯ ρ ρ¯ τ τ¯ 0 i I i i I i I ∆; Γ ` e : τ I e¯ :τ ¯ τ ≈ τ (var) 0 (reorder) ∆; Γ ` x : τ I x[¯τ1,..., τ¯m][¯ρ1,..., ρ¯n] :τ ¯ ∆; Γ ` e : τ I e¯ :τ ¯

∆; Γ, f : τ2 → τ, x : τ2 ` e : τ e¯ :τ ¯ ∆ ` τ2 ok τ2 τ¯2 (int) I I (fun) ∆; Γ ` n : int I n : int ∆; Γ ` fun f x = e : τ2 → τ I fun f x =e ¯ :τ ¯2 → τ¯

∆; Γ ` e : τ e¯ :τ ¯ ∆ ` (l : τ, ρ) ok (l : τ, ρ); α ρ¯ I I (data const.) ∆; Γ ` l e : hl : τ, ρi I (let xv :τ ¯ =e ¯ in Λα.fun xr = xr.l xv): ∀α. {ρ¯} → α

∆; Γ ` e1 : τ2 → τ e¯1 :τ ¯2 → τ¯ ∆; Γ ` e2 : τ2 e¯2 :τ ¯2 I I (app) ∆; Γ ` e1 e2 : τ I e¯1 e¯2 :τ ¯ −−−→ ~α = α , . . . , αm = FTV(τ ) \ FTV(Γ) β , . . . , βn = FRV(τ ) \ FRV(Γ) β :: κ = β :: κ , . . . , βn :: κn 1 −−−→ 1 1 −−−→ 1 1 1 ∆, β :: κ;Γ ` e1 : τ1 I e¯1 :τ ¯1 ∆; Γ, x : ∀~α.∀β :: κ.τ1 ` e2 : τ2 I e¯2 :τ ¯2 e1is a syntactic value −−−→ −−−→ (let/val) ∆; Γ ` let x = e1 in e2 : τ2 I let x : ∀~α.∀β :: κ.τ¯1 = Λ~α.Λβ :: κ.e¯1 in e¯2 :τ ¯2

∆; Γ ` e1 : τ1 e¯1 :τ ¯1 ∆; Γ, x : τ1 ` e2 : τ2 e¯2 :τ ¯2 e1is not a syntactic value I I (let/non-val) ∆; Γ ` let x = e1 in e2 : τ2 I let x :τ ¯1 =e ¯1 in e¯2 :τ ¯2

∆; Γ ` e : hρ[α as hρi/α]i e¯ :τ ¯[α as τ/α¯ ] ∆; Γ ` e : α as hρi e¯ : α as τ¯ I (roll) I (unroll) ∆; Γ ` e : α as hρi I e¯ : α as τ¯ ∆; Γ ` e : hρ[α as hρi/α]i I e¯ :τ ¯[α as τ/α¯ ]

∀i.∆; Γ ` ei : τi I e¯i :τ ¯i ∀i.(∆; Γ ` τi ok ) ∧ (∆; Γ, xi : τi ` ei : τ I e¯i :τ ¯) ∆ ` l1, . . . , lk ok ∆ ` l1, . . . , lk ok ∀i.(τi I τ¯i) (r) (c) k k k k ∆; Γ ` { li = ei }i=1 : { li : τi }i=1 ∆; Γ ` { li xi ⇒ ei }i=1 : h li : τi ii=1 ,→ τ k k k k I { li =e ¯i }i=1 : { li :τ ¯i }i=1 I { li = fun xi :τ ¯i =e ¯i }i=1 : { li :τ ¯i → τ¯ }i=1 ∆; Γ ` e1 : {ρ} I e¯1 : {ρ¯} ∆; Γ ` e1 : hρi ,→ τ I e¯1 : {ρ¯} ∆ ` ρ \{l} ∆ ` ρ \{l} ∆ ` τ1 ok τ1 I τ¯1 ∆; Γ ` e2 : τ2 I e¯2 :τ ¯2 ∆; Γ, x : τ1 ` e2 : τ I e¯2 :τ ¯ (r/ext) (c/ext) ∆; Γ ` e1 ⊗ {l = e2} : {l : τ2, ρ} ∆; Γ ` e1 ⊕ { l x ⇒ e2 } : hl : τ1, ρi ,→ τ I e¯1 ⊗ {l =e ¯2} : {l :τ ¯2, ρ¯} I e¯1 ⊗ {l = fun x :τ ¯1 =e ¯2} : {l :τ ¯1 → τ,¯ ρ¯}

∆; Γ ` e : {l : τ, ρ} e¯ : {l :τ, ¯ ρ¯} ∆; Γ ` e : hl : τ1, ρi ,→ τ e¯ : {l :τ ¯1 → τ,¯ ρ¯} I (r/sub) I (c/sub) ∆; Γ ` e l : {ρ} I e¯ l : {ρ¯} ∆; Γ ` e l : hρi ,→ τ I e¯ l : {ρ¯}

∆; Γ ` e1 : hρi I e¯1 : ∀α.({ρ¯α} → α) ∆; Γ ` e : {l : τ, ρ} e¯ : {l :τ, ¯ ρ¯} ∆; Γ ` e2 : hρi ,→ τ I e¯2 : {ρ¯τ } I (select) (match) ∆; Γ ` e.l : τ I e.l¯ :τ ¯ ∆; Γ ` match e1with e2 : τ I e¯1[¯τ]e ¯2 :τ ¯

Figure 10. The static semantics and translation for basic terms (top), and records and cases. motivations behind differentiating between syntactic values and label l under the condition that l is not included in the record type. non-values: 1) it ensures that the transformation to System F pre- The type of a record subtraction e¯ l is a record where label l is serves non-termination semantics of the program, and 2) it makes excluded under the condition that e¯ contains l. The type of a record it easier to extend the language with side effects (e.g., references). selection is the type of the field l being selected, under the condition When used, a variable with a polymorphic type is instantiated to that the record expression contains the field with label l. Since the a non-polymorphic type by selecting types and row types for its FR language includes the record expressions included in PolyR, all polymorphic variables (var rule). An instantiation is translated into record expressions are translated into the FR language directly. System F as a type application. A case constructor is assigned a case type that identifies the The bottom box in Figure 10 shows the typing rules for records result type τ of the bodies (ei’s) and maps each label li to its (left) and cases (right). The judgments are arranged to bring out the domain type τi. A case is translated into a record of functions, symmetry between these rules. one for each label li, whose argument type is equal to the domain A record constructor is assigned the record type that maps the type τ¯i of li and whose body is the body of the case ei. A case labels to the types of the corresponding fields as long as the labels extension extends the type of a case with a new branch. A case are distinct. The type of a record extension e1 ⊗ {l = e2} is a extension is translated to a record extension. A case subtraction record type that extends the type {ρ} of the record e1 with the takes out the specified branch from a case type and translates it t ::= n | x | t1 + t2 | t1 − t2 | len(t) | (val) n v ⇓ v fun f x = t | t1 t2 | h si ii=1 | t.t | let x = t in t 1 2 t1 ⇓ n1 t2 ⇓ n2 t1 ⇓ n1 t2 ⇓ n2 s ::= t | (t, t, t) (plus) (minus) t1 + t2 ⇓ n1 + n2 t1 − t2 ⇓ n1 − n2 v ::= n | fun x t1 = t2 | 0 t1 ⇓ fun f x = t1 t2 ⇓ v2 0 0 Figure 11. The abstract syntax for LRec. t1[fun f x = t1/f1, v2/x] ⇓ v (app) t1 t2 ⇓ v

t1 ⇓ v1 t2[v1/x] ⇓ v t ⇓ hv0, . . . , vn−1i into a record subtraction. A match expression match e1with e2 (let) (length) let x = t in t ⇓ v len(t) ⇓ n is well typed if the domain type of e2 is the same as the type e1. 1 2 Since data constructors are transformed into functions that select t1 ⇓ hv0, . . . , vi, . . . , vn−1i t2 ⇓ i 0 ≤ i < n (select) the appropriate function from their argument and apply their value t .t ⇓ v to that function, a match expression is compiled into a function 1 2 i application. Since translated sum expressions have polymorphic s1 ⇓s v1,0, . . . , v1,k1−1 . . . sn ⇓s vn,0, . . . , vn,kn−1 () type, this requires instantiating the function type first. We note that hs1, . . . , sni ⇓ hv1,0, . . . , v1,k1−1, . . . , vn,0, . . . , vn,kn−1i the symmetry to a selection is indirect (through the translation of t ⇓ v data constructors). (slice/singleton) t ⇓s v

3.4 Dynamic Semantics t1 ⇓ hv0, . . . , vi, . . . , vj , . . . , vn−1i t2 ⇓ i t3 ⇓ j 0 ≤ i ≤ j ≤ n The dynamic semantics of PolyR is mostly standard. The full (slice/sequence) semantics is given in Appendix A. Although the PolyR language (t1, t2, t3) ⇓s vi, . . . , vj−1 is purely functional, the dynamic semantics is written with the same implicit threading of state in mind that is also used by the Definition of Standard ML [20]. This removes all non-determinism Figure 12. The dynamic semantics for LRec. by enforcing an evaluation order. The primary motivation for this is to enable a precise specification of the transformation of PolyR into an untyped λ-calculus (Section 3.5) without altering the execution sisting of label and term pairs. More precisely, for a row variable order. A secondary motivation is to ensure that the transformation β, ∆(β) = {(l1, t1),..., (lk, tk)}, where ti is the term that will would be consistent with imperative features, if the languages are aid in computing the index for li in a record. We write ∆(β)(l) for extended with them. the index (term) of l for β, i.e., if (l, t) ∈ ∆(β), then ∆(β)(l) = t. Given ∆, the kind of a row variable β, denoted κ(∆, β) can be 3.5 Translation to Untyped λ-Calculus recovered by projecting out the labels. More precisely κ(∆, β) is defined as κ(∆, β) = {l | (l, t) ∈ ∆(β)}. We describe the translation of System F expressions (Section 3.2) The translation of numbers, variables, functions, applications, into an untyped language, called LRec. The LRec language extends and let expressions are straightforward. A record is translated into the untyped λ-calculus with (n-ary) and named functions; a tuple of slices, each of which is obtained by translating the Figure 11 shows the abstract syntax for LRec. The terms of the label expressions. The slices are sorted based on the corresponding language, denoted by t (and variants), consist of numbers n, vari- labels. Since sorting can re-arrange the ordering of the fields, the ables x, the operations plus and minus, len(t) for determining the transformation first evaluates the fields in their original order by number of fields in a tuple t, named functions, function applica- binding them to variables and then constructs the tuple using those tion, and introduction and eliminations forms for tuples. The in- n variables. troduction form for tuples, h s i , specifies a sequence of slices i i=1 A record selection is translated by computing the index for the from which the tuple is being constructed. The elimination form for label being projected based on the type of the record. To compute tuples is selection (projection), written t1.t2, that projects out the indices for record labels, the translation relies on two operations. field with index t2 from the tuple t1. The terms include a let expres- Given a set of labels κ and a label l, define the position of l in κ, sion (as syntactic sugar for application). A slice, denoted by s (and denoted pos(l, κ), as the number of labels of l that are less than l in 0 0 variants), is either a term, or a triple of terms (t1, t2, t3), where t1 the total order defined on labels. Formally, pos(l, κ) = |{l | l ∈ 0 yields a record while t2 and t3 must evaluate to numbers. A slice κ ∧ l

∀i, j.i < j ⇒ l#(i)

10 The same argument does not work for Haskell, because due to type 11 The cost of the heap limit check is amortized over multiple allocations classes Haskell’s polymorphism is not parametric. within a basic . can be compiled very efficiently, using an index-passing transfor- This setup rests firmly on the well-understood theory of row mation based on a kinded type system for records [25]. He also types. It allows for efficient type reconstruction and should yield points out the duality between records and sums and suggests that few surprises for programmers. On the other hand, we find that it the same index-passing techniques can be adopted to implement enables a very flexible style of programming where the treatment of polymorphic sums. Remy´ gives a more general type system capable individual variants of a sum can be coded separately and combined of expressing linearly extensible polymorphic records. Remy’s´ cal- later with minimal notational overhead. In combination with ex- culus employs row polymorphism and has an efficient type recon- plicitly coded open recursion (which requires equi-recursive types), struction algorithm that infers principal types [26]. Jones and Pey- it provides an elegant approach to solving the expression problem, ton Jones describe an implementation of extensible records based i.e., the problem of adding new constructors to a datatype and be- on the same ideas for Haskell [15]. ing able to re-use existing code. Since we are striving for simplicity, Gaster and Jones attempt a direct encoding of the dual construc- we consciously left out features such as subtyping or inference of tion for sum types within Haskell’s type system [12]. The encoding intersections. requires type system features absent from most languages, in par- We implemented our language using a technique based on ticular higher-order polymorphism and a which Ohori’s index-passing scheme for polymorphic records and the roughly corresponds to the row arrow  in our System F. Type exploitation of the sum-product duality for first-class cases. In this inference in such a system seems difficult, and, indeed, Gaster and paper we explain this technique in terms of a 2-stage translation, Jones report that they had to impose an ad-hoc restriction to obtain first into an explicitly-typed polymorphic (Sys- most general unifiers. Their restriction is to disallow empty rows, tem F) where sums and cases are eliminated using duality with meaning that they could not type our nocases construct. records, then into an untyped λ-calculus LRec where records are Garrigue implements a version of polymorphic sum types in represented as vectors with numeric indices. OCaml. His approach does not take advantage of the duality be- tween sums and records but instead provides a form of extensibil- ity based on so-called variant dispatching [10, 11]. As Zenger and 7. Acknowledgments Odersky point out [33], variant dispatching requires writing addi- We would like to thank Atsushi Ohori for helpful discussions. tional functions to forward control to existing code. This is a con- Jacques Garrigue as well as the anonymous reviewers provided sequence of the fact that in Garrigue’s system, extensions need to valuable feedback. know what they are extending. As a result, extensions cannot be composed directly. It should be noted that a suitably modified typing rule for a References match expression with a default case could actually be used to give [1] A. W. Appel. Compiling with continuations. Cambridge University Garrigue’s implementation the same power of extensibility that we Press, New York, NY, USA, 1992. provide in MLPolyR. Consider the following example: [2] A. W. Appel and D. B. MacQueen. Standard ML of New Jersey. fun g y = ... In M. Wirsing, editor, 3rd International Symp. on Prog. Lang. fun f x = match x with Implementation and , pages 1–13, New York, ‘A () ⇒ print "A" Aug. 1991. Springer-Verlag. | y ⇒ g y [3] F. Bourdoncle and S. Merz. Type-checking higher-order polymorphic multi-methods. In Conference Record of POPL ’97: The 24th Here the types of x and y should be related sums that share a ACM SIGPLAN-SIGACT Symposium on Principles of Programming common row, the only difference being the presence of the ‘A Languages, pages 302–315, Paris, France, 15–17 1997. constructor in x’s type and its absence in y’s type. The typing rule [4] K. Bruce. Some challenging typing issues in object-oriented for this could be: languages. In Electronic notes in Theoretical , volume 82(8), 2003. Γ ` e : hl : τ , ρi Γ, x : τ ` e : τ Γ, y : hρi ` e : τ 1 l l 2 3 [5] L. Cardelli and J. C. Mitchell. Operations on records. In C. A. Γ ` match e1 with lx ⇒ e2 | y ⇒ e3 : τ Gunter and J. C. Mitchell, editors, Theoretical Aspects of Object- Oriented Programming: Types, Semantics, and Language Design, This approach does not require the alternative function arrow ,→ pages 295–350. The MIT Press, Cambridge, MA, 1994. for cases but uses the ordinary function arrow in its place. The [6] R. B. Findler and M. Flatt. Modular object-oriented programming main advantage of having the case arrow ,→ in the type system with units and mixins. In ICFP ’99: Proceedings of the fourth ACM and statically distinguishing cases from other functions lies in the SIGPLAN international conference on , fact that this makes it very easy to use different runtime repre- volume 34(1) of SIGPLAN, pages 94–104, New York, NY, June 1999. sentations for the two. In particular, we can represent MLPolyR ACM. cases as records of functions. These records represent jump tables. [7] C. Flanagan, A. Sabry, B. F. Duba, and M. Felleisen. The Essence of In Garrigue’s implementation, however, case analysis for polymor- Compiling with Continuations. In 1993 Conference on Programming phic variants proceeds by direct comparisons of constructor names Language Design and Implementation., pages 21–25, June 1993. (significantly sped up via hashing). Thus, his implementation tech- [8] M. Flatt. Programming Languages for Reusable Software Compo- nique essentially corresponds to our semantics of PolyR. In this nents. PhD thesis, Department of Computer Science, Rice University, setting, extending functions by extra cases can be implemented by 1999. simple chaining of conditionals. [9] T. Freeman and F. Pfenning. Refinement types for ML. In PLDI ’91: Proceedings of the ACM SIGPLAN 1991 conference on Programming 6. Conclusions language design and implementation, pages 268–277, New York, NY, We have presented MLPolyR, a language with row polymorphism USA, 1991. ACM Press. for both records and sums. MLPolyR explicitly exposes the duality [10] J. Garrigue. Programming with polymorphic variants. In ACM between sums and products by providing a type of first-class cases. SIGPLAN Workshop on ML, 1998. Values of case type (like records, their dual counterpart) can be [11] J. Garrigue. Code reuse through polymorphic variants. In Workshop functionally extended to handle larger sums. on Foundations of Software Engineering, Nov. 2000. [12] B. R. Gaster and M. P. Jones. A polymorphic type system for inheritance. Information and Computation, 93(1):1–15, 1991. extensible records and variants. Technical Report NOTTCS-TR-96-3, [33] M. Zenger and M. Odersky. Extensible algebraic datatypes with Department of Computer Science, University of Nottingham, 1996. defaults. In ICFP ’01: Proceedings of the sixth ACM SIGPLAN [13] K. Gowani and M. Blume. Writing a garbage collector for the international conference on Functional programming, pages 241– MLPolyR compiler, July 2005. Final report on independent study 252, New York, NY, USA, 2001. ACM Press. project in the CSPP program. [34] M. Zenger and M. Odersky. Independently extensible solutions to [14] M. P. Jones. Coherence for qualified types. Technical Report the expression problem. In The 12th International Workshop on YALEU/DCS/RR-989, Yale University, New Haven, Connecticut, Foundations of Object-Oriented Languages (FOOL 12), Long Beach, USA, 1993. California, 2005. ACM. [15] M. P. Jones and S. P. Jones. Lightweight extensible records for Haskell, 1999. A. Dynamic Semantics for PolyR [16] S. Krishnamurthi, M. Felleisen, and D. P. Friedman. Synthesizing Figure 14 shows the dynamic semantics for the PolyR language. Object-Oriented and Functional Design to Promote Re-Use. In 0 European Conference on Object-Oriented Programming, 1998. e1 ⇓ fun f x = e1 e2 ⇓ v2 0 0 e1[fun f x = e1/f, v2/x] ⇓ v [17] X. Leroy. [caml-list] cyclic types. http://caml.inria.fr/pub/ml- (val) (app) archives/caml-list/, Jan. 2005. v ⇓ v e1 e2 ⇓ v

[18] T. Millstein, C. Bleckner, and C. Chambers. Modular typechecking e ⇓ v e1 ⇓ v1 e2[v1/x] ⇓ v for hierarchically extensible datatypes and functions. In ICFP ’02: (data const.) (let) l e ⇓ l v let x = e1 in e2 ⇓ v Proceedings of the seventh ACM SIGPLAN international conference on Functional programming, pages 110–122, New York, NY, USA, e1 ⇓ v1 . . . ek ⇓ vk (r) 2002. ACM Press. k k { li = ei }i=1 ⇓ { li = vi }i=1 [19] R. Milner. A theory of type polymorphism in programming. Journal ˘ 0 ¯k of Computer and System Sciences, 13(3):348–375, 1978. e1 ⇓ li = vi i=1 e2 ⇓ v2 0 0 (r/ext) [20] R. Milner, M. Tofte, R. Harper, and D. MacQueen. The Definition of e1 ⊗ {l = e2} ⇓ {l1 = v1, . . . , lk = vk, l = v2} Standard ML (Revised). MIT Press, Cambridge, MA, 1997. e ⇓ {l1 = v1, . . . , li = vi, . . . , lk = vk} [21] I. N. I. Adams, D. H. Bartley, G. Brooks, R. K. Dybvig, D. P. (r/sub) e l ⇓ {l = v , . . . , l = v , Friedman, R. Halstead, C. Hanson, C. T. Haynes, E. Kohlbecker, i 1 1 i−1 i−1 l = v , . . . , l = v } D. Oxley, K. M. Pitman, G. J. Rozas, J. G. L. Steele, G. J. Sussman, i+1 i+1 k k M. Wand, and H. Abelson. Revised5 report on the algorithmic e ⇓ {l1 = v1, . . . , li = vi, . . . , lk = vk} language scheme. SIGPLAN Not., 33(9):26–76, 1998. (select) e.li ⇓ vi [22] M. Odersky and P. Wadler. Pizza into Java: Translating theory into 0 k practice. In Proceedings of the 24th ACM Symposium on Principles of e1 ⇓ { li xi ⇒ ei }i=1 Programming Languages (POPL’97), Paris, France, pages 146–159. 0 (c/ext) e1 ⊕ { l x ⇒ e2 } ⇓ {l1 x1 ⇒ e1,..., ACM Press, New York (NY), USA, 1997. 0 lk xk ⇒ ek, l x ⇒ e2} [23] A. Ohori. A simple semantics for ml polymorphism. In FPCA, pages 0 0 0 281–292, 1989. e ⇓ { l1 x1 ⇒ e1, . . . , li xi ⇒ ei, . . . , lk xk ⇒ ek } 0 0 (c/sub) [24] A. Ohori. A compilation method for ml-style polymorphic record e li ⇓ {l1 x1 ⇒ e1, . . . , li−1 xi−1 ⇒ ei−1, 0 0 calculi. In POPL ’92: Proceedings of the 19th ACM SIGPLAN- li+1 xi+1 ⇒ ei+1, . . . , lk xk ⇒ ek} SIGACT symposium on Principles of programming languages, pages e1 ⇓ li v 154–165, New York, NY, USA, 1992. ACM Press. 0 0 0 e2 ⇓ { l1 x1 ⇒ e1, . . . , li xi ⇒ ei, . . . , lk xk ⇒ ek } [25] A. Ohori. A polymorphic record calculus and its compilation. ACM 0 0 ei[v/xi] ⇓ v Trans. Program. Lang. Syst., 17(6):844–895, 1995. 0 (match) match e1with e2 ⇓ v [26] D. Remy.´ Type inference for records in a natural extension of ML. Research Report 1431, Institut National de Recherche en Informatique et Automatisme, Rocquencourt, BP 105, 78 153 Le Figure 14. The dynamic semantics for PolyR. Chesnay Cedex, France, may 1991. [27] D. Remy´ and J. Vouillon. Objective ML: An effective object-oriented extension to ML. Theory And Practice of Objects Systems, 4(1):27– B. Reordering rules 50, 1998. Figure 15 shows the reordering judgment ≈ which expresses the [28] J. Reppy and J. Riecke. Simple objects for Standard ML. In relationship between two types where they are considered equal up Proceedings of the ACM SIGPLAN ’96 Conference on Programming to permutation of their fields. Language Design and Implementation, pages 171–180, Philadelphia, 0 0 0 Pennsylvania, 21–24 1996. τ1 ≈ τ1 τ2 ≈ τ2 ρ ≈ ρ 0 0 0 [29] M. Torgersen. The expression problem revisited: Four new solutions α ≈ α int ≈ int τ1 → τ2 ≈ τ1 → τ2 {ρ} ≈ {ρ } using generics. In M. Odersky, editor, ECOOP 2004—Object- ρ ≈ ρ0 ρ ≈ ρ0 ρ ≈ ρ0 τ ≈ τ 0 Oriented Programming, 18th European Conference, Oslo, Norway, 0 0 0 0 Proceedings, volume 3086 of LNCS, pages 123–143, New York, NY, hρi ≈ hρ i α as hρi ≈ α as hρ i hρi ,→ τ ≈ hρ i ,→ τ July 2004. Springer-Verlag. [30] P. Wadler. The expression problem. Email to the Java Genericity β ≈ β  ≈  l : τ, β ≈ l : τ, β l : τ,  ≈ l : τ,  mailing list, Dec. 1998. # is a permutation of 1,. . . ,k

[31] M. Wand. Complete type inference for simple objects. In Proceedings l1 : τ1, . . . , lk : τk, ρ ≈ l#(1) : τ#(1), . . . , l#(k) : τ#(k), ρ of the IEEE Symposium on Logic in Computer Science, Ithaca, NY, June 1987. Figure 15. The reordering judgment ≈. [32] M. Wand. Type inference for record and multiple