<<

1ML – Core and Modules United (F-ing First-class Modules)

Andreas Rossberg Google, Germany [email protected]

Abstract ference [18, 3], and an advanced module system based on concepts ML is two languages in one: there is the core, with types and ex- from dependent [17]. Although both have contributed pressions, and there are modules, with signatures, structures and to the success of ML, they exist in almost entirely distinct parts functors. Modules form a separate, higher-order functional lan- of the language. In particular, the convenience of type inference is guage on top of the core. There are both practical and technical available only in ML’s so-called core language, whereas the mod- reasons for this stratification; yet, it creates substantial duplication ule language has more expressive types, but for the price of being in syntax and semantics, and it reduces expressiveness. For exam- painfully verbose. Modules form a separate language layered on ple, selecting a module cannot be made a dynamic decision. Lan- top of the core. Effectively, ML is two languages in one. guage extensions allowing modules to be packaged up as first-class This stratification makes sense from a historical perspective. values have been proposed and implemented in different variations. Modules were introduced for programming-in-the-large, when the However, they remedy expressiveness only to some extent, are syn- core language already existed. The machinery that tactically cumbersome, and do not alleviate redundancy. was the central innovation of the original module design was alien We propose a redesign of ML in which modules are truly first- to the core language, and could not have been integrated easily. class values, and core and module layer are unified into one lan- However, we have since discovered that dependent types are not guage. In this “1ML”, functions, functors, and even type construc- actually necessary to explain modules. In particular, Russo [26, 28] tors are one and the same construct; likewise, no distinction is made demonstrated that module types can be readily expressed using between structures, records, or tuples. Or viewed the other way only System-F-style quantification. The F-ing modules approach round, everything is just (“a mode of use of”) modules. Yet, 1ML later showed that the entire ML module system can in fact be does not require dependent types, and its type structure is express- understood as a form of syntactic sugar over System Fω [25]. Meanwhile, the second-class nature of modules has increasingly ible in terms of plain System Fω, in a minor variation of our F-ing modules approach. We introduce both an explicitly typed version been perceived as a practical limitation. The standard example of 1ML, and an extension with Damas/Milner-style implicit quan- being that it is not possible to select modules at runtime: tification. Type inference for this language is not complete, but, we module Table = if size > threshold then HashMap else TreeMap argue, not substantially worse than for Standard ML. An alternative view is that 1ML is a user-friendly surface syntax A definition like this, where the choice of an implementation is for System Fω that allows combining term and type abstraction in dependent on dynamics, is entirely natural in object-oriented lan- a more compositional manner than the bare calculus. guages. Yet, it is not expressible with ordinary ML modules. What a shame! Categories and Subject Descriptors D.3.1 [Programming Lan- guages]: Formal Definitions and Theory; D.3.3 [Programming 1.1 Packaged Modules Languages]: Language Constructs and Features—Modules; F.3.3 [Logics and Meanings of Programs]: Studies of Program Constructs— It comes to no surprise, then, that various proposals have been Type structure made (and implemented) that enrich ML modules with the ability to package them up as first-class values [27, 22, 6, 25, 7]. Such General Terms Languages, Design, Theory packaged modules address the most imminent needs, but they are Keywords ML modules, first-class modules, type systems, ab- not to be confused with truly first-class modules. They require stract data types, existential types, System F, elaboration explicit injection into and projection from first-class core values, accompanied by heavy annotations. For example, in OCaml 4 the 1. Introduction above example would have to be written as follows: The ML family of languages is defined by two splendid innova- module Table = (val (if size > threshold then (module HashMap : MAP) tions: with Damas/Milner-style type in- else (module TreeMap : MAP))) : MAP) which, arguably, is neither natural nor pretty. Packaged modules have limited expressiveness as well. In particular, type sharing with a packaged module is only possible via a detour through core-level polymorphism, such as in: f : (module S with type t = ’a) → (module S with type t = ’a) → ’a (where t is an abstract type in S). In contrast, with proper modules, the same sharing could be expressed as [Copyright notice will appear here once ’preprint’ option is removed.] f : (X : S) → (S with type t = X.t) → X.t

1 2015/6/18 Because core-level polymorphism is first-order, this approach tool for solving type constraints. And while inference algorithms cannot express type sharing between type constructors – a com- for exist, they have much less satisfactory properties than plaint that has come up several times on the OCaml mailing list; our beloved Hindley/Milner sweet spot. for example, if one were to abstract over a monad: Worse, module types do not even form a lattice under subtyping: map : (module MONAD with type ’a t = ?) → (’a → ’b) → ? → ? f1 : {type t a; x : t int} → int f2 : {type t a; x : int} → int There is nothing that can be put in place of the ?’s to complete this g = if condition then f else f function signature. The programmer is forced to either use weaker 1 2 types (if possible at all), or drop the use of packaged modules and There are at least two possible types for g: lift the function (and potentially a lot of downstream code) to the g : {type t a = int; x : int} → int functor level – which not only is very inconvenient, it also severely g : {type t a = a; x : int} → int restricts the possible computational behaviour of such code. One could imagine addressing this particular limitation by introducing Neither is more specific than the other, so no least upper bound higher-kinded polymorphism into the ML core. But with such an exists. Consequently, annotations are necessary to regain principal extension type inference would require higher-order unification and types for constructs like conditionals, in order to restore any hope hence become undecidable – unless accompanied by significant for compositional type checking, let alone inference. restrictions that are likely to defeat this example (or others). 1.3 F-ing Modules 1.2 First-Class Modules In our work on F-ing modules with Russo & Dreyer [25] we have Can we overcome this situation and make modules more equal demonstrated that ML modules can be expressed and encoded citizens of the language? The answer from the literature has been: entirely in vanilla System F (or Fω, depending on the concrete core no, because first-class modules make type-checking undecidable language and the desired semantics for functors). Effectively, the F- and type inference infeasible. ing semantics defines a type-directed desugaring of module syntax The most relevant work is Harper & Lillibridge’s calculus into System F types and terms, and inversely, interprets a stylised of translucent sums [9] (a precursor of later work on singleton subset of System F types as module signatures. types [31]). It can be viewed as an idealised functional language The core language that we assume in that paper is System F that allows types as components of (dependent) records, so that (respectively, Fω) itself, leading to the seemingly paradoxical situ- they can express modules. In the type of such a record, individual ation that the core language appears to have more expressive types type members can occur as either transparent or opaque (hence, than the module language. That makes sense when considering that translucent), which is the defining feature of ML module typing. the module translation rules manipulate the sublanguage of module Harper & Lillibrige prove that type-checking this language is types in ways that would not generalise to arbitrary System F types. undecidable. Their result applies to any language that has (a) con- In particular, the rules implicitly introduce and eliminate universal travariant functions, (b) both transparent and opaque types, and (c) and existential quantifiers, which is key to making modules a us- allows opaque types to be subtyped with arbitrary transparent types. able means of abstraction. But the process is guided by, and only The latter feature usually manifests in a subtyping rule like meaningful for, module syntax; likewise, the built-in subtyping re-

{D1[τ/t]} ≤ {D2[τ/t]} lation is only “complete” for the specific occurrences of quantifiers FORGET {type t=τ;D } ≤ {type t;D } in module types. 1 2 Nevertheless, the observation that modules are just sugar for which is, in some variation, at the heart of every definition of signa- certain kinds of constructs that the core language can already ex- ture matching. In the premise the concrete type τ is substituted for press (even if less concisely), raises the question: what necessitates the abstract t. Obviously, this rule is not inductive. The substitution modules to be second-class in that system? can arbitrarily grow the types, and thus potentially require infinite derivations. A concrete example triggering non-termination is the 1.4 1ML following, adapted from Harper & Lillibridge’s paper [9]: The answer to that question is: very little! And the present paper is type T = {type A; f : A → ()} motivated by exploring that answer. type U = {type A; f : (T where type A = A) → ()} In essence, the F-ing modules semantics reveals that the syntac- type V = T where type A = U tic stratification between ML core and module language is merely g (X : V) = X : U (* V ≤ U ? *) a rather coarse means to enforce predicativity for module types: Checking V ≤ U would match type A with type A=U, substitut- it prevents abstract types themselves from being instantiated with ing U for A accordingly, and then requires checking that the types binders for abstract types. But this heavy syntactic restriction can of f are in a subtyping relation – which contravariantly requires be replaced by a more surgical semantic restriction! It is enough checking that (T where type A = A)[U/A] ≤ A[U/A], but that to employ a simple universe distinction between small and large is the same as the V ≤ U we wanted to check in the first place. types (reminiscent of Harper & Mitchell’s XML [10]), and limit In fewer words, signature matching is no longer decidable when the equivalent of the FORGET rule shown earlier to only allow small module types can be abstracted over, which is the case if module types for subsitutition, which serves to exclude problematic quan- types are simply collapsed into ordinary types. It also arises if tifiers. “abstract signatures” are added to the language, as in OCaml, where That would settle decidability, but what about type inference? the same divergent example can be constructed on the module type Well, we can use the same distinction! A quick inspection of the level alone. subtyping rules in the F-ing modules semantics reveals that they, Some may consider decidability a rather theoretical concern. almost, degenerate to type equivalence when applied to small types However, there also is the – quite practical – issue that the introduc- — the only exception being width subtyping on structures. If we tion of signature matching into the core language makes ML-style are willing to accept that inference is not going to be complete type inference impossible. Obviously, Milner’s algorithm W [18] is for records (which it already isn’t in Standard ML), then a simple far too weak to handle dependent types. Moreover, modules intro- restriction to inferring only small types is sufficient to make type duce subtyping, which breaks unification as the basic algorithmic inference work almost as usual.

2 2015/6/18 In this spirit, this paper presents 1ML, an ML-dialect in which pression type T reifies the type T as a value.1 Such an expression modules are truly first-class values. The name is both short for “1st- has type type, and thereby can be abstracted over. For example, class module language” and a pun on the fact that it unifies core and id = fun (a : type) ⇒ fun (x : a) ⇒ x modules of ML into one language. Our contributions are as follows: defines a polymorphic identity function, similar to how it would • We present a decidable for a language of first-class be written in dependent type theories. Note in particular that a modules that subsumes conventional second-class ML modules. is a term variable, but it is used as a type in the annotation for • We give an elaboration of this language into plain System Fω. x. This is enabled by the “path” form E in the syntax of types, • We show how Damas/Milner-style type inference can be inte- which expresses the (implicit) projection of a type from a term, grated into such a language; it is incomplete, but only in ways provided this term has type type. Consequently, all variables are that are already present in existing ML implementations. term variables in 1ML, there is no separate notion of type variable. More interestingly, a function can return types, too. Consider • We develop the basis for a practical design of an ML-like language in which the distinction between core and modules pair = fun (a : type) ⇒ fun (b : type) ⇒ type {fst : a; snd : b} has been eliminated. which takes a type and returns a type, and effectively defines a . Applied to a reified type it yields a reified type. Again, We see several benefits with this redesign: it produces a lan- the implicit projection from “paths” enables using this as a type: guage that is more expressive and concise, and at the same time, more minimal and uniform. “Modules” become a natural means to second = fun (a : type) ⇒ fun (b : type) ⇒ fun (p : pair a b) ⇒ p.snd express all forms of (first-class) polymorphism, and can be freely intermixed with “computational” code and data. Type inference In this example, the whole of “pair a b” is a term of type type. integrates in a rather seamless manner, reducing the need for ex- Figure 1 also defines a of syntactic sugar to make function plicit annotations to large types, module or not. Every program- and type definitions look more like in traditional ML. For example, ming concept is derived from a small of orthogonal constructs, the previous functions could equivalently be written as over which general and uniform syntactic sugar can be defined. id a (x : a) = x type pair a b = {fst : a; snd : b} 2. 1ML with Explicit Types second a b (p : pair a b) = p.snd To separate concerns a little, we will start out by introducing It may seem surprising that we can just reify types as first-class 1MLex, a sublanguage of 1ML proper that is explicitly typed and values. But reified types (or “atomic type modules”) have been does not support any type inference. Its kernel syntax is given in common in module calculi for a long time [16, 6, 24, 25]. We are Figure 1. Let us take a little tour of 1MLex by way of examples. merely making them available in the source language directly. For the most part, this is just a notational simplification over what first- Functional Core A major part of 1MLex consists of fairly con- class modules already offer: instead of having to define a spurious ventional functional language constructs. On the expression level, module T = {type t = int} : {type t} and then refer to T.t, we as a representative for a base type, we have Booleans; in examples allow injecting types into modules (i.e., values) anonymously, with- that follow, we will often assume the presence of an integer type out wrapping them into a structure; thus t = (type int) : type, and respective constructs as well. Then there are records, which which can be referred to as just t. consist of a sequence of bindings. And of course, it wouldn’t be a functional language without functions. Translucency The type type allows classifying types abstractly: In a first approximation, these forms are reflected on the type given a value of type type, nothing is known about what type level as one would expect, except that for functions we allow it is. But for modular programming it is essential that types can two forms of arrows, distinguishing pure function types (⇒) from selectively be specified transparently, which enables expressing the impure ones (→) (discussed later). vital concept of type sharing [12]. Like in the F-ing modules paper [25], most elimination forms in As a simple example, consider these type aliases: the kernel syntax only allow variables as subexpressions. However, type size = int the general expression forms are all definable as straightforward type pair a b = {fst : a; snd : b} syntactic sugar, as shown in the lower half of Figure 1. For example, According to the idea of translucency, the variables defined by these (fun (n : int) ⇒ n + n) 3 definitions can be classified in one of two ways. Either opaquely: desugars into size : type pair : (a : type) ⇒ (b : type) ⇒ type let f = fun (n : int) ⇒ n + n; x = 3 in f x Or transparently: and further into size : (= type int) {f = fun (n : int) ⇒ n + n; x = 3; body = f x}.body pair : (a : type) ⇒ (b : type) ⇒ (= type {fst : a; snd : b}) This works because records actually behave like ML structures, The latter use a variant of singleton types [31, 6] to reveal the such that every bound identifier is in scope for later bindings – definitions: a type of the form “=E” is inhabited only by values which enables encoding let-expressions. that are “structurally equivalent” to E, in particular, with respect to Also, notably, if-expressions require a type annotation in 1MLex. parts of type type. It allows the type system to infer, for example, As we will see, the type language subsumes module types, and as that the application pair size size is equivalent to the (reified) type discussed in Section 1.2 there wouldn’t generally be a unique least upper bound otherwise. However, in Section 4 we show that this 1 Ideally, “type T ” should be written just “T ”, like in dependently typed annotation can usually be omitted in full 1ML. systems. However, that would create various syntactic ambiguities, e.g. for phrases like “{}”, which could only be avoided by moving to a more Reified Types The core feature that makes 1MLex able to express artificial syntax for types themselves. Nevertheless, we at least allow writing modules is the ability to embed types in a first-class manner: the ex- “ET ” for the application “E (type T )” if T unambiguously is a type.

3 2015/6/18 (identifiers) X ⇒ (types) T ::= E | bool | {D} | (X:T ) → T | type | =E | T where (.X:T ) (declarations) D ::= X : T | include T | D;D |  (expressions) E ::= X | true | false | if X then E else E:T | {B} | E.X | fun (X:T ) ⇒E | XX | type T | X:>T (bindings) B ::= X=E | include E | B;B | 

(types) (expressions) let B in T := {B;X= type T }.X let B in E := {B;X=E}.X ⇒ ⇒ T1 → T2 := (X:T1) → T2 if E1 then E2 else E3:T := let X=E1 in if X then E2 else E3:T T where (.X P =E) := T where (.X: P ⇒ (=E)) E1 E2 := let X1=E1; X2=E2 in X1 X2 T where (type .X P =T 0) := T where (.X: P ⇒ (= type T 0)) ET := E (type T ) (if T unambiguous) E : T := (fun (X:T ) ⇒ X) E (declarations) E :> T := let X=E in X :> T local B in D := include (let B in {D}) fun P ⇒ E := fun P ⇒ E X P :T := X : P ⇒ T X P =E := X : P ⇒ (= E) (bindings) type X P := X : P ⇒ type local B in B0 := include (let B in {B0}) type X P =T := X : P ⇒ (= type T ) X P : T 0 :> T 00=E := X = fun P ⇒ E : T 0 :> T 00 where: (parameter) P ::= (X:T ) with abbreviation X := (X: type) type X P =T := X = fun P ⇒ type T (Identifiers X only occurring on the right-hand side are considered fresh)

Figure 1. 1MLex syntax and syntactic abbreviations

{fst : int; snd : int}. A type =E is a subtype of the type of E add a (k : key) (v : a) (m : map a) = itself, and consequently, transparent classifications define subtypes fun (x : key) ⇒ if Key.eq x k then some a v else m x : opt a of opaque ones, which is the crux of ML signature matching. } Translucent types usually occur as part of module type declara- The record type EQ amounts to a module signature, since it con- tions, where 1ML can abbreviate the above to the more familiar tains an abstract type component t. It is referred to in the type of eq, type size type size = int or, respectively, which shows that record types are seemingly “dependent”: like for type pair a b type pair a b = {fst : a; snd : b} terms, earlier components are in scope for later components – the i.e., as in ML, transparent declarations look just like definitions. key insight of the F-ing approach is that this dependency is benign, Singletons can be formed over arbitrary values. This gives the however, and can be translated away, as we will see in Section 3. ability to express module sharing and aliases. In the basic seman- Similarly, MAP defines a signature with abstract key and map tics described in this paper, this is effectively a shorthand for shar- types. Note how type parameters on the left-hand side conve- ing all types contained in the module (including those defined in- niently and uniformly generalise to value declarations, avoid- side transparent functors, see below). We leave the extension to full ing the need for brittle implicit scoping rules like in conven- value equivalence (including primitive types like Booleans), as in tional ML: as shown in Figure 1, “empty a : map a” abbreviates our F-ing semantics for applicative functors [25], to future work. “empty : (a : type) ⇒ map a”, in a generalisation of the syntax for type specifications introduced earlier, where “type t a” desug- Functors Returning to the 1ML grammar, the remaining con- ars into “t a : type” and then “t : (a : type) ⇒ type”. structs of the language are typical for ML modules, although they The Map function is a functor: it takes a value of type EQ, are perhaps a bit more general than what is usually seen. Let us i.e., a module. From that it constructs a naive implementation of explain them using an example that demonstrates that our language maps. “X:>T ” is the usual sealing operator that opaquely ascribes can readily express “real” modules as well. Here is the (unavoid- a type (i.e., signature) to a value (a.k.a. module). The type refine- able, it seems) functor that defines a simple map ADT: ment syntax “T where (type .X=T )” should be familiar from type EQ = ML, but here it actually is derived from a more general construct: { “T where (.X:U)” refines T ’s subcomponent at path .X to type type t; U, which can be any subtype of what’s declared by T . That form eq : t → t → bool }; subsumes module sharing as well as other forms of refinement. type MAP = Applicative vs. Generative In this paper, we stick to a relatively { simple semantics for functor-like functions, in which Map is gener- type key; ative [28, 4, 25]. That is, like in Standard ML, each application will type map a; yield a fresh map ADT, because sealing occurs inside the functor: empty a : map a; add a : key → a → map a → map a; M1 = Map IntEq; lookup a : key → map a → opt a M2 = Map IntEq; }; m = M1.add int 7 M2.empty (* ill-typed: M1.map 6= M2.map *) Map (Key : EQ) :> MAP where (type .key = Key.t) = { But as we saw earlier, type constructors like pair or map are type key = Key.t; essentially functors, too! Sealing the body of the Map functor type map a = key → opt a; hence implies higher-order sealing of the nested map “functor”, as empty a = fun (k : key) ⇒ none a; if performing map :> type ⇒ type. It is vital that the resulting lookup a (k : key) (m : map a) = m k; functor has applicative semantics [15, 25], so that

4 2015/6/18 type map a = M1.map a; literature (or in C++ land). The function entries is parameterised type t = map int; over a corresponding module C – an (explicit) instance type u = map int if you want. Its depends directly on C’s definition of the t u associated types. Such a dependency can be expressed in ML on yields = , as one would expect from a proper type constructor. 2 We hence need applicative functors as well. To keep things sim- the module level, but not at the core level. ple, we restrict ourselves to the simplest possible semantics in this Moving to higher kinds, things become even more interesting: paper, in which we distinguish between pure (⇒, i.e. applicative) type MONAD (m : type ⇒ type) = and impure (→, i.e. generative) function types, but sealing is al- { ways impure (or strong [6]). That is, sealing inside a functor always return a : a → m a; makes it generative. The only way to produce an bind a b : m a → (a → m b) → m b is by sealing a (fully transparent) functor as a whole, with applica- }; tive functor type, as for the map type constructor above. Given: map a b (m : type ⇒ type) (M : MONAD m) (f : a → b) (mx : m a) = M.bind a b mx (fun (x : a) ⇒ M.return b (f x)) (* : m b *) F = (fun (a : type) ⇒ type {x : a}) :> type ⇒ type G = (fun (a : type) ⇒ type {x : a}) :> type → type Here, MONAD is again akin to a type class, but over a type H = fun (a : type) ⇒ (type {x : a} :> type) constructor. As explained in Section 1.1, this of polymorphism J = G :> type ⇒ type (* ill-typed! *) cannot be expressed even in MLs with packaged modules. F is an applicative functor, such that F int = F int. G and H on the Computed Modules Just for completeness, we should mention other hand are generative functors; the former because it is sealed that the motivating example from Section 1 can of course be written with impure , the latter because sealing occurs inside (almost) as is in 1MLex: its body. Consequently, G int or H int are impure expressions and invalid as type paths (though it is fine to bind their result to a name, Table = if size > threshold then HashMap else TreeMap : MAP e.g., “type w = G int”, and use the constant w as a type). Lastly, The only minor nuisance is the need to annotate the type of the J is ill-typed, because applicative functor types are subtypes of conditional. As explained earlier, the annotation is necessary in generative ones, but not the other way round. general to achieve unique types, but can usually be inferred once This semantics for applicative functors (which is very similar to we add inference to the mix (Section 4). the applicative functors of Shao [30]) is somewhat limited, but just enough to encode sealing over type constructors and hence recover Predicativity What is the restriction we employ to maintain de- the ability to express type definitions as in conventional ML. An cidability? It is simple: during subtyping (a.k.a. signature match- extension of 1ML to applicative functors with pure sealing a` la F- ing) the type type can only be matched by small types, which ing modules [25] is given in the Technical Appendix [23]. are those that do not themselves contain the type type; or in other The purity distinction would naturally extend to other relevant words, monomorphic types. Small types thus exclude first-class ab- effects, such as state. For example, the assignment operator := stract types, actual functors (functions taking type parameters), and would need to be typed as impure (because there is no sound type constructors (which are just functors). For example, all of the elaboration for it otherwise), while other operators, such as +, following define large types: could be pure. However, we do not explore that space further here, type T1 = type; type T4 = (x : {}) → type; and conservatively treat all “core-like” functions as impure for now. type T2 = {type u}; type T5 = (a : type) ⇒ {}; type T = {type u = T }; type T = {type u a = bool}; Higher Polymorphism So far, we have only shown how 1ML 3 2 6 recovers constructs well-known from ML. As a first example of None of these are expressible as type expressions in conventional something that cannot directly be expressed in conventional ML, ML, and vice versa, all ML type expressions materialise as small consider first-class polymorphic arguments: types in 1ML, so nothing is lost in comparison. The restriction on subtyping affects annotations, parameterisa- f (id : (a : type) ⇒ a → a) = {x = id int 5; y = id bool true} tion over types, and the formation of abstract types. For example, Similarly, existential types are directly expressible: for all of the above Ti, all of the following definitions are ill-typed: type SHAPE = {type t; area : t → float; v : t} type U = pair Ti Ti; (* error *) volume (height : int) (x : SHAPE) = height * x.area (x.v) A = (type Ti): type; (* error *) B = {type u = Ti} :> {type u}; (* error *) SHAPE can either be read as a module signature or an existential C = if b then Ti else int : type (* error *) type, both are indistinguishable. The function volume is agnostic about the actual type of the shape it is given. Notably, the case A with T1 literally implies type type 6 : type It turns out that the previous examples can still be expressed (although type type itself is a well-formed expression!). The main with packaged modules (Section 1.1). But now consider: challenge with first-class modules is preventing such a type:type situation, and the separation into a small universe (denoted by type COLL c = type) and a large one (for which no syntax exists) achieves that. { A transparent type is small as long as it reveals a small type: type key; 0 type val; type T1 = (= type int); 0 empty : c; type T2 = {type u = int} add : c → key → val → c; lookup : c → key → opt val; would not cause an error when inserted into the above definitions. keys : c → list key 2 }; In OCaml 4, this example can be approximated with heavy fibration: module type COLL = sig type coll type key type val ... end entries c (C : COLL c) (xs : c) : list (C.key × C.val) = ... let entries (type c) (type k) (type v) COLL amounts to a parameterised signature, and is akin to a (module C : COLL with type coll = c and type key = k and type value = v) Haskell-style type class [34]. It contains two abstract type spec- (xs : c) : (k * v) list = ... ifications, which are known as associated types in the type class

5 2015/6/18 Recursion The 1ML syntax we give in Figure 1 omits a couple ex (kinds) κ ::= Ω | κ → κ of constructs that one can rightfully expect from any serious ML contender: in particular, there is no form of recursion, neither for (types) τ ::= α | τ → τ | {l:τ} | ∀α:κ.τ | ∃α:κ.τ | terms nor for types. It turns out that those are largely orthogonal to λα:κ.τ | τ τ the overall design of 1ML, so we only sketch them here. (terms) e, f ::= x | λx:τ.e | e e | {l=e} | e.l | λα:κ.e | e τ | ML-style recursive functions can be added simply by throwing pack hτ, eiτ | unpack hα, xi=e in e in a primitive polymorphic fixpoint operator (environ’s) Γ ::= · | Γ, α:κ | Γ, x:τ fix a b : (a → b) → (a → b) Figure 2. Syntax of Fω plus perhaps some suitable syntactic sugar: rec X Y (Z:T ): U=E := (abstracted) Ξ ::= ∃α.Σ X = fun Y ⇒ fix TU (fun(X:(Z:T ) → T 0) ⇒ fun(Z:T ) ⇒ E) (large) Σ ::= π | bool | [= Ξ] | {l:Σ} | ∀α.Σ →ι Ξ Given an appropriate fixpoint operator, this generalises to mutually (small) σ ::= π | bool | [= σ] | {l:σ} | σ →I σ recursive functions in the usual ways. Note how the need to specify (paths) π ::= α | π σ the result type b (respectively, U) prevents using the operator to (purity) ι ::= P | I construct transparent recursive types, because U has no way of referring to the result of the fixpoint. Moreover, fix yields an impure Desugarings into Fω: function, so even an attempt to define an abstract type recursively, (types) (terms) rec (a : type): type = type {head : a; tail : stream a} [= τ] := {typ : τ → {}} [τ] := {typ = λx:τ.{}} won’t type-check, because stream wouldn’t be an applicative func- τ1 →l τ2 := τ1 → {l : τ2} λlx:τ.e := λx:τ.{l : e} tor, and so the term stream a on the right-hand side is not a valid type — fortunately, because there would be no way to translate such Notation: ι ≤ ι ι ∨ ι := ι ι(Σ) = P ≤ ∨ := ∨ := ι(∃αα.Σ) = a definition into System Fω with a conventional fixpoint operator. P IP I I P I I Recursive (data)types have to be added separately. One ap- τ.l := τ τ[.l=τ2] := τ2 (l = ) 0 0 0 proach, that has been used by Harper & Stone’s type-theoretic ac- {l:τ, ...}.l := τ.l {l:τ, ...}[.l=τ2] := {l:τ[.l =τ2], ...} (l = l.l ) count of Standard ML [13], is to interpret a recursive datatype like Figure 3. Semantic Types datatype t = A | B of T as a module defining a primitive ADT with the signature

{type t; A : t; B : T ⇒ t; expose a : ({} → a) ⇒ (T → a) ⇒ t → a} System Fω, the higher-order polymorphic λ-calculus [1], extended with simple record types (Figure 2). The semantics is completely where expose is a case-operator accessed by standard; we omit it here and reuse the formulation from [25]. The compilation. We refer to [13] for more details on this approach. only point of note is that it allows term (but not type) variables in There is one caveat, though: datatypes expressed as ADTs require the environment Γ to be shadowed without α-renaming, which is sealing. With the simple system presented in this paper, they hence convenient for translating bindings. could not be defined inside applicative functors. However, this We write Γ ` e : τ for the F typing judgement, and let e ,→ e0 limitation is removed by the aforementioned generalisation to pure ω denote (one-step) reduction. Then System F is well-known to sealing described in the Technical Appendix [23]. ω enjoy the standard soundness properties: Impredicativity Reloaded Predicativity is a severe restriction. THEOREM 3.1 (Preservation). Can we enable impredicative type abstraction without breaking de- 0 0 cidability? Yes we can. One possibility is the usual trick of piggy- If · ` e : τ and e ,→ e , then · ` e : τ. backing datatypes: we can allow their data constructors to have THEOREM 3.2 (Progress). large parameters. Because datatypes are nominal in ML, impred- 0 0 icativity is “hidden away” and does not interfere with subtyping. If · ` e : τ and e is not a value, then e ,→ e for some e . Structural impredicative types are also possible, as long as large types are injected into the small universe explicitly, by way of To establish soundness of 1ML it suffices to ensure that elaboration a special type, say, “wrap T ”. The gist of this approach is that always produces well-typed Fω terms (Section 3.3). subtyping does not extend to such wrapped types. It is an easy We assume obvious encodings of let-expressions and n-ary extension, the Technical Appendix [23] gives the details. universal and existential types in Fω. To ease notation we often drop type annotations from let, pack, and unpack where clear from 3. Type System and Elaboration context. We will also omit kind annotations on type variables, and where necessary, use the notation κα to refer to the kind implicitly So much for leisure, now for work. The general recipe for 1MLex associated with α. is simple: take the semantics from F-ing modules [25], collapse the levels of modules and core, and impose the predicativity restriction Semantic Types Elaboration translates 1MLex types directly into needed to maintain decidability. This requires surprisingly few “equivalent” System Fω types. The shape of these semantic types changes to the whole system. Unfortunately, space does not permit is given by the grammar in Figure 3. explaining all of the F-ing semantics in detail, so we encourage the The main magic of the elaboration is that it inserts appro- reader to refer to [25] (mostly Section 4) for background, and will priate quantifiers to bind abstract types. Following Mitchell & focus primarily on the differences and novelties in what follows. Plotkin [20], abstract types are represented by existentials: an ab- stracted type Ξ = ∃α.Σ quantifies over all the abstract types (i.e., 3.1 Internal Language components of type type) from the underlying concretised type System Fω The semantics is defined by elaborating 1MLex types Σ, by naming them α. Inside Σ they can hence be represented as and terms into types and terms of (call-by-value, impredicative) transparent types, equal to those α’s.

6 2015/6/18 A sketch of the mapping between syntactic types T and seman- 3.2 Elaboration tic types Ξ is as follows: The complete elaboration rules for 1MLex are collected in Figure 4. There is one judgement for each syntactic class, plus an auxiliary T ∃α.Σ (= type T ) [= ∃α .Σ ] judgement for subtyping. If you are merely interested in typing 1 1 1 1ML then you can ignore the greyed out parts “ e” in the rules type ∃α.[= α] {X :T ;X :T } ∃α α .{X :Σ ,X :Σ } – they are concerned with the translation of terms, and are only 1 1 2 2 1 2 1 1 2 2 relevant to define the operational semantics of the language. (X:T1) → T2 ∀α1.Σ1 →I ∃α2.Σ2 (X:T1) ⇒ T2 ∃α2.∀α1.Σ1 →P Σ2 Types and Declarations The main job of the elaboration rules for A.t αA.t types is to name all abstract type components with type variables, F(M) αF( ) σM collect them, and bind them hoisted to an outermost existential (or universal, in the case of functions) quantifier. The rules are mostly Here, we assume that each constituent type T on the left-hand side i identical to [25], except that type is a free-standing construct in- is recursively mapped to a corresponding ∃α .Σ appearing on the i i stead of being tied to the syntax of bindings, and 1ML’s “where” right-hand side. construct requires a slightly more general rule. Walking through these in turn, (transparent) reified types are Rule TSING corresponds to rule S-LIKE in [25] and handles represented as [= Ξ], which is expressed in System F using a ”singleton” types. It simply infers the (unique) type Σ of the ex- simple coding trick [25] – cf. the desugaring of [= τ] and [τ] pression E. Note that this type is not allowed to have existential given in Figure 3, assuming a reserved label “typ”. Because all quantifiers, i.e., E may not introduce local abstract types. All types type constructors are represented as functors, we have no need for [= Ξ] occurring in Σ thus are transparent. As explained below, we reified types of higher kind (as was the case in [25]). dropped the side condition for Σ to be explicit in this rule. With all abstract types being named, they always appear as transparent types [= α] as well, albeit quantified as necessary. Expressions and Bindings The elaboration of expressions closely Records, no surprise, map to records. We assume an implicit in- follows the rules from the first part of [25], but adds the tracking of jection from 1ML identifiers X into both Fω variables x and labels purity as in Section 7 of that paper. However, to keep the current l, so we can conveniently treat any X as a variable or label. The paper simple, we left out the ability to perform pure sealing, or to abstract type names from all record components (here, the α1 from create pure functions around it. That avoids some of the notational T1 and the α2 from T2) are collectively hoisted outside the record; contortions necessary for the applicative functor semantics from within, the components all have concretised types, respectively. In [25]. An extension of 1MLex with pure sealing can be found in the particular, this makes α1 scope over Σ2, thereby allowing possible Technical Appendix [23]. dependencies of T2 on (abstract types from) T1 without requiring The only other non-editorial changes over [25] are that “type actual dependent types. T ” is now handled as a first-class value, no longer tied to bindings, Function types map to polymorphic functions in Fω. Being in and that Booleans have been added as representatives of the core. negative position, the existential quantifier for the abstract types The rules collect all abstract types generated by an expression α1 from the parameter type Σ1 turns into a universal quantifier, (e.g. by sealing or by functor application) into an existential pack- scoping over the whole type, and allowing the result type Σ2 to age. This requires repeated unpacking and repacking of existentials refer to the parameter types. Like for records, this hoisting avoids created by constituent expressions. Moreover, the sequencing rule the need for dependent types. Functions are also annotated by a BSEQ combines two (n-ary) existentials into one. simple effect ι, which distinguishes impure (→) from pure (⇒) It is an invariant of the expression elaboration judgement that function types, and thus, generative from applicative functors. ι = I if Ξ is not a concrete type Σ – i.e., abstract type “generation” Pure function types encode applicative semantics for the ab- is impure. Without this invariant, rule EFUN might form an invalid stract types they return by having their existential quantifiers α2 function type that is marked pure but yet has an inner existential “lifted” over their parameters. To capture potential dependencies, quantifier (i.e., is “generative”). To maintain the invariant, both the α2 are skolemised over α1 [2, 28, 25]. That is, the kinds of α2 sealing (rule ESEAL) and conditionals (rule EIF) have to be deemed are of the form κα1 → κ for pure functors, which is where higher impure if they generate abstract types – enforced by the notation kinds come into play. We impose the syntactic invariant that a pure ι(Ξ) defined in Figure 3. In that sense, our notion of purity actually function type never has an existential quantifier right of the arrow. corresponds to the stronger property of valuability in the parlance Abstract types are denoted by their type variables – e.g. some of Dreyer [4], which also implies phase separation, i.e., the ability αA.t introduced for A.t – but may generally take the form of a se- to separate static type information from dynamic computation, key mantic path π if they have parameters. Parameters are (only) in- to avoiding the need for dependent types. troduced through pure function abstraction and the aforementioned kind lifting that goes along with it. An abstract type that is the re- Subtyping The subtyping judgement is defined on semantic sult of an application of a pure function (applicative functor, or type types. It generates a coercion function f as computational evidence constructor) F to a value (module) M becomes the application of a of the subtyping relation. The domain of that function always is the higher-kinded type variable representing the constructor to the con- left-hand type Ξ0; to avoid clutter, we omit its explicit annotation crete types σM from the argument, corresponding to the abstract from the λ-terms produced by the rules. The rules mostly follow types α1 in F’s parameter. Because we enforce predicativity, these the structure from [25], merely adding a straightforward rule for argument types have to be small. For example, the type constructor abstract type paths π, which now may occur as “module types”. map (Section 2) has semantic type ∀α.[= α] →P [= αmap(α)], However, we make one structural change: instead of guess- and the application map int translates to αmap(int). ing the substitution for the right-hand side’s abstract types non- The latter forms can appear in arbitrary combination: for in- deterministically in a separate rule (rule U-MATCH in [25]), the stance, an abstract type projected from a functor application, current formulation looks them up algorithmically as it goes, using G(M).t, would map to αG( ).t σM accordingly. the new rule SFORGET to match an individual abstract type. The Figure 3 also defines the subgrammar of small types, which can- reason for this change is merely a technical one: it eliminates the not have quantifiers in them. Moreover, small functions are required need for any significant meta-theory about decidability, which was to be impure, which will simplify type inference (Section 5). somewhat non-trivial before, at least with applicative functors.

7 2015/6/18 Types Γ ` T Ξ Γ ` E :P [= Ξ] e κα = Ω Γ ` D Ξ TPATH TTYPE TBOOL TSTR Γ ` E Ξ Γ ` type ∃α.[= α] Γ ` bool bool Γ ` {D} Ξ Γ ` T ∃α .Σ Γ ` T1 ∃α1.Σ1 1 1 1 Γ, α1,X:Σ1 ` T2 ∃α2.Σ2 κ 0 = κα → κα Γ, α1,X:Σ1 ` T2 ∃α2.Σ2 α2 1 2 TFUN TPFUN Γ ` (X:T ) → T ∀α . Σ → ∃α .Σ 0 0 1 2 1 1 I 2 2 Γ ` (X:T1) ⇒ T2 ∃α2.∀α1. Σ1 →P Σ2[α2 α1/α2] Γ ` T1 ∃α1.Σ1 α1 = α11 ] α12 Γ ` E :P Σ e Γ ` T2 ∃α2.Σ2 Γ, α11, α2 ` Σ2 ≤α12 Σ1.X δ; f TSING TWHERE Γ ` (= E) Σ Γ ` T1 where (.X:T2) ∃α11α2.δΣ1[.X=Σ2]

Declarations Γ ` D Ξ Γ ` T ∃α.Σ Γ ` T ∃α.{X:Σ} DVAR DINCL Γ ` X:T ∃α.{X:Σ} Γ ` include T ∃α.{X:Σ}

Γ ` D1 ∃α1.{X1:Σ1} Γ, α1, X1:Σ1 ` D2 ∃α2.{X2:Σ2} X1 ∩ X2 = ∅ DSEQ DEMPTY Γ ` D1;D2 ∃α1α2.{X1:Σ1, X2:Σ2} Γ `  {}

Expressions Γ ` E : Ξ e Γ(X) = Σ Γ ` T Ξ ι EVAR ETYPE ETRUE Γ ` X :P Σ X Γ ` type T :P [= Ξ] [Ξ] Γ ` true :P bool true

Γ ` X :P bool e Γ ` E1 :ι1 Ξ1 e1 Γ ` Ξ1 ≤ Ξ f1 Γ ` T ΞΓ ` E2 :ι2 Ξ2 e2 Γ ` Ξ2 ≤ Ξ f2 EFALSE EIF Γ ` false :P bool false Γ ` if X then E1 else E2 : T :ι1∨ι2∨ι(Ξ) Ξ if e then f1 e1 else f2 e2 0 0 0 0 Γ ` B :ι Ξ e Γ ` E :ι ∃α.{X :Σ } eX :Σ ∈ X :Σ ESTR EDOT Γ ` {B} :ι Ξ e Γ ` E.X :ι ∃α.Σ unpack hα, yi = e in pack hα, y.Xi

Γ ` X1 :P ∀α. Σ1 →ι Ξ e1 Γ ` T ∃α.ΣΓ, α, X:Σ ` E :ι Ξ e Γ ` X2 :P Σ2 e2 Γ ` Σ2 ≤α Σ1 δ; f EFUN EAPP Γ ` fun (X:T ) ⇒E :P ∀α. Σ →ι Ξ λα.λιX:Σ.e Γ ` X1 X2 :ι δΞ (e1 (δα)(f e2)).ι

Γ ` X :P Σ1 e Γ ` T ∃α.Σ2 Γ ` Σ1 ≤α Σ2 δ; f ESEAL Γ ` X:>T :ι(∃α.Σ2) ∃α.Σ2 pack hδα, f ei Bindings Γ ` B :ι Ξ e Γ ` E :ι ∃α.Σ e Γ ` E :ι ∃α.{X:Σ} e BVAR BINCL Γ ` X=E :ι ∃α.{X:Σ} unpack hα, xi = e in pack hα, {X=x}i Γ ` include E :ι ∃α.{X:Σ} e 0 Γ ` B1 :ι1 ∃α1.{X1:Σ1} e1 X1 = X1 − X2 0 0 Γ, α1, X1:Σ1 ` B2 :ι2 ∃α2.{X2:Σ2} e2 X1:Σ1 ⊆ X1:Σ1 BSEQ BEMPTY 0 0 Γ `  :P {} {} Γ ` B1;B2 :ι1∨ι2 ∃α1α2.{X1:Σ1, X2:Σ2} unpack hα1, y1i = e1 in let X1 = y1.X1 in unpack hα2, y2i = e2 in 0 0 pack hα1α2, {X1 = y1.X1, X2 = y2.X2}i

0 0 0 Subtyping Γ ` Ξ ≤ Ξ f := Γ ` Ξ ≤ Ξ id; f Γ ` Ξ ≤π Ξ δ; f

SPATH SBOOL Γ ` π ≤ π λx.x Γ ` bool ≤ bool λx.x Γ ` Ξ0 ≤ Ξ f Γ ` Ξ ≤ Ξ0 f 0 π = α α0 TYPE FORGET 0 S 0 S Γ ` [= Ξ ] ≤ [= Ξ] λx.[Ξ] Γ ` [= σ] ≤π [= π] [λα .σ/α]; λx.x 0 Γ ` Σ1 ≤π1 Σ1 δ1; f1 0 0 Γ ` {l :Σ } ≤π2 {l: δ1Σ} δ2; f2 δ2Σ1 = Σ1 SEMPTY SSTR 0 0 0 0 Γ ` {l:Σ } ≤ {} λx.{} Γ ` {l1:Σ1, l :Σ } ≤π1π2 {l1:Σ1, l:Σ} δ1δ2; λx.{l1=f1(x.l1), l=(f2 x).l} 0 0 Γ, α ` Σ ≤α0 Σ δ1; f1 ι ≤ ι 0 0 0 0 Γ, α ` δ1Ξ ≤πα Ξ δ2; f2 δ2Σ = Σ Γ, α ` Σ ≤α Σ δ; f α α 6=  0 0 0 SFUN 0 0 0 SABS Γ ` (∀α .Σ →ι0 Ξ ) ≤π (∀α.Σ →ι Ξ) Γ ` ∃α .Σ ≤ ∃α.Σ λx. unpack hα , yi = x in pack hδα, f yi 0 0 δ2; λx. λα. λιy:Σ. f2 ((x (δ1α )(f1 y)).ι )

Figure 4. Elaboration of 1MLex

8 2015/6/18 To this end, the judgement is indexed by a vector π of abstract 4. Full 1ML paths that correspond to the abstract types from the right-hand Ξ. A language without type inference is not worth naming ML. Be- The counterparts of those types have to be looked up in the left- 0 cause that is so, Figure 5 shows the minimal extension to 1MLex hand Ξ , which happens one at a time in rule SFORGET. And necessary to recover ML-style implicit polymorphism. Syntacti- that’s where the predicativity restriction materialises: the rule only cally, there are merely two new forms of type expression. allows a small type on the left. Lookup produces a substitution First, “ ” stands for a type that is to be inferred from context. δ whose domain corresponds to the root variables of the abstract The crucial restriction here is that this can only be a small type. This paths π. Normally, each of π is just a plain abstract type variable fits nicely with the notion of a monotype in core ML, and prevents (which occur free in Ξ in this judgement). But in the formation the need to infer polymorphic types in an analogous manner. rule TPFUN for pure function types, lifting produces more complex On top of this new piece of kernel syntax we allow a type paths. So when subtyping goes inside a pure functor in rule SFUN, annotation “: ” on a function parameter or conditional to be the same abstract paths with skolem parameters have to be formed omitted, thereby recovering the implicitly typed expression syntax for lookup, so that rule SFORGET can match them accordingly. familiar from ML. (At the same time we drop the 1MLex sugar The move to deterministic subtyping allows us to drop the aux- interpreting an unannotated parameter as a type; we only keep that iliary notion of explicit types, which was present in [25] to ensure interpretation in type declarations or bindings.) that non-deterministic lookup can be made deterministic. There Second, there is a new type of implicit function, distinguished is one side effect from dropping the “explicitness” side condition by a leading tick ’ (a choice that will become clear in a moment). from rule TSING, though: subtyping is no longer reflexive. There This corresponds to an ML-style polymorphic type. The parameter are now “monster” types that cannot be matched, not even by them- has to be of type type, whose being small fits nicely with the fact selves. For example, take {} →I ∃α.α, which is created by that ML can only abstract monotypes, and no type constructors. For (= (fun (x : {}) ⇒ ({type t = int; v = 0} :> {type t; v : t}).v)) obvious reasons, an implicit function has to be pure. We write the and is not a subtype of itself (it only contains a use of the abstract semantic type of implicit functions with an arrow →A, in order to type α, no “binding” of the form [= α]; consequently, when recur- reuse notational convention. It is distinct from →ι, however, and 0 0 we do not consider A an actual effect; i.e., A is not included in ι. sively matching ∃α .α ≤ ∃α.α, rule SFORGET is never invoked to introduce the necessary substitution [α0/α] of α by (the renamed As the name would suggest, there are no explicit introduction version of) itself). However, this does not break anything else, so or elimination forms for implicit functions. Instead, they are intro- we make that simplification anyway – if desired, explicitness could duced and eliminated implicitly. The respective typing rules (EGEN easily be revived. and EINST) match common formulations of ML-style polymor- phism [3]. Any pure expression can have its type generalised, which 3.3 Meta-Theory is more liberal than ML’s value restriction [35] (recall that purity also implies that no abstract types are produced). It is relatively straightforward to verify that elaboration is correct: Subtyping allows the implicit elimination of implicit functions PROPOSITION 3.3 (Correctness of 1MLex Elaboration). as well, via instantiation on the left, or skolemisation on the right Let Γ be a well-formed Fω environment. (rules SIMPLL and SIMPLR). This closely corresponds to ML’s 1. If Γ ` T/D Ξ, then Γ ` Ξ:Ω. signature matching rules, which allow any value to be matched by a value of more polymorphic type. However, this behaviour can now 2. If Γ ` E/B :ι Ξ e, then Γ ` e :Ξ, and if ι=P then Ξ=Σ. 0 0 be intermixed with proper “module” types. In particular, that means 3. If Γ ` Ξ ≤αα0 Ξ δ; f and Γ ` Ξ :Ω and Γ, α ` Ξ:Ω, 0 that we allow looking up types from an implicit function, similar to then dom(δ) = α and Γ ` δ :Γ, α and Γ ` f :Ξ → δΞ. other pure functions. For example, the following subtyping holds, Together with the standard soundness result for Fω we can tell by implicitly instantiating the parameter a with int: that 1MLex is sound, i.e., a well-typed 1MLex program will either ’(a : type) ⇒ {type t = a; f : a → t} ≤ {type t; f : int → int} diverge or terminate with a value of the right type: With these few extensions, the Map functor from Section 2 can THEOREM 3.4 (Soundness of 1MLex). If · ` E :Ξ e, then either e ↑ or e ,→∗ v such that · ` v :Ξ and v is a value. now be written in 1ML very much like in traditional ML: type MAP = More interestingly, the 1MLex type system is also decidable: { THEOREM 3.5 (Decidablity of 1MLex Elaboration). type key; All 1MLex elaboration judgements are decidable. type map a; empty ’a : map a; This is immediate for all but the subtyping judgement, since they lookup ’a : key → map a → opt a; are syntax-directed and inductive, with no complicated side con- add ’a : key → a → map a → map a ditions. The rules can be read directly as an inductive algorithm. }; (In the case of where, it seems necessary to find a partitioning Map (Key : EQ) :> MAP where (type .key = Key.t) = α1 = α11 ] α12, but it is not hard to see that the subtyping premise { can only possibly succeed when picking α12 = fv(Σ1) ∩ α1.) type key = Key.t; The only tricky judgement is subtyping. Although it is syntax- type map a = key → opt a; directed as well, the rules are not actually inductive: some of their empty = fun x ⇒ none; premises apply a substitution δ to the inspected types. Alas, that is lookup x m = m x; add x y m = fun z ⇒ if Key.eq z x then some y else m z exactly what can cause undecidability (see Section 1.2). } The restriction to substituting small types saves the day. We can define a weight metric over semantic types such that a quantified The MAP signature here uses one last bit of syntactic sugar defined type variable has more weight than any possible substitution of in Figure 5, which is to allow implicit parameters on the left-hand that variable with a small type. We can then show that the overall side of declarations, like we already do for explicit parameters (cf. weight of types involved decreases in all subtyping rules. For space Figure 1), The tick becomes a pun on ML’s type variable syntax, reasons, the details appear in the Technical Appendix [23]. but without relying on brittle implicit scoping rules.

9 2015/6/18 Syntax (expressions) if E1 then E2 else E3 := if E1 then E2 else E3: fun X ⇒ E := fun (X: ) ⇒ E (types) T ::= ... | | ’(X:type) ⇒ T (types) ’X ⇒ T := ’(X:type) ⇒ T (declarations) X ’Y :T := X : ’(Y : type) ⇒ T Semantic Types

(large signatures) Σ ::= ... | ∀α.{} →A Σ

Types Γ ` T Ξ Γ ` σ :Ω Γ, α, X:[= α] ` T Σ κα = Ω TINFER TIMPL Γ ` σ Γ ` ’(X:type) ⇒ T ∀α.{} →A Σ Expressions Γ ` E :ι Ξ e 0 Γ, α ` E :P Σ e κα = Ω Γ ` E :ι ∃α.∀α .{} →A Σ e Γ, α ` σ : κα0 EGEN 0 EINST Γ ` E :P ∀α.{} →A Σ λα.λAx:{}.e Γ ` E :ι ∃α.Σ[σ/α ] unpack hα, xi = e in pack hα, (x σ {}).Ai 0 Subtyping Γ ` Ξ ≤π Ξ δ; f 0 0 0 Γ ` σ : κα0 Γ ` Σ [σ/α ] ≤π Σ δ; f Γ, α ` Σ ≤π Σ δ; f fv(δπ) 6∩ α 0 0 SIMPLL 0 SIMPLR Γ ` ∀α .{} →A Σ ≤π Σ δ; λx. f ((x σ {}).A) Γ ` Σ ≤π ∀α.{} →A Σ δ; λx. λα.λAy:{}.f x

Figure 5. Extension to Full 1ML

Space reasons forbid more extensive examples, but it should First, it may be necessary to infer a small type from a large one be clear from the rules that there is nothing preventing the use of via subtyping. For example, we might encounter the inequation implicit functions as first-class values, given sufficient annotations for their (large) types. For example: ∀α.[= α] →P [= α] ≤ υ

(fun (id : ’a ⇒ a → a) ⇒ {x = id 3; y = id true})(fun x ⇒ x) which can be solved just fine with υ = [= σ] →I [= σ] for The type of the argument expression is generalised implicitly and any σ; through contravariance, similar situations can arise with an matches the implicitly polymorphic parameter via subtyping. inference variable on the left. Because of this, it is not enough to just consider the cases υ ≤ σ or σ ≤ υ for resolving υ. Instead, when the subtyping algorithm hits υ ≤ Σ or Σ ≤ υ (rules ISRESL 5. Type Inference and ISRESR, where Σ may or may not be small) it invokes the With the additions from Figure 5 we have turned the deterministic auxiliary Resolution judgement Γ `θ υ ≈ Σ, which only resolves typing and elaboration judgements of 1MLex non-deterministic. υ so far as to match the shape of Σ and inserts fresh inference They have to guess types (in rules TINFER,EINST,SIMPLL) and variables for its subcomponents. After that, subtyping “tries again”. quantifiers (in rule EGEN). Moreover, we have decide when to Second, an inference variable υ can be introduced in the scope apply rules EGEN and EINST. Clearly, an algorithm is needed. of abstract types (i.e., regular type variables). In general, it would Fortunately, what’s going on is not fundamentally different from be incorrect to resolve υ to a type containing type variables that are core ML. Where core ML would require type equivalence (and type not in scope for all occurrences of υ in a derivation. To prevent that, inference would use unification), the 1ML rules require subtyping. each υ is associated with a set ∆υ of type variables that are known That may seem scary at first, but a closer inspection of the to be in scope for υ everywhere. The set is verified when resolving subtyping rules reveals that, when applied to small types, subtyping υ (see rule IRPATH in particular). The set also is propagated to 0 almost degenerates to type equivalence! The only exception is any other υ the original υ is unified with, by intersecting ∆υ0 00 width subtyping on records. The 1ML type system only promises with ∆υ – or more precisely, by introducing a new variable υ 0 to infer small types, so we are not far away from conventional ML. with the intersected ∆υ00 , and replacing both υ and υ with it (see That is, we can still formulate an algorithm based on inference e.g. rule IRINFER); that way, we can treat ∆υ as a globally fixed variables (which we write υ) holding place for small types. set for each υ, and do not need to maintain those sets separately. Inference variables also have to be updated when type variables go 5.1 Algorithm out of scope. That is achieved by employing the following notation Figure 6 shows the essence of this algorithm, formulated via in- in rules locally extending Γ with type variables (we write undet(Ξ) ference rules. The basic idea is to modify the declarative typing to denote the free inference variables of Ξ): 0 0 0 0 00 rules such that wherever they have to guess a (small) type, we sim- Γ; Γ θ`θ0 J := Γ, Γ θ`θ00 J ∧ θ = [υ /υ] ◦ θ ply introduce a (free) inference variable. Furthermore, the rules are where υ = undet(θ00J ) augmented with outputting a substitution θ for resolved inference 0 υ fresh with ∆υ0 = ∆υ ∩ dom(Γ) variables: all judgements have the form Γ `θ J , which, roughly, implies the respective declarative judgement υ, θΓ ` θJ , where υ The net effect is that all local α’s from Γ0 are removed from all binds the unresolved inference variables that still appear free in θΓ ∆-sets of inference variable remaining after executing Γ, Γ0 ` J . or θJ . Notation is simplified by abbreviations of the form We omit θ in this notation when it is the identity. θ 0 00 Implicit functions work mostly like in ML. Like with let- Γ θ`θ0 J := θΓ `θ00 J ∧ θ = θ ◦ θ polymorphism, generalisation is deferred to the point where an θ where J is meant to apply θ to J ’s “inputs”. It’s used to thread and expression is bound – in this case, in rule IBPVAR. This works de- compose substitutions through multiple premises (e.g. rule IEIF). spite 1ML’s first-class polymorphism, thanks to the desugaring into There are two main complications, both due to the fact that, a kernel syntax requiring named variables in most places (Figure 1). unlike in old ML, small types can be intermixed with large ones. Consider the example from the previous section:

10 2015/6/18 Types Γ `θ T Ξ ! Γ `θ E :P [= Ξ] υ fresh ∆υ = dom(Γ) Γ `θ E :Σ ITPATH ITINFER ITTYPE ITSING Γ `θ E Ξ Γ `[] υ Γ `[] type ∃α.[= α] Γ `θ (= E) Σ

Γ `θ1 T1 ∃α1.Σ1 Γ; α1,X:Σ1 θ `θ T2 ∃α2.Σ2 κα0 = κα → κα 1 2 2 1 2 Γ; α, X:[= α] `θ T Σ κα = Ω ITPFUN ITIMPL 0 0 Γ `θ ’(X:type) ⇒ T ∀α.{} →A Σ Γ `θ2 (X:T1) ⇒ T2 ∃α2.∀α1. Σ1 →P Σ2[α2 α1/α2]

Expressions ! Γ `θ E :ι Ξ Γ `θ0 X :P bool Γ θ0`θ1 E1 :ι1 Ξ1 Γ θ3`θ4 Ξ1 ≤ Ξ ! 0 0 Γ(X) = Σ Γ θ2`θ3 T ΞΓ θ1`θ2 E2 :ι2 Ξ2 Γ θ4`θ5 Ξ2 ≤ Ξ Γ `θ E :ι ∃α.{X:Σ, X :Σ } IEVAR IEIF IEDOT Γ `[] X :P Σ Γ `θ5 if X then E1 else E2 : T :ι1∨ι2∨ι(Ξ) Ξ Γ `θ E.X :ι ∃α.Σ ! Γ `θ1 X1 :P ∀α. Σ1 →ι Ξ Γ ` X : Σ Γ ` Σ ≤ Σ δ Γ `θ1 T ∃α.Σ Γ; α, X:Σ θ1`θ2 E :ι Ξ θ1 θ2 2 P 2 θ2 θ3 2 α 1 IEFUN IEAPP Γ `θ2 fun (X:T ) ⇒E :P ∀α. Σ →ι Ξ Γ `θ3 X1 X2 :ι δΞ

Bindings Γ `θ B :ι Ξ Γ `θ E :I ∃α.Σ Γ `θ E :P Σ υ = undet(θΣ) − undet(θΓ) κα = Ω IBVAR IBPVAR Γ `θ X=E :I ∃α.{X :Σ} Γ `θ X=E :P {X : ∀α.{} →A Σ[α/υ]}

0 Subtyping ! ! 0 0 Γ `θ Ξ ≤π Ξ δ Γ `θ υ ≈ ΣΓ θ`θ0 υ ≤ Σ Γ `θ υ ≈ Σ Γ θ`θ0 Σ ≤ υ REFL RESL RESR IS IS 0 IS Γ `[] υ ≤ υ Γ `θ0 υ ≤ Σ Γ `θ0 Σ ≤ υ 0 0 0 Γ, α `θ1 Σ ≤α Σ δ1 ι ≤ ι Γ; α ` δ Ξ0 ≤ Ξ δ θ δ Σ = θ Σ 0 0 θ1 θ2 1 πα 2 2 2 2 Γ; α `θ Σ ≤α Σ δ α α 6=  0 0 0 ISFUN 0 0 ISABS 0 Γ `θ2 (∀α .Σ →ι Ξ ) ≤π (∀α.Σ →ι Ξ) δ2 Γ `θ ∃α .Σ ≤ ∃α.Σ 0 0 0 υ fresh ∆υ = dom(Γ) Γ `θ Σ [υ/α ] ≤π Σ δ Γ; α `θ Σ ≤π Σ δ; f α 6∩ fv(θδ) 0 0 ISIMPLL 0 ISIMPLR Γ `θ ∀α .{} →A Σ ≤π Σ δ Γ `θ Σ ≤π ∀α.{} →A Σ δ ! Resolution Γ `θ υ ≈ Σ := υ∈ / undet(Σ) ∧ Γ `θ υ ≈ Σ Γ `θ υ ≈ Σ

00 0 υ fresh ∆υ00 = ∆υ ∩ ∆υ0 α ∈ ∆υ υ fresh ∆υ0 = ∆υ INFER PATH 0 IR IR Γ `[υ00/υ,υ00/υ0] υ ≈ υ Γ `[α υ0/υ] υ ≈ α σ 0 υ fresh ∆υ0 = ∆υ υ1, υ2 fresh ∆υ = ∆υ = ∆υ IRBOOL IRTYPE 1 2 IRFUN 0 Γ `[bool/υ] υ ≈ bool Γ `[[=υ ]/υ] υ ≈ [= Ξ] Γ `[(υ1→Iυ2)/υ] υ ≈ ∀α.Σ →ι Ξ

! 0 0 0 Instantiation Γ `θ E :ι Ξ := Γ `θ E :ι Ξ ∧ Γ θ`θ0 Ξ  Ξ Γ `θ Ξ  Ξ

0 0 Γ; α `θ υ ≈ Σ υ fresh ∆υ = dom(Γ, α)Γ `θ ∃α.Σ[υ/α ]  ∃α.Σ REFL RES IMPL IN IN 0 0 IN Γ `θ Ξ  Ξ Γ `θ ∃α.υ  ∃α.Σ Γ `θ ∃α.∀α .{} →A Σ  ∃α.Σ

Figure 6. Type Inference for 1ML (Excerpt)

(fun (id : ’a ⇒ a → a) ⇒ {x = id 3; y = id true})(fun x ⇒ x) 5.2 Incompleteness There are a couple of sources of incompleteness in this algorithm: Desugaring rewrites this application into an expression that has an explicit binding for the argument (fun x ⇒ x). The same observa- Width subtyping Subtyping like υ ≤ {l:σ} does not determine tions applies to other relevant forms. Hence, generalising bindings the shape of the record type υ: the set of labels can still vary. in the kernel syntax is still enough. Consequently, the Resolution judgement has no rule for structures – Similarly, instantiation is deferred to rules corresponding to instead a structure type must be determined by the previous context. elimination forms (e.g. IEIF,IEDOT,IEAPP, but also ITPATH). This is, in fact, similar to Standard ML [19], where record types There, the auxiliary Instantiation judgement is invoked (as part cannot be inferred either, and require type annotation. However, ! of the notation Γ `θ J .). This does not only instantiate implicit SML implementations typically ensure that type inference is still functions (possibly under existential binders), it also may resolve order-independent, i.e., the information may be supplied after the inference variables to create a type whose shape matches the shape point of use. They do so by employing a simple form of row that is expected by the invoking rule. inference. A similar approach would be possible for 1ML, but Instantiation can also happen implicitly as part of subtyping subtyping would still make more programs fail to type-check. For (rule ISIMPLL), which covers the case where a polymorphic value the sake of presentation, we decided to err on the side of simplicity. is matched against a monomorphic (or other polymorphic) parame- The real solution of course would be to incorporate not just row ter. For example, ∀α1α2.{} →A α1 →I α2 ≤ ∀β.{} →A β →I β inference but row polymorphism [21], so that width subtyping on will be checked by first applying ISIMPLR, turning the right type structures can be recast as universal and existential quantification. monomorphic, and then instantiating the left with ISIMPLL, so that We leave investigating such an extension for future work (though the check is down to υ1 →I υ2 ≤ β →I β, unifying easily. we note that include would still represent a challenge).

11 2015/6/18 Type Scoping Tracking of the sets ∆υ is conservative: after leav- THEOREM 5.2 (Termination of 1ML Inference). ing the scope of a type variable α, we exclude any solution for All 1ML type inference judgements terminate. υ that would still involve α, even if υ only appears inside a type binder for α. Consider, for example [5]: We have to defer the details to the Technical Appendix [23]. G (x : int) = {M = {type t = int; v = x} :> {type t; v : t}; f = id id}; 6. Related Work C = G 3; x = C.f (C.M.v); Packaged Modules The first concrete proposal for extending ML with packaged modules was by Russo [27], and is implemented in and assume id : ’(a : type) ⇒ a → a. Because id is impure, the Moscow ML. Later work on type systems for modules routinely definition of f is impure, and its type cannot be generalised; more- included them [6, 4, 24, 25], and variations have been implemented over, G is impure too. The algorithm will infer G’s type as in other ML dialects, such as Alice ML [22] and OCaml [7]. int → ∃β.{M : {t : [= β], v : β}, f : υ →I υ} To avoid soundness issues in the combination with applicative functors, Russo’s original proposal conservatively allowed unpack- β∈ / ∆ β with υ (because goes out of scope the moment we bind it ing a module only local to core-level expressions, but this restric- with a local quantifier), and then generalises to tion has been lifted in later systems, restricting only the occurrence G : ∀α.{} →A int → ∃β.{M : {t : [= β], v : β}, f : α →I α} of unpacking inside applicative functors. But its too late, the solution υ = β, which would make x well- First-Class Modules The first to unify ML’s stratified type sys- typed, is already precluded. When typing C, instantiating α with β tem into one language was Harper & Mitchell’s XML calculus [10]. is not possible either, because β can only come into scope again It is a dependent type theory modeling modules as terms of Martin- after having applied an argument for α already. Lof-style¨ Σ and Π types, closely following MacQueen’s original Although not well-known, this very problem is already present ideas [17]. The system enforces predicativity through the introduc- in good old ML, as Dreyer & Blume point out [5]: existing type in- tion of two universes U1 and U2, which correspond directly to our ference implementations are incomplete, because combinations of notion of small and large type, and both systems allow both U1 : U2 functors and the value restriction (like above) do not have principal and U1 ⊆ U2. XML lacks any account of either sealing or translu- types. Interestingly, a variation of the solution suggested by Dreyer cency, which makes it fall short as a foundation for modern ML. & Blume (implicitly generalising the types of functors) is implied That gap was closed by Harper & Lillibridge’s calculus of by the 1ML typing rules: since functors are just functions, their translucent sums [9, 16], which also was a dependently typed lan- types can already be generalised. However, generalisation happens guage of first-class modules. Its main novelty were records with outside the abstraction, which is more rigid than what they propose both opaque and transparent type components, directly modeling (but which is not expressible in System Fω). Consequently, 1ML ML structures. However, unlike XML, the calculus is impredica- can type some examples from their paper, but not all. tive, which renders it undecidable. Translucent sums where later superseded by the notion of sin- Purity Annotations Due to effect subtyping, a function type as gleton types [31]; they formed the foundation of Dreyer et al.’s type an upper bound does not determine the purity of a smaller type. theory for higher-order modules [6]. However, to avoid undecid- Technically, that does not affect completeness, because we defined ability, this system went back to second-class modules. small types to only include impure functions: the resolution rule One concern in dependently typed theories is phase separation: IRFUN can always pick I. But arguably, that is cheating a little to enable compile-time checking without requiring core-level com- by side-stepping the issue. In particular, it makes an extension of putation, such theories must be sufficiently restricted. For example, the notion of (im)purity to other effects, as suggested in Section 2, Harper et al. [11] investigate phase separation for the XML calcu- somewhat inconvenient, because pure function types could not be lus. The beauty of the F-ing approach is that it enjoys phase sepa- inferred in parameter positions. ration by construction, since it does not use dependent types. Again, the solution would be more polymorphism, in this case a simple form of effect polymorphism [32]. That will be future work. Applicative Functors Leroy proposed applicative semantics for functors [15], as implemented in OCaml. Russo later combined Despite these limitiations, we found 1ML inference quite usable. both generative and applicative functors in one language [28] and In practice, MLs have long given up on complete type inference: implemented them in Moscow ML; others followed [30, 6, 4, 25]. various limitations exist in both SML and OCaml (and the extended A system like Leroy’s, where all functors are applicative, would language family including Haskell), necessitating type annotations be incompatible with first-class modules, because the application or declarations. In our limited experience with a prototype, 1ML is in type paths like F(A).t needs to be phase-separable to enable not substantially worse, at least not when used in the same manner type checking, but not all functions are. Russo’s system has similar as traditional ML. In fact, we conjecture that any SML program problems, because it allows converting generative functors into not using features omitted from 1ML – but including both modules applicative ones. Like Dreyer [4] or F-ing modules [25], 1ML and Damas/Milner polymorphism – can be directly transliterated hence combines applicative (pure) and generative (impure) functors into 1ML without adding type annotations. such that applicative semantics is only allowed for functors whose body is both pure and separable. In F-ing modules, applicativity is 5.3 Metatheory even inferred from purity, and sealing itself not considered impure; If the inference algorithm isn’t complete, then at least it is sound. the Technical Appendix [23] shows a similar extension to 1ML. That is, we can show the following result: In the version of 1ML shown in the main paper, an applicative functor can only be created by sealing a fully transparent functor THEOREM 5.1 (Correctness of 1ML Inference). with pure function type, very much like in Shao’s system [30]. Let υ, Γ be a well-formed Fω environment. 0 Type Inference There has been little work that has considered 1. If Γ `θ T/D Ξ, then υ , θΓ ` T/D θΞ. 0 type inference for modules. Russo examined the interplay between 2. If Γ `θ E/B :ι Ξ e, then υ , θΓ ` E/B :ι θΞ θe. core-level inference and modules [28], elegantly dealing with vari- 0 0 3. If Γ `θ Ξ ≤π Ξ δ;f and υ, Γ ` Ξ :Ω and υ, Γ, α ` Ξ:Ω, able scoping via unification under a mixed prefix. Dreyer & Blume 0 0 then υ , θΓ ` θΞ ≤π θΞ θδ; θf. investigated how functors interfere with the value restriction [5].

12 2015/6/18 At the same time, there have been ambitious extensions of ML- [7] J. Garrigue and A. Frisch. First-class modules and composable signa- style type inference with higher-rank or impredicative types [8, tures in Objective Caml 3.12. In ML, 2010. 14, 33, 29]. Unlike those systems, 1ML never tries to infer a [8] J. Garrigue and D. Remy.´ Semi-explicit first-class polymorphism for polymorphic type annotation: all guessed types are monomorphic ML. Information and Computation, 155(1-2), 1999. and polymorphic parameters require annotation. [9] R. Harper and M. Lillibridge. A type-theoretic approach to higher- On the other hand, 1ML allows bundling types and terms to- order modules with sharing. In POPL, 1994. gether into structures. While it is necessary to explicitly annotate terms that contain types, associated type quantifiers (both univer- [10] R. Harper and J. C. Mitchell. On the type structure of Standard ML. In ACM TOPLAS, volume 15(2), 1993. sal and existential) and their actual introduction and elimination are implicit and effectively inferred as part of the elaboration process. [11] R. Harper, J. C. Mitchell, and E. Moggi. Higher-order modules and the phase distinction. In POPL, 1990. 7. Future Work [12] R. Harper and B. Pierce. Design considerations for ML-style module systems. In B. C. Pierce, editor, Advanced Topics in Types and Pro- 1ML, as shown here, is but a first step. There are many possible gramming Languages, chapter 8, pages 293–346. MIT Press, 2005. improvements and extensions. [13] R. Harper and C. Stone. A type-theoretic interpretation of Standard Implementation We have implemented a simple prototype inter- ML. In Proof, Language, and Interaction: Essays in Honor of Robin preter for 1ML (mpi-sws.org/˜rossberg/1ml/), but it would be Milner. MIT Press, 2000. great to gather more experience with a “real” implementation. [14] D. Le Botlan and D. Remy.´ MLF: Raising ML to the power of System Applicative Functors We would like to extend 1ML’s rather basic F. In ICFP, 2003. notion of applicative functor with pure sealing a` la F-ing modules [15] X. Leroy. Applicative functors and fully transparent higher-order (see the Technical Appendix [23]), but more importantly, make it modules. In POPL, 1995. properly abstraction-safe by tracking value identities [25]. [16] M. Lillibridge. Translucent Sums: A Foundation for Higher-Order Implicits The domain of implicit functions in 1ML is limited to Module Systems. PhD thesis, CMU, 1997. type type. Allowing richer types would be a natural extension, and [17] D. MacQueen. Using dependent types to express modular structure. might provide functionality like Haskell-style type classes [34]. In POPL, 1986. Type Inference Despite the ability to express first-class and [18] R. Milner. A theory of type polymorphism in programming lan- higher-order polymorphism, inference in 1ML is rather simple. guages. JCSS, 17:348–375, 1978. Perhaps it is possible to combine 1ML elaboration with some of [19] R. Milner, M. Tofte, R. Harper, and D. MacQueen. The Definition of the more advanced approaches to inference described in literature. Standard ML (Revised). MIT Press, 1997. More Polymorphism Replacing more of subtyping with poly- [20] J. C. Mitchell and G. D. Plotkin. Abstract types have existential type. morphism might lead to better inference: row polymorphism [21] ACM TOPLAS, 10(3):470–502, July 1988. could express width subtyping, and simple effect polymorphism [32] [21]D.R emy.´ Records and variants as a natural extension of ML. In would allow more extensive use of pure function types. POPL, 1989. Recursive Modules In [24] we gave a fully general design for [22] A. Rossberg. The Missing Link – Dynamic components for ML. In recursive modules, elaborating into an extension of System F. It ICFP, 2006. would be interesting (but complicated) to redo it 1ML-style, in [23] A. Rossberg. 1ML – Core and modules as one (Technical Appendix), order to achieve a more uniform treatment of recursion for 1ML. 2015. mpi-sws.org/˜rossberg/1ml/. Dependent Types Finally, 1ML goes to length to push the bound- [24] A. Rossberg and D. Dreyer. Mixin’ up the ML module system. ACM aries of non-dependent typing. It’s a legitimate question to ask, TOPLAS, 35(1), 2013. what for? Why not go fully dependent? Well, even then sealing ne- [25] A. Rossberg, C. Russo, and D. Dreyer. F-ing modules. JFP, cessitates some equivalent of weak sums (existential types). Incor- 24(5):529–607, 2014. porating them, along with the quantifier pushing of our elaboration, [26] C. Russo. Non-dependent types for Standard ML modules. In PPDP, into a dependent type system might pose an interesting challenge. 1999. Acknowledgements [27] C. Russo. First-class structures for Standard ML. Nordic Journal of Computing, 7(4):348–374, 2000. I thank Scott Kilpatrick, Claudio Russo, Gabriel Scherer, and the anonymous reviewers for their careful and helpful comments. [28] C. Russo. Types for Modules. ENTCS, 60, 2003. [29] C. Russo and D. Vytiniotis. QML: Explicit first-class polymorphism References for ML. In ML, 2009. [1] H. Barendregt. Lambda calculi with types. In S. Abramsky, D. Gab- [30] Z. Shao. Transparent modules with fully syntactic signatures. In bay, and T. Maibaum, editors, Handbook of Logic in Computer Sci- ICFP, 1999. ence, vol. 2, chapter 2, pages 117–309. Oxford University Press, 1992. [31] C. A. Stone and R. Harper. Extensional equivalence and singleton [2] S. K. Biswas. Higher-order functors with transparent signatures. In types. ACM TOCL, 7(4):676–722, 2006. POPL, 1995. [32] J.-P. Talpin and P. Jouvelot. Polymorphic type, region and effect [3] L. Damas and R. Milner. Principal type-schemes for functional inference. JFP, 2(3):245271, 1992. programs. In POPL, 1982. [33] D. Vytiniotis, S. Weirich, and S. Peyton Jones. FPH: First-class [4] D. Dreyer. Understanding and Evolving the ML Module System. PhD polymorphism for Haskell. In ICFP, 2008. thesis, CMU, 2005. [34] P. Wadler and S. Blott. How to make ad-hoc polymorphism less ad [5] D. Dreyer and M. Blume. Principal type schemes for modular pro- hoc. In POPL, 1989. grams. In ESOP, 2007. [35] A. Wright.Simple imperative polymorphism.LASC,8:343–356, 1995. [6] D. Dreyer, K. Crary, and R. Harper. A type system for higher-order modules. In POPL, 2003.

13 2015/6/18 A. System Fω Type Equivalence τ ≡ τ 0 A.1 Syntax τ 0 ≡ τ τ ≡ τ 0 τ 0 ≡ τ 00 Figure 2 gave the syntax of the variant of impredicative System Fω τ ≡ τ τ ≡ τ 0 τ ≡ τ 00 that we use as the target of our elaboration translation. It includes 0 0 0 record types (where we assume that labels are always disjoint), but τ1 ≡ τ1 τ2 ≡ τ2 τ ≡ τ 0 0 0 is otherwise completely standard. τ1 → τ2 ≡ τ1 → τ2 {l:τ} ≡ {l:τ } The syntactic sugar for n-ary pack’s and unpack’s, that intro- 0 0 duce/eliminate existential types ∃α.τ quantifying over several type τ ≡ τ τ ≡ τ variables at once, is defined as follows: ∀α:κ.τ ≡ ∀α:κ.τ 0 ∃α:κ.τ ≡ ∃α:κ.τ 0 0 0 0 ∃.τ := τ τ ≡ τ τ1 ≡ τ1 τ2 ≡ τ2 0 0 0 0 ∃α.τ := ∃α1.∃α .τ λα:κ.τ ≡ λα:κ.τ τ1 τ2 ≡ τ1 τ2

pack h, ei∃.τ0 := e α∈ / fv(τ) pack hτ, ei∃α.τ0 := pack hτ1, 0 (λα:κ.τ1) τ2 ≡ τ1[τ2/α] (λα:κ.τ α) ≡ τ 0 pack hτ , ei∃α .τ0[τ1/α1]i∃α.τ0 unpack h, x:τi = e1 in e2 := let x:τ = e1 in e2 A.3 Dynamic Semantics unpack hα, x:τi = e1 in e2 := unpack hα1, x1i = e1 in 0 We assume a standard left-to-right call-by-value evaluation order: unpack hα , x:τi = x1 in e2 let x:τ = e in e := (λx:τ.e ) e 1 2 2 1 Reduction e ,→ e0 (where τ = τ τ 0 and α = α α0) 1 1 (λx:τ.e) v ,→ e[v/x] Likewise, n-ary forms of other constructs (e.g. universals, lambdas, {l1=v1, l=v, l2=v2}.l ,→ v or applications) are defined in all instances in the obvious way. (λα:κ.e) τ ,→ e[τ/α] unpack hα, xi = pack hτ, viτ 0 in e ,→ e[τ/α][v/x] A.2 Static Semantics e ,→ e0 The only point of note about the static semantics is that, unlike 0 in most presentations, typing environments Γ permit shadowing of C[e] ,→ C[e ] bindings for value variables x (but not for type variables α). We where: write Γ(x) to refer to the type in the right-most binding of x in Γ. (values) v ::= λx:τ.e | {l=v} | λα:κ.e | pack hτ, viτ (contexts) C ::= [] | C e | v C | {l1=v, l=C, l2=e} | C.l | Environments Γ ` 2 C τ | pack hτ, Ciτ | unpack hα, xi=C in e Γ ` 2 α∈ / dom(Γ) Γ ` τ :Ω A.4 Properties · ` 2 Γ, α:κ ` 2 Γ, x:τ ` 2 The calculus as defined enjoys the standard soundness properties:

Types Γ ` τ : κ THEOREM A.1 (Preservation). If · ` e : τ and e ,→ e0, then · ` e0 : τ. Γ ` τ1 :ΩΓ ` τ2 :Ω Γ ` τ :Ω Γ ` τ1 → τ2 :Ω Γ ` {l:τ} :Ω THEOREM A.2 (Progress). If · ` e : τ and e 6= v for any v, then e ,→ e0 for some e0. Γ ` 2 Γ, α:κ ` τ :Ω Γ, α:κ ` τ :Ω Γ ` α : Γ(α) Γ ` ∀α:κ.τ :Ω Γ ` ∃α:κ.τ :Ω The proofs are entirely standard.

0 0 0 Γ, α:κ ` τ : κ Γ ` τ1 : κ → κ Γ ` τ2 : κ A.5 Substitution 0 Γ ` λα:κ.τ : κ → κ Γ ` τ1 τ2 : κ Our semantics makes fair use of parallel type substitutions. We write them as [τ/α], assuming that both vectors have the same arity, Terms Γ ` e : τ and use δ to range over them. The following definitions and lemmas Γ ` 2 Γ ` e : τ 0 τ 0 ≡ τ Γ ` τ :Ω are relevant: Γ ` x : Γ(x) Γ ` e : τ DEFINITION A.3 (Typing of Type Substitutions). Let δ = [τ/α]. 0 0 0 0 We write Γ ` δ :Γ if and only if Γ, x:τ ` e : τ Γ ` e1 : τ → τ Γ ` e2 : τ 0 0 1. Γ ` 2, Γ ` λx:τ.e : τ → τ Γ ` e1 e2 : τ 2. α ⊆ dom(Γ), Γ ` e : τ Γ ` e : {l:τ, l0:τ 0} 3. for all α ∈ dom(Γ), Γ0 ` δα : Γ(α), Γ ` {l=e} : {l:τ} Γ ` e.l : τ 4. for all x ∈ dom(Γ), Γ0 ` x : δ(Γ(x)).

0 Γ, α:κ ` e : τ Γ ` e : ∀α:κ.τ 0 Γ ` τ : κ LEMMA A.4 (Type Substitutions). Let Γ ` δ :Γ. Then: Γ ` λα:κ.e : ∀α:κ.τ Γ ` e τ : τ 0[τ/α] 1. If Γ ` τ : κ, then Γ0 ` δα : κ. 0 Γ ` τ : κ Γ ` e : τ 0[τ/α]Γ ` ∃α:κ.τ 0 :Ω 2. If Γ ` e : τ, then Γ ` δe : δτ. 0 Γ ` pack hτ, ei∃α:κ.τ 0 : ∃α:κ.τ

0 0 Γ ` e1 : ∃α:κ.τ Γ, α:κ, x:τ ` e2 : τ Γ ` τ :Ω Γ ` unpack hα, xi=e1 in e2 : τ

14 2015/6/18 B. Decidability of Subtyping Syntax In order to prove that subtyping terminates, we define the following (types) T ::= ... | wrap T weight functions over semantic types: (expressions) E ::= ... | wrap X:T | unwrap X:T S[[bool]] = 1 wrap E : T := let X=E in wrap X : T S[[α]] = 1 unwrap E : T := let X=E in unwrap X : T S[[π σ]] = S[[π]]+ S[[σ]]+ 1 S[[λα.σ]] = S[[σ]]+ 1 Semantic Types S[[[= Ξ]]] = S[[Ξ]]+ 1 (large) Σ ::= ... | [Ξ] S[[{}]] = 1 (small) σ ::= ... | [Ξ] S[[{l0:Σ0, l:Σ}]] = S[[Σ0]]+ S[[{l:Σ}]]+ 1 Desugarings into Fω: S[[∀α.Σ →ι Ξ]] = S[[Σ]]+ S[[Ξ]]+ 1 S[[∃α.Σ]] = S[[Σ]] (types) (terms) Q[[bool]] = 0 [Ξ] := {val :Ξ} [e] := {val = e} Q[[α]] = 0 Q[[π σ]] = 0 Types Γ ` T Ξ Γ ` T Ξ Q[[λα.σ]] = 0 TWRAP Γ ` wrap T [Ξ] Q[[[= Ξ]]] = 2 · Q[[Ξ]] Q[[{}]] = 0 Expressions Γ ` E :ι Ξ e Q[[{l0:Σ0, l:Σ}]] = Q[[Σ0]]+ Q[[{l:Σ}]] Γ ` X :P Σ e Γ ` T [Ξ] Γ ` Σ ≤ Ξ f Q[[∀α.Σ →ι Ξ]] = |α| + Q[[Σ]]+ Q[[Ξ]] EWRAP Q[[∃α.Σ]] = |α| + Q[[Σ]] Γ ` wrap X:T :P [Ξ] [f e]

W [[Ξ]] = hQ[[Ξ]],S[[Ξ]]i 0 0 Γ ` X :P [Ξ ] e Γ ` T [Ξ] Γ ` Ξ ≤ Ξ f βη EUNW The definitions assume -normal form; in the case of paths and Γ ` unwrap X:T :ι(Ξ) Ξ f (e.val) structures, they are inductive over the vector of arguments and components, respectively. Subtyping Γ ` Ξ0 ≤ Ξ δ; f We apply addition point-wise to weight pairs W [[]], and impose a lexicographic ordering on them: SWRAP Γ ` [Ξ] ≤ [Ξ] λx:[Ξ].x hq, si < hq0, s0i :⇔ q < q0 ∨ (q = q0 ∧ s < s0) Figure 7. Extension with Impredicativity Small types don’t contain quantifiers, so Q[[σ]] = 0. We extend the notion of small type to (higher-order) type constructors and substitutions: a type constructor is small if it is of the form λα.σ; a typing, then, does not apply to wrapped types (rule SWRAP). This substitution δ is small if δ(α) is small for all α. Then, abbreviating way, any infinite recursion is avoided; the weight function for the |dom(δ)| to |δ|: termination proof (Appendix B) can trivially be extended to the ad- ditional form of type as S[[[Ξ]]] = 1 and Q[[[Ξ]]] = 0. LEMMA B.1 (Weight Reduction under Small Substitution). Let δ be a small substitution. The syntax we chose here is reminiscent of packaged modules (Section 1.1), but the construct has far more specialised use cases in 1. Q[[δΞ]] = Q[[Ξ]]. 1ML. In particular, wrapping is never needed if one merely wants 2. W [[δΞ]] < h|1, 0i + W [[Ξ]]. to abstract over a value of large type (like is the case in conventional 3. W [[δΞ]] ≤ h|δ|, 0i + W [[Ξ]]. ML with packaged modules). It is only needed in the rare case where one wants to type-abstract over a large type. Note how this lemma crucially relies on δ being small. The proper- As an example, consider a Church encoding of the option type: ties are essential in proving the case of rule SFUN for the termina- tion lemma: type OPT = { LEMMA B.2 (Termination of Algorithmic Subtyping). type opt a Let Γ be a well-formed context, Γ ` Ξ0 :Ω and Γ, α ` Ξ:Ω and none a : opt a π = α α0. Then Γ ` Ξ0 ≤ Ξ δ; f terminates. some a : a → opt a π caseopt a b : opt a → b → (a → b) → b The proof is by case analysis on the algorithm. In each rule, the } 0 weight W [[Ξ ]] + W [[Ξ]] + h|π|, 0i gets strictly smaller for each Opt :> OPT = premise: either its Q component shrinks, because a quantifier is { peeled, or its S, because the structure is otherwise simplified. In type opt a = wrap (b : type) ⇒ b → (a → b) → b particular, Q shrinks in all cases where a non-empty substitution δ none a = wrap (fun b (n : b) (s : a → b) ⇒ n) : opt a is applied in a premise. some a (x : a) = wrap (fun b (n : b) (s : a → b) ⇒ s x) : opt a caseopt a b (o : opt a) = (unwrap o : opt a) b } C. Impredicativity The implementation of opt is a large type, so it has to be wrapped

Figure 7 shows shows an extension of 1MLex with (structural) im- in order to “make it small” and allow matching the abstract decla- predicative types. The trick is that a large type has to be injected ration in the signature. into the universe of small types explicitly, marked by the form Impredicativity in this manner is also easily integrated with type wrap T (this is rather similar to the bracketted polytypes in Gar- inference. The respective rules can be seen in Appendix D. rigue & Remy’s´ semi-explicit first-class polymorphism [8]). Sub-

15 2015/6/18 D. Type Inference Rules Expressions Γ `θ E :ι Ξ e Abbreviations Γ(X) = Σ θ 0 00 IEVAR Γ θ`θ0 J := θΓ `θ00 J ∧ θ = θ ◦ θ Γ ` X : Σ X 0 0 0 0 00 [] P Γ; Γ θ`θ0 J := Γ, Γ θ`θ00 J ∧ θ = [υ /υ] ◦ θ 00 where υ = undet(θ J ) Γ `θ T Ξ 0 IETYPE υ fresh with ∆υ0 = ∆υ ∩ dom(Γ) Γ `θ type T :P [= Ξ] [Ξ]

Types Γ `θ T Ξ Γ `[] true :P bool true Γ `[] false :P bool false ! IETRUE IEFALSE Γ `θ E :P [= Ξ] e ITPATH Γ `θ E Ξ ! Γ `θ0 X :P bool eX Γ θ2`θ3 T Ξ υ fresh ∆υ = dom(Γ) Γ θ0`θ1 E1 :ι1 Ξ1 e1 Γ θ3`θ4 Ξ1 ≤ Ξ f1 ITINFER Γ θ1`θ2 E2 :ι2 Ξ2 e2 Γ θ4`θ5 Ξ2 ≤ Ξ f2 Γ `[] υ IEIF Γ `θ5 if X then E1 else E2 : T :ι1∨ι2∨ι(Ξ) Ξ ITTYPE if eX then f1 e1 else f2 e2 Γ `[] type ∃α.[= α] Γ `θ B :ι Ξ e IESTR ITBOOL Γ `θ {B} :ι Ξ e Γ `[] bool bool ! 0 0 Γ `θ D Ξ Γ `θ E :ι ∃α.{X:Σ, X :Σ } e ITSTR IEDOT Γ `θ {D} Ξ Γ `θ E.X :ι ∃α.Σ unpack hα, yi = e in pack hα, y.Xi Γ `θ1 T1 ∃α1.Σ1 Γ; α1,X:Σ1 θ1`θ2 T2 ∃α2.Σ2 Γ `θ1 T ∃α.Σ Γ; α, X:Σ θ1`θ2 E :ι Ξ e ITFUN IEFUN Γ `θ2 (X:T1) → T2 ∀α1. Σ1 →I ∃α2.Σ2 Γ `θ2 fun (X:T ) ⇒E :P ∀α. Σ →ι Ξ λα.λX:Σ.e

! Γ `θ T1 ∃α1.Σ1 1 Γ `θ1 X1 :P ∀α. Σ1 →ι Ξ e1 0 Γ; α1,X:Σ1 θ1`θ2 T2 ∃α2.Σ2 κα = κα1 → κα2 Γ θ1`θ2 X2 :P Σ2 e2 Γ θ2`θ3 Σ2 ≤α Σ1 δ; f 2 IEAPP 0 0 Γ `θ X1 X2 :ι δΞ (e1 (δα)(f e2)).ι Γ `θ2 (X:T1) ⇒ T2 ∃α2.∀α1. Σ1 →P Σ2[α2 α1/α2] 3 ITPFUN

Γ `θ1 X :P Σ e Γ θ1`θ2 T [Ξ] Γ θ2`θ3 Σ ≤ Ξ f Γ; α, X:[= α] `θ T Σ κα = Ω ITIMPL Γ `θ3 wrap X:T :P [Ξ] [f e] IEWRAP Γ `θ ’(X:type) ⇒ T ∀α.{} →A Σ Γ `! X : [Ξ0] e Γ ` T [Ξ] Γ ` Ξ0 ≤ Ξ f Γ `θ T Ξ θ1 P θ1 θ2 θ2 θ3 ITWRAP Γ ` unwrap X:T : Ξ (f e).val UNW Γ `θ wrap T [Ξ] θ3 P IE

Γ `θ E :P Σ e ITSING Bindings Γ `θ B :ι Ξ e Γ `θ (= E) Σ Γ `θ E :I ∃α.Σ e IBVAR Γ `θ1 T1 ∃α1.Σ1 α1 = α11 ] α12 Γ `θ X=E :I ∃α.{X :Σ} Γ θ1`θ2 T2 ∃α2.Σ2 Γ, α11, α2 θ2`θ Σ2 ≤α12 Σ1.X δ; f unpack hα, xi = e in pack hα, {X=x}i 0 Γ `θ T1 where (.X:T2) ∃α11α2.δΣ1[.X=Σ2] ITWHERE Γ `θ E :P Σ e υ = undet(θΣ) − undet(θΓ) κα = Ω Γ `θ X=E :P {X : ∀α.{} →A Σ[α/υ]} {X=λα.λAx:{}.e} IBPVAR Declarations Γ `θ D Ξ ! Γ `θ E :ι ∃α.{X:Σ} e Γ `θ T ∃α.Σ IBINCL IDVAR Γ ` include E : ∃α.{X:Σ} e Γ `θ X:T ∃α.{X:Σ} θ ι 0 Γ `θ T ∃α.{X:Σ} Γ `θ1 B1 :ι1 ∃α1.{X1:Σ1} e1 X1 = X1 − X2 IDINCL 0 Γ; α , X :Σ ` B : ∃α .{X :Σ } e X :Σ0 ⊆ X :Σ Γ `θ include T ∃α.{X:Σ} 1 1 1 θ1 θ2 2 ι2 2 2 2 2 1 1 1 1 0 0 Γ `θ2 B1;B2 :ι1∨ι2 ∃α1α2.{X1:Σ1, X2:Σ2} Γ ` D ∃α .{X :Σ } θ1 1 1 1 1 unpack hα1, y1i = e1 in let X1 = y1.X1 in Γ; α1, X1:Σ1 θ1`θ2 D2 ∃α2.{X2:Σ2} X1 ∩ X2 = ∅ unpack hα2, y2i = e2 in pack hα α , {X0 = y .X0 , X = y .X }i Γ `θ2 D1;D2 ∃α1α2.{X1:Σ1, X2:Σ2} 1 2 1 1 1 2 2 2 IDSEQ IBSEQ

IDEMPTY IEEMPTY Γ `[]  {} Γ `[]  :P {} {}

16 2015/6/18 0 0 Subtyping Γ `θ Ξ ≤π Ξ δ; f Unification Γ `θ Ξ = Ξ IUREFL 0 0 Γ ` υ = υ Γ `θ Ξ ≤ Ξ f := Γ `θ Ξ ≤ Ξ id; f [] 0 Γ `θ Ξ = Ξ ISREFL SYMM 0 IU Γ `[] υ ≤ υ λx.x Γ `θ Ξ = Ξ ! Γ `θ υ ≈ ΣΓ θ`θ0 υ ≤ Σ 0 0 ISRESL υ = undet(σ) υ∈ / υ ∆υ ⊇ fv(σ) Γ `θ0 υ ≤ Σ λx.x 00 υ fresh ∆υ00 = ∆υ0 ∩ ∆υ ! 0 0 0 IUBIND Γ `θ υ ≈ Σ Γ θ`θ Σ ≤ υ 00 0 RESR Γ `[υ /υ ]◦[σ/υ] υ = σ 0 IS Γ `θ0 Σ ≤ υ λx.x 0 0 Γ `θ Ξ = Ξ Γ `θ π = π IUTYPE PATH 0 0 IS Γ `θ0 [= Ξ] = [= Ξ ] Γ `θ π ≤ π λx.x 0 Γ `θ Ξ = Ξ FORGET WRAP IS 0 IU Γ `[] [= σ] ≤α0 α [= α0 α] [λα.σ/α0]; λx.x Γ `θ [Ξ] = [Ξ ] 0 0 0 0 0 Γ `θ σ = σ Γ `θ Ξ ≤ Ξ f Γ θ`θ Ξ ≤ Ξ f PATH ISTYPE 0 IU 0 Γ `θ α σ = α σ Γ `θ0 [= Ξ ] ≤ [= Ξ] []; λx.[Ξ] 0 0 Γ `θ Ξ = Ξ Γ `θ Σ = Σ WRAP IUSTR 0 IS 0 Γ `θ [Ξ ] ≤ [Ξ] λx.x Γ `θ {l:Σ} = {l:Σ } 0 0 ISEMPTY Γ; α `θ1 Σ = Σ Γ; α θ1`θ2 Ξ = Ξ 0 IUFUN Γ `[] {l:Σ } ≤ {} λx. {} 0 0 Γ `θ2 ∀α.Σ →ι Ξ = ∀α.Σ →ι Ξ 0 Γ `θ1 Σ1 ≤π1 Σ1 δ1; f1 0 Γ; α `θ Σ = Σ Γ ` {l0:Σ0} ≤ δ {l:Σ} δ ; f δ Σ = Σ ABS θ1 θ2 π2 1 2 2 2 1 1 0 IU ISSTR Γ `θ ∃α.Σ = ∃α.Σ 0 0 0 Γ `θ2 {l1:Σ1, l :Σ } ≤π1π2 {l1:Σ1, l:Σ} 0 δ1δ2; λx.{l1=f1(x.l1), l=(f2 x).l} Γ; α `θ Σ = Σ IMPL 0 IU Γ `θ ∀α.{} →A Σ = ∀α.{} →A Σ 0 0 0 Γ, α `θ1 Σ ≤α Σ δ1; f1 ι ≤ ι Γ; α ` δ Ξ0 ≤ Ξ δ ; f θ δ Σ = θ Σ 0 θ1 θ2 1 πα 2 2 2 2 2 Instantiation Γ `θ Ξ  Ξ e[ ] 0 0 0 ISFUN 0 Γ `θ2 (∀α .Σ →ι Ξ ) ≤π (∀α.Σ →ι Ξ) 0 0 ! 0 0 δ2; λx. λα. λιy:Σ. f2 ((x (δ1α )(f1 y)).ι ) Γ `θ E :ι Ξ e [e] := Γ `θ E :ι Ξ e ∧ 0 0 Γ θ`θ0 Ξ  Ξ e [ ] 0 0 Γ; α `θ Σ ≤α Σ δ; f α α 6=  0 0 ISABS Γ `θ ∃α .Σ ≤ ∃α.Σ INREFL 0 Γ `[] Ξ  Ξ [ ] λx. unpack hα , yi = x in pack hδα, f yi Γ; α `θ υ ≈ Σ 0 0 INRES υ fresh ∆υ = dom(Γ) Γ `θ Σ [υ/α ] ≤π Σ δ; f Γ `θ ∃α.υ  ∃α.Σ [ ] 0 0 ISIMPLL Γ `θ ∀α .{} →A Σ ≤π Σ δ; λx. f ((x υ {}).A) υ fresh ∆υ = dom(Γ, α) 0 0 0 Γ; α `θ Σ ≤π Σ δ; f α 6∩ fv(θδ) Γ `θ ∃α.Σ [υ/α ]  ∃α.Σ e[ ] IMPLR 0 IS 0 0 INIMPL Γ `θ Σ ≤π ∀α.{} →A Σ δ; λx. λα.λAy:{}.f x Γ `θ ∃α.∀α .{} →A Σ  ∃α.Σ e[unpack hα, xi = [ ] in pack hα, x υ {}).A)i] Resolution Γ `θ υ ≈ Σ ! Γ `θ υ ≈ Σ := υ∈ / undet(Σ) ∧ Γ `θ υ ≈ Σ

00 υ fresh ∆υ00 = ∆υ ∩ ∆υ0 INFER 0 IR Γ `[υ00/υ,υ00/υ0] υ ≈ υ 0 α ∈ ∆υ υ fresh ∆υ0 = ∆υ IRPATH Γ `[α υ0/υ] υ ≈ α σ

IRBOOL Γ `[bool/υ] υ ≈ bool 0 υ fresh ∆υ0 = ∆υ IRTYPE Γ `[[=υ0]/υ] υ ≈ [= Ξ] 0 υ fresh ∆υ0 = ∆υ IRWRAP Γ `[[υ0]/υ] υ ≈ [Ξ]

υ1, υ2 fresh ∆υ = ∆υ = ∆υ 1 2 IRFUN Γ `[(υ1→Iυ2)/υ] υ ≈ ∀α.Σ →ι Ξ

17 2015/6/18 E. Correctness of Type Inference E.2 Termination

E.1 Soundness As for the 1MLex typing rules, termination is obvious for most the In order to formulate and prove correctness of the type inference 1ML inference judgements: the main ones are inductive on the syn- rules (Theorem 5.1) properly, we need to refine the statement a tactic structure of the language; for the auxiliary Unification judge- little bit. In particular, instead of just bundling together υ, Γ as an ment, termination can be proved in the usual manner, Instantiation environment, we want to construct an environment that respects the is inductive on the right-hand side type, and Resolution is not even scoping constraints on the free inference variables υ. recursive. Again, it only remains subtyping as the problem child. To that end, we define a judgement υ ` Γ Γ0 as follows: We can prove its termination by extending the prove from Ap- pendix B. First, we trivially extend the weight function to handle 0 υ ∈ υ υ − υ ` Γ Γ ∆υ ⊆ dom(Γ) inference variables:  ` · · υ ` Γ Γ0, υ S[[υ]] = 1 Q[[υ]] = 0 0 0 0 0 υ ` Γ Γ α∈ / dom(Γ ) υ ` Γ Γ Γ ` Σ:Ω The Weight Reduction lemma still holds for δ-substitutions on type υ ` Γ, α Γ0, α υ ` Γ,X:Σ Γ0,X:Σ variables, but we can formulate it analogously for θ-substitutions This assumes that dom(Γ) 6∩ υ initially. Γ0 is an extension of Γ on inference variables (which are always small): υ that includes bindings for , treated as ordinary type variables, and LEMMA E.3 (Weight Reduction under Small Resolution). placed such that they adhere to the scoping constraints encoded in ∆. Note that this is a relation: there may be many possible Γ0 for a 1. Q[[θΞ]] = Q[[Ξ]]. given pair of Γ and υ (although they are all equivalent in the sense 2. W [[θΞ]] < h1, 0i + W [[Ξ]]. that they all admit the same set of Fω typing judgments). 3. W [[θΞ]] ≤ h|θ|, 0i + W [[Ξ]]. With that, for any Fω or 1ML typing judgement J , we define The following key lemma postulates that a subtyping derivation 0 0 0 υ;Γ ` J :⇔ ∃Γ , υ ` Γ Γ ∧ Γ ` J either resolves more inference variables than it introduces, or the Furthermore, define amount of excess variables (which will come from rule ISIMPLL) is bounded by the number of quantifiers in the types involved. 0 0 00 0 0 00 υ ;Γ ` θ : υ;Γ:⇔ υ ` Γ Γ ∧ υ ;Γ ` θ :Γ ∧ θ small ∧ dom(θ) 6∩ υ0 ∧ LEMMA E.4 (Resolution Progress in Subtyping Inference). 00 Let Γ `θ J an inference derivation and Y the set of fresh inference ∀υ ∈ υ, ∀υ ∈ undet(θυ), ∆ 00 ⊆ ∆υ υ variables generated by it. Well-typed substitutions can be composed: 1. If J is Ξ0 = Ξ, then either |Y | = |θ| = 0 or |Y | < |θ|. LEMMA E.1 (Composition of Substitutions). 0 0 0 00 00 0 0 0 00 2. If J is Ξ ≤π Ξ δ; f, then either |Y | = |θ| = 0 or |Y | < |θ| If υ ;Γ ` θ : υ;Γ and υ ;Γ ` θ : υ ;Γ and dom(θ) 6∩ υ , 0 00 00 0 or |Y | − |θ| ≤ Q[[Ξ ]]+ Q[[Ξ]]. then υ ;Γ ` θ ◦ θ : υ;Γ. Furthermore, if Ξ 6= Ξ0 and Ξ = υ or Ξ0 = υ, then υ ∈ The correctness theorem in its full beauty is stated as follows: dom(θ).

THEOREM E.2 (Soundness of 1ML Inference). With this, the termination proof can proceed similar to before, 0 Let dom(Γ) 6∩ υ and υ ` Γ Γ . Let J range over the 1ML except that we have to take θ into account. inference judgements. LEMMA E.5 (Termination of Subtyping Inference). 0 0 0 0 0 0 1. Γ is well-formed and a permutation of υ, Γ. Let υ ` Γ Γ and Γ ` Ξ :Ω and Γ , α ` Ξ:Ω and π = α α . θ 0 2. If θΓ `θ0 J , then θΓ `θ0 θJ . Then Γ `θ Ξ ≤π Ξ δ; f terminates. 0 0 3. If Γ `θ J , then υ ; θΓ ` θ : υ;Γ with υ − υ fresh. The proof is again by case analysis on the algorithm, mostly as 0 4. If Γ θ`θ0 J and υ ; θΓ ` θ : υ;Γ, before, but this time using the weight W [[Ξ0]] + W [[Ξ]] + h|π| + 00 0 0 00 0 then υ ; θ Γ ` θ : υ;Γ with υ − υ − υ fresh. |υ0|, 0i, with υ0 = undet(Ξ0) ∪ undet(Ξ). In each rule, this 0 5. If Γ = Γ1, Γ2 and Γ1;Γ2 θ`θ0 J and υ ; θΓ ` θ : υ;Γ, weight gets smaller for all premises. The only exceptions are the 00 0 0 00 0 then υ ; θ Γ ` θ : υ;Γ with υ − υ − υ fresh and ∆υ00 ⊆ rules ISRESL and ISRESR, which invoke the Resolution judgment dom(Γ1). that may introduce additional temporary inference variables. That 0 6. If Γ `θ T/D Ξ, then υ ; θΓ ` T/D θΞ. locally increases the weight of the judgement, but we can show by 0 7. If Γ `θ E/B :Ξ e, then υ ; θΓ ` E/B : θΞ θe. case analysis of the Σ in these rules that the temporary variables 0 0 are resolved immediately and weight will decrease again in the 8. If Γ `θ Ξ ≤π Ξ δ; f, and υ;Γ ` Ξ:Ω and υ;Γ ` Ξ :Ω, 0 0 recursive invocation. For example, consider rule ISRESL for the then υ ; θΓ ` θΞ ≤π θΞ θδ; θf. 0 0 0 case Σ = ∀α.Σ1 →ι Ξ2 as one representative case: 9. If Γ `θ Ξ  Ξ e [ ], and υ;Γ ` Ξ:Ω and υ;Γ ` Ξ :Ω 0 0 0 and υ;Γ ` E :Ξ e, then υ ; θΓ ` E : θΞ θe [θe]. 0 0 0 • the weight is w = h0, 1i + W [[Σ]] + h|υ |i = W [[Σ1]] + 10. If Γ `θ Ξ = Ξ , and υ;Γ ` Ξ:Ω and υ;Γ ` Ξ :Ω, W [[Ξ ]]+ h|α| + |υ0|, 2i then θΞ = θΞ0. 2 • by inverting the first premise of ISRESL, (1) υ∈ / undet(Σ ) ∪ 11. If Γ `θ υ ≈ Σ, then θυ has the same outer shape as Σ. 1 undet(Ξ2) and (2) Γ `θ υ ≈ Σ The proof of the first part is by induction on the derivation of 0 • the only rule that applies for (2) is IRFUN υ ` Γ Γ , for the others by simultaneous induction on the (first) derivation. We omit the gory details. • hence, θ = [(υ1 →I υ2)/υ] for fresh υ1, υ2 • because of (1), θΣ = Σ

• hence, the second premise of ISRESL is invoked as θΓ `θ00 (υ1 →I υ2) ≤ (∀α.Σ1 →ι Ξ2)

• because of (1), Σ 6= υ1 →I υ2, so only rule ISFUN applies

18 2015/6/18 • its first premise is invoked as Γ, α ` Σ1 ≤ υ1, producing θ1 (3)

• because of (1), υ1, υ2 ∈/ undet(Σ1) = undet(θΣ1) 0 • so υ1 = undet(Σ1) ∪ undet(υ1) ⊆ (υ − υ) ∪ υ1 0 • that is, |υ1| ≤ |υ |

• the weight for the first premise of ISFUN is w1 = W [[Σ1]] + 0 h0, 1i + h|υ1|, 0i ≤ W [[Σ1]]+ h|υ |, 1i

• w1 < w, so the first premise terminates (with empty δ1)

• let Y1 be the set of fresh inference variables generated by it

• the second premise of ISFUN is invoked as θ1Γ, α ` υ2 ≤ θ1Ξ2 (after expanding the judgement abbreviation), yielding θ2

• by Lemma E.4, υ1 ∈ dom(θ1), and thus, υ1 ∈/ undet(θ1Ξ2) 0 • so υ2 = undet(υ2) ∪ undet(θ1Ξ2) ⊆ υ2 ∪ (υ − υ) ∪ Y1 − dom(θ1) 0 • that is, |υ2| ≤ |υ | + |Y1| − |θ1|

• the weight for the second premise of ISFUN is w2 ≤ h0, 1i + 0 W [[θ1Ξ2]]+ h|υ2|, 0i ≤ W [[θ1Ξ2]]+ h|υ | + |Y1| − |θ1|, 1i • split by Lemma E.4 applied to (3): 0 if |Y1| = |θ1| = 0, then w2 ≤ W [[Ξ2]]+ h|υ |, 1i < w

if |Y1| < |θ1|, then with Weight Reduction, w2 < h1, 0i + 0 0 W [[Ξ2]]+ h|υ | + |Y1| − |θ1|, 1i ≤ W [[Ξ2]]+ h|υ |, 1i < w

if |Y1| − |θ1| < Q[[Σ1]]+ Q[[υ1]], then with Weight Reduc- 0 tion, w2 < h1, 0i + W [[Ξ2]] + h|υ | + |Y1| − |θ1|, 1i ≤ 0 0 W [[Ξ2]] + h|υ |, 1i + Q[[Σ1]] < W [[Ξ2]] + h|υ |, 1i + W [[Σ1]] < w

• in all cases, w2 < w, so the second premise of ISFUN termi- nates as well

19 2015/6/18 Abbreviations for Environment Abstraction (kinds) (·) → κ := κ (types) λ(·).τ := τ (terms) λ(·).e := e (Γ, α) → κ := Γ → κα → κ λ(Γ, α).τ := λΓ.λα.τ λ(Γ, α).e := λΓ.λα.e 0 ΓI := · (Γ, x:τ) → κ := Γ → κ λ(Γ, x:τ ).τ := λΓ.τ λ(Γ, x:τ).e := λΓ.λPx:τ.e ΓP := Γ (types) ∀(·).τ := τ τ (·) := τ e (·) := e ∀(Γ, α).τ := ∀Γ.∀α.τ τ (Γ, α) := τ Γ α e (Γ, α) := e Γ α 0 0 0 ∀(Γ, x:τ ).τ := ∀Γ.τ →P τ τ (Γ, x:τ ) := τ Γ e (Γ, x:τ) := (e Γ x).P

Types Γ ` T Ξ Γ ` E :P ∃α.[= Ξ] e α 6∩ fv(Ξ) Γ ` E :P ∃α.Σ e α 6∩ fv(Σ) TPATH TSING Γ ` E Ξ Γ ` (= E) Σ

Expressions Γ ` E :ι Ξ e Γ(X) = Σ Γ ` T Ξ EVAR ETYPE ETRUE Γ ` X :P Σ λΓ.X Γ ` type T :P [= Ξ] λΓ.[Ξ] Γ ` true :P bool λΓ.true

Γ ` X :P bool e Γ ` E1 :ι1 Ξ1 e1 Γ ` Ξ1 ≤ Ξ f1 Γ ` T ΞΓ ` E2 :ι2 Ξ2 e2 Γ ` Ξ2 ≤ Ξ f2 EFALSE EIF Γ ` false :P bool λΓ.false Γ ` if X then E1 else E2 : T :ι1∨ι2∨ι(Ξ) Ξ ι1∨ι2∨ι(Ξ) ι1 ι2 λΓ . if e Γ then f1 (e1 Γ ) else f2 (e2 Γ ) 0 0 0 0 Γ ` B :ι Ξ e Γ ` E :ι ∃α.{X :Σ } eX :Σ ∈ X :Σ STR DOT E ι ι E Γ ` {B} :ι Ξ e Γ ` E.X :ι ∃α.Σ unpack hα, yi = e in pack hα,λ Γ . (y Γ ).Xi Γ ` T ∃α.ΣΓ, α, X:Σ ` E :I Ξ e Γ ` T ∃α.ΣΓ, α, X:Σ ` E :P ∃α2.Σ2 e EFUN EPFUN Γ ` fun (X:T ) ⇒E :P ∀α. Σ →I Ξ λΓ.λα.λιX:Σ.e Γ ` fun (X:T ) ⇒E :P ∃α2.∀α. Σ →P Σ2 e

Γ ` X1 :P ∀α. Σ1 →ι Ξ e1 Γ ` X2 :P Σ2 e2 Γ ` Σ2 ≤α Σ1 δ; f APP ι E Γ ` X1 X2 :ι δΞ λΓ . (e1 Γ( δα)(f (e2 Γ))).ι

Γ ` X :P Σ1 e Γ ` T ∃α.Σ2 Γ ` Σ1 ≤α Σ2 δ; f κα0 = Γ → κα ESEAL 0 0 Γ ` X:>T :P ∃α .Σ2[α Γ/α] pack hλΓ.δα,λ Γ.f (e Γ)i 0 0 Γ ` X :P Σ e Γ ` T [Ξ] Γ ` Σ ≤ Ξ f Γ ` X :P [Ξ ] e Γ ` T [Ξ] Γ ` Ξ ≤ Ξ f EPACK EUNP ι(Ξ) Γ ` pack X:T :P [Ξ] λΓ.[f (e Γ)] Γ ` unpack X:T :ι(Ξ) Ξ λΓ .f ((e Γ).val) 0 0 Γ, α ` E :P ∃α.Σ e κα0 = Ω Γ ` E :ι ∃α.∀α .Σ e Γ, α ` σ : κα0 GEN INST 0 E 0 ι ι E Γ ` E :P ∃α.∀α .Σ e Γ ` E :ι ∃α.Σ[σ/α ] unpack hα, xi = e in pack hα,λ Γ .x Γ σi

Bindings Γ ` B :ι Ξ e Γ ` E :ι ∃α.Σ e Γ ` E :ι ∃α.{X:Σ} e ι ι BVAR BINCL Γ ` X=E :ι ∃α.{X:Σ} unpack hα, xi = e in pack hα,λ Γ .{X=x Γ }i Γ ` include E :ι ∃α.{X:Σ} e 0 Γ ` B1 :ι1 ∃α1.{X1:Σ1} e1 X1 = X1 − X2 0 0 Γ, α1, X1:Σ1 ` B2 :ι2 ∃α2.{X2:Σ2} e2 X1:Σ1 ⊆ X1:Σ1 BSEQ BEMPTY 0 0 Γ `  : {} λΓ.{} Γ ` B1;B2 :ι1∨ι2 ∃α1α2.{X1:Σ1, X2:Σ2} P unpack hα1, y1i = e1 in ι ∨ι ι unpack hα2, y2i = (let X1 = λΓ 1 2 . (y1 Γ 1 ).X1 in e2) in ι1∨ι2 ι pack hα1α2,λ Γ . let X1 = (y1 Γ 1 ).X1 in

ι2 0 0 let X2 = (y2 (Γ, α1, X1:Σ1) ).X2 in {X1 = X1, X2 = X2}i

Figure 8. Elaboration of 1ML with Pure Sealing (rules not shown are unchanged)

F. 1ML with Pure Sealing ing inside a function, the abstract types are already skolemised over Figure 8 shows how to modify the 1ML elaboration rules from that function’s parameters. Rule EPFUN can thus lift the existential Figures 4, 7, and 5 to support a semantics where sealing is a pure quantifier for the local abstract types α2 from the body. operation. The actual changes are highlighted. The translation invariant for expression and bindings changes The new rules are mostly straight out of [25], Section 7. Regard- so that Γ ` E/B :ι ∃α.Σ e implies Γ ` e : ∃α.Σ if ι = I ing typing, the main change over plain 1ML is the modification to (as before), but · ` e : ∃α.∀Γ.Σ if ι = P (see [25] for an extensive explanation of this approach). Consequently, the gist of the changes rule ESEAL for sealing and the addition of rule FPFUN for pure is in the term translation. which now has to abstract over Γ for all functions. ESEAL now skolemises the abstract variables over the whole environment Γ. This is so that for any use of sealing occur- pure expressions, and systematically apply and reabstract Γ to get

20 2015/6/18 to the actual value where necessary – for pure expressions, the ∀Γ becomes part of the “abstraction monad”. This is possible because sealing is the only pure operation that can create quantifiers in the modified system: as motivated in Sec- tion 3, all other expressions introducing quantifiers (conditionals in rule EIF, application in rule RAPP, and also unwrapping in rule RUNW) continue to be treated as impure if quantifiers are present. The reason for this is that they are not phase-separable, and no translation to System Fω could be given that would allow them to be treated as pure. The new rules render (only) sealing pure, ex- actly by making it separable, which allows maintaining the crucial invariant that “purity” implies phase separation. Rules TPATH and TSING now have to allow for the possibility of pure expressions creating quantifiers through sealing. That is fine, as long as the abstract names are purely local and do not escape from E. Similarly, quantifiers may occur in rule EGEN, where they just remain, in the same spirit as for ordinary pure functions. The only practical complication of this system over what is discussed in [25], Section 7, is the interaction of sealing with implicit generalisation: the type derived by rule ESEAL depends on Γ, which includes all variables α0 that are possibly introduced by uses of rule EGEN in scope. However, both the use of the latter rule, and the sequence of variables it introduces are non-deterministic. In practice, a type inference algorithm will only determine both after type-checking an actual expression. If that expression uses sealing, the inference algorithm would hence need to be able to defer the exact choice of Γ in rule ESEAL. This is certainly doable, but we leave the formal refinement of this idea to future work. The rules from Figure 8 define a semantics for sealing that is not abstraction-safe when used locally inside a function – see [25], Section 8, for a discussion of the issue, and an approach for solving it in conventional ML. Unfortunately, the solution is not trivial to integrate with type inference of the form required for 1ML, because it necessitates existential quantifiers to occur in many additional types, including ones we’d like to consider small. We leave solving this to future work as well.

21 2015/6/18 kind Ω, the resulting expression will have a type of the form [= σ], (kinds) so that it can be used as a 1ML type; otherwise, it will be a function [[Ω]] = type returning such a value. Finally, F terms are translated into 1ML [[κ → κ ]] = [[κ ]] ⇒ [[κ ]] ω 1 2 1 2 expressions, as is to be expected. (types) All translated terms become 1ML expressions with a small type, [[α]] = Xα and consequently, ground types translate to small types as well. [[τ1 → τ2]] = type ([[τ1]] → [[τ2]]) This is necessary to encompass impredicativity in Fω. To achieve [[τ1 × τ2]] = type ([[τ1]], [[τ2]]) that for quantified types, the translation wraps each of them, using [[∀α:κ.τ]] = type (wrap ((Xα : [[κ]]) ⇒ [[τ]])) the extension from Section C – if we were to translate only a [[∃α:κ.τ]] = type (wrap (Xα = [[κ]], [[τ]])) predicative version of Fω, then the uses of wrap in the figure [[λα:κ.τ]] = fun (Xα : [[κ]]) ⇒ [[τ]] could be dropped (and the use in existential formation would be [[τ1 τ2]] = [[τ1]][[τ2]] replaced by sealing). Unfortunately wrapped expressions require (expressions) a type annotation; we assume that the types Te and T1 used in the figure are determined by context (as usual, this could be made more [[x]] = Xx precise via a type-directed translation, but we chose to gloss over [[λx:τ.e]] = fun (Xx : [[τ]]) ⇒ [[e]] this detail for the sake of simple exposition). [[he1, e2i]] = ([[e1]], [[e2]]) To state correctness of the embedding, we also need to define [[λα:κ.e]] = wrap (fun (Xα : [[κ]]) ⇒ [[e]]) an embedding of Fω environments. It modifies variable bindings in :(Xα : [[κ]]) ⇒ Te such a way that they match the requirements of the 1ML typing [[hτ1, e2iτ ]] = wrap ([[τ1]], [[e2]]): [[τ]] rules: [[e1 e2]] = [[e1]][[e2]] [[e1 τ2]] = (unwrap [[e1]] : T1) [[τ2]] [[ · ]] = · [[e.i]] = [[e]].i [[Γ, x:τ]] = [[Γ]],Xx:σ if [[Γ]] ` [[τ]] : [= σ] [[unpack hα, xi=e1 in e2]] = let (Xα, Xx) = unwrap [[e1]] : T1 [[Γ, α:κ]] = [[Γ]], α:κ, Xα:Σ if [[Γ]] ` [[κ]] ∃α:κ.Σ in [[e2]] With this, we can show that the embedding is sound, i.e., pro- where: duces well-formed 1ML programs from well-formed Fω terms:

(X = T1, T2) := { 1 : T1; 2 : let X = 1 in T2} THEOREM G.1 (Soundness of Embedding). Let Γ ` 2. (E1, E2) := { 1 = E1; 2 = E2} E.i := E. i 1. [[Γ]] ` 2. let (X1, X2) = E1 in E2 := let X = E1; 2. · ` [[κ]] ∃α:κ.Σ with · ` ∃α.Σ ≤ ∃α.Σ. X1 = X. 1; X2 = X. 2 in E2 3. If Γ ` τ : κ, then [[Γ]] ` [[τ]] : Σ and [[Γ]] ` [[κ]] Ξ and [[Γ]] ` Σ ≤ Ξ (the latter implies that if κ = Ω then Σ = [= σ]).

Figure 9. Embedding of Fω in 1ML 4. If Γ ` e : τ, then [[Γ]] ` [[e]] : σ and [[Γ]] ` [[τ]] : [= σ]. The proof is by simultaneous induction on (1) the derivations of Γ ` 2, (2) the structure of κ, and the derivations of (3) Γ ` τ : κ G. Embedding Fω and (4) Γ ` e : τ, respectively.

The elaboration of 1ML embeds the language into System Fω, For a fully pedantic proof of correctness we would also need to showing that it is no more expressive than that calculus. We can also show that the embedding is adequate, i.e., the produced 1ML pro- show that it is no less expressive, by doing the inverse: embedding grams are computationally equivalent to the original Fω terms. This Fω into 1ML. should be possible through a suitable logical relation, but would Doing so is fairly straightforward: Figure 9 gives the canonical be involved, requiring simultaneous reasoning about the embed- translation function [[ ]] for Fω kinds, types, and terms. ding just defined and the 1ML elaboration back into Fω. Given the Fω kinds are translated to suitable 1ML types. Fω types are limited relevance of this result, and its informal “obviousness”, it translated to 1ML expressions; in the case that the type has ground seems acceptable to punt.

22 2015/6/18