System F with Type Equality Coercions Including Post-Publication Appendix

System F with Type Equality Coercions Including post-publication Appendix January 19, 2011 Martin Sulzmann Manuel M. T. Chakravarty Simon Peyton Jones Kevin Donnelly School of Computing Computer Science & Engineering Microsoft Research Ltd National University of Singapore University of New South Wales Cambridge, England [email protected] [email protected] fsimonpj,[email protected] Abstract nesses to justify explicit type-cast operations. Like types, coercions are erased before running the program, so they are guaranteed to We introduce System FC, which extends System F with support for non-syntactic type equality. There are two main extensions: (i) have no run-time cost. explicit witnesses for type equalities, and (ii) open, non-parametric This single mechanism allows a very direct encoding of associ- type functions, given meaning by top-level equality axioms. Unlike ated types and GADTs, and allows us to deal with some exotic System F, FC is expressive enough to serve as a target for several functional-dependency programs that GHC currently rejects on the different source-language features, including Haskell’s newtype, grounds that they have no System-F translation (x2). Our specific generalised algebraic data types, associated types, functional de- contributions are these: pendencies, and perhaps more besides. • NOTE: this version has a substantial Appendix, written subse- We give a formal description of System FC, our new intermedi- quent to the publication of the paper, giving a simplified ver- ate language, including its type system, operational semantics, sion of System FC. This version is much closer to the one used soundness result, and erasure properties (x3). There are two dis- in GHC. tinct extensions. The first, explicit equality witnesses, gives a system equivalent in power to System F + GADTs (x3.2); the Categories and Subject Descriptors D.3.1 [Programming Lan- second introduces non-parametric type functions, and adds sub- guages]: Formal Definitions and Theory—Semantics; F.3.3 [Log- stantial new power, well beyond System F + GADTs (x3.3). ics and Meanings of Programs]: Studies of Program Constructs— Type structure • A distinctive property of FC’s type functions is that they are open (x3.4). Here we use “open” in the same sense that Haskell General Terms Languages, Theory type classes are open: just as a newly defined type can be Keywords Typed intermediate language, advanced type features made an instance of an existing class, so in FC we can extend an existing type function with a case for the new type. This 1. Introduction property is crucial to the translation of associated types. The polymorphic lambda calculus, System F, is popular as a highly- • The system is very general, and its soundness requires that the expressive typed intermediate language in compilers for functional axioms stated as part of the program text are consistent (x3.5). languages. However, language designers have begun to experiment That is why we call the system FC(X): the “X” indicates that with a variety of type systems that are difficult or impossible to it is parametrised over a decision procedure for checking con- translate into System F, such as functional dependencies [21], gen- sistency, rather than baking in a particular decision procedure. eralised algebraic data types (GADTs) [44, 31], and associated (We often omit the “(X)” for brevity.) Conditions identified in types [6, 5]. For example, when we added GADTs to GHC, we earlier work on GADTs, associated types, and functional de- extended GHC’s intermediate language with GADTs as well, even pendencies, already define such decision procedures. though GADTs are arguably an over-sophisticated addition to a • A major goal is that FC should be a practical compiler inter- typed intermediate language. But when it came to associated types, mediate language. We have paid particular attention to ensuring even with this richer intermediate language, the translation became that FC programs are robust to program transformation (x3.8). extremely clumsy or in places impossible. • It must obviously be possible to translate the source language F In this paper we resolve this problem by presenting System C(X), into the intermediate language; but it is also highly desirable more foundational more powerful a super-set of F that is both and that it be straightforward. We demonstrate that F has this ad hoc C than adding extensions to System F such as GADTs or as- property, by sketching a type-preserving translation of two F sociated types. C(X) uses explicit type-equality coercions as wit- source language idioms, namely GADTs (Section 4) and associated types (Section 5). The latter, and the corresponding translation for functional dependencies, are more general than all previous type-preserving translations for these features. System FC has no new foundational content: rather, it is an intrigu- ing and practically-useful application of techniques that have been Abridged version appears in The Third ACM SIGPLAN Workshop well studied in the type-theory community. Several other calculi on Types in Language Design and Implementation (TLDI’07), Jan- exist that might in principle be used for our purpose, but they gen- uary 16, 2007, Nice, France, ACM Press. erally do not handle open type functions, are less robust to trans- 1 2011/1/19 formation, and are significantly more complicated. We defer a com- representing GADTs by ordinary algebraic data types encapsulat- parison with related work until x6. ing such type equality coercions. To substantiate our claim that FC is practical, we have implemented Specifically, we translate the GADT Exp to an ordinary algebraic it in GHC, a state-of-the-art compiler for Haskell, including both data type, where each variant is parametrised by a coercion: GADTs and associated (data) types. This is not just a prototype; data Exp : ? ! ? where FC now is GHC’s intermediate language. Zero : 8a: (a ∼Int) ) Exp a FC does not strive to do everything; rather we hope that it strikes Succ : 8a: (a ∼Int) ) Exp Int ! Exp a an elegant balance between expressiveness and complexity. While Pair : 8abc: (a ∼(b; c)) ) Exp b ! Exp c ! Exp a our motivating examples were GADTs and associated types, we believe that FC may have much wider application as a typed target So far, this is quite standard; indeed, several authors present for sophisticated HOT (higher-order typed) source languages. GADTs in the source language using a syntax involving explicit equality constraints, similar to that above [44, 10]. However, for us 2. The key ideas the equality constraints are extra type arguments to the constructor, which must be given when the constructor is applied, and which pure No compiler uses System F as an intermediate language, are brought into scope by pattern matching. The “)” is syntac- because some source-language constructs can only be desugared tic sugar, and we sloppily omitted the kind of the quantified type into pure System F by very heavy encodings. A good example is variables, so the type of Zero is really this: the algebraic data types of Haskell or ML, which are made more complicated in Haskell because algebraic data types can capture Zero : 8 a: ?: 8(co:a ∼Int): Exp a existential type variables. To avoid heavy encoding, most compilers invariably extend System F by adding algebraic data types, data Here a ranges over types, of kind ?, while co ranges over coercions, constructors, and case expressions. We will use FA to describe of kind a ∼Int. An important property of our approach is that System F extended in this way, where the data constructors are coercions are types, and hence, equalities τ1 ∼τ2 are kinds. An allowed to have existential components [24], type variables can be equality kind τ1 ∼τ2 categorises all coercion types that witness the of higher kind, and type constructor applications can be partial. interchangeability of the two types τ1 and τ2. So, our slogan is Over the last few years, source languages (notably Haskell) have propositions as kinds, and proofs as (coercion) types. started to explore language features that embody non-syntactic or Coercion types may be formed from a set of elementary coer- definitional type equality. These features include functional depen- cions that correspond to the rules of equational logic; for example, dencies [16], generalised algebraic data types (GADTs) [44, 37], Int :(Int ∼Int) is an instance of the reflexivity of equality and and associated types [6, 5]. All three are difficult or impossible to sym co :(Int ∼a), with co :(a ∼Int), is an instance of symme- translate into System F — and yet the alternative of simply ex- try. A call of the constructor Zero must be given a type (to instan- tending System F by adding functional dependencies, GADTs, and tiate a) and a coercion (to instantiate co), thus for example: associated types, seems wildly unattractive. Where would one stop? Zero Int Int : Exp Int In the rest of this section we informally present System FC, an extension of System F that resolves the dilemma. We show how it As indicated above, regular types like Int, when interpreted as can serve as a target for each of the three examples. The formal coercions, witness reflexivity. x details are presented in 3. Throughout we use typewriter font Just like value arguments, the coercions passed to a constructor italics F for source-code, and for C. when it is built are made available again by pattern matching. Here, 2.1 GADTs then, is the code of eval in FC: Consider the following simple type-safe evaluator, often used as the eval = Λa: ? .λe:Exp a: poster child of GADTs, written in the GADT extension of Haskell case e of supported by GHC: Zero (co:a ∼Int) ! data Exp a where 0 I sym co Succ (co:a ∼Int)(e0:Exp Int) ! Zero :: Exp Int 0 Succ :: Exp Int -> Exp Int (eval Int e + 1) I sym co Pair :: Exp b -> Exp c -> Exp (b, c) Pair (b:?)(c:?)(co:a ∼(b; c)) (e1:Exp b)(e2:Exp c) ! eval :: Exp a -> a (eval b e1; eval c e2) I sym co eval Zero = 0 The form Λa: ?:e abstracts over types, as usual.

Load more