<<

Type variables in patterns

Richard A. Eisenberg Joachim Breitner Simon Peyton Jones Bryn Mawr College University of Pennsylvania Microsoft Research Bryn Mawr, PA, USA Philadelphia, PA, USA Cambridge, UK [email protected] [email protected] [email protected]

Abstract need a way to name such types. The obvious way to address For many years, GHC has implemented an extension to this challenge is by providing language support for lexically Haskell that allows type variables to be bound in type sig- scoped type variables. GHC has long supported scoped type natures and patterns, and to scope over terms. This exten- variables: the ScopedTypeVariables extension is very popular, sion was never properly specified. We rectify that oversight and 29% of Haskell packages on Hackage use it. But it has here. With the formal specification in hand, the otherwise- never been formally specified! Moreover, as we shall see, it labyrinthine path toward a design for binding type vari- is in any case inadequate to the task. In this paper we fix ables in patterns becomes blindingly clear. We thus extend both problems, making the following contributions: ScopedTypeVariables to bind type variables explicitly, obvi- ating the Proxy workaround to the dustbin of history. In the days of Haskell98, scoped type variables were • 1 Introduction seldom crucial. Through a series of examples we show Haskell allows the programmer to write a type signature for a that, as Haskell’s has grown more sophis- definition or expression, both as machine-checked documen- ticated, the need for scoped type variables has become tation, and to resolve ambiguity (Section 2.1). For example, acute (Section 2), while at the same time GHC’s exist- ing support for them has become more visibly inade- prefix :: a a a → [[ ]] → [[ ]] quate (Section 3). prefixx yss = map xcons yss To fix these inadequacies, we describe visible type ap- • where xcons ys = x : ys plication in patterns, a natural extension to GHC’s ex- Sadly, it is not always possible to write such a type signature. isting visible type applications from terms to patterns For example, to give a signature for xcons we might try: (Section 4). We give the first formal specification of scoped type prefix :: a a a • → [[ ]] → [[ ]] variables for Haskell, formalizing the folklore, and pro- prefixx yss = map xcons yss viding a firm foundation for both design and imple- where xcons :: a a [ ] → [ ] mentation (Section 5). xcons ys = x : ys As part of this specification, we offer a new and simpler • But Haskell98’s scoping rules specify that the a in the signa- typing judgment for GADT (Sec- ture for xcons is locally quantified, thus: xcons :: a. a tion 5.3), which treats uniformly the universal and [ ] → a . That is not what we want! We want the a in∀ the sig- existential variables of a data constructor. [ ] nature for xcons to mean “the universally quantified type variable for prefix”, and Haskell98 provides no way to do that. The inability to supply a type signature for xcons might arXiv:1806.03476v1 [cs.PL] 9 Jun 2018 seem merely inconvenient, but it is just the tip of an iceberg. 2 Motivation and background Haskell uses to infer types, and that is a won- 2.1 The need for type annotations derful thing. However, insisting on complete type inference— One of the magical properties of ML-family languages, in- that is, the ability to infer types for any well-typed program cluding Haskell, is that type inference allows us to write with no help from the programmer—places serious limits many programs with no type annotations whatsoever. In on the expressiveness of the type system. GHC’s version of practice, however, Haskell programs contain many user- Haskell has gone far beyond these limits, and fundamentally written type signatures, for two main reasons. relies on programmer-supplied annotations to guide type First, the type of a function can be extremely helpful as doc- inference. As some examples, see the work of Peyton Jones umentation, with the advantage that it is machine-checked et al. [2007], Vytiniotis et al. [2011], or Eisenberg et al. [2016]. documentation. Almost all programmers regard it as good So the challenge we address is this: it should be possible practice to provide a signature for every top-level function. for the programmer to write an explicit type signature for any Indeed, GHC has a warning flag, -Wmissing-signatures, sub-term of the program. To do so, some type signatures must which enforces this convention. refer to a type that is already in the static environment, so we 2 Richard A. Eisenberg, Joachim Breitner, and Simon Peyton Jones

show :: Showa a String a really is Int. GHC can use this fact during type check- ⇒ → read :: Reada String a ing the right-hand side of a function, like this: ⇒ → ++ :: a a a matchG :: Ga a ( ) [ ] → [ ] → [ ] → concat :: a a = [[ ]] → [ ] matchG MkInt 5 matchG MkFun = +10 ( ) Figure 1. Types of standard functions Again, however, matchG will only type-check if it is given a signature; see Schrijvers et al. [2009] for details. Second, as Haskell’s type system becomes increasingly Ambiguous types. Consider • expressive, complete type inference becomes intractable, and Fa the type system necessarily relies on programmer-supplied type family ambig :: Typeablea Fa Int type annotations. Here are some examples: ⇒ → Type-class ambiguity is present even in Haskell 98. test :: Char Int • → Consider1: testx = ambigx normalize :: String String In test GHC must decide at what type to call ambig; → normalizes = show reads that is, what type should instantiate the a in ambig’s ( ) type. Any choice a = τ must ensure that F τ Char This function parses a string to a value of some type, ∼ but, because F might not be injective, that does not tell and then turns that value back into a string. But noth- us what a should be. A type signature is not enough to ing in the code specifies that type, so the programmer resolve this case; we need a different form of type an- must disambiguate. One way to do so is to provide notation, namely visible type application (Section 2.3). a type signature that specifies the result type of the reads call, thus: There is a general pattern here: as the type system becomes more expressive, the type inference needs more guidance. normalizes = show reads :: Int ( ) Moreover, that guidance is extremely informative to the Polymorphic recursion. In ML, recursive calls to a func- programmer, as well as to the compiler. • tion must be at monomorphic type, but Haskell has always supported polymorphic recursion, provided 2.2 Support for scoped type variables the function has a type signature. For example: Given the increasing importance of type annotations, a good principle is this: it should be possible to specify, via a type data Ta = Leafa Node T a T a | ( [ ]) ( [ ]) signature, the type of any sub-expression or any let-binding. leaves :: Ta a → [ ] Alas, as shown in the introduction, Haskell98 supports only leaves Leafx = x ( ) [ ] closed type signatures, so there are useful type signatures leaves Node t1 t2 = concat leaves t1 ++ leaves t2 that we simply cannot write. ( ) ( ) Higher-rank types [Peyton Jones et al. 2007]. Consider The key deficiency in Haskell98 is that it provides noway • to bring type variables into scope. GHC has recognized this f :: a. a a Char , Bool ( [ ] → [ ]) → ([ ] [ ]) need for many years, and the ScopedTypeVariables extension fg =∀ g "Hello", g True, False ( [ ]) offers two ways to bring a type variable into scope: Here the type of g is polymorphic, so it can be applied Binding in a declaration signature. Since 2004 GHC • to lists of different type. The type signature is essential allows you to write to specify the type of the argument g; without it, f will prefix :: a. a a a be rejected. ∀ → [[ ]] → [[ ]] Generalized algebraic data types [Schrijvers et al. 2009]. prefixx yss = map xcons yss • where xcons :: a a The popular GADTs extension to GHC allows pattern [ ] → [ ] matching to refine the type information available in xcons ys = x : ys the right hand side of an equation. Here is an example: The explicit “ ” brings a into scope in the rest of the data Ga where type signature∀ (of course), but it also brings a into MkInt :: G Int scope in the body of the named function, prefix in this MkFun :: G Int Int case. The rule is a bit strange, because the definition ( → ) of prefix is not syntactically “under” the , and indeed When we learn that a value g :: Ga is actually the the signature can be written far away from∀ the actual constructor MkInt, then we simultaneously learn that binding. But in practice the rule works well, and we 1 Figure 1 gives the types of standard functions, such as read and show. take it as-is for the purposes of this paper. Type variables in patterns 3

Pattern signatures: binding a type variable in a pattern. 3.1 The binding structure of a pattern signature • For even longer, since 1998, GHC has allowed you to A pattern signature may bind a type variable, but it may also write this: mention a type variable that is already in scope. For example, we may write prefix x :: b yss = map xcons yss ( ) prefix x :: a yss = map xcons yss where xcons :: b b ( ) [ ] → [ ] where xcons ys :: a = x : ys xcons ys = x : ys ( [ ]) Here, the pattern signature x :: a binds a (as well as x), but Here, we bind the type variable b in the pattern x :: b , ( ) ( ) the pattern signature ys :: a simply mentions a (which is and this binding scopes over the body of the binding. ( [ ]) already in scope), as well as binding ys. The rule is this: a use We describe pattern signatures in much more detail in of a type variable p in a pattern signature is an occurrence Section 3. of p if p is already in scope; but binds p if p is not already in scope. 2.3 Visible type application It is entirely possible to have many different type vari- The TypeApplications extension provides a relatively new ables in scope, all of which are aliases for the same type. For form of type annotation: explicit type applications, first de- example: scribed by Eisenberg et al. [2016]. The idea is that an ar- prefix :: a. a a a gument of the form @ty specifies a type argument. This → [[ ]] → [[ ]] prefix x∀:: b yss :: = map xcons yss can often be used more elegantly than a type signature. For ( )( [[ ]]) where xcons ys :: d = x : ys example, a hypothetical unit-test for the function isJust :: ( [ ]) Maybea Bool, Here a, b, c, and d are all in scope in the body of xcons, and → are all aliases for the same type. testIsJust1 = isJust Just 2018 :: Int == True ( ( )) The current implementation of ScopedTypeVariables al- testIsJust2 = isJust Nothing :: Maybe Int == False ( ) lows such lexically-scoped type variables to stand only for other , and not for arbitrary , a point we can equivalently be written more elegantly using explicit type variables types return to in Section 3.5. type annotations testIsJust1 = isJust Just @Int 2018 == True 3.2 Pattern signatures are useful ( ) testIsJust2 = isJust Nothing @Int == False Pattern signatures have merit even if there are no type vari- ( ) ables around. Consider this Haskell program: Visible type application solves the awkward case of ambig main = do x readLn in Section 2.1: we can disambiguate the call with a type ← if nullx then putStrLn "Empty" argument. For example: else putStrLn "Not empty" type family Fa The types of the program are ambiguous: Clearly, x is some type instance F Bool = Char type that has a Read instance, and because it is passed to 2 ambig :: Typeablea Fa Int null, is a list , but the compiler needs to know the precise ⇒ → type, and rejects the program. test :: Char Int → To fix this in Haskell98, the programmer has two options: testx = ambig @Boolx They can wrap the call to readLn in a type annotation: • Here we specify that ambig should be called at Bool, and main = do x readLn :: IO Int ← ( [ ]) that is enough to type-check the program. if nullx then putStrLn "Empty!" It is natural to wonder whether we can extend visible type else printx application to patterns, just as we extended type signatures to patterns. Doing so is the main language extension suggested but this is infelicitous because there is no question that in this paper: Section 4. readLn is in the IO, and with larger types this can get very verbose. They can wrap an occurrence of x in a type annotation: 3 Pattern signatures and their • main = do x readLn shortcomings ← if null x :: Int then putStrLn "Empty" We see above that ScopedTypeVariables enables the user to ( [ ]) bind type variables in patterns, by providing a pattern signa- else printx ture, that is, a type signature in a pattern. We explore pattern 2Or, with a recent version of the standard library, it is something with a signatures and their shortcomings, in this section. Foldable instance. 4 Richard A. Eisenberg, Joachim Breitner, and Simon Peyton Jones

but again this is unsatisfying, because it feels too late. (The Showa constraint is brought into scope by the pattern Both variants are essentially work-arounds for the natural match.) way of specifying the type of x, namely at its binding site: However, existentials can never escape, forbidding the following function: main = do x :: Int readLn ( [ ]) ← jailbreak MkTicker val = val if nullx then putStrLn "Empty" ( ) else printx What should the type of jailbreak be? There is no answer to this question (jailbreak :: Ticker a is clearly too polymor- which is precisely what the PatternSignatures language ex- → tension provides3—the ability to write a type annotation in phic), and so GHC rejects this definition, correctly stating type variable ’a’ would escape its scope a pattern. that . Users, especially beginners, who have to track down a We naturally wish to name the existential type variable confusing type error in their code, can now exhaustively sometimes. For example, suppose we wanted to give newVal type annotate not just their terms, but also their patterns, (in the where clause of tick) a type signature. Saying newVal:: until they have cornered the bug. a would hardly do, because there is not yet a connection between the name a and the type unpacked from MkTicker. 3.3 Pattern signatures are essential We have to do this: Pattern signatures become more crucial when we consider tick MkTicker val :: b upd toInt limit = ... ( ( ) ) existential types. The ExistentialQuantification extension al- where newVal :: b lows users to bind existential variables in their data con- newVal = upd val structors. These are type variables whose values are “stored” by a constructor (but not really, because types are erased) The val :: b in the pattern binds the type variable b, so we and made available during pattern matching. Here are two can refer back to it later. examples: 3.4 Pattern signatures are clumsy data Ticker where Pattern signatures can be clumsy to use, when the type vari- MkTicker :: a. a a a a Int Ticker able is buried deep inside an ornate type. Here is a contrived ∀ → ( → ) → ( → ) → data Showable where example: MkShowable :: a. Showa a Showable Elab ∀ ⇒ → data where A Ticker contains an object of some type (but we do not know MkElab :: Showa Maybe Tree a, Int Elab ⇒ [ ( ( ))] → what type), along with an update function of type a a getE :: Elab Int → → and a way to convert an a into an Int. The Showable type getE MkElab xs :: Maybe Tree a, Int = ...a ... packs a value of an arbitrary type that has a Show instance ( ( [ ( ( ))])) along with its Show dictionary. Here are some functions that To bring a into scope in f ’s right-hand side, we have to repeat operate on these types: the MkElab’s elaborate argument type. More seriously, it may be impossible, rather than merely -- Updates a ticker, returning whether or not the ticker clumsy, to bind the variable we need. Consider the following -- has reached a limit GADT: tick :: Ticker Int Ticker, Bool → → ( ) data GMa where tick MkTicker val upd toInt limit ( ) MkMaybe :: GM Maybeb = newTicker, toInt newVal ⩾ limit ( ) ( ) matchGM :: a GMa Bool where newVal = upd val → → newTicker = MkTicker newVal upd toInt matchGMx MkMaybe = isJustx showAll :: Showable String This definition works just fine: GHC learns that x’s type [ ] → showAll = "" is Maybeb (for some existential b) and the call to isJust is [] showAll MkShowablex : ss = showx ++ showAll ss well typed. But what if we want to bind b in this definition? ( ) Annoyingly, MkMaybe has no argument to which we can We see that the tick function can unpack the existential in the apply a pattern signature. Nor does it work to wrap a pattern Ticker value and operate on the value of type a without ever signature around the outside of the match, thus: knowing what a is. Similarly, the showAll function works matchGM :: a GMa Bool with data of type a knowing only that a has a Show instance. → → matchGMx MkMaybe :: GM Maybeb = isJust @bx 3Modern GHC actually folds PatternSignatures into ScopedTypeVariables, ( ( )) giving both extensions the same meaning. However, it is expositionally This definition is rejected. The problem is that the type anno- cleaner to separate the two, as we do throughout this paper. tation on the MkMaybe pattern is checked before the pattern Type variables in patterns 5 itself is matched against4. Before matching the MkMaybe, type family Fa where we do not yet know that a is really Maybeb . Nor can we F Int = Bool put the type annotation on x, as that, too, occurs before the F Char = Double MkMaybe pattern has been matched. A possible solution is F Float = Double this monstrosity: Naturally, we can use a type family to define the type of an matchGM :: a GMa Bool argument to an existential data constructor: → → matchGMx gm @MkMaybe = case gm of data TF where :: GM Maybeb isJust @bx MkTF :: a. Typeablea Fa TF ( ( )) → ∀ ⇒ → The MkTF constructor stores a value of type Fa ; it also This is grotesque. We must do better, and we will in Section 4. ( ) stores a dictionary for a Typeablea constraint [Peyton Jones 3.5 Pattern signatures resist refactoring et al. 2016]—that is, we can use a runtime type test to discover In Section 3.1, we explained that a scoped type variable may the choice for a. We would thus like to write the following refer only to another type variable. This means that the function: definition toDouble :: TF Double → toDouble MkTFx -- We expect x :: Fa prefix :: a a a ( ) → [[ ]] → [[ ]] Just HRefl isType @Int = if x then 1.0 else 1.0 prefix x :: b yss = map xcons yss | ← − ( ) Just HRefl isType @Char = x xcons ys = x : ys | ← where Just HRefl isType @Float = x | ← otherwise = 0.0 is accepted, because b stands for the type variable in the type | of prefix. But suppose, for example, we specialize the type where signature of prefix without changing its definition: isType :: ty. Typeable ty Maybe a : : ty ⇒ ( ≈ ) isType =∀eqTypeRep typeRep @a typeRep @ty prefix :: Int Int Int ( )( ) → [[ ]] → [[ ]] prefix x :: b yss = map xcons yss The specifics of this function are not important here (see ( ) [Peyton Jones et al. 2016]). For our present purposes, the cru- where xcons ys = x : ys cial point is this: the existentially-bound type a is mentioned Now this definition is rejected with the error message “Couldn’t in both the definition of isType and its type signature—but match expected type b with actual type Int”, because b would there is no way to bring a into scope. We might try using a have to stand for Int. pattern signature at the binding of x, thus: Since the design of ScopedTypeVariables, GHC has evolved, toDouble MkTF x :: Fa = ... and with the advent of type equalities, the restriction itself ( ( )) but that does not quite work. The problem is that F is not in- becomes confusing. Should GHC accept the following defi- jective. The a in that pattern type annotation need not be the nition? same one packed into the existential type variable by MkTF, prefix :: a Int Int a a and GHC rightly considers such an a to be ambiguous5. ∼ ⇒ → [[ ]] → [[ ]] prefix x :: b yss = map xcons yss The only workaround available today is to change MkTF ( ) where xcons ys = x : ys to take a proxy argument: data Proxya = Proxy -- in GHC’s Data.Proxy Is b an alias for a (legal) or for Int (illegal)? Since a and Int are data TF where equal, the question does not really make sense. We therefore MkTF :: a. Typeablea Proxya Fa TF propose to simply drop this restriction (Section 4.3). ∀ ⇒ → → toDouble MkTF :: Proxya x = ... ( ( ) ) 3.6 Pattern signatures are inadequate The Proxy type stores no runtime information (at runtime, We end our growing list of infelicities with a case in which it is isomorphic to ), but the type Proxya carries the all- () there is no way whatsoever to bind the type variable, short important type variable a. All datatypes are injective, so we of changing the definition. can use this proxy argument to bind the type variable a in a Type families [Chakravarty et al. 2005; Eisenberg et al. way that we could not do previously. 2014] allow users to write type-level functions and encode As with many other examples, this is once again unsatisfy- type-level computation. For example, we might write this: ing: it is a shame that we have to modify the data constructor declaration just to deal with type variable binding.

4This ordering arises because any type variable in a pattern signature is 5If F were an injective type family, we could label it as such to fix the bound within the pattern. problem [Stolarek et al. 2015]. But here we assume that it is not. 6 Richard A. Eisenberg, Joachim Breitner, and Simon Peyton Jones

3.7 Conclusion Term variables x, y, z, f , g, h ∋ In this section we have seen that pattern signatures allow Internal type vars a, b us to bring into scope the existentially-bound type variables ∋ User type vars c of a data constructor, but that doing so can be clumsy, and ∋ Data constructors K occasionally impossible. We need something better. ∋ Atoms ν F K x | 4 Visible type application in patterns Expressions e F ν λx.e e1 e2 case e of p e Consider again Elab from Section 3.4: | | | { → } Patterns p F x K p | data Elab where Polytypes σ F a. τ MkElab :: Showa Maybe Tree a, Int Elab ∀ ⇒ [ ( ( ))] → Monotypes τ,υ F tv Tτ ... and suppose we want to build a value of type Elab containing | | Type variables tv F a c an empty list. We cannot write just MkElab because that | [] Type env Γ F ϵ Γ,ν : σ is ambiguous: we must fix the type at which MkElab is called | so that the compiler can pick the right Show dictionary. We Substitutions θ F tv τ can use a type signature, but it is clumsy, just as the pattern [ 7→ ] Types of data cons Γ0 = K: a. σ T a signature was clumsy in Section 3.4: ∀ → MkElab :: Maybe Tree Bool, Int ([ ] [ ( ( ))]) Figure 2. The initial grammar It is much nicer to use visible type application and write MkElab @Bool . So it is natural to ask whether we could [] do the same in patterns, like this: as an alternative to getE :: Elab Int main = do Just x :: Int readMaybe ‘fmap‘ getLine → ( ( )) ← getE MkElab @a xs = ...a ... putStrLn $ "Input was " ++ showx ( ) Here, we bind a directly, as a type-argument pattern all by Visible type application in patterns considers the type of itself, rather than indirectly via a pattern signature. data constructor, exactly as written by the user. For example We call this visible type application in patterns, a dual of visible type application in the same way that a pattern sig- data Gab where natures are a dual of type signatures. This section describes G1 :: b. Char G Intb ∀ → visible type application in patterns informally, while the next G2 :: pqb . p q b G p, q b ∀ → → → ( ) formalizes it. G3 :: pqab . a p, q p q b Gab 6 ∀ ( ∼ ( )) ⇒ → → → This feature was first requested more than two years ago. f :: Ga Bool Int → Furthermore, binding type variables like this is useful for f G1 @Booly = ordy more than just disambiguation, as we will shortly see. ( ) f G2 @p @qxyz = 0 ( ) f G3 @p @q @a @Boolxyz = 1 4.1 Examples ( ) Visible type application in patterns immediately fixes the In this definition other problems of pattern signatures identified above. For G1 has one type argument. • example, in the GADT example of Section 3.4 we can write G2 has three type arguments, but we have chosen to • matchGM :: a GMa Bool match only the first two. → → G3 is morally identical to G2, because of the equality, matchGMx MkMaybe @b = isJust @bx • ( ) but it is written with four type arguments, and visible and for the type-family example of Section 3.6 we write type application in patterns follows that specification. toDouble MkTF @ax = ... ( ) 4.3 Type aliases 4.2 Universal and existential variables In Section 3.5 we have seen that GHC currently restricts type Visible type applications in patterns can be used for all the variables to refer to type variables, but that this does not type arguments of a data constructor, whether existential or have to be the case. Similar questions arise in our function f universal. As an example of the latter we may write above. Could we write this for G3? main = do Just @Intx readMaybe ‘fmap‘ getLine f G3 @p @q @ p, q @bxyz = 1 ( ) ← ( ( ) ) "Input was " ++ putStrLn showx Instead of a we have written p, q , which is equal to a. And ( ) 6https://ghc.haskell.org/trac/ghc/ticket/11350 instead of Bool we have written b, thereby binding b to Bool. Type variables in patterns 7

Γ e : τ Expression typing ⊢ Γ e : υ c = ftv υ ⊢ ( ) Γ, x : τ1 e : τ2 Γ e1 : τ1 τ2 Γ e2 : τ2 Γ, x : c.υ e2 : τ ⊢ Abs ⊢ → ⊢ App ∀ ⊢ Let Γ λx.e : τ τ Γ e e : τ Γ let x :: υ = e in e : τ ⊢ 1 → 2 ⊢ 1 2 2 ⊢ 2 i ν : tv.υ Γ Γ e : υ Γ p pi : υ Γ′ Γ′ ei : τ ( ∀ ) ∈ VarCon ⊢ ⊢ ⇒ i i ⊢ Case Γ ν : tv τ υ Γ case e of p e i : τ ⊢ [ 7→ ] ⊢ { i → i }

Γ p : σ Γ Pattern typing ⊢p ⇒ ′ k i k i i K: ai . σk T ai ΓΓ p∗ pk : ai τi σk Γ′ PatVar ( ∀ → ) ∈ ⊢ [ 7→ ] ⇒ PatCon98 Γ p x : σ Γ, x : σ Γ K p k :Tτ i Γ ⊢ ⇒ ⊢p k i ⇒ ′

Γ p : σ i Γ Pattern sequence typing ⊢p∗ i i ⇒ ′ i 1..n Γi 1 p pi : σi Γi ∈ − ⊢ ⇒ i 1..n PatSeq Γ p : σ ∈ Γ 0 ⊢p∗ i i ⇒ n Figure 3. Typing of Haskell98 patterns

Given the ubiquity of equalities, it no longer seems to in Fig. 3. In the typing rules, we use a convention where an make sense to restrict what a scoped type variable can stand over-bar indicates a list, optionally with a superscript index for, so we propose simply to drop the restriction. Doing so to indicate the iterator. Iterators are additionally annotated simplifies the specification and the implementation of both with length bounds, where appropriate. pattern signatures and visible type application in patterns. The grammar includes separate metavariables for internal A GHC proposal by one of the authors [Breitner 2018] is type variables a and user type variables c. The former are underway. Relaxing the requirement also allows the user to type variables as propagated by the compiler, while the latter use type variables as “local type synonyms” that stand for are type variables the user has written. It is as if internal type possibly long types: variables a are spelled with characters unavailable in source Haskell. This distinction becomes important in Section 5.5. processMap :: Map Int Maybe Complex Type ... ( ( )) → The language also includes only annotated let-bindings; no processMap m :: Map key value = ... ( ) let-generalization here. (The “generalization” you might spot in Rule Let is simply quantifying over the variables the user 5 Formal Specification has lexically written in the type signature.) This keeps our treatment simple and avoids the challenges of type infer- We give the first formal specification of a number of exten- ence. Allowing full let-generalization and un-annotated lets sions to Haskell related to pattern matching and the scoping changes none of the conclusions presented here. of type variables: annotations in patterns, scoped type vari- The judgment Γ e : τ indicates that the term e has type τ ables, and type application syntax in patterns. This section ⊢ in the context Γ, where Γ is a list of term variables and their builds up these specification step by step, starting with a (possibly polymorphic) types. Data constructors are globally specification of the language without these features. fixed in an initial context Γ ; it is assumed that any context Our specification does not cover let-bindings and declara- 0 Γ contains the global Γ binding data constructors. tion type signatures. We focus instead on pattern signatures, 0 The type-checking of possibly nested patterns, as they which is where our new contribution lies. The scoping of occur in a case statement, is offloaded to the judgment Γ p : forall-bound type variables from declaration type signatures ⊢p σ Γ , which checks that p is a pattern for a value of type σ would be straightforward to add. ⇒ ′ and possibly binds new term variables, which are added to Γ and returned in the extended environment Γ . The auxiliary 5.1 The baseline ′ judgment Γ p : σ i Γ straightforwardly threads the We begin with a reduced model of Haskell98 terms, which ⊢p∗ i i ⇒ ′ environment through a list of such pattern typings. knows nothing yet about scoped type variables nor type These rules should be unsurprising, but provide a baseline equalities (i.e., no GADTs). We also removed con- from which to build. straints. The syntax is given in Fig. 2, and the typing rules are 8 Richard A. Eisenberg, Joachim Breitner, and Simon Peyton Jones

Updates to grammar: T in the return type are no longer confined to be a, the quanti- Constraints Q F ϵ Q Q τ τ ... | 1 ∧ 2 | 1 ∼ 2 | fied type variables; instead they can be arbitrary monotypes. Polytypes σ F a. Q τ In addition, a constructor can include a constraint Q. ∀ ⇒ Type env Γ F ϵ Γ,ν : σ Γ, a : Γ, Q When a data constructor is used in an expression, then the | | ∗ | type equalities must be satisfied in the current environment, Types of data cons Γ0 = K: a. Q σ Tτ as expressed by the new premise of Rule VarConQ in Fig. 4. ∀ ⇒ → Constraint entailment Γ ⊩ Q We see also that the type equalities in the environment can Γ e : τ Expression typing be used for implicit coercion, as expressed in the Rule Eq. ⊢ When pattern-matching a data constructor, Rule PatCon ν : a. Q υ Γ Γ e : τ1 ( ∀ ⇒ ) ∈ ⊢ brings the type variables a into scope, by extending Γ. We Γ ⊩ a τ Q Γ ⊩ τ1 τ2 [ 7→ ] VarConQ ∼ Eq require that these bound variables are fresh with respect to Γ ν : a τ υ Γ e : τ ⊢ [ 7→ ] ⊢ 2 other variables in scope, a requirement we can satisfy by i α-renaming if necessary. We also add the type equalities that Γ p : υ Γ Γ e : τ ⊢p i ⇒ i′ i′ ⊢ i we have learned to the environment—that is, the equivalence Γ e : υ ftv τ dom Γ j j ⊢ ( ) ⊆ ( ) between the υj from the data constructor’s type and the τj i CaseTv Γ case e of pi ei : τ from the pattern type. ⊢ { → } Finally, we update Rule CaseTv to prevent skolem escape Γ, a : e : c a υ and Rule LetTv to track the internal variables brought into ∗ ⊢ [ 7→ ] c = ftv υ a # dom Γ scope. Note that these variables are internal only—the user ( ) ( ) Γ, x : c.υ e2 : τ cannot write them in a program. ∀ ⊢ LetTv At this point, our type system is comparable in expres- Γ let x :: υ = e in e2 : τ ⊢ siveness to the specification given by Vytiniotis .et al [2011]. A notable difference is that we explicitly handle nested pat- Γ p : σ Γ′ Pattern typing ⊢p ⇒ terns. This is important, as in the presence of GADTs, the i j K: a. Q σi Tυj Γ a # dom Γ precise formulation of how nested pattern are type-checked ( ∀ ⇒ j → ) ∈i ( ) Γ, a : ,υj τj , Q p∗ pi : σi Γ′ matters. For example, consider: ∗ ∼ ⊢ ⇒ PatCon Γ K p i :Tτ j Γ ⊢p i j ⇒ ′ data Ga where G1 :: G Bool Figure 4. Adding support for GADTs G2 :: Ga f :: Ga , a, Ga Bool ( ) → 5.2 Support for GADTs f G1, True, = False ( ) Now we extend this language with support for GADTs, with f , True, G1 = False ( ) their existential type variables and equality constraints. See Here the first equation for f is fine, but the second is not, Fig. 4. The term syntax is unchanged, but polytypes now can because the pattern True cannot match against an argument mention constraints, which can either be empty (and elided of type a until after the constructor G1 has been matched— from this text), an equality between two monotypes, or a and matching in Haskell is left-to-right. conjunction of constraints. We leave the possibility open for additional constraints, as indicated by the ellipsis. 5.3 Treating universals and existentals uniformly The environment Γ is extended with two new forms. First, A technical contribution of this paper is that Rule PatCon is we track the scope of type variables by adding a : to ∗ simpler and more uniform than the one usually given [e.g. Γ7. Second, we add constraints Q to Γ, to indicate which constraints (bound by a GADT pattern match) are in scope. by Vytiniotis et al. 2011], in that it does not distinguish the universal and existential type variables of the data constructor. Conversely, constraints are proved by the Γ ⊩ Q entailment relation. As type inference and entailment is not the subject Instead, all the type variables are freshly bound, with the equalities υ τ j linking them to the context. In particular, of this paper, we leave this relation abstract. The concrete j ∼ j instantiation of this judgment by, e.g., that of Vytiniotis et al. these equalities take the place of the substitution written in [2011] would be appropriate in an implementation of this the previous Rule Con98. type system. However, there is a worry: pattern-matching involving Support for GADTs can be seen in the new form of data GADTs lacks principal types, and hence usually requires constructor types, listed in Fig. 4. Note that the arguments to a type signature (see Section 2.1). If we treat vanilla, non- GADT Haskell98 data types in the same way as GADTs, do 7Haskell supports higher kinds, but we elide that here for simplicity, and we lose type inference for ordinary Haskell98 definitions? assume that all type variables have . Specifically, Vytiniotis et. al [2011, Section 5.6.1] describe ∗ Type variables in patterns 9

Patterns p ... p :: σ σ = c. Q υ ftv σ dom Γ F ∀ ⇒ ( ) ⊆ ( ) | Γ, c : , Q e : υ Γ, x : σ e2 : τ ∗ ⊢ ⊢ LetForall ftv σ ′ = Γ ⊩ σ σ ′ Γ p p : σ ′ Γ′ Γ let x :: σ = e in e : τ ( ) ∅ ≤ ⊢ ⇒ PatSig ⊢ 2 Γ p p :: σ ′ : σ Γ′ c = ftv σ dom Γ Γ = Γ, c : , c b ⊢ ( ) ⇒ ( ′) \ ( ) ′ ∗ ∼ Γ′ ⊩ σ σ ′ Γ′ p p : σ ′ Γ′′ Figure 5. Syntax and typing rule for pattern signatures ≤ ⊢ ⇒ PatSigTv Γ p :: σ : σ Γ ⊢p ( ′) ⇒ ′′ how assumed local constraints can interfere with type in- Figure 6. Typing with scoped type variables ference, essentially by making certain unification variables “untouchable” (that is, unavailable for unification). That sec- tion also describes how to make more unification variables relationship is backwards from the usual expected/actual touchable in the non-GADT case, when the constraints entail relationship in typing because patterns are in a negative position. The subtleties of of polytype subtyping are well ex- no equalities. But our typing rule introduces equalities even 9 in the non-GADT case, so this mechanism fails for us. plored in the literature and need not derail our exploration Let us investigate Rule PatCon specialized to the case of here. However, note that Rule PatSig checks pattern p with an ordinary, non-GADT constructor, which binds no context the annotated type σ ′, not the more general σ—after all, the σ Q and does not constrain its result type arguments: user has asked us to use ′. j i j K: aj . σi T aj Γ a # dom Γ 5.5 Scoped type variables ( ∀ j → j ) ∈ i ( ) Γ, aj : , aj τj p∗ pi : σi Γ′ Next, we add support for open pattern signatures that bring ∗ ∼ ⊢ ⇒ PatCon98’ i j type variables into scope. We need not change any syntax. Γ p K pi :Tτj Γ′ ⊢ ⇒ Only the typing rule for annotated patterns changes: we We see that all of its assumed equality constraints take the replace PatSig with PatSigTv and add LetForall in Fig. 6. form a τ , where a is freshly bound. We can view such j ∼ j j The LetForall rule allows programmers to bring vari- equalities not as true assumed equalities (which lead to the ables c into scope when an explicit is mentioned in the type inference problems for GADTs), but instead as a form source. Note that this rule does not do∀ any implicit lexical of local -binding: the context simply gives us the defi- let generalization: echoing GHC’s behavior, if the user writes a nition of these type variables. In this interpretation, it is , all new variables to be used in the type signature must be critical that the type variable in the equality assumption is bound∀ explicitly. freshly bound—that is, we are not referring to a type variable In Rule PatSigTv, the last two premises are identical to from a larger scope. Viewing the equalities in Γ as -like, ′ let those of its predecessor PatSig. The first premise extracts it is sensible to extend the ad-hoc extension of Vytiniotis the type variables c that are free in the user-written type et al. [2011] to include such forms. Indeed, doing so is an signature σ , but not already in scope in Γ. The “not already independently-useful improvement to type inference, and ′ in scope” part reflects the discussion of Section 3.1. GHC has already adopted it, in response to a request8 from But what if Γ contains a binding, introduced by rule Pat- one of this paper’s authors. Thus, despite the addition of Con, for a type variable that just happens to have the same equalities in Rule PatCon, we do not have a negative effect name as one of the c in a user-written signature? After all, on type inference. the names of the type variables in PatCon are arbitrary inter- 5.4 Closed pattern signatures nal names; they just need to be fresh. Our solution is simple: we take advantage of the difference between internal type Our next step is to formalize PatternSignatures, which allows variables and external ones. The user cannot accidentally the user to annotate patterns with type signatures, but for capture an internal variable. now we will only handle pattern signatures. We simply closed The second (top-right) premise of Rule PatSigTv is the add one new typing rule PatSig, shown in Fig. 5. Note that most unusual. It brings the variables c into scope, but then the user is allowed to give the pattern a more specific type, also assumes that each variable c equals some internal vari- as in this example, which requires RankNTypes: able b. (Nothing stops multiple elements of b from being f :: a. a a Int ( → ) → the same internal variable.) Strikingly, the b are mentioned f x ∀:: Int Int = x 42 ( → ) nowhere else in the rule. The existence of the b in this The typing rule expresses this through the premise Γ σ premise essentially says that the c are merely a renaming of ⊩ ≤ σ ′, an appeal to the subtype relation on polytypes. This sub- existing in-scope internal variables. In practice, the b are cho- type relationship checks that the expected type of the pattern sen in order to make the subtyping relationship Γ σ σ ′ ⊩ ≤ ′ σ is more general than the annotated type σ ′. Note that this 9GHC’s current implementation of subtyping is described by Eisenberg 8https://ghc.haskell.org/trac/ghc/ticket/15009 et al. [2016], who also cite other relevant publications on the subject. 10 Richard A. Eisenberg, Joachim Breitner, and Simon Peyton Jones

Patterns p F x K @τ p p :: σ This version supports type applications only in constructor | | patterns; we study pattern synonyms in Appendix A. The syntax and new typing rule are shown in Fig. 7. Rule Pat- j k i ConTyApp looks scary, but it just integrates the concepts K: aj . Q σk Tυi Γ ( ∀ j⇒ → ) ∈ seen in Rule PatSigTv into Rule PatCon. We have kept all c l = ftv τ dom Γ a j # dom Γ l ( j′ ) \ ( ) j ( ) the iteration indices to help the reader match up which lists j l l i Γ′ = Γ, a : , c : , c b ,υ τ , Q are expected to have the same size. j ∗ l ∗ l ∼ l i ∼ i j Let us look at each premise separately: Γ′ ⊩ τj′ aj ∼ k l Γ′ ∗ p : σ Γ′′ Once again, the type variables c are those that occur ⊢p k k ⇒ • l j PatConTyApp in the explicit type patterns but are not yet in scope. Γ K @τ p k :Tτ i Γ ⊢p j′ k i ⇒ ′′ These are treated like type variables in a pattern sig- nature: they are brought into scope here, each as a Figure 7. Typing of type applications in patterns short-hand for some internal type variable bl. The environment Γ is extended to Γ′ and contains • j hold; GHC checks this subtyping relationship, unifying the c now the (internal) type variables aj , the user-written l with internal variables b as necessary. Because the subtyping scoped type variables cl , the type equations equat- relationship is checked with respect to a context that con- ing each cl to its internal type variable bl, the GADT equalities υ τ i, and the constraint Q captured by K. tains the c b equalities, the b do not need to be explicitly i ∼ i ∼ The type patterns are checked against the types they mentioned again in the rule. For example, consider • match against. In contrast to pattern signatures, we ExIntegral Integral data where -- packs an value use type equality here ( ), not the subtyping relation ∼ MkEx :: a. Integrala a ExIntegral ( ): no types involved can be polytypes, and so the ∀ ⇒ → ≤ getInt :: ExIntegral Integer subtyping relation degenerates to type equality. → getInt MkEx x :: c = toInteger @cx ( ( )) As written here, the rule requires a type application for The pattern match on MkEx brings an internal existential j each type variable (note that the @τj′ use the same indices variable a into scope, via the PatCon rule. Recall that the user j as the quantified type variables aj in K’s type). However, cannot type the name of such a variable. Instead, the user we can weaken this requirement simply by dropping some annotates the pattern x with the user-written type variable c. τj′s from both the conclusion and the relevant premises. This annotation triggers Rule PatSigTv, which must find an Just as in Rule SigPatTv: internal variable b such that a : , c : , c b a c. The ∗ ∗ ∼ ⊩ ≤ l answer is that we must choose b to be equal to the variable The bl are mentioned nowhere else in the rule; instead, • a, and the rule succeeds. We have thus renamed the internal they are fixed such that the equality constraints for variable a to become the user-written variable c and can j the τj′ are entailed by Γ′. successfully use c in the pattern’s right-hand side. The rule requires that each user-written type variable • Contrast that behavior with this (failing) example: stands for an internal variable, but this choice could l l notAVar :: Int Int readily be changed by writing υ instead of b in the → l′ l notAVar x :: c = x definition of Γ′. ( ) Here, we are trying to bind a user-written type variable c to 5.7 Conclusion Int. GHC rejects this function, saying that c does not match with Int. In terms of Rule SigPatTv, there exists no b such Through the incremental building of rules, we can see pre- that c : , c b Int c holds. cisely how the new feature of explicit binding sites for type ∗ ∼ ⊩ ≤ There is a free design choice embodied in Rule SigPatTv: variables fits into the existing typing framework. We have the rule asserts that each c must be a renaming of a type also explored two further extensions: variable. Instead, we could replace c b with c τ , allowing Allowing type application in patterns headed by pat- ∼ ∼ • each type variable to rename a type. Nothing else in the tern synonym [Pickering et al. 2016]. Our framework system would have to change. Indeed, understanding this extends well in this new context, offering no surprises very fact is one of the primary motivators for writing this (Appendix A). specification in the first place. Incorporating explicit binding sites for type variables • in the patterns of a λ-expression. This is slightly subtler 5.6 Type applications in patterns (though the end result adds only one, simple typing Having nailed down the status quo, it is now easy to specify rule), but is relegated to the appendix because it re- what it should mean to use type applications in patterns. quires reasoning about bidirectional type checking. Type variables in patterns 11

Bringing all the necessary context into scope would that deconstructing a datatype resembles closely the take us too far afield here (Appendix B). syntax of constructing one. We thus prefer not to differentiate universals and existen- 6 Alternative approaches tials in this way. 6.1 Universals vs. existentials 6.2 The type-lambda approach Type theorists are wont to separate quantified type vari- A plausible alternative approach to adding scoped type vari- ables in data constructors into two camps: universals and ables is to take a hint from System F, the explicitly-typed existentials. Here is a contrived but simple example: polymorphic lambda calculus [Girard 1990]. In System F, a data UnivExa where type lambda, written “Λ”, binds a type variable, just as a term MkUE :: ab . a b UnivExa lambda, written “λ”, binds a term variable. For example: ∀ → → matchUE :: a. UnivExa ... id : α. α α → ∀ → matchUE MkUE∀ xy = ... id = Λα. λx : α. x ( ) A term Λα.e has type α.τ , for some type τ , just as a term In the constructor MkUE, the variable a is universal (it is λx.e has type τ τ .∀ Hence, a very natural idea is to bind a fixed by the return type UnivExa ) while b is existential (it is 1 → 2 source-language type variable with a source-language type not fixed by the result type). When we match on MkUE in lambda. This “the type-lambda approach” is the one adopted matchUE, we might want to bind b, as it is first brought into by SML 97 [Milner et al. 1997]. In SML one can write: scope by the match. However, we never need to match a, as it is already in scope from matchUE’s type signature. fun 'a prefix (x : 'a) yss = An alternative design for type applications in patterns is let fun xcons (ys : 'a list) = x :: ys in to allow matching only existentials in pattern matches, thus: map xcons yss Here, “’a” following the keyword fun is the binding site of matchUE :: a. UnivExa .... an (optional) type parameter of prefix; it scopes over the ∀ → matchUE MkUE @bxy = ... patterns of the definition and its right hand side. ( ) Just as Haskell has implicit quantification in type signa- Indeed, this forms the main payload of the original GHC pro- tures, SML allows the programmer to introduce implicit type posal for binding type variables [Suarez 2017]. This design is lambdas. This definition is elaborated into the previous one: attractive because the bindings would be concise: only those variables that need to be bound would be available. However, fun prefix (x : 'a) yss = there are two distinct drawbacks: let fun xcons (ys : 'a list) = x :: ys in map xcons yss Given the complexity of Haskell, it can be hard to spec- • The language definition gives somewhat intricate rules to ify the difference between universals and existentials: explain how to place the implicit lambdas. For example: Clearly, a is universal in the constructor MkUE above. But what if its type were MkUE :: ab . a b fun f x = ....(fun (y:’a) => y).... → → UnivEx Ida , where Id is a type synonym?∀ An injec- Where is the type lambda that binds the type variable ’a? ( ) tive type family? If we add a b to the constraints In SML one cannot answer that question without knowing ∼ of MkUE, then b is also fixed by the result type—does both what the “....” is, and the context for the definition fun f. that make it a universal? Roughly speaking, the type lambda for an implicitly-scoped The question of whether the value of a type variable type variable ’a is placed on the innermost function defi- is fixed by the return type depends on how smart the nition that encloses all the free occurrences of ’a. The rule compiler is, and any specification would have to draw [Milner et al. 1997] is only one informal, albeit carefully an arbitrary line. In the end, this would leave our users worded, paragraph; the formal typing rules assume that a just very confused. pre-processing pass has inserted an explicit binding for every When using a data constructor in an expression, the type variable that is implicitly bound by the above rule. • caller is free to instantiate both universals and exis- The type-lambda approach explicitly connects lexical scop- tentials. Indeed, universals and existentials are utterly ing and quantification. In contrast, our approach decouples indistinguishable in expressions. That means that one the two, by treating a lexically scoped type variable merely might write MkUE @Int @Bool 5 True in an expres- as an alias for a type (or type variable). sion. If we could match against only existentials in patterns, though, we would have to write a pattern Acknowledgments MkUE @bxy , remembering to skip the universal a. This material is based upon work supported by the Na- This would both be confusing to users and weaken the tional Science Foundation under Grant No. 1319880, Grant ergonomics of pattern matching, whose chief virtue is No. 1521539, and Grant No. 1704041. 12 Richard A. Eisenberg, Joachim Breitner, and Simon Peyton Jones

References Joachim Breitner. 2018. Allow ScopedTypeVariables to refer to types. GHC proposal. (2018). https://github.com/ghc-proposals/ghc-proposals/pull/ 128 Joachim Breitner, Richard A. Eisenberg, Simon L. Peyton Jones, and Stephanie Weirich. 2016. Safe zero-cost coercions for Haskell. Jour- nal of Functional Programming 26 (2016), e15. https://doi.org/10.1017/ S0956796816000150 Manuel M. T. Chakravarty, Gabriele Keller, Simon L. Peyton Jones, and Simon Marlow. 2005. Associated types with class. In POPL. ACM, 1–13. https://doi.org/10.1145/1040305.1040306 Richard A. Eisenberg, Dimitrios Vytiniotis, Simon L. Peyton Jones, and Stephanie Weirich. 2014. Closed type families with overlapping equa- tions. In POPL. ACM, 671–684. https://doi.org/10.1145/2535838.2535856 Richard A. Eisenberg, Stephanie Weirich, and Hamidhasan G. Ahmed. 2016. Visible Type Application. In ESOP (LNCS), Vol. 9632. Springer, 229–254. https://doi.org/10.1007/978-3-662-49498-1_10 Extended version at https://cs.brynmawr.edu/~rae/papers/2016/ type-app/visible-type-app-extended.pdf. J-Y Girard. 1990. The System F of variable types: fifteen years later. In Logical Foundations of Functional Programming, G Huet (Ed.). Addison-Wesley. Robin Milner, Mads Tofte, Robert Harper, and David MacQueen. 1997. The Definition of Standard ML (Revised). MIT Press, Cambridge, Mas- sachusetts. xii+114 pages. Simon L. Peyton Jones, Dimitrios Vytiniotis, Stephanie Weirich, and Mark Shields. 2007. Practical type inference for arbitrary-rank types. Journal of Functional Programming 17, 1 (2007), 1–82. https://doi.org/10.1017/ S0956796806006034 Simon L. Peyton Jones, Stephanie Weirich, Richard A. Eisenberg, and Dim- itrios Vytiniotis. 2016. A Reflection on Types. In A List of Successes That Can Change the World - Essays Dedicated to Philip Wadler on the Occasion of His 60th Birthday (LNCS), Vol. 9600. Springer, 292–317. https://doi.org/10.1007/978-3-319-30936-1_16 Matthew Pickering, Gergo Érdi, Simon L. Peyton Jones, and Richard A. Eisenberg. 2016. Pattern synonyms. In Haskell Symposium. ACM, 80–91. https://doi.org/10.1145/2976002.2976013 Tom Schrijvers, Simon L. Peyton Jones, Martin Sulzmann, and Dimitrios Vytiniotis. 2009. Complete and decidable type inference for GADTs. In ICFP. ACM, 341–352. https://doi.org/10.1145/1596550.1596599 Jan Stolarek, Simon L. Peyton Jones, and Richard A. Eisenberg. 2015. In- jective type families for Haskell. In Haskell Symposium. ACM, 118–128. https://doi.org/10.1145/2804302.2804314 Emmanuel Suarez. 2017. Binding existential type variables. GHC proposal. (2017). https://github.com/ghc-proposals/ghc-proposals/pull/96 Dimitrios Vytiniotis, Simon L. Peyton Jones, Tom Schrijvers, and Martin Sulzmann. 2011. OutsideIn(X) Modular type inference with local as- sumptions. Journal of Functional Programming 21, 4-5 (2011), 333–412. https://doi.org/10.1017/S0956796811000098 Type variables in patterns 13

Pattern synonyms P constraint listed is the required one. (When only one con- ∋ straint is written, we understand that it is the required one, Expressions e F ... P | not the provided one.) Patterns p F ... P @τ p Because of the two contexts, our old Rule VarConQ is | insufficient to deal with pattern synonyms used in an expres- Pattern synonym types Γ1 = P: a. Qr b. Qp σ υ ∀ ⇒ ∀ ⇒ → sion. We thus introduce the straightforward Rule Syn. Corresponding to the two contexts, a pattern synonym i Γ e : τ Expression typing has two sets of type variables, the universals ai and the ⊢ j P: a. Qr b. Qp υ Γ existentials bj . The universals are those that are mentioned ( ∀ ⇒ ∀ ⇒ ) ∈ θ = a τ b τ ′ in the result type υ; they can be fixed by comparing υ with τ , [ 7→ ][ 7→ ] 10 Γ ⊩ θ Qr Qp the known type of the pattern . In contrast, the existentials ( ∧ ) Syn Γ P: θ υ are simply brought into scope by the pattern; they match no ⊢ ( ) other known type (except as entailed by the constraints Qp ). Let us work premise-by-premise through Rule PatSyn. Γ p : σ Γ Pattern typing ⊢p ⇒ ′ i j k P: ai . Qr bj . Qp σk υ Γ We first look up the pattern synonym type in theen- ( ∀ ⇒ ∀ i ⇒ → ) ∈ • Γ = Γ, i, υ i Γ vironment. ′ ai : ai i′ ai # dom i ∗ ∼ ( ) We then bring the universals ai into scope, allowing Γ′ ⊩ υ τ Qr • ( ∼ )i ∧ j j each universal variable a to equal some other type υ . l = τ , τ Γ # Γ i i′ cl ftv i′ j′′ dom bj dom We can think of these υ as the instantiations for the ( j ) \ ( ) l ( ) i′ Γ = Γ , , l, ,Q ′′ ′ bj : cl : cl bl′ p variables ai. i∗ ∗ ∼ Γ τ Under these assumptions (that is, after instantiating ′′ ⊩ i′ ai • ∼ j i universals with arbitrary monotypes υ′ ), we must Γ′′ ⊩ τj′′ bj i ∼ k ensure that the expected type of the pattern τ matches Γ′′ p∗ pk : θ σk Γ′′′ ⊢ ( ) ⇒ PatSyn the declared result type of the pattern synonym υ. We i j k also must ensure that the required constraintQ indeed Γ p P @τ ′ @τ ′′ pk : τ Γ′′′ r ⊢ i j ⇒ holds. Then, we collect the newly scoped type variables c l. Figure 8. Support for pattern synonyms • l We extend the context yet again, bringing the exis- • j tentials bj into scope, introducing the user-written l A Pattern synonyms scoped type variables cl , asserting that each cl is actu- ally a renaming of an internal variable b , and assum- Pattern synonym [Pickering et al. 2016] types differentiate l′ ing the provided constraint Q . (As before, we could universal type variables from existential type variables, and p reasonably change the b to a υ to allow scoped type thus add superficial complexity to our rules. Happily, this l′ l′′ variables to stand for types instead of variables.) complexity really is just superficial—the underlying mecha- We now check that the type patterns corresponding nism we have laid out applies well in this new scenario. See • i i Fig. 8. to universals τi′ indeed match the universals ai . We also check that the type patterns corresponding to A pattern synonym type has two contexts: the Qr required • j j constraint and the Qp provided constraint. Using a pattern existentials τj′′ match those existentials bj . synonym in a pattern requires that the required constraint Finally, we use this extended context to check the pat- • k already be provable; in contrast, the provided constraint may tern arguments pk . be assumed in the pattern’s right-hand side. The provided constraint is the analogue of the constraint in a data con- Though there are lots of details here, the only new piece structor’s type, while the required constraint has no data is the check for the required constraint Qr . This new check constructor analogue. An example can help to demonstrate. fits nicely within our framework. If we define pattern Three = 3 and then write f :: T ... → f Three = ... we must be able to prove both that NumT and EqT hold before the pattern match can type-check. Accordingly, the 10There are some subtleties around the difference between universals and pattern Three gets type a. Eqa , Numa a, where the existentials. Pickering et al. [2016, Section 6.3] explore these challenges. ∀ ( ) ⇒ 14 Richard A. Eisenberg, Joachim Breitner, and Simon Peyton Jones

B Binding type variables in function Expressions e F ... λ@c.e definitions | B.1 Motivation and examples Γ σ s∗b e Checking against specified polytypes It is relatively easy to extend the ideas in our paper to al- ⊢ ⇐ Γ, c : ∗ e a c σ low type variable binding in function definitions. We can ∗ ⊢sb ⇐ [ 7→ ] SB_DTyAbs Γ ∗ λ@c.e a. σ motivate this idea using an early example, repeated here: ⊢sb ⇐ ∀ prefix :: a. a a a → [[ ]] → [[ ]] Figure 9. Binding type variables in λ-expressions prefixx yss∀ = map xcons yss where xcons :: a a [ ] → [ ] By writing a type signature for fmap (using the InstanceSigs xcons ys = x : ys extension), we can bring a and b into scope to use as visible The type signature includes a , which brings the type vari- type arguments to the coerced fmap. As with the examples able a into scope in the body of∀ prefix. We find this syntax in our motivation section, this workaround is infelicitous, design awkward however: a traditional reading of the syn- requiring enabling yet another extension and writing out an tax of tells us that a is in scope in a a a otherwise-redundant type, just to bind type variables. → [[ ]] → [[ ]] and nowhere∀ else. Indeed, we should be able to α-vary the With the ability to explicitly bind type variables, this in- name a in that type signature (and only that type signature) stance is simplified to become without any external effect. Yet, it is not so, with a’s scope instance Functor MyList where extending beyond that type signature. fmap @a @b = coerce fmap @ @a @b Instead, we find it would be more natural (if, admittedly, ( [] ) a bit more verbose) to write RankNTypes Further motivation arises in the context of prefix @ax yss = ... higher-rank types [Peyton Jones et al. 2007]. Suppose we have explicitly binding the type variable a in a pattern. hr :: a. Fa Fa ... (∀ → ) → GeneralizedNewtypeDeriving This issue is not always where F is a non-injective type family. At a call site of hr, so simple, however. Consider GHC’s implementation of the ...hr λx ...... , there is no way to bind the type variable ( → ) extension GeneralizedNewtypeDeriving, which uses coerce [Bre- a, which is used only as the argument to non-injective type itner et al. 2016] to translate the implementation of a class family. Currently, the only workaround here is to add a Proxy method from one type to another. Here is an example: argument to hr’s argument type, allowing the client of hr to bind the type variable in a pattern annotation on the proxy newtype MyLista = MkMyList a [ ] argument. deriving Functor Instead, our design allows users to bind the type variable How should we write our derived Functor instance? A first explicitly, with ...hr λ@ax ...... ( → ) attempt might look like this: B.2 Specification Functor MyList instance where The ability to bind type variables makes sense only when fmap = coerce fmap @ ( [ ]) we know the type of the function we are defining; this type This, unfortunately, fails to compile. The problem is that tells us how the type variable relates to the other parameters the use of coerce is ambiguous. GHC can determine that the and the result of the function. Note that the analogue of this argument to coerce has type a0 b0 a0 b0 requirement is always true for constructor patterns, where ( → ) → [ ] → [ ] (the lists in there come from the use of @ ) and that the we can look up the type of the constructor in the environ- [] result must have type a b MyLista MyListb . ment. We thus must proceed by considering a bidirectional ( → ) → → However, GHC has no way to know that the undetermined a0 approach to type-checking, where explicitly track whether and b0 in coerce’s argument’s type should be the expected we are inferring a type or checking one. Specifically, we will a and b in the result type. Perhaps GHC could guess this build on one author’s previous work [Eisenberg et al. 2016], correspondence, but GHC refuses to make guesses. which sets the stage for our new extension here. Readers The way to get this to work is to bring a and b into scope, may want to consult Fig. 8 of that previous work in order to like this: understand our new contribution. Our addition to the previous judgment is in Fig. 9. The ⊢s∗b instance Functor MyList where judgment checks an expression against a polytype con- ⊢s∗b fmap :: ab . a b MyLista MyListb taining outermost binders. The original judgment comprises ∀ ( → ) → → fmap = coerce fmap @ @a @b only one rule, which skolemizes all these variables into fresh ( [] ) Type variables in patterns 15 type constants. With our addition, it is understood that the however, SB_DeepSkol applies, which will skolemizes a2, original rule, termed SB_DeepSkol, fires only when the new removing an opportunity to bind it. Accordingly, while the one does not apply. Our new rule simply brings c into scope example above is accepted by our addition, this one would as an alias for the internal variable a from the polytype we be rejected: are checking against. λ@c1x @c2y ... :: a1. a1 a2. a2 ... Incredibly, that’s it! No other changes are necessary. One ( → ) ∀ → ∀ → of the claims in the previous work is that the design therein is The difference here is that we try to bind c2 as an alias to flexible and extensible. Our success here is further evidence a2. This fails because a2 has been skolemized before we can of that claim. apply SB_DTyAbs. There are a few subtleties of this approach, of course: One could easily wonder: why do deep skolemization here? This is answered in the last paragraph of Section 6.1 of Type variables vs. types The grammar extension and Rule the extended version of Eisenberg et al. [2016]—essentially, SB_DTyAbs both require that the λ-expression binds a vari- doing shallow skolemization would work but would make able, never a full type. This contrasts with our approach for our ideas harder to implement. Currently, we have no non- binding type variables in constructor patterns. The reason for contrived examples where this extra work is necessary, and the difference here is that, while we could mimic the earlier so we leave the deep skolemization rule as it is. approach here, we would find that the type pattern would always just be a variable. In the case of λ-expressions, there is no analogue of the universal variables of data constructor types. Furthermore, any context Q in the polytype being checked against will come into scope after the bound type variable, so there is no possibility of such a context entailing an equality between the bound variable and some other type. For similar reasons, there is no incentive to use an equality assumption in place of substitution in Rule SB_DTyAbs. We thus keep the rule simple by straightforwardly requiring the binding of only variables, and not full types. Comparison against declarative specification Since we are working in the context of Eisenberg et al. [2016], we should consider what the implications are for the compar- ison of our extension of System SB against the declarative System B of that work.11 Happily, all we need to do to keep System B in sync with System SB is add an analogous rule to the “checking” judgment of System B: the fact that the judg- ment works only with specified polytypes (never generalized ones, in the terminology of the previous work) means that we need not worry about the effects of implicit generalization here. Deep skolemization Unfortunately, the new system is not quite as expressive as we would like. Rule SB_DeepSkol performs deep skolemization [Peyton Jones et al. 2007, Sec- tion 4.6], meaning that any quantified type variables—even those to the right of arrows—are skolemized. Once these are skolemized, they are unavailable in later applications of SB_DTyAbs. Let us examine an example: λ@c1xy ... :: a1. a1 a2. a2 ... ( → ) ∀ → ∀ → In this example, we are checking a λ-expression binding type variables against a polytype a1. a1 a2. a2 .... First, → → SB_DTyAbs applies, binding∀c1 to be an∀ alias for a1. Then,

11The previous work builds up its type systems by contrasting syntax- directed type systems against declarative specifications. The current paper focuses only on the syntax-directed approach.