Retrofitting Linear Types in Haskell
Total Page:16
File Type:pdf, Size:1020Kb
1 Retrofiing Linear Types In submission JEAN-PHILIPPE BERNARDY, Gothenburg University MATHIEU BOESPFLUG, Tweag I/O RYAN R. NEWTON, Indiana University SIMON PEYTON JONES, Microso Research ARNAUD SPIWACK, Tweag I/O and Irif, Universite´ Paris Diderot Linear and ane type systems have a long and storied history, but not a clear path forward to integrate with existing languages such as Ocaml or Haskell. In this paper, we introduce and study a linear type system designed with two crucial properties in mind: backwards-compatibility and code reuse across linear and non-linear users of a library. Only then can the benets of linear types permeate conventional functional programming. Rather than bifurcate data types into linear and non-linear counterparts, we instead aach linearity to binders. Linear function types can receive inputs from linearly-bound values, but can also operate over unrestricted, regular values. Linear types are an enabling tool for safe and resource ecient systems programming. We explore the power of linear types with a case study of a large, in-memory data structures that must serve responses with low latency. CCS Concepts: •So ware and its engineering ! Language features; Functional languages; Formal language denitions; Additional Key Words and Phrases: Haskell, laziness, linear logic, Linear types, systems programming ACM Reference format: Jean-Philippe Bernardy, Mathieu Boespug, Ryan R. Newton, Simon Peyton Jones, and Arnaud Spiwack. 2017. Retroing Linear Types. PACM Progr. Lang. 1, 1, Article 1 (January 2017), 25 pages. DOI: 0000001.0000001 1 INTRODUCTION Can we safely use Haskell to implement a low-latency server that caches a large dataset in-memory? Today, the answer is “no”[12], because pauses incurred by garbage collection (gc), observed in the order of 50ms or more, are unacceptable. Pauses are in general proportional to the size of the heap. Even if the gc is incremental, the pauses are unpredicable, dicult to control by the programmer and induce furthermore for this particular use case a tax on overall throughput. e problem is: the gc is not resource ecient, meaning that resources aren’t always freed as soon as they could be. Programmers can allocate such large, long-lived data structures in manually-managed o-heap memory, accessing it through ffi calls. Unfortunately this common technique poses safety risks: space leaks (by failure to deallocate at all), as well use-aer-free or double-free errors just like programming in plain C. e programmer has then bought resource eciency but at the steep price of giving up on resource safety. If the type system was resource-aware, a programmer wouldn’t have to choose. It is well known that type systems can be useful for controlling resource usage, not just ensuring correctness. Ane types [43], linear types [23, 47], permission types [48] and capabilities [3, 10] enable resource safe as well resource ecient handling of scarce resources such as sockets and le handles. All these approaches have been extensively studied, yet these ideas have had relatively lile eect on programming practice. Few full-scale is work has received funding from the European Commission through the SAGE project (grant agreement no. 671500). © 2017 ACM. is is the author’s version of the work. It is posted here for your personal use. Not for redistribution. e denitive Version of Record was published in PACM Progr. Lang., hp://dx.doi.org/0000001.0000001. PACM Progr. Lang., Vol. 1, No. 1, Article 1. Publication date: January 2017. 1:2 • J.-P. Bernardy, M. Boespflug, R. Newton, S. Peyton Jones, and A. Spiwack languages are designed from the start with such features. Rust is the major exception [31], and in Rust we see one of the aendant complications: advanced resource-tracking features puts a burden on new users, who need to learn how to satisfy the “borrow checker”. We present in this paper the rst type system that we believe has a good chance of inuencing existing functional programs and libraries to become more resource ecient. Whereas in Rust new users and casual programmers have to buy the whole enchilada, we seek to devise a language that is resource safe always and resource ecient sometimes, only when and where the programmer chooses to accept that the extra burden of proof is worth the beer performance. Whereas other languages make resource eciency opt-out, we seek to make it opt-in. We do this by retroing a backward-compatible extension to a Haskell-like language. Existing functions continue to work, although they may now have more rened types that enable resource eciency; existing data structures continue to work, and can additionally track resource usage. Programmers who do not need resource eciency should not be tripped up by it. ey need not even know about our type system extension. We make the following specic contributions: • We present a design for Hask-LL, which oers linear typing in Haskell in a fully backward-compatible way (Section2). In particular, existing Haskell data types can contain linear values as well as non- linear ones; and linear functions can be applied to non-linear values. Most previous designs force an inconveniently sharp distinction between linear and non-linear code (Section6). Interestingly, our design is fully compatible with laziness, which has typically been challenging for linear systems because of the unpredictable evaluation order of laziness [47]. q • We formalise Hask-LL as λ!, a linearly-typed extension of the λ-calculus with data types (Section3). We provide its type system, highlighting how it is compatible with existing Haskell features, including some q popular extensions. e type system of λ! has a number of unusual features, which together support backward compatibility with Haskell: linearity appears only in bindings and function arrows, rather than pervasively in all types; we support linearity polymorphism (Section 2.3); and the typing rule for case is novel (Section 3.3). No individual aspect is entirely new (Section 6.7), but collectively they add up to an unintrusive system that can be used in practice and scales up to a full programming language implementation with a large type system. q • We provide a dynamic semantics for λ!, combining laziness with explicit deallocation of linear data (Section4). We prove type safety, of course. But we also prove that the type system guarantees the key memory-management properties that we seek: that every linear value is eventually deallocated by the programmer, and is never referenced aer it is deallocated. • Type inference, which takes us from the implicitly-typed source language, Hask-LL, to the explicitly-typed q intermediate language λ!, is not covered this paper. However, we have implemented a proof of concept, by modifying the leading Haskell compiler ghc to infer linear types. e prototype is freely available1, and provides strong evidence that it is not hard to extend ghc’s full type inference algorithm (despite its complexity) with linear types. Our work is directly motivated by the needs of large-scale low-latency applications in industrial practice. In Section5 we show how Hask-LL meets those needs. e literature is dense with related work, which we dicuss in Section6. 2 A TASTE OF HASK-LL We begin with an overview of Hask-LL, our proposed extension of Haskell with linear types. All the claims made in this section are substantiated later on. First, along with the usual arrow type A ! B, we propose an additional 1URL suppressed for blind review, but available on request PACM Progr. Lang., Vol. 1, No. 1, Article 1. Publication date: January 2017. Retrofiing Linear Types • 1:3 arrow type, standing for linear functions, wrien A ( B. In the body of a linear function, the type system tracks that there is exactly one copy of the parameter to consume. f :: A ( B g :: A ! B f x = f- x has multiplicity 1 here -g g y = f- y has multiplicity ω here -g We say that the multiplicity of x is 1 in the body of f ; and that of y is ω in g. Similarly, we say that unrestricted (non-linear) parameters have multiplicity ω (usable any number of times, including zero). We call a function linear if it has type A ( B and unrestricted if it has type A ! B. e linear arrow type A ( B guarantees that any function with that type will consume its argument exactly once. However, the type places no requirement on the caller of these functions. e laer is free to pass either a linear or non-linear value to the function. For example, consider these denitions of a function g: g1; g2; g3 :: (a ( a ! r) ! a ( a ! r g1 k x y = k x y -- Valid g2 k x y = k y x -- Invalid: fails x’s multiplicity guarantee g3 k x y = k x (k y y) -- Valid: y can be passed to linear k As in g2, a linear variable x cannot be passed to the non-linear function (k y). But the other way round is ne: g3 illustrates that the non-linear variable y can be passed to the linear function k. Linearity is a strong contract for the implementation of a function, not its caller. 2.1 Calling contexts and promotion A call to a linear function consumes its argument once only if the call itself is consumed once. For example, consider these denitions of the same function g: f :: a ( a g4; g5; g6 :: (a ( a ! r) ! a ( a ! r g4 k x y = k x (f y) -- Valid: y can be passed to linear f g5 k x y = k (f x) y -- Valid: k consumes f x’s result once g6 k x y = k y (f x) -- Invalid: fails x’s multiplicity guarantee In g5, the linear x can be passed to f because the result of (f x) is consumed linearly by k.