A Right-To-Left Type System for Value Recursion

A Right-To-Left Type System for Value Recursion

1 A right-to-left type system for value recursion 61 2 62 3 63 4 Alban Reynaud Gabriel Scherer Jeremy Yallop 64 5 ENS Lyon, France INRIA, France University of Cambridge, UK 65 6 66 1 Introduction Seeking to address these problems, we designed and imple- 7 67 mented a new check for recursive definition safety based ona 8 In OCaml recursive functions are defined using the let rec opera- 68 novel static analysis, formulated as a simple type system (which we 9 tor, as in the following definition of factorial: 69 have proved sound with respect to an existing operational seman- 10 let rec fac x = if x = 0 then 1 70 tics [Nordlander et al. 2008]), and implemented as part of OCaml’s 11 else x * (fac (x - 1)) 71 type-checking phase. Our check was merged into the OCaml distri- 12 Beside functions, let rec can define recursive values, such as 72 bution in August 2018. 13 an infinite list ones where every element is 1: 73 Moving the check from the middle end to the type checker re- 14 74 let rec ones = 1 :: ones stores the desirable property that compilation of well-typed programs 15 75 Note that this “infinite” list is actually cyclic, and consists of asingle does not go wrong. This property is convenient for tools that reuse 16 76 cons-cell referencing itself. OCaml’s type-checker without performing compilation, such as 17 77 However, not all recursive definitions can be computed. The MetaOCaml [Kiselyov 2014] (which type-checks quoted code) and 18 78 following definition is justly rejected by the compiler: Merlin [Bour et al. 2018] (which type-checks code during editing). 19 79 let rec x = 1 + x Furthermore, some aspects of the check have delicate interactions 20 80 with types, and so cannot be performed on an untyped IR (§4). 21 Here x is used in its own definition. Computing 1 + x requires x to 81 22 have a known value: this definition contains a vicious circle, and 82 any evaluation strategy would fail. Our analysis We looked at reusing existing inference systems, 23 but they do not appear to suit our analysis: they have a finer- 83 24 Functional languages deal with recursive values in various ways. 84 Standard ML simply rejects all recursive definitions except function grained handling of functions and functors than we need, but 25 coarser-grained handling of cyclic data, and most do not propose 85 26 values. At the other extreme, Haskell accepts all well-typed recur- 86 sive definitions, including those that lead to infinite computation. effective inference algorithms. In return for a coarser analysis, our 27 system is noticeably simpler; furthermore, it scales cleanly to the 87 28 In OCaml, safe cyclic-value definitions are accepted, and they are 88 occasionally useful. full OCaml language. 29 A key aspect of our approach is the idea of right-to-left (type to 89 30 For example, consider an interpreter for a programming language 90 with datatypes for ASTs and for values: environment) algorithmic interpretation, which reduces complexity 31 compared to a presentation designed for a left-to-right reading. It is 91 type ast = Fun of var * expr | ::: 32 novel in this space and could inspire other inference rules designers. 92 33 type value = Closure of env * var * expr | ::: 93 34 The eval function builds values from environments and asts 2 Static and dynamic semantics 94 35 let rec eval env = function 95 36 | ::: Syntax Figure1 introduces a minimal subset of ML with the 96 37 | Fun (x, t) -> Closure(env, x, t) interesting ingredients of OCaml’s recursive value definitions: a 97 multi-ary let rec binding let rec ¹x = t ºi in u, functions (λ- 38 Now consider adding an ast constructor FunRec of var * var * expr i i 98 abstractions) λx: t and applications tu, datatype constructors K ¹t ; t ;::: º 39 for recursive functions: FunRec ("f", "x", t) represents the recur- 1 2 99 and shallow pattern-matching match t with ¹K ¹x ºj ! u ºi . 40 sive function let rec f x = t in f . Our OCaml interpreter can i i;j i 100 Other ML constructs (non-recursive let, tuples, conditionals, 41 use value recursion to build a closure for these recursive func- 101 etc.) can be desugared into this core. In fact, the full inference rules 42 tions, without changing the type of the Closure constructor: the 102 for OCaml (and our check) exactly correspond to the rules (and 43 recursive closure simply adds itself to the closure environment 103 check) derived from this desugaring. 44 ((var * value) list). 104 Since ML’s types are largely orthogonal to our analysis, we 45 let rec eval env = function 105 46 present the check using an untyped fragment. (In the full OCaml 106 | ::: language, there are some interactions with types — in particular, 47 | Fun (x, t) -> Closure(env, x, t) 107 48 with GADTs — see §4.) Although we ignore types, we do assume 108 | FunRec (f, x, t) -> i 49 that terms are well-scoped — n.b. in let rec ¹xi = vi º in u, the 109 let rec cl = Closure((f,cl)::env, x, t) in cl i 50 ¹xi º are in scope of u but also of all the vi . 110 51 Our new check and its implementation Until recently, the static 111 Access modes For each recursive binding x = e, our analysis 52 check used by OCaml to reject vicious definitions relied on a syn- 112 assigns an access mode m representing the way that x is accessed 53 tactic analysis, performed on an untyped intermediate language . 113 during evaluation of e. 54 While we believe that the check as originally defined was correct, it 114 Figure2 defines the modes, their order structure, and the mode 55 proved fragile and hard to extend to the interaction of new language 115 composition operations. The modes are as follows: 56 features with recursive definitions. Over the years, bugs were found 116 57 where the check was unduly lenient. In conjunction with OCaml’s Ignore : an expression is entirely unused during the evalua- 117 58 efficient recursive definition compilation scheme [Hirschowitz et al. tion of the program. This is the mode of a variable in an 118 59 2009], this leniency led to segmentation faults. expression in which it does not occur. 119 60 1 120 Alban Reynaud, Gabriel Scherer, and Jeremy Yallop Modes: Ignore ≺ Delay ≺ Guard ≺ Return ≺ Deref 121 Terms 3 t;u ::= x;y; z 122 j let rec b in u Mode composition: 123 j λx: t j t u 0 i m »m ¼ Ignore Delay Guard Return Deref m 124 j K ¹ti º j match t with h Ignore Ignore Ignore Ignore Ignore Ignore 125 Delay Ignore Delay Delay Delay Deref 126 i Guard Ignore Delay Guard Guard Deref 127 Bindings 3 b ::= ¹xi = ti º i Return Ignore Delay Guard Return Deref 128 Handlers 3 h ::= ¹pi ! ti º i Deref Ignore Delay Deref Deref Deref 129 Patterns 3 p; q ::= K ¹xi º 0 m 130 131 Figure 1. Core language syntax Figure 2. Access modes and operations 132 0 133 Γ ` t : m m ≻ m Γ; x : mx ` t : m »Delay¼ Γt ` t : m »Deref¼ Γu ` u : m »Deref¼ 134 Γ ` t : m0 Γ ` λx: t : m Γ + Γ ` t u : m t u 135 136 ¹ ¹ ºj 2I ` ºi 2I ¹ ⪯ ºi;j ¹ ºi ` rec Γi; xj : mi;j ti : Return mi;j Guard xi : Γi b 137 0 Õ h 0i j i 0 i def i i ¹Γi = Γi + ¹mi;j Γj º º ¹mi º = ¹max¹mi; Guardºº Γu; ¹xi : mi º ` u : m 138 0 i 2I i 2I Õ 0 i 139 ¹xi : Γ º ` rec ¹xi = ti º ¹mi »Γi ¼º + Γu ` let rec b in u : m i 140 141 Figure 3. Mode inference rules (abridged) 142 143 Delay : a context can be evaluated (to Weak Normal Form) Finally, Ignore is the absorbing element of mode composition 144 without evaluating its argument. λx: □ is a delay context. (m »Ignore¼ = Ignore = Ignore »m¼), Return is an identity (Return »m¼ = 145 Guard : the context returns the value as a member of a data m = m »Return¼), and composition is idempotent (m »m¼ = m). 146 structure (e.g. a variant or record). K ¹□º is a guard context. 147 The value can safely be defined mutually-recursively with A right-to-left inference system Figure3 gives a representative 148 its context, as in let rec x = K ¹xº1. sample of the inference rules for a judgment of form Γ ` t : m 149 Return : the context returns its value without further inspec- for term t, access mode m and environment Γ that maps term vari- 150 tion. This value cannot be defined mutually-recursively with ables to access modes. Modes classify terms and variables, playing 151 its context, to avoid self-loops: in let rec x = x and let rec x = the role of types in usual type systems. The example judgment 152 let y = x in y, the last occurrence of x is in Return context. x : Deref;y : Delay ` ¹x + 1; lazy yº : Guard can be read either 153 154 Deref : the context inspects and uses the value in arbitrary left-to-right: If x can safely be used in Deref mode, and y in 155 ways. Such a value must be fully defined at the point of Delay mode, then ¹x + 1; lazy yº can safely be used at Guard. 156 usage; it cannot be defined mutually-recursively with its right-to-left: If a context accesses the term ¹x+1; lazyyº under 157 context.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    4 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us