Occurrence Typing Modulo Theories T

ifac t t r Comp * * let A nt e e * A t s W i E * s e n l C l o I D C o * D * c u L e m s E u P e e n R t v e o d Occurrence Typing Modulo Theories t * y * s E a a l d u e a t Andrew M. Kent David Kempe Sam Tobin-Hochstadt Indiana University Bloomington, IN, USA &M/KF2Mi-/F2KT2-bKi?'!BM/BMX2/m Abstract 1. Introduction We present a new type system combining occurrence typ- Applying a static type discipline to an existing code base ing—a technique previously used to type check programs in written in a dynamically-typed language such as JavaScript, dynamically-typed languages such as Racket, Clojure, and Python, or Racket requires a type system tailored to the id- JavaScript—with dependent refinement types. We demon- ioms of the language. When building gradually–typed sys- strate that the addition of refinement types allows the inte- tems, designers have focused their attention on type systems gration of arbitrary solver-backed reasoning about logical with relatively simple goals, e.g. ruling out dynamic type er- propositions from external theories. By building on occur- rors such as including a string in an arithmetic computation. rence typing, we can add our enriched type system as a natu- These systems—ranging from widely-adopted industrial ef- ral extension of Typed Racket, reusing its core while increas- forts such as TypeScript [6], Hack [16], and Flow [15] to ing its expressiveness. The result is a well-tested type system more academic systems such as Typed Racket [25], Typed with a conservative, decidable core in which types may de- Clojure [2], Reticulated Python [29], and Gradualtalk [1]— pend on a small but extensible set of program terms. have been successful in this narrow goal. In addition to describing our design, we present the fol- However, advanced type systems can express more pow- lowing: a formal model and proof of correctness; a strategy erful properties and check more significant invariants than for integrating new theories, with specific examples includ- merely the absence of dynamic type errors. Refinement and ing linear arithmetic and bitvectors; and an evaluation in the dependent types, as well as sophisticated encodings in the context of the full Typed Racket implementation. Specifi- type systems of languages such as Haskell and ML [11, 30], cally, we take safe vector operations as a case study, exam- allow programmers to capture more precise correctness crite- ining all vector accesses in a 56,000 line corpus of Typed ria for their programs such as those for balanced binary trees, Racket programs. Our system is able to prove that 50% of sized vectors, and much more. these are safe with no new annotations, and with a few an- In this paper, we combine these two strands of research, notations and modifications we capture more than 70%. producing a system we dub Refinement Typed Racket, or RTR. RTR follows in the tradition of Dependent ML [32] and Categories and Subject Descriptors F.3.1 [Specifying and Liquid Haskell [27] by supporting dependent and refinement Verifying and Reasoning about Programs] types over a limited but extensible expression language. Ex- perience with these languages has already demonstrated that arXiv:1511.07033v2 [cs.PL] 4 Oct 2016 General Terms Languages, Design, Verification expressive and rich program properties can be captured by a fully-decidable type system. Keywords Refinement types, occurrence typing, Racket Furthermore, by building on the existing framework of occurrence typing, refinement types prove to be a natural addition to the Typed Racket implementation, formal model, and soundness results. Occurrence typing is designed to reason about dynamic type tests and control flow in existing Permission to make digital or hard copies of part or all of this work for personal or Permissionclassroom use to makeis granted digital without or hard fee copies provided ofall that or copies part of are this not work made for or personaldistributed or Racket programs, using propositions about the types of terms classroomfor profit or use commercial is granted advantage without fee and provided that copies that bear copies this are notice not madeand the or full distributed citation and simple rules of logical inference. Extending this logic foron profitthe first or commercial page. Copyrights advantage for andcomponents that copies of bear this this work notice owned and theby fullothers citation than onACM the firstmust page. be honored. Copyrights Abstracting for components with credit of this is workpermitted. owned To by copy others otherwise, than ACM to to encompass refinements of types as well as propositions mustrepublish, be honored. to post Abstracting on servers, with or to credit redistribute is permitted. to lists, To copy contact otherwise, the Owner/Author. or republish, drawn from solver-backed theories produces an expressive toRequest post on permissions servers or tofrom redistribute [email protected] to lists, requires or Publications prior specific Dept., permission ACM, and/orInc., fax a fee. Request permissions from [email protected]. +1 (212) 869-0481. Copyright held by Owner/Author. Publication Rights Licensed to system which scales to real programs. In this paper, we show CopyrightACM. is held by the owner/author(s). Publication rights licensed to ACM. examples drawn from the theory of linear inequalities and the PLDI’16PLDI ’16,, JuneJune 13–17, 13 - 17, 2016,2016, Santa Barbara, Barbara, CA, CA, USA USA ACM.Copyright 978-1-4503-4261-2/16/06...$15.00 © ACM 978-1-4503-4261-2/16/06…$15.00 theory of bitvectors. http://dx.doi.org/10.1145/2908080.2908091DOI: http://dx.doi.org/10.1145/2908080.2908091 296 (: max : [x : Int] [y : Int] The remainder of this paper is structured as follows: sec- -> [z : Int #:where (∧ (≥ z x) (≥ z y))]) tion 2 reviews the basics of occurrence typing and introduces (define (max x y) (if (> x y) x y)) dependency and refinement types via examples; section 3 formally presents the details of our type system, as well as Figure 1. max with refinement types soundness results for the calculus; section 4 describes the challenges of scaling the calculus to our implementation in Typed Racket; section 5 contains the results of our empirical Figure 1 presents a simple demonstration of integrating evaluation of the effectiveness of using refinement types for refinement types with linear arithmetic. The max function vector bounds verification; section 6 discusses related work; takes two integers and returns the larger, but instead of de- and section 7 concludes. scribing it as a simple binary operator on values of type Int, as the current Typed Racket implementation specifies, we 2. Beyond Occurrence Typing give a more precise type describing the behavior of max. The syntax for function types in RTR allows for ex- Occurrence typing—an approach whereby different occur- plicit dependencies between the domain and range by giv- rences of the same identifier may be type checked at different ing names to arguments which are in scope in any log- types throughout a program—is the strategy Typed Racket ical refinements. For the range in this example we use uses to type check idiomatic Racket code [25, 26]. To illus- [z : Int #:where (∧ (≥ z x) (≥ z y))] as syn- trate, consider a Typed Racket function which accepts either tactic sugar for a logical refinement on the base type Int: an integer or a list of bits as input and returns the least sig- (Refine [z : Int] (∧ (≥ z x) (≥ z y))). Note nificant bit: that the max function definition does not require any changes (: least-significant-bit : to accommodate the stronger type, nor do clients of max (U Int (Listof Bit)) -> Bit) need to care that the type provides more guarantees than (define (least-significant-bit n) before; the conditional in the body of max enables the use (if (int? n) of the refinement type in the result, as in most refinement (if (even? n) 0 1) type systems. Typed Racket’s pre-existing ability to reason (last n))) about conditionals means that abstraction and combination Type checking the function body begins with the assump- of conditional tests properly works in RTR without requiring tion that n is of type (U Int (Listof Bit)) (an ad hoc anything more from solvers for various theories. union containing Int and (Listof Bit)). Typed Racket With these features we enable programmers to enforce then verifies the test-expression(int? n) is well typed and new invariants in existing Typed Racket code and thus eval- checks the remaining branches of the program with the fol- uate the effectiveness of our type checker on real-world lowing insights: programs. As evidence, we have implemented our system in Typed Racket, including support for linear arithmetic 1. In the then branch we know (int? n) produced a non- and provably-safe vector access, and automatically analyzed #f value, thus n must have type Int. The type system can three large libraries totalling more than 56,000 lines of code. then verify (if (even? n) 0 1) is well-typed at Bit. We determined that approximately 50% of the vector ac- 2. In the else branch we know (int? n) produced #f, im- cesses are provably safe with no code changes. We then ex- plying n is not of type Int. This fact, combined with our amined one of the libraries in detail, finding that of the 75% previous knowledge of the type of n, tells us n must have that were not automatically proved safe, an additional 47% type (Listof Bit). Thus (last n) is also well typed could be verified by adding type annotations and minor code and of type Bit.

Load more