Learning Refinement Types

Learning Refinement Types He Zhu Aditya V. Nori Suresh Jagannathan Purdue University, USA Microsoft Research, UK Purdue Univeristy, USA [email protected] [email protected] [email protected] Abstract l e t maxxyzm=m(mxy)z We propose the integration of a random test generation system l e t f x y = i f x >= y t h e n x e l s e y (capable of discovering program bugs) and a refinement type sys- l e t main x y z = tem (capable of expressing and verifying program invariants), for l e t result = max x y z f i n higher-order functional programs, using a novel lightweight learn- assert (f x result = result) ing algorithm as an effective intermediary between the two. Our Figure 1. approach is based on the well-understood intuition that useful, but A simple higher-order program difficult to infer, program properties can often be observed from concrete program states generated by tests; these properties act as To illustrate, consider the simple program shown in Fig. 1. likely invariants, which if used to refine simple types, can have their Intuitive program invariants for max and f can be expressed in validity checked by a refinement type checker. We describe an im- terms of the following refinement types: plementation of our technique for a variety of benchmarks written max :: (x : int!y : int!z : int! in ML, and demonstrate its effectiveness in inferring and proving useful invariants for programs that express complex higher-order m :(m0 : int!m1 : int!fintj ν ≥ m0 ^ ν ≥ m1g) control and dataflow. !fintjν ≥ x ^ ν ≥ y ^ ν ≥ zg) :: ( : ! : !f j ν ≥ ^ ν ≥ g) Categories and Subject Descriptors D.2.4 [Software Engineer- f x int y int int x y ing]: Software/Program Verification-Correctness proofs, Formal The types specify that both max and f produce an integer that is methods; F.3.1 [Logics and Meanings of Programs]: Specifying no less than the value of their parameters. However, these types and Verifying and Reasoning about Programs are not sufficient to prove the assertion in main ; to do so, requires specifying more precise invariants (we show how to find sufficient Keywords Refinement Types, Testing, Higher-Order Verification, invariants for this program in Section 2). Learning On the other hand, random testing, exemplified by systems like QUICKCHECK [6], can be used to define useful underapproxima- 1. Introduction tions, and has proven to be effective at discovering bugs. However, it is generally challenging to prove the validity of program asser- Refinement types and random testing are two seemingly disparate tions, as in the program shown above, simply by randomly execut- approaches to build high-assurance software for higher-order func- ing a bounded number of tests. tional programs. Refinement types allow the static verification of Tests (which prove the existence, and provide conjectures on the critical program properties (e.g. safe array access). In a refinement absence, of bugs) and types (which prove the absence, and conjec- type system, a base type such as int is specified into a refine- ture the presence, of bugs) are two complementary techniques for ment base type written fintj eg where e (a type refinement) is a understanding program behavior. They both have well-understood Boolean-valued expression constraining the value of the term de- limitations and strengths. It is therefore natural to ask how we might fined by the type. For example, fintj ν > 0g defines the type of define synergistic techniques that exploit the benefits of both. positive integers where the special variable ν denotes the value of Approach. We present a strategy for automatically constructing re- the term. Refinement types naturally generalize to function types. finement types for higher-order program verification. The input to A refinement function type, written fx : Px ! P g, constrains the our approach is a higher-order program P together with P’s safety argument x by the refinement type Px, and produces a result whose property (e.g. annotated as program assertions). We identify P type is specified by P . Refinement type systems such as DML [41] with a set of sampled program states. “Good” samples are col- and LIQUIDTYPES [28] have demonstrated their utility in validat- lected from test runs; these are reachable states from concrete ex- ing useful specifications of higher-order functional programs. ecutions that do not lead to a runtime assertion failure that invali- dates . “Bad” samples are states generated from symbolic execu- tions which would produce an assertion failure, and hence should be unreachable; they are synthesized from a backward symbolic Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed execution, structured to traverse error paths not explored by good for profit or commercial advantage and that copies bear this notice and the full citation runs. The goal is to learn likely invariants ' of P from these sam- on the first page. Copyrights for components of this work owned by others than ACM ples. If refinement types encoded from ' are admitted by a refine- must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, ment type checker and ensure the property , then P is correct with Thisto post is the on author’s servers or version to redistribute of the work. to lists, It is requires posted here prior for specific your personal permission use. and/or Not for a redistribution. The definitive version was published in the following publication: fee. Request permissions from [email protected]. respect to . Otherwise, ' is assumed as describing (or fitting) ICFP’15ICFP ’15,, August Aug 31–Sep 31 – September 2, 2015, Vancouver, 2, 2015, British Vancouver, Columbia, BC, Canada. Canada an insufficient set of “Good” and “Bad” samples. We use ' as a Copyrightc 2015 ACM. c 2015 978-1-4503-3669-7/15/08... ACM 978-1-nnnn-nnnn-n/yy/mm. $15.00. counterexample to drive the generation of more “Good” and “Bad” http://dx.doi.org/10.1145/2784731.2784766http://dx.doi.org/10.1145/nnnnnnn.nnnnnnn samples. This counterexample-guided abstraction refinement (CE- 400 Program with initial inv ϕ :true the initial program invariant ' is assumed to be true.A Deducer ψ failed SafetyP Property ψ (a) uses backward symbolic execution starting from program states that violate , to supply bad sample states at all function entries and intput iv exits, i.e., those that reflect error states of P, sufficient to trigger a Runner ( ,iv,ψ) Deducer ( , ϕ, ψ) P P failure of .A Runner (b) runs P using randomly generated tests, and samples good states at all function entries and exits. These good bad samples VB and bad states are fed to a Learner (c) that builds classifiers from good samples VG failed inv ϕ which likely invariants ' (for functions) are generated. Finally a ψ verified Verifier (d) encodes the likely invariants into refinement types and Learner (V ,V ) Verifier ( , ϕ, ψ) G B P checks whether they are sufficient to prove the provided property. likely inv ϕ If the inferred types fail type checking, the failed likely invariants ' are transferred from the Verifier to the Deducer, which then generates new sample states based on the cause of the failure. Figure 2. The Main algorithm. Our technique thus provides an automated CEGAR approach to lightweight refinement type inference for higher-order program GAR) process [7] repeats until type checking succeeds, or a bug is verification. discovered in test runs. In the following, we consider functional arguments and return unknown functions There are two algorithmic challenges associated with our proof values of higher-order functions to be . All other known functions strategy: (1) how do we sample good and bad program states in the functions are . presence of complex higher-order control and dataflow? (2) how Example. To illustrate, the program shown in the left column of do we ensure that the refinement types deduced from observing Fig. 3 makes heavy use of unknown functions (e.g., the functional the sampled states can generally capture both the conditions (a) argument a of init is an unknown function). In the function sufficient to capture unseen good states and (b) necessary to reject main , the value for a supplied by f is a closure over n , and unseen bad ones? when applied to a value i , it checks that i is non-negative but less The essential idea for our solution to (1) is to encode the un- than n , and returns 0. The function init iteratively initializes the known functions of a higher-order function (e.g. function m in closure a ; in the i -th iteration the call to update produces a new Fig. 1) as uninterpreted functions, hiding higher-order features closure that is closed over a and yields 1 when supplied with i . from the sampling phase. Our solution to (2) is based on learning Our system considers program safety properties annotated in the techniques to abstract properties classifying good states and bad form of assertions. The assertion in main specifies that the result states derived from the CEGAR process, without overfitting the in- of init should be a function which returns 1 when supplied with ferred refinement types to just the samples. an integer between [0, n ). Implementation. We have implemented a prototype of our frame- Verifying this program is challenging because a proof system work built on top of the ML type system. The prototype can take must account for unknown functions. The program also exhibits a higher-order program over base types and polymorphic recur- complex dataflow (e.g., init can create an arbitrary number of sive data structures as input, and automatically verify whether the closures via the partial application of update ); thus, any useful program satisfies programmer-supplied safety properties.

Load more