Natural Proofs for Structure, Data, and Separation

XiaokangQiu PranavGarg AndreiS¸tef˘anescu P.Madhusudan University of Illinois at Urbana-Champaign, USA {qiu2, garg11, stefane1, madhu}@illinois.edu

Abstract ing verification conditions has become very powerful. The latter We propose natural proofs for reasoning with programs that ma- is often executed by automated logical decision procedures nipulate data-structures against specifications that describe the supported by SMT solvers, which have emerged as robust and pow- structure of the heap, the data stored within it, and separation and erful engines to automatically find proofs. Several techniques and framing of sub-structures. Natural proofs are a subclass of proofs tools have been developed [2, 16, 20] and there have been several that are amenable to completely automated reasoning, that pro- success stories of large software verification projects using this ap- vide sound but incomplete procedures, and that capture common proach (the OS project [42], the hypervisor veri- reasoning tactics in program verification. We develop a dialect of fication project using VCC [16], and a recent verified-for-security + separation logic over heaps, called Dryad, with recursive defini- OS browser for mobile applications [29], to name a few). tions that avoids explicit quantification. We develop ways to reason Verification conditions do not, however, always fall into decid- with heaplets using classical logic over the theory of sets, and de- able theories. In particular, the verification of properties of the dy- velop natural proofs for reasoning using proof tactics involving namically modified heap is a big challenge for logical methods. The disciplined unfoldings and formula abstractions. Natural proofs dynamically manipulated heap poses several challenges, as typical are encoded into decidable theories of first-order logic so as to be correctness properties of heaps require complex combinations of discharged using SMT solvers. structure (e.g., p points to a tree structure, or to a doubly-linked We also implement the technique and show that a large class of list, or to an almost balanced tree, with respect to certain pointer- more than 100 correct programs that manipulate data-structures are fields), data (the integers stored in data-fields of the tree respect the amenable to full functional correctness using the proposed natural binary search tree property, or the data stored in a tree is a max- proof method. These programs are drawn from a variety of sources heap), and separation (the procedure modifies one list and not the including standard data-structures, the Schorr-Waite algorithm for other and leaves the two lists disjoint at exit, etc.). garbage collection, a large number of low-level routines from the The fact that the dynamic heap contains an unbounded number Glib library and OpenBSD library, the Linux kernel, and routines of locations means that expressing the above properties requires from a secure verified OS-browser project. Our work is the first that quantification in some form, which immediately precludes the use we know of that can handle such a wide range of full functional ver- of most SMT decidable theories (there are only a few of them ification properties of heaps automatically, given pre/post and loop known that can handle quantification; e.g., the array property frag- trand invariant annotations. We believe that this work paves the way for ment [12] and the S logic [25, 26]). Consequently, express- deductive verification technology to be used by programmers who ing such properties naturally and succinctly in a logical formalism do not (and need not) understand the internals of the underlying has been challenging, and reasoning with them automatically even logic solvers, significantly increasing their applicability in building more so. reliable systems. For instance, in the Boogie line of tools (including VCC) of writing specifications using first-order logic and employing SMT Categories and Subject Descriptors F.3.1 [Logics and Meanings solvers to validate verification conditions, the specification of in- of Programs]: Specifying and Verifying and Reasoning about Pro- variants of even simple methods like singly-linked-list insert is te- grams: Mechanical verification; D.2.4 [Software Engineering]: dious. In such code1, second-order properties (reachability, acyclic- Software/Program Verification: Assertion checkers ity, separation, etc.) are smuggled in using carefully chosen ghost variables; for example, acyclicity of a list is encoded by assign- Keywords heap analysis; data structures; natural proofs; separa- ing a ghost number (idx) to each node in the list, with the property tion logic; SMT solvers that the numbers associated with adjacent nodes strictly increase going down the list. These ghost variables require careful manip- 1. Introduction ulation when the structures are updated; for example, inserting a In recent years, the automated deductive verification paradigm for node may require updating the ghost numbers for other nodes in software verification that combines user written modular contracts the list, in order to maintain the acyclicity property. Once such a and loop invariants with automated theorem proving of the result- ghost-encoding of the specification is formulated, the validation of verification conditions, which typically have quantifiers, are dealt with using sound heuristics (a wide variety of them including e- matching, model-based quantifier instantiation, etc. are available), Permission to make digital or hard copies of all or part of this work for personal or but are still often not enough and have to be augmented by instan- classroom use is granted without fee provided that copies are not made or distributed tiation triggers from the verification engineer to help the proof go for profit or commercial advantage and that copies bear this notice and the full citation through. on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. PLDI’13, June 16–19, 2013, Seattle, WA, USA. 1 http://vcc.codeplex.com/SourceControl/changeset/view/dcaa4d0ee8c2#vcc Copyright c 2013 ACM 978-1-4503-2014-6/13/06. . . $15.00 /Docs/Tutorial/c/7.2.list.c In recent years, separation logic, especially in combination how a particular sub-part of the heap changes due to a procedure, with recursive definitions, has emerged as a much more succinct implicitly leaving the rest of the heap and its properties unchanged and natural logic to express properties about structure and separa- across a call to this procedure. Separation logic is a particular tion [32, 36]. However, the validation of verification conditions re- framework for strict specifications, where formulas are implicitly sulting from separation logic invariants are also complex, and has defined on strictly defined heaplets, and where heaplets can be eluded automatic reasoning and exploitation of SMT solvers (even combined using a spatial conjunction operator denoted by ∗. The more so than tools such as Boogie that use classical logic). Again, frame rule in separation logic captures the main advantage of strict help from the user in proving the verification conditions are cur- specifications: if the Hoare-triple {P}C{Q} holds for some program rently necessary— the tools Verifast [20] and Bedrock [15], for C, then {P ∗ }C{Q ∗ R} also holds (with side-conditions that the instance, admit separation logic specifications but require the user modified variables in C are disjoint from the free variables in R). to write low-level lemmas and proof tactics to guide the verifica- Consider, for example, expressing that the location x is the root tion. For example, in verifying an in-place reversal of a linked list2, of a tree. This is a second-order property and formulations of it Bedrock would require several lemmas and a hint package be sup- in classical logic using set or path quantification are quite com- plied at the level of the code in order for the proof to go through. plex and not easily amenable to automated verification. We prefer The work in this paper is motivated by the opinion that en- inductive definitions of structural properties without any explicit tirely decidable logics are too restrictive, in general, to support quantification. The separation logic syntax with recursive defini- the verification of complex specifications of functional correctness tions and heaplet semantics allows simple quantifier-free formulas for heap manipulating programs, and the other extreme of user- to express structural restrictions; for example. tree-ness can be ex- supplied proof tactics and lemmas is too tedious, requiring of the pressed simply as: user too much knowledge of the underlying proof systems/decision procedures. Our aim is to build completely automatic, sound, but tree(x)::(x = nil ∧ emp) ∨ (x 7−→ (l, r) ∗ tree(l) ∗ tree(r)) incomplete proof techniques that can solve a large class of proper- We first define a new logic, Dryad, that permits no explicit ties involving complex data-structures. quantification, but permits powerful recursive definitions to define The natural proof methodology: integers, sets/multisets of integers, and sets of locations, using least Our proof methodology of natural proofs was first proposed in a pa- fixed-points. The logic Dryad furthermore has a heaplet semantics per by Madhusudan et al on natural proofs for tree data-structures and allows the spatial conjunction operator ∗. However, a key last year at POPL [27]. Natural proofs exploit a fixed set of proof design feature of Dryad is that the heaplet for recursive formulas is tactics, keeping the expressiveness of powerful logics, retaining the essentially determined by the syntax as opposed to the semantics. automated nature of proving validity, but giving up on complete- In classical separation logic, a formula of the form α ∗ β says that ness (i.e., giving up decidability, retaining soundness). The idea of the heaplet can be partitioned into any two disjoint heaplets, one natural proofs [27] is to identify a subclass of proofs N such that satisfying α and the other β.InDryad, the heaplet for a (complex) (a) a large class of valid verification conditions of real-world pro- formula is determined and hence if there is a way to partition the grams have a proof in N, and (b) searching for a proof in N is heaplet, there is precisely one way to do so. We have found that decidable. In fact, we would even like the search for a proof in N most uses of separation logic to express properties can be written to be efficiently decidable, possibly utilizing the automatic logic quite succinctly and easily using Dryad (in fact, it is easier to solvers (SMT solvers) that exist today. Natural proofs are hence a write such deterministic-heap specifications). The key advantage is fixed set of proof tactics whose application is itself expressible in that this eliminates implicit existential quantification the separation a decidable logic. The natural proofs developed in [27] were too operator provides. In a verification condition that combines the pre- restrictive, however, handling only single trees, with no scope for condition in the negative and the post-condition in the positive, handling multiple or more complex data-structures and their sepa- the classical semantics for separation logic invariably introduces ration (see section on Related Work for more details). universal quantification in the satisfiability query for the negation of The aim of this paper is to provide natural proofs for general the verification condition, which in turn is extremely hard to handle. properties of structure, data, and separation. Our contributions are: In Dryad, the semantics of a recursive definition r(x) (such as (a) we propose Dryad, a dialect of separation logic for heaps, with tree above), requires that the heaplet be determined and defined as no explicit (classical) quantification but with recursive definitions, the set of all locations reachable from the node x through a set of to express second-order properties; (b) show that Dryad is both pointer-fields f1,..., fk without passing through a set of locations powerful in terms of expressiveness, and that the strongest-post of (given by a set of location terms t1,... tn). While our logical mech- Dryad specifications with respect to bounded code segments can anisms can be extended beyond this notion (in deterministic ways), be formulated in Dryad, (c) show how Dryad has been designed so we have found that this captures most common properties required that it can be systematically converted to classical logic using the in proving data-structure manipulating programs correct. theory of sets, allowing us to connect the more natural and succinct specifications to more verbose but classical logic, and (d) develop Translating Dryad to classical logic with recursion: a natural proof mechanism for classical logics with recursion and The second key step in our paradigm is a technique to bridge the sets that implement sound but incomplete reductions to decidable gap from separation logic to classical logic in order to utilize ef- theories that can be handled by an SMT solver. ficient decision procedures supported by SMT solvers. We show that heaplet semantics and separation logic constructs of Dryad Dryad: A separation logic with determined heaplets can be effectively translated to classical logic where heaplets are The primary design principle behind separation logic is the decision modeled as sets of locations. We show that Dryad formulas can be to express strict specifications— logical formulas must naturally translated into classical logic with free set variables that capture the refer to heaplets (subparts of the heap), and, by default, the smallest heaplets corresponding to the strict semantics. This translation does heaplets over which the formula needs to refer to. This is in contrast not, of course, a decidable theory yet, as recursive definitions to classical logics (such as FOL) which implicitly refer to the entire are still present (the recursion-free formulas are in a decidable the- heap globally. Strict specifications permit elegant ways to capture ory). The carefully designed Dryad logic with determined heaplet semantics ensures that there is no quantification in the resulting 2 http://plv.csail.mit.edu/bedrock/Tutorial.html formula in classical logic. The heaplets of recursively defined prop- erties, which are defined using the set of all reachable nodes, are and simple programs that manipulate two lists or two trees cannot translated to recursively defined sets of locations. be reasoned with. Also, structures such as doubly-linked lists, trees ryad with parent pointers, etc. are out of scope of this work. Technically, Natural proofs for D : in our present work, we can handle user-defined structures express- ryad Finally, we develop a natural proof methodology for D by ible in separation logic, multiple structures and their separation, showing a natural proof mechanism for the equivalent formulas programs with while-loops, etc., because of our logical treatment in classical logic. The basic proof tactic that we follow is not just of separation logic using classical logic. dependent on the formula embodying the verification condition, but There is a rich literature on analysis of heaps in software. We also on the precise footprint touched by the program segment being omit discussing literature on general interactive theorem provers verified. We unfold recursive definitions precisely across footprints, (like Isabelle [31]) that require considerable manual guidance. translating them to the frontier of the footprint, and then use a form We also omit a lot of work on analyzing shape properties of the of formula abstraction that treats recursive formulas on frontier heap [6, 13, 18, 28, 41], as they do not handle complex functional nodes as uninterpreted functions. The resulting formula falls in a properties. logic over sets and integers, which is then decided using the theory There are several proof systems and assistants for separation of uninterpreted functions and arrays using SMT solvers. The key logic [32, 36] that incorporate proof heuristics and are incom- feature is that heaplets and separation logic constructs, which get plete. However, [3] gives a small decidable fragment of separation translated to recursively defined sets of locations, are unfolded logic on lists which has been further extended in [11] to include along with other user-defined recursive definitions and formula- a restricted form of arithmetic. Symbolic execution with separa- abstracted using this uniform natural proof strategy, tion logic has been used in [4, 5, 8] to prove structural specifica- While our proof strategy is roughly as above, there are many tions for various list and tree programs. These tools come hard- technical details that are complex. For example, the heaplets de- wired with a collection of axioms and their symbolic execution en- fined by pre/post conditions intrinsically specify the modified lo- gines check the entailment between two formulas modulo these ax- cations of the heap, which have to be considered when processing ioms. Verifast [20], on the other hand, chooses flexibility of writ- procedure calls in order to ensure which recursively defined metrics ing richer specifications over complete automation, but requires the on locations continue to hold after a procedure call. Also, the final user to provide inductive lemmas and proof tactics to aid verifica- decidable theories that we compile our conditions down to does tion. Similarly, Bedrock [15] is a Coq library that aims at mostly require a bit of quantification, but it turns out to be in the array automated (but not completely automated) procedures that requires property fragment which admits automatic decision procedures. some proof tactics to be given by the user to prove verification Implementation and Evaluation: conditions. The idea of using regions (sets of locations) for de- Our proof mechanisms are essentially a class of decidable proof scribing heaps in our work also extends to describing frames for tactics that result in sound but incomplete validation procedures. To function calls, and the use for the latter is similar to implicit dy- show that this class of natural proofs is effective in practice, we pro- namic frames [38] in the literature. The crucial difference in our vide a prototype implementation of our technique, which handles framework is that the implicit dynamic frames are syntactically de- a low-level programming language with pre-conditions and post- termined, and amenable to quantifier-free reasoning. A work that conditions written in Dryad. We show, using a large class of correct comes very close to ours is a paper by Chin et al. [14], where the au- programs manipulating lists, trees, cyclic lists, and doubly linked thors allow user-defined recursive predicates (similar to ours) and lists as well as multiple data-structures of these kinds, that the nat- build a terminating procedure that reduces the verification condi- ural proof mechanism succeeds in proving the verifications con- tion to standard logical theories. However, their procedure does not ditions automatically. These programs are drawn from a range of search for a proof in a well-defined simple and decidable class, un- sources, from textbook data-structure routines (binary search trees, like our natural proof mechanism; in fact, the resulting formulas are red-black trees, etc.) to routines from Glib low-level C-routines quantified and incompatible with decidable logics handled by SMT used in GTK+/Gnome to implement file-systems, from the Schorr- solvers. Waite garbage collection algorithm, to several programs from a re- In all of the above cited work and other manual and semi- cent secure framework developed for mobile applications [29]. Our automatic approaches to verification of heap-manipulating pro- work is by far the only one that we know of that can handle such a grams like [37], inductive definitions of algebraic data-types is ex- large class of programs, completely automatically. Our experience tremely common for capturing second-order data-structure proper- has been that the user-provided contracts and invariants are easily ties. Most of these approaches use proof tactics which unroll induc- expressible in Dryad, and the automatic natural proof mechanisms tive definitions and do extensive unification to try to match terms work extremely fast. In fact, contrary to our own expectations, we to find simple proofs. Our notion of natural proofs is very much in- also found that the tool is useful in debugging: in several cases, spired by such kinds of semi-automatic and manual heap reasoning when the annotations supplied were incorrect, the model provided that we have seen in the literature. by the SMT solver for the natural proof was useful in detecting There is also a variety of verification tools based on classi- errors and correcting the invariants/program. cal logics and SMT solvers. [23] and VCC [16] compile to Boogie [2] and generate VCs that are passed to SMT solvers. This approach requires significant ghost annotations, and annota- 2. Related Work tions that explicitly express and manipulate frames. The Jahob sys- tem [43, 44] is one of the first attempts at full functional verifica- The natural proof methodology was introduced in [27] (see also [39]), tion of linked data structures, which integrates a variety of theo- but was exclusively built for tree data-structures. In particular, this rem provers, including SMT solvers, and makes the process mostly work could only handle recursive programs, i.e., no while-loops, automated. However, complex specifications combining structure, and even for tree data-structures, imposed a large number of restric- data and separation usually require more complex provers such tions on pre/post conditions for methods— the input to a procedure as Mona [21], or even interactive theorem provers such as Is- had to be only a single tree, the method can only return a sin- abelle [31] in the worst case. The success of the proof search also gle tree, and even then must havoc the input tree given to it. The relies on users’ manual guidance. lack of handling of multiple structures means that even simple pro- grams like mergesort (that merges two lists), cannot be handled, void ∆ heapify(loc x) { assume x.left , nil For a location x, the recursive definition keys −→(x) returns the if 0 pf (x.left = nil) assume x.right , nil set of keys at the nodes of the tree rooted at x: if x is nil and the s := x.right; 0 lx := x.left0 heaplet is empty, then the empty-set; otherwise, the union of the else if (x.right = nil) rx := x.right0 key stored at x and the keys stored in the left and right subtrees of s := x.left; assume lx.key < rx.key ∆ else { 0 0 x. Similarly, the recursive definition mheap −→(x) states that x points s := x.right0 pf lx := x.left assume s , nil to a max-heap if: x is nil and the heaplet is empty; or x and the rx := x.right; assume s.key > x.key heaplets of the left and right subtrees of x are mutually disjoint (x if (lx.key < rx.key) 0 0 t := s.key0 points to a tree) and the key at x is greater than or equal to the keys s := x.right; s.key := x.key else 1 0 of the left and right subtrees of x. x.key2 := t The method heapify in Figure 1 is at the heart of the procedure s := x.left; heapify(s) } for deleting the root from a max-heap (removing the node with the ∆ def if mheap −→(x) = maximum priority). If the max-heap property is violated at a node x (s =/= nil) pf if (s.key > x.key) { x = nil ∧ emp while satisfied by its descendants, then heapify restores the max- t := s.key; key,left,right heap property at x. It does so by recursively descending into the ∨ x 7−→ (k, l, r) s.key := x.key; ∆ ∆ tree, swapping the key of the root with the key at its left or right x.key := t; ∗ (mheap −→(l) ∧{k} ≥ keys −→(l)) pf pf child, whichever is greater. The precondition ϕpre binds the free heapify(s); ∆ ∆ ∗ (mheap −→(r) ∧{k} ≥ keys −→(r)) variable K to the set of keys of x. The postcondition states that } pf pf ∆ def  after the procedure call, x satisfies the max-heap property and the } keys −→(x) = pf set of keys of x is unchanged (same as K). key,left,right x = nil ∧emp : ∅ ; ϕpre ≡ x 7−→ (k, l, r) key,left,right One of the main aspects of our approach is to reduce reasoning ∆ ∆ 7−→ ∗ ∗ mheap −→(l) ∗ mheap −→(r) x (k, l, r) true : about heaplet semantics and separation logic constructs to reason- pf pf ∆ ∆ ∆  keys −→(l) ∪{k} ∪ keys −→(r) ; ing about sets of locations. We use set operations like union, in- ∧ keys −→(x) = K pf pf pf ∆ ∆ default : ∅ tersection and membership to describe separation constraints on a ϕpost ≡ mheap −→(x) ∧ keys −→(x) = K pf pf  heaplet satisfying a formula. This translation from Dryad formu- las, like those in Figure 1, to formulas in classical logic with re- Figure 1. Motivating example: Heapify cursive predicates and functions is formally presented in Section 5. Intuitively, we associate a set of locations to each (spatial) atomic formula, which is the domain of the heaplet satisfying that formula. The idea of unfolding recursive definitions and formula abstrac- Dryad requires that this heaplet is syntactically determined for each tion also features in the work by Suter et al. [39, 40], where a pro- formula. For example, the heaplet associated to the formula x 7→ ... cedure for algebraic data-types is presented. However, this work ∆ is the singleton {x}; for recursive definitions like mheap −→(x) and focuses on soundness and completeness, and is not terminating for pf ∆ several complex data structures like red-black trees. Moreover, the keys −→(x), the domain of the heaplet is reach{left,right}(x), which in- pf work limits itself to functional program correctness; in our opinion, tuitively is the set of locations reachable from x using the pointer functional programs are very similar to algebraic inductive specifi- fields left and right, and can be defined recursively. cations, leading to much simpler proof procedures. As shown in Figure 1, ϕpre is a conjunction of two formulas. There is also a rich literature on completely automatic deci- If Gpre is the domain of the heaplet associated to ϕpre, then the sion procedures for restricted heap logics, some of which com- first conjunct requires Gpre to be the disjoint union of the sets {x}, bine structure-logic and arbitrary data-logic. These logics are usu- reach{left,right}(left(x)) and reach{left,right}(right(x)). The second con- ally FOLs with restricted quantifiers, and usually are decided us- junct requires Gpre = reach{left,right}(x). From these heaplet con- isbq ing SMT solvers. The logics L [22] and CSL [9, 10] offer such straints, we can translate ϕpre to the following formula in classical reasoning with restricted reachability predicates and quantification; logic over the global heap: see also the logics in [1, 7, 30, 33–35]. Strand is a relatively ex- pressive logic that can handle some data-structure properties (like Gpre = {x}∪ reach{left,right}(left(x)) ∪ reach{left,right}(right(x)) BSTs) and admits decidable fragments [25, 26], but is again not ∧ x < reach{left,right}(left(x)) ∧ x < reach{left,right}(right(x)) expressive enough for more complex properties of inductive data- ∧ reach{left,right}(left(x)) ∩ reach{left,right}(right(x)) = ∅∧ x , nil structures. None of these logics can express the class of VCs for ∧ mheap(left(x)) ∧ mheap(right(x)) full functional verification explored in this paper. ∧ Gpre = reach{left,right}(x) ∧ keys(x) = K

Similarly, we translate ϕpost to

3. Motivating Example Gpost = reach{left,right}(x) ∧ mheap(x) ∧ keys(x) = K In this section we give intuition into our verification approach Note that the recursive definitions mheap and keys without the through a motivating example. Recall that a max-heap is a binary “∆” superscript are in the classical logic (without the heaplet con- tree such that for each node n the key stored at n is greater than straint). Hence the recursive predicate mheap satisfies or equal to the keys stored at each of its children. Heaps are often mheap(x) ↔ x=nil∧reach{left,right}(x) = ∅ used to implement priority queues. In Figure 1, in the lower right ∨ x,nil∧x key0(x) ∧ t = key0(s) −→ ∧ key1 = key0{s ← key0(x)}∧ key2 = key1{x ← t} is parameterized by a set of pointer fields pf and a set of program ∆ variables ~v, denoted as f −→ . The subscripts are used in defining the Once we have expressed the verification condition in classical pf ,~v logic with recursive definitions over the global heap, we prove semantics of recursive functions in Section 4.2. We usually simply it using the natural proof methodology. We unfold the recursive use f ∆ when the subscripts are not relevant in the context. Similarly, ∆ ∆ definitions mheap(x), keys(x) and reach{ }(x) for x, lx and rx recursive predicates are denoted as p −→ or simply p . The recursive left,right pf ,~v (the footprint), thus evaluating them in terms of their values on the functions are defined using the syntax: frontier. The call to heapify preserves the recursive definitions on f f f f f locations reachable from lx, and modifies those on rx according to ϕ1 (x,~v, ~s): t1 (x, ~s); ... ; ϕk (x,~v, ~s): tk (x, ~s) ; default : tk+1(x, ~s) the pre/post condition. Finally, we abstract the recursive definitions f f  on the frontier with uninterpreted functions. We decide the resulted where ϕu (x,~v, ~s)/tu (x, ~s) is a formula/term in our logic with ~s im- plicitly existentially quantified. The recursively defined predicates formula (which is in a decidable logic) using an SMT solver. p Section 6 describes this process in detail. are defined using the syntax: ϕ (x,~v, ~s), which is a formula in our logic with ~s implicitly existentially quantified. The recursive func- ryad tion syntax above expresses a case-split, with the function evaluat- 4. TheLogicD ing to the first term whose guard evaluates to true. The restrictions In this section we present our logic Dryadsep; this redefines the on the recursive definitions are: logic Dryad [27] on arbitrary data-structures (not just trees), using • heaplet semantics and separation logic primitives; the logic hence is Subtraction, set-difference, and negation are disallowed; a quantifier-free heaplet logic augmented with recursively defined • Every variable in ~s should appear in the right hand side of a predicates/functions. However, for brevity, we will refer to the points-to relation binding it to x exactly once. new logic we propose as Dryad, and refer to the logic in [27] as For examples of recursive functions and predicates, see the defini- Dryadtree. ∆ ∆ tions keys −→(x) and mheap −→(x) in Figure 1, respectively. The set of pf pf 4.1 Syntax program variables ~v parameterizing the definitions is empty in both Let us fix a finite set of pointer-fields PF and a finite set of data- these definitions and the set of implicitly existentially-quantified fields DF. A record consists of a set of pointer-fields from PF and variables ~s is {k, l, r}. a set of data-fields from DF. Our logic also presumes that locations refer to entire records rather than particular fields, and that address 4.2 Semantics arithmetic is precluded. We will use the term locations hence to Our logic is interpreted on models that are program states: refer to these records. We assume that every field is defined at every location, i.e., all memory records have the same layout (to Definition 4.1. A program state is a tuple C = (R, s, h) where simplify the presentation); our logic can easily be extended with • R ⊆ Loc \{nil} is a finite set of locations; record types. • s : Vars → Int ∪ Loc is a store mapping program variables to Let Bool = {true, false} stand for the set of Boolean values, locations or integers (of appropriate type); Int stand for the set of integers and Loc stand for the universe of • h : R × (PF ∪ DF) → Int ∪ Loc is a heaplet mapping non- S locations. For any set A, let (A) denote the set of all finite subsets nil locations and each pointer-field/data-field to values of the MS of A, and let (A) denote the set of all finite multisets with appropriate type.  elements in A. The Dryad logic allows expressing quantifier-free first-order Note that the set of locations is, in general, larger than the state properties over heaps/heaplets augmented with recursively defined R and hence R defines a subset of heap locations. The store maps notions for a location to express second-order properties, denoted variables to locations (not necessarily in R), but the heaplet h gives as a function r : Loc → D. The codomain D can be IntL, S(Loc), interpretations for pointer and data-fields only for elements in R. S(Int), MS(Int)L or Bool, where IntL and MS(Int)L extend Int Given a heaplet h, for every pointer field pf, we denote the and MS(Int) to lattice domains, respectively, in order to give least projection of h on R × (PF\{pf}∪ DF) as h ∤ pf; similarly, for every fixed-point semantics (explained later in this section). Typical ex- data-field df, we denote the projection of h on R × (PF ∪ DF \{df}) amples of these recursive definitions include the definitions of the as h ∤ df. Also, for every subset S ⊆ R, we denote the projection of height of a tree or the height of black-nodes in the tree rooted at a h on S × (PF ∪ DF) as h | S . node (recursively defined integers), the set of nodes reachable from A term/formula with free variables F is interpreted by interpret- a location following certain pointer fields (recursively defined sets ing the free variables in F using the map s from variables to values. of locations), the set/multiset of keys stored at a particular data-field The semantics of Dryad is similar to that of classical Separation under nodes reachable from a location (recursively defined set/mul- Logic (SL). In particular, a term/formula without recursive defini- tiset of integers), and the property that the tree rooted at a node is tions is interpreted exactly in the same way in Dryad and SL. Hence a binary search tree or a balanced tree or just a tree (recursively we first give the semantics of the non-recursive part, followed by defined predicates). the semantics of recursive definitions. ∆ ∆ ∆ ∆ ∆ i : Loc → IntL sl : Loc → S(Loc) si : Loc → S(Int) msi : Loc → MS(Int)L p : Loc → Bool j ∈ IntL Variables L ∈ S(Loc) Variables S ∈ S(Int) Variables MS ∈ MS(Int)L Variables q ∈ Bool Variables x ∈ Loc Variables c : IntL Constant pf ∈ PF df ∈ DF Loc Term: lt ::= x | nil ∆ IntL Term: it ::= c | j | i −→ (lt) | it + it | it − it pf ,~v ∆ S(Loc) Term: slt ::= ∅l | L | {lt} | sl −→ (lt) | slt ∪ slt | slt ∩ slt | slt \ slt pf ,~v ∆ S(Int) Term: sit ::= ∅s | S | {it} | si (lt) | sit ∪ sit | sit ∩ sit | sit \ sit pf~ ,~v ∆ MS(Int)L Term: msit ::= ∅m | MS |{it}m | msi (lt) | msit ∪ msit | msit ∩ msit | msit\msit pf~ ,~v −→ −→ ∆ pf ,df Positive Formula: ϕ ::= true | false | q | p −→ (lt) | emp | lt 7−→ (~lt, ~it) | lt = lt | lt , lt | it ≤ it | it < it | sit ≤ sit | sit < sit | msit ≤ msit | msit < msit pf ,~v | slt ⊆ slt | slt * slt | sit ⊆ sit | sit * sit | msit ⊑ msit | msit @ msit | lt ∈ slt | lt < slt | it ∈ sit | it < sit | it ∈ msit | it < msit | ϕ ∧ ϕ | ϕ ∨ ϕ | ϕ ∗ ϕ Formula: ψ ::= ϕ | ψ ∧ ψ | ψ ∨ ψ |¬ψ

∆ def f f f f f Recursive function : f −→ (x) = ϕ (x,~v, ~s) : t (x, ~s) ; ... ; ϕ (x,~v, ~s) : t (x, ~s) ; default : t + (x, ~s) pf ,~v 1 1 k k k 1 ∆ def p  Recursive predicate : p −→ (x) = ϕ (x,~v, ~s) pf ,~v

Figure 2. Syntax of Dryad Before defining the semantics of formulas, we define the pure ~lt and it~, respectively. Formally, the semantics of this formula is property for terms/formulas. Intuitively, a term/formula is pure if it given as: / −→ −→ is independent of the heap. Syntactically, a term formula is pure if pf ,df it does not contain emp, 7−→ or any recursive definition. Note that (R, s, h) |= lt 7−→ (~lt, ~it) iff R = {~ltR,s,h} and in SL all terms are pure, but in Dryad, a term can be impure if it h(~ltR,s,h , pfi) = ~ltiR,s,h for corresponding pfi and lti, contains a recursive function f ∆. h(~ltR,s,h , dfi) = ~itiR,s,h for corresponding dfi and iti. Semantics of terms Note that, as in separation logic, the above has a strict semantics— Each T -term evaluates to either a normal value of type T , or to the heaplet must be a singleton set and cannot be a larger set. ∼ ′ undef, which is only used in interpreting recursive functions (will For binary relations t t between integers, sets, and multisets, including equality, the pure property plays an important role. Re- be explained later). As a special value, undef will be propagated member that in SL all terms are pure. To be consistent with SL, if throughout the formula: if a formula ϕ contains a sub-term that both t and t′ are pure, it is interpreted in the normal way. Otherwise, evaluates to undef, then ϕ will evaluate to false if it appears t ∼ t′ is only defined on the minimum heaplet required by t and t′, positively, and will evaluate to true otherwise. Intuitively, undef more concretely the union of the heaplet associated with t and t′. cannot help in making the formula true over a model. ′ ′ ′ (R, s, h) |= t ∼ t iff t or t is pure and ~tC ∼ ~t C The Loc terms are evaluated as follows: ′ or t and t are impure and there exist R1, R2 s.t. R = R1 ∪ R2 ~xC = s(x) ′ ′ ~nilC = nil and ~tC|R1 , undef, ~t C|R2 , undef and ~tC|R1 ∼ ~t C|R2

′ where ∼ is interpreted in the natural way. For any binary operator op, t op t is evaluated as follows: The semantics of the disjoint conjunction operator ∗ is defined as follows. The formula ϕ0 ∗ ϕ1 asserts that the heap can be split ′ ′ ~tC op ~t C if t or t is pure into two disjoint parts in which ϕ0 and ϕ1 hold respectively: ′ ~tC|R op ~t C|R else if there exist R1, R2 such that ′  1 2 (R, s, h) |= ϕ0 ∗ ϕ1 iff there exist R0, R1 s.t. R0 ∩ R1 = ∅ and ~t op t C =  R = R1 ∪ R2, ~tC|R1 , undef  ′  and ~t  , undef R0 ∪ R1 = R and (R0, s, h | R0) |= ϕ0 and (R1, s, h | R1) |= ϕ1  C|R2  undef otherwise  where op is interpreted in the natural way. Boolean combinations are defined in the standard way:

For singletons, {it} will evaluate to ∅ if it evaluates to −∞ or ∞: (R, s, h) |= ϕ0 ∧ ϕ1 iff (R, s, h) |= ϕ0 and (R, s, h) |= ϕ1 (R, s, h) |= ϕ0 ∨ ϕ1 iff (R, s, h) |= ϕ0 or (R, s, h) |= ϕ1 undef if ~itC = undef (R, s, h) |= ¬ϕ iff (R, s, h) 6|= ϕ ~{it}C =  ∅ if ~itC = −∞ or ∞   {~itC} otherwise  Semantics of recursive definitions  {it}m and {lt} evaluate similarly. The main semantical difference between Dryad and SL is on recur- Semantics of formulas sive definitions. We would like to deterministically delineate the The formula true is always interpreted to be true: heap domain for any recursive definition, so that the heap domain required by any Dryad formula can be syntactically determined. R, s, h |= true ( ) ∆ −→ Given a recursive definition rec −→ , the subscripts pf and ~v play a pf ,~v The formula emp asserts that the heap is empty: role in delineating the heap domain. Intuitively, the heap domain for ∆ rec −→ (l) is the set of locations reachable from l using pointer-fields (R, s, h) |= emp iff R = ∅ pf ,~v −→ −→ −→ pf ,df in pf , but without going through the locations ~v. In other words, we The formula lt 7−→ (~lt, it~) asserts that the heap contains exactly −→ −→ want to take the set of locations that lie in between l and~v. Precisely, one record consisting of fields pf and df , at address lt, with values this set is determined by a location l and a program state (R, s, h). We denote it as reachset −→ (l, (R, s, h)). Formally it is the smallest on the heaplet that includes the underlying tree nodes of the max- pf ,~v set of locations L satisfying the following two conditions: heap only, as the heaplet for a recursive definition is the set of all reachable nodes according to the two pointers. 1. l is in the set L if l is not in ~v and l , nil; To clarify the difference between Dryad and SL, consider now 2. for each c in L, with c ∈ R, and for each pointer pf, if h(c, pf) is this recursive definition: ∆ def l,r ∆ ∆ not in ~v and is not nil, then h(c, pf) is also in L. p{l,r}(x) = (x = nil ∧ emp) ∨ (x 7−→ y, z) ∗ p{l,r}(y) ∨ p{l,r}(z) h  i Note that even though the reach set is defined with respect to Now consider a global heap that has a tree rooted at x with pointer the edges in the heaplet, we can determine whether R includes all fields l and r. The above recursive formula, in separation logic, will nodes reachable from l without going through ~v in the global heap be true on any heaplet that contains the nodes of a path in this tree by checking whether R = reachset −→ (l, (R, s, h)). from x to nil. However, in Dryad, we require that the heaplet must pf ,~v ∆ satisfy the heap constraints of the formula and also be the precise For each recursive definition rec −→ , we usually simply denote pf ,~v set of locations reachable from x using the pointer fields l and r. rec reachset −→ as reachset , as the subscripts are implicitly known. pf ,~v Consequently, if the tree pointed to by x has more than one path, Now, given a program state C = (R, s, h) and a recursive the Dryad formula will be false for any heaplet. function/predicate rec∆, the semantics on a location l depends The above example shows the advantage of Dryad; when on whether the heap domain R is exactly the required reach set heaplets are determined, we can avoid quantification. We have not reachsetrec(l, (R, s, h)). If this is not true, we simply interpret it as found natural examples where an undetermined heaplet semantics undef or false. helps in specifying properties of heaps. If the heap domain matches the reach set (i.e., R = reachset −→ (l, ryad pf ,~v D can express structures beyond trees. The main restriction (R, s, h))), the semantics is defined in the natural way (using least we do impose is that we allow only unary recursive definitions, as fixed-points). The colon operator in the syntax of recursive function this allows us to find simpler natural proofs since there is only one ∆ f −→ translates into a nested if-then-else (ITE) operator. Formally, way to unfold the definition across a footprint. However, Dryad can pf ,~v express structures like cyclic lists and doubly-linked lists. ∆ f f f f f ∆ ~ f −→ (l)C = ITE ϕ , ~t C|R , ITE(ϕ , ~t C|R ,... ~t C|R ... ) A cyclic-list is captured as (v 7→ y) ∗ lseg (y). Here, v is pf ,~v 1 1 1 2 2 2 k+1 k+1 next,v  a program variable which denotes the head of the cyclic-list and f ∆ where R1 ... Rk+1 ⊆ R such that ~ti C|Ri , undef. In order to give lsegnext,v(y) captures the list segment from y back to the head v, least fixed-point semantics for recursive definitions in the logic, we where the subscripts next and v indicate that the heaplet of the list extend the primitive data-types to lattice domains. Bool with the segment is the locations that can be reached using the field next, but order false ⊑ true forms a complete lattice, and S(Loc) and without going through v: S def next (Int) ordered by inclusion, with join as union and meet as in- lseg∆ (y) = (y = v ∧ emp) ∨ (y 7→ z) ∗ lseg∆ (z) tersection, form complete lattices. Integers and multisets are ex- next,v next,v tended to lattices. Let (Int , ≤) denote the complete lattice, where  L Another interesting example is a doubly-linked list. We define a IntL = Int∪{−∞, ∞}, and where the ordering is ≤, join is max, meet doubly-linked list as the following unary predicate: is min. Also, MS(Int)L, ⊑ denote the complete lattice constructed ∆ next from MS(Int), where MS(Int)L = MS(Int) ∪ {⊤}, and ⊑ extends dllnext(x) = (x = nil ∧ emp) ∨ (x 7→ nil) ∨ next prev ∆ the inclusion relation with S ⊑ ⊤ for any M ∈ MS(Int). It is easy x 7→ y ∗ (y 7→ x ∗ true) ∧ dllnext(y) to see that (IntL, ⊑) and (MS(Int)L, ⊑) are complete lattices.  Formally, let Def consists of definitions of integer functions I, The first two disjuncts in the definition cover the base case when set-of-locations functions SL, set-of-integers functions SI, multiset- x is nil or the location next to x is nil; otherwise, let y be the location of-integers functions MSI and predicates P. Since these definitions next to x, then the prev pointer at y points to x and location y is could rely on each other, we evaluate them altogether as a function recursively defined as a doubly-linked list. vector −→ −→ −→ −−−→ −→ r∆ = (i∆ ,sl∆,si∆,msi∆,p∆) 5. Translation to a Logic over the Global Heap We take the cartesian product lattice of the individual lattices and We now show one of the main contributions of this paper— a trans- take the least fixed-point of r∆ to obtain the semantics for each lation from Dryad logic to classical logic with recursive predicates ∆ ∆ definition. Let selectrec(lfp(r )), for each recursive definition rec , and functions, but over the global heap. The formulation of separa- denote the selection of the coordinate for rec∆ in lfp(r∆). tion logic primitives in the global heap allows us to express com- Now we can formally define the semantics of recursive defini- plex structural properties, like disjointness of heaplets and tree- tions. For any configuration C, the semantics of a recursive function ness, using recursive definitions over sets of locations, which are ∆ f is defined as: defined locally, and are amenable to unfolding across the footprint ∆ f ∆ select f lfp(r ) (~ltC ) if R = reachset (~ltC ,C) and hence amenable to natural proofs. ~ f (lt)C = ∆ ∆ ( undef otherwise For example, consider the formula mheap (x) ∗ mheap (y),  ∆ ∆ where mheap is defined in Section 3. Since the heaplets for and the semantics of a recursive predicate p is defined as ∆ ∆ mheap (x) and mheap (y) are precise, it can get translated to an ∆ p ∆ selectp lfp(r ) (~ltC ) if R = reachset (~ltC ,C) equivalent formula with a free set variable G that denotes the global ~p (lt)C = ( false  otherwise heap over which the formula is evaluated: mheap mheap Remark: Note that we disallow negative operations (subtraction, mheap(x) ∧ mheap(y) ∧ (reach (x) ∩ reach (y) = ∅) set-difference and negation) in defining recursive definitions. This ∧ (reachmheap(x) ∪ reachmheap(y) = G) ∆ syntactical restriction guarantees that each iteration of r is - mheap ∆ where mheap and reach are corresponding recursive defini- tonic. By Knaster-Tarski theorem, r admits a least fixed-point. tions in classical logic, which will be defined later in this section. Note that we use italics and remove the ∆ superscript to show the Examples difference from their counterpart in Dryad. The Dryad logic was already used in Section 3 to define max- We assume the Dryad formula to be translated is in disjunctive heaps. Note that the definition of a max-heap is precisely defined normal form, i.e., ∨ operators should be above all ∗ and ∧ operators. Construct Domain-exact Scope T(var / const, G) ≡ var / const var/const false ∅ T({t} / {t}m, G) ≡ {t} / {t}m ′ ′ {t}/{t}m dom-ext(t) scope(t) T(t op t , G) ≡ T(t, G) op T(t , G) t op t′ dom-ext(t) ∨ dom-ext(t′) scope(t) ∪ scope(t′) T( f ∆(lt), G) ≡ ITE reach f (lt) = G, f (lt), undef ∆ f f (lt) true reachset (lt) T(true / false, G) ≡ true / false  true/false false ∅ T(emp, G) ≡ G = ∅ emp true ∅ pf~ ,df~ T(lt 7−→ (~lt, it~), G) ≡ G = {lt} ∧ pf T(lt, G) = T(lt , G) pf~ ,df~ pfi i i lt 7−→ (~lt, ~it) true {lt} ∧ df T(lt, G) = T(it , G) V dfi i  i ∆ p ∆ p p (lt) true reachset (lt) T(p (lt), G) ≡ p(lt) ∧ G = reachV (lt)  t ∼ t′ dom-ext(t) ∨ dom-ext(t′) scope(t) ∪ scope(t′) t ∼ t′ if t ∼ t′ is not domain-exact ′ ′ ′ T(t ∼ t′, G) ≡ ϕ ∧ ϕ dom-ext(ϕ) ∨ dom-ext(ϕ ) scope(ϕ) ∪ scope(ϕ ) ′ ′ ′ ′ ′  ∼ ∧ = ∼ ϕ ∗ ϕ dom-ext(ϕ) ∧ dom-ext(ϕ ) scope(ϕ) ∪ scope(ϕ )  t t G scope(t t ) otherwise T(ϕ ∧ ϕ′, G) ≡ T(ϕ,G) ∧ T(ϕ′,G)  T(ϕ ∨ ϕ′, G) ≡ T(ϕ,G) ∨ T(ϕ′,G) Figure 3. Domain-exact property and Scope function. Both are de- T(¬ϕ, G) ≡ ¬T(ϕ,G) fined only for terms and formulas without disjunction and negation. ′ ′ A formula is assumed in its disjunctive normal form. T ϕ, scope(ϕ) ∧ T ϕ , scope(ϕ ) ∧ scope(ϕ) ∪ scope(ϕ′) = G   ′  This is not a real restriction as one can always push the disjunction  ∧ scope(ϕ) ∩ scope(ϕ ) = ∅  ′ out. This normal form ensures that for all occurrences of the separa-  if both ϕ and ϕ are domain-exact  tion operator in a formula, there exists a unique way of splitting the  ′  T ϕ, scope(ϕ) ∧ T ϕ , G \ scope(ϕ) heap so as to satisfy the ∗ separated sub-formulas. Also, it ensures  ∧ scope(ϕ) ⊆ G if only ϕ is domain-exact ′    that this unique heap-split can be determined syntactically from the T(ϕ ∗ ϕ , G) ≡   T ϕ′, ϕ′ ∧ T ϕ, G \ ϕ′  scope( ) scope( ) structure of those sub-formulas.  ∧ scope(ϕ′) ⊆ G if only ϕ′ is domain-exact    In our translation, we model the heaplets associated with a for-   T ϕ, scope(ϕ) ∧ T ϕ′, scope(ϕ′) mula or a term as a set of locations and all operations on these   ∧ scope(ϕ) ∪ scope(ϕ′) ⊆ G heaplets are modeled as set operations like set union, set intersec-   ′   ∧ scope(ϕ) ∩ scope(ϕ ) = ∅ tions, etc. over set-of-location variables. For example the separating  ′  if neither ϕ nor ϕ is domain-exact conjunction P∗Q is translated to the following set constraint: the in-   tersection of the sets associated with the heaplets in the formulas P Figure 4. Translating Dryad terms/formulas to classical logic and Q is empty. Given a formula ϕ in Dryad and its associated heap domain modeled by a set variable G, we define an inductive trans- lation T into a classical logic formula T(ϕ, G) in the quantifier- heaplet is restricted to reachrec. Formally, suppose rec∆ is a recur- free theory of finite sets, integers and uninterpreted functions. The sive definition w.r.t. pointer fields pf~ and stopping locations ~v, then translated formula is not interpreted on a heaplet, but interpreted on reachrec is recursively defined as the least fixed-point of a global heap (i.e., with the heap domain Loc). def The translation uses an auxiliary domain-exact property and reachrec(x) = ITE x = nil ∨ x ∈ ~v, ∅, {x} ∪ reachrec(pf(x))   an auxiliary scope function. The domain-exact property indicates pf[∈pf~  whether a term evaluates to a well-defined value or a positive def formula evaluates to true on a fixed heap domain or not. This is For each recursive predicate p∆ defined as p∆(x) = ϕp(x,~v, ~s), we different from the property pure; a pure formula or term is not define domain-exact but the reverse implication is not true, in general. For def p(x) = T ϕp(x, ~v, ~s), reachp(x) example, the formula (lt 7−→ it) ∗ true is not domain-exact but is also not pure. The scope function maps a term to the minimum Similarly, for each recursive function f ∆ defined as heap domain required to evaluate it to a normal value, and maps a def ∆ = f f f f f positive formula to the minimum heap domain required to evaluate f (x) ϕ1 (x,~v, ~s) : t1 (x, ~s) ; ... ϕk (x,~v, ~s) : tk (x, ~s) ; default : tk+1(x, ~s) it to true. The domain-exact property and the scope function are we define  defined inductively in Figure 3. We describe the logic translation in detail in Figure 4. The ITE def f f f −∆ f (x) = ITE T ϕ1 (x,~v, ~s), reach (x) , t1 (x, ~s) expression used in the translation is short for ”if-then-else”. It is just  T ϕ f x,~v, ~s , reach f x , t f −∆ x, ~s a conditional expression defined as follows: ITE(φ, t1, t2) evaluates ITE 2 ( ) ( ) 2 ( )  to t1 if φ is true, otherwise evaluates to t2. f −∆ ..., tk+1 (x, ~s) ... In general, our translation restricts an impure term/formula to  be evaluated only on the syntactically determined heap domain f −∆ f where ti (x, ~s) is just the classical logic counterpart of ti (x, ~s), according to the semantics of Dryad. In particular, when evaluating f ∆ when interpreted in a heap domain within reach (x). Formally it is a recursive formula or predicate p , we ensure that the heaplet is short for p ∗ ′ precisely the reach set reach (lt). For a formula ϕ ϕ , translation f f f ′ f to classical logic depends on whether the sub-formulas ϕ and ϕ ITE scope(ti (x, ~s)) ⊆ reach (x), T ti (x, ~s), scope(ti (x, ~s)) , undef are domain-exact or not. If a sub-formula is domain-exact then it is    ∆ ryad evaluated on its scope. If it is not domain-exact, then it is evaluated Now for each set of recursive definitions Def in D , we can on the rest of the heaplet. translate it to a set of recursive definitions Def in classical logic. Recursive definitions in Dryad are also translated to recursive ∆ Theorem 5.1. Let ϕ be a Dryad formula w.r.t. a set of recursive definitions in classical logic. Translating a recursive definition rec ∆ uses the corresponding definitions rec and reachrec, both of which definitions Def . For every program state C with heap domain Loc, are defined recursively in classical logic. The set reachrec repre- and for every interpretation of variables I including a valuation sents the domain of the required heaplet for evaluating rec∆, and for set-variable G, (C, I) |= T(ϕ, G) w.r.t. Def if and only if ∆ ∆  the ∆-eliminated definition rec captures the value of rec when the (C |G, I\{G}) |= ϕ w.r.t. Def . P :− P ; P | stmt Hence we simply use v to denote sn(v). Moreover, every recursive stmt :− u := v | u := nil | u := v.pf | u.pf := v predicate/function is also indexed by i. For example, pi is the recur- ∆ p | j := u.df | u.df := j | j := aexpr sive predicate such that pi(l) is true iff Ci |= T(p (l), reachset (l)). | u := new | free u | assume bexpr Now for every formula ϕ and every index i, we can give the index | u := f (~v,~z) | j := g(~v,~z) i to all the pointer fields, data fields and recursive definitions. We aexpr :− int | j | aexpr + aexpr | aexpr − aexpr denote the indexed formula as ϕ[i]. − = | = nil | ≤ bexpr : u v u aexpr aexpr ψ | ¬bexpr | bexpr ∨ bexpr We algorithmically derive the verification condition VC corre- sponding to it in classical logic with recursive definitions on the Figure 5. Syntax of programs global heap (the algorithm is quite involved, and is presented in Appendix A in the supplemental material ). 6. Natural Proofs for ryad D Theorem 6.1. Given a Hoare-triple {ψpre} P {ψpost}, assume that In this section we show how Dryad can be used in reasoning about each procedure call in P satisfies its associated pre- and post- the correctness of imperative heap-manipulating programs, in terms conditions. Then the triple is valid if the formula ψVC derived above of verifying Hoare-triples where the pre- and post-conditions are is valid. Moreover, when P contains no procedure calls, the triple expressed in Dryad. We first introduce a simple programming lan- is valid iff ψVC is valid. guage and the corresponding Hoare-triples, generate a classical  logic formula as the verification condition, utilizing in part the Proof. Presented in Appendix B in the supplemental material. translation defined in Section 5. Then we present the natural proof 6.2 Unfolding Across the Footprint framework which consists of two steps. In the first step, we utilize the idea of unfolding across the footprint to strengthen the verifica- The verification condition obtained above is a quantifier-free for- tion condition. In the second step, we prove the validity of the VC mula involving recursive definitions and the reachable sets of the p soundly using the technique of formula abstraction. form reach (x), which are also defined recursively. While these re- cursive definitions can be unfolded ad infinitum, we exploit a proof 6.1 Programs and Hoare-triples tactic called unfolding across the footprint. Intuitively, the footprint We consider straight-line program segments that do destructive is the set of locations explored by the program explicitly (not in- pointer-updates, data-updates and procedure calls. Parameterized cluding procedure calls). More precisely, a location is in the foot- by a set of pointer fields PF and a set of data-fields DF, the syntax print if it is dereferenced explicitly in the program. The idea is to of the programs is defined in Figure 5, where pf ∈ PF, f ∈ DF, u unfold the recursive definitions over the footprint of the program, and v are program variables of type location, j and z are program so that recursive definitions on the footprint nodes are related, as variables of type integer, int is an integer constant. To simplify the precisely as possible, to the recursive definitions on frontier nodes. ff presentation, we assume all program variables are local and are This will enable e ective use of the formula abstraction mecha- either pre-assigned or assigned once in the program. nism, as when recursive definitions on frontier nodes are made un- We allow two kinds of recursive procedures, one returning a interpreted, the unfolding formulas ensure tight conditions that the location f (~v,~z) and one returning an integer g(~v,~z). Each proce- frontier nodes have to satisfy. ff dure/program is annotated with its pre- and post-conditions in Furthermore, to enable e ective frame reasoning, it is also nec- essary to strengthen the verification condition with a set of in- Dryad. The pre-condition is denoted as a formula ψpre(~v,~z,~c), where ~v and ~z are variables as the formal parameters/program vari- stances of the frame rule. More concretely, we need to capture the ables, ~c is a set of implicitly existentially quantified complimentary fact that a recursive definition (or a field) on a location is unchanged variables (e.g., variable K in the pre-condition ϕ in Figure 1). during a segment or procedure call of the program, if the reachable pre locations (or only the location itself) are not affected by the segment The post-condition is denoted as a formula ψpost(ret,~v,~z,~c), where ret is the variable representing the returned value, of corresponding or procedure call. type, ~v and ~z are program variables, ~c is a set of complimentary We incorporate the above facts formally into the verification condition. Let us introduce a macro function fp that identifies the variables that have appeared in the pre-condition ψpre. Given a straight-line program with its pre- and post-conditions location variables that are in (or aliased to something in) the foot- {ψ } P {ψ }, we define its partial correctness without considering print. The footprint of P, FP, is the set of dereferenced variables pre post in P (we call a location variable dereferenced if it appears on the memory errors3: P is partially correct iff for every normal execution left-hand side of a dereferencing operator “.” in P). Then fp(u) ≡ (memory-error free) of P, which transits state C to state C′, if ′ v∈FP(u = v). C |= ψpre, then C |= ψpost. W Now we state the unfoldings and framings using a formula Given a Hoare-triple {ψpre} P {ψpost} as defined above, a set of UnfoldAndFrame. Assume there are m procedure calls in P, then recursive definitions and a set of annotated procedure declarations P can be divided into m + 1 basic segments (subprograms without are presented here. Assume that P consists of n statements, then procedure calls): S 0 ; g1 ; S 1 ; ... ; gm ; S m where S d is the consider a normal execution E, which can be represented as a se- (d + 1)-th basic segment and gd is the d-th procedure call. Then quence of program states (C0,..., Cn), where each Ci = (Ri, si, hi) represents the program state after executing the first i statements. UnfoldAndFrame ≡  The verification condition is just a formula interpreted on a state se- ^rec 0≤^d≤m u∈LVars^∪{nil} quence (C ,..., C ). Let pf : Loc → Loc be the function mapping rec 0 n i fp(u) ∨ u=nil ⇒ Unfoldd (u) ∧ FieldUnchangedd(u) ∧ every location l to its pf pointer, i.e., pfi(l) = hi(l, pf) for every loca-       tion l. Similarly, df : Loc → Int is defined such that df (l) = h (l, df) rec i i i ¬fp(u) ∨ u=nil ⇒ RecUnchangedd (u) for every l. Recall that every program variable is either pre-assigned      The formula enumerates every recursive definition rec and every or assigned once in the program, each si is an expantion of the pre- index d, and for each location u that is either pointed to by a lo- vious one, and sn is the store with all program variables defined. cation variable or is nil, the formula checks if u is in the footprint, 3 We exclude memory errors in order to simplify the presentation. Memory and then unfolds it or frames it accordingly. If u is in the foot- errors can be handled using a similar VC generation for assertions that print, then we unfold rec for the timestamps before and after S d rec negate the conditions for memory errors to occur. (represented by the formula Unfoldd (u) ); moreover, all fields of Time (s) u are unchanged if it is not affected during calling gd (represented Data-structure Routine / Routine by the formula FieldUnchangedd(u) ). If u is not in the footprint, rec find rec, insert front, i.e., in the frontier, then rec and its corresponding reach set reach Singly- insert back rec, delete all rec, < 1s are unchanged after executing S , if S does not modify any loca- Linked List d d copy rec, append rec, reverse iter tion in reachrec; they are also unchanged if reachrec is not affected by calling g . These frame assertions are represented by the for- find rec, insert rec, merge rec, d delete all rec, insert sort rec, < 1s ec nchangedrec Sorted List mula R U d (u). All the sub-formulas mentioned above reverse iter, find last iter are formulated in Appendix C in the supplemental material. insert iter 1.4 Now we can strengthen the verification condition by incorporat- quick sort iter 64.8 ing the derived formula above: insert front, insert back rec, Doubly- ′ delete all rec, append rec, < 1s ψVC ≡ ψVC ∧ UnfoldAndFrame Linked List mid insert, mid delete, meld Since the incorporated formula is implied by the verification insert front, insert back rec, ′ Cyclic List < 1s condition, we can reduce the validity of ψVC to the validity of ψVC. delete front, delete back rec

Theorem 6.2. Given a Hoare-triple {ψpre} P {ψpost}, its verification Max-Heap heapify rec 8.8 condition ψ is valid if and only if ψ′ is valid.  find rec, find iter, insert rec, VC VC < 1s delete rec, remove root rec 6.3 Formula Abstraction insert iter 72.4 BST While checking the validity of the strengthened verification condi- find leftmost iter 4.7 remove root iter 65.6 ′ ffi tion ψVC is still undecidable, as we argued before, it is often su - delete iter 225.2 cient to prove it by assuming that the recursive definitions are arbi- find rec, delete rec < 1s trary, or uninterpreted. Moreover, the uninterpreted formula falls in Treap insert rec 12.7 the array property fragment [12], whose satisfiability is decidable remove root rec 9.5 and is supported by modern SMT solvers such as Z3 [17]. This tac- balance, leftmost rec < 1s tic roughly corresponds to applying unification in proof systems. insert rec 4.1 ′ AVL-Tree To prove ψVC, we first replace each recursive predicate recd with delete rec 13.9 an uninterpreted predicaterec ˆ d, and replacing the corresponding insert rec 73.9 rec ˆ rec reach-set function reachd with an uninterpreted function reachd . insert left fix rec 8.1 abs Let the result formula be ψVC. This conversion, called formula insert right fix rec 5.1 abs ′ RB-Tree delete rec 12.1 abstraction, is sound: if ψVC is valid, so is ψVC. When a proof for abs ′ delete left fix rec 7.6 ψVC is found, we call it a natural proof for ψVC (and also for ψVC). The formula abstraction step is the only step that introduces in- delete right fix rec 5.5 leftmost rec < 1s completeness in our framework, but helps us transform the verifi- cation condition to a decidable theory. Formula abstraction (com- Binomial find min rec 1.1 merge rec bined with unfolding recursive definitions across the footprint) dis- Heap 152.7 Schorr-Waite covers recursive proofs where the recursion is structural recursion marking iter < 1s on the definitions of the data-structures. The use of these tactics (for trees) inorder tree to list rec 2.4 comes from the observation that such programs often have such Tree inorder tree to list iter 42.7 recursive proofs (see [39] also for use of formula abstractions). Traversals preorder rec, postorder rec Our goal now is to check the satisfiability of ¬ψabs in a decidable < 1s VC inorder rec 3.76 theory. The resulting formula can be expressed using the theory of maps (to model sets) and corresponding map operations to model Figure 6. Results of verifying data-structure algorithms. (more set operations. Formulas of the kind S 1 ≤ S 2, where S 1 and S 2 details at http://www.cs.uiuc.edu/˜madhu/dryad/sl/ ) are sets/multi-sets of integers, are the only ones that introduce quantification, but they can be translated to formulas in the array iterative programs (as opposed to recursive ones), and we can fix property fragment, which is decidable [12]. We hence obtain a them for each class of data-structures. We also allow the use of the formula ψAPF in the array property fragment combined with the separating implication, −∗, from separation logic while specifying theory of uninterpreted functions, maps, and arithmetic (details are these axioms. User-defined axioms are instantiated, using the nat- in Appendix D in the supplementary material). ural proof philosophy, on precisely the footprint nodes uniformly, and get translated to quantifier-free formulas. Theorem 6.3. Given a Hoare-triple {ψpre} P {ψpost}, if the derived array formula ψAPF is satisfiable, then the Hoare-triple is valid.  7. Experimental Evaluation User-provided axioms: We have implemented a prototype of the natural proof methodol- While natural proofs are often effective in finding recursive proofs ogy for Dryad presented in this paper. The prototype verifier takes that unfold recursive definitions and do unification, they are not as input a set of user-defined recursive definitions, a set of pro- geared towards finding relationships between various recursive def- cedure declarations with contracts, and a set of straight-line pro- initions themselves. We hence require the user to provide certain grams (or basic blocks) annotated with a pre-condition and a post- obvious relationships between the different recursive definitions as condition specifying a set of partial correctness properties includ- axioms. For example, lseg(x,y) ∗ list(y) ⇒ list(x) is such an ax- ing structural, data and separation requirements. Both the contracts iom saying that a list segment concatenated with a list yields a and pre-/post-conditions are written in Dryad. For each basic block, list. Note that these axioms are not program-dependent, and hence the verifier automatically generates the abstracted formula ψAPF as are not program-specific tactics that the user provides. These ax- described in Section 6, and passes ψAPF to Z3 [17], a state-of-the- ioms are necessary typically to relate partial data-structure proper- art SMT solver, to check the satisfiability in the decidable theory of ties (like list segments) to complete ones (like lists), especially in array property fragment. The front-end of our verifier is based on Time (s) ANTLR and our tool is around 4000 lines of C# code. Using the Example Routine verifier, we successfully proved the partial correctness of 59 rou- / Routine tines over a large class of programs involving heap data structures free, prepend, concat, like sorted lists, doubly-linked lists, cyclic lists and trees. Addi- insert before, remove all, remove link, delete link, < 1s tionally, we pit our natural proofs methodology against real-world copy, reverse, nth, programs and successfully verified, in total, 47 routines from dif- nth data, find, position, glib/gslist.c ferent projects including the list and queue implementations in the index, last, length Singly Glib open source library, the OpenBSD library, the Linux kernel append 4.9 Linked-List and the memory regions and the page cache implementations from insert at pos 11.4 LOC: 1.1K two different operating systems. Experimental details are available remove 3.1 at http://www.cs.uiuc.edu/˜madhu/dryad/sl/ . insert sorted list 16.6 We conducted the experiments on a machine with a dual-core, merge sorted lists 6.1 2.4GHz CPU and 6GB RAM. The first part of our experimental merge sort 3.0 results is tabulated in Figure 6. In general, for every routine, we glib/glist.c free, prepend, reverse, Doubly nth, nth data, position, checked the properties formalizing the complete verification of the < 1s find, index, last, routines— capturing the precise structure of the resulting heap- Linked-List LOC: 0.3K length structure, the precise change to the data stored in the nodes and simpleq init, the precise heaplet modified by the respective routines. < 1s simpleq remove after ffi rec iter OpenBSD/queue.h For every routine, the su x or indicates if the routine simpleq insert head 1.6 Queue was implemented recursively or iteratively using while loops. The simpleq insert tail 3.6 LOC: 0.1K names for most of the routines are self-descriptive. Routines like simpleq insert after 18.3 find, insert, delete, append, etc. are the natural implementa- simpleq remove head 2.1 tions of the corresponding operations. The routine ExpressOS/cachePage.c lookup prev 2.4 delete all for singly-linked lists, sorted lists and doubly-linked LOC: 0.1K add cachepage 6.4 lists recursively deletes all occurrences of a particular key in the ExpressOS/ memory region init < 1s input list. The max-heap routine heapify accepts an almost max- memoryRegion.c create user space region 3.6 heap in which the heap property is violated only at the root, both LOC: 0.1K split memory region 5.8 of whose children are max-heaps, and recursively descends the tree find vma, remove vma, linux/mmap.c < 1s to restore the max-heap property. The routine remove root for bi- remove vma list LOC: 0.1K nary search trees and treaps is an auxiliary routine which is called insert vm struct 11.6 in delete. Similarly, the routines leftmost for AVL-trees and RB-trees and delete fix and insert fix for RB-trees are also Figure 7. Results of verifying open-source libraries. (more details auxiliary routines. at http://www.cs.uiuc.edu/˜madhu/dryad/sl/ ) Schorr-Waite is a well-known graph marking algorithm which marks all the reachable nodes of the graph using very little addi- In an OS kernel, a process address space consists of a set of tional space. The algorithm achieves this by manipulating the point- intervals of linear addresses represented as a memory region. In ers in the graph such that the stack of nodes along the path from the the ExpressOS implementation, a memory region is implemented root is encoded in the graph itself. The Schorr-Waite algorithm is as a sorted doubly-linked list where each node of the list with a used in garbage collectors and it is traditionally considered as a start and an end address represents an interval included in the ad- challenging problem for verification [19]. The routine marking is dress space. We also verified some key components of the Linux an implementation of Schorr-Waite for trees [24] and we check the implementation of a memory region, present in the file mmap.c. In property that the resulting output tree is well-marked. Linux, a memory region is represented as a red-black tree where The routines inorder tree to list construct a list consisting each node, again, represents an address interval. We proved meth- of the keys of the input tree, which is traversed inorder. The itera- ods which find, remove and insert a vma struct (vma is short for tive version of this algorithm achieves this by maintaining a work- virtual memory address) into a memory region. list/stack of sub-trees which remain to be processed at any given It also worth mentioning that in the process of experiments, time. The routines inorder, preorder and postorder number we did make some unintentional mistakes, in writing both the ba- the nodes of an input tree according to the inorder, preorder and sic blocks and the annotations. For example, forgetting to free the ∧ ∗ postorder traversal algorithm, respectively. deleted node, or using instead of in the specification between Figure 7 shows the results of applying natural proofs to the two disjoint heaplets, were common mistakes. In these cases, Z3 verification of various other real world programs and libraries. provided counter-examples to the verification condition that cap- Glib is the low-level C library that forms the basis of the GTK+ tured the essence of the bugs, and turned out to be very helpful for toolkit and the GNOME desktop environment, apart from other us to debug the specification. These debugging hints are usually not open source projects. Using our prototype verifier, we efficiently available in other incomplete proof systems. verified Glib implementation of various routines for manipulating Our experiments show that the natural proof methodology set ffi singly-linked and doubly-linked lists. We also verified the queue forth in this paper is successful in e ciently proving full-functional library which forms part of the OpenBSD . correctness of a large variety of algorithms. Most of the VCs gen- ExpressOS is an operating-system/browser implementation erated for the above examples were discharged by Z3 in a few sec- which provides security guarantees to the user via formal verifica- onds. To the best of our knowledge, this is the first automatic mech- tion [29]. The module cachePage maintains a cache of the recently anism that can prove such a wide variety of algorithms correct, han- used disc pages. The cache is implemented as a priority queue dling such complex properties of structure, data and separation. based on a sorted list. We prove that the methods add cachepage and lookup prev (both called whenever a disc page is accessed) Acknowledgments maintain the sortedness property of the cache page. This work is partially funded by NSF CAREER award #0747041, NSF CCF #1018182, and NSA contract H98230-10-C-0294. References [22] S. Lahiri and S. Qadeer. Back to the future: revisiting precise program POPL’08 [1] I. Balaban, A. Pnueli, and L. D. Zuck. Shape analysis of single-parent verification using SMT solvers. In , pages 171–182. ACM, heaps. In VMCAI’07, volume 4349 of LNCS, pages 91–105. Springer, 2008. 2007. [23] K. R. M. Leino. Dafny: An automatic program verifier for functional correctness. In LPAR-16, volume 6355 of LNCS, pages 348–370. [2] M. Barnett, B.-Y. E. Chang, R. DeLine, B. Jacobs, and K. R. M. Leino. Springer, 2010. Boogie: A modular reusable verifier for object-oriented programs. In FMCO’05, volume 4111 of LNCS, pages 364–387. Springer, 2005. [24] A. Loginov, T. W. Reps, and M. Sagiv. Automated verification of the Deutsch-Schorr-Waite tree-traversal algorithm. In SAS’06, volume [3] J. Berdine, C. Calcagno, and P. W. O’Hearn. A decidable fragment 4134 of LNCS, pages 261–279. Springer, 2006. of separation logic. In FSTTCS’04, volume 3328 of LNCS, pages 97– 109. Springer, 2004. [25] P. Madhusudan and X. Qiu. Efficient decision procedures for heaps using STRAND. In SAS’11, volume 6887 of LNCS, pages 43–59. [4] J. Berdine, C. Calcagno, and P. W. O’Hearn. Symbolic execution with Springer, 2011. separation logic. In APLAS’05, volume 3780 of LNCS, pages 52–68. [26] P. Madhusudan, G. Parlato, and X. Qiu. Decidable logics combining Springer, 2005. heap structures and data. In POPL’11, pages 611–622. ACM, 2011. [5] J. Berdine, C. Calcagno, and P. W. O’Hearn. Smallfoot: Modular [27] P. Madhusudan, X. Qiu, and A. Stefanescu. Recursive proofs for automatic assertion checking with separation logic. In FMCO’05, inductive tree data-structures. In POPL’12, pages 123–136. ACM, volume 4111 of LNCS, pages 115–137. Springer, 2005. 2012. [6] J. Berdine, C. Calcagno, B. Cook, D. Distefano, P. W. O’Hearn, [28] S. Magill, M.-H. Tsai, P. Lee, and Y.-K. Tsay. THOR: A tool for T. Wies, and H. Yang. Shape analysis for composite data structures. reasoning about shape and arithmetic. In CAV’08, volume 5123 of In CAV’07, volume 4590 of LNCS, pages 178–192. Springer, 2007. LNCS, pages 428–432. Springer, 2008. [7] N. Bjørner and J. Hendrix. Linear functional fixed-points. In CAV’09, [29] H. Mai, E. Pek, H. Xue, S. T. King, and P. Madhusudan. Verifying volume 5643 of LNCS, pages 124–139. Springer, 2009. security invariants in ExpressOS. In ASPLOS’13, pages 293–304. [8] M. Botinˇcan, M. Parkinson, and W. Schulte. Separation logic verifica- ACM, 2013. tion of C programs with an SMT solver. ENTCS, 254:5 – 23, 2009. [30] G. Nelson. Verifying reachability invariants of linked structures. In POPL’83, pages 38–47. ACM, 1983. [9] A. Bouajjani, C. Dr˘agoi, C. Enea, and M. Sighireanu. A logic- based framework for reasoning about composite data structures. In [31] T. Nipkow, L. C. Paulson, and M. Wenzel. Isabelle/HOL - A Proof CONCUR’09, volume 5710 of LNCS, pages 178–195. Springer, 2009. Assistant for Higher-Order Logic, volume 2283 of Lecture Notes in Computer Science. Springer, 2002. [10] A. Bouajjani, C. Dr˘agoi, C. Enea, and M. Sighireanu. On inter- procedural analysis of programs with lists and data. In PLDI’11, pages [32] P. W. O’Hearn, J. C. Reynolds, and H. Yang. Local reasoning about 578–589. ACM, 2011. programs that alter data structures. In CSL’01, volume 2142 of LNCS, pages 1–19. Springer, 2001. [11] M. Bozga, R. Iosif, and S. Perarnau. Quantitative separation logic and programs with lists. In IJCAR’08, volume 5195 of LNCS, pages [33] Z. Rakamari´c, J. D. Bingham, and A. J. Hu. An inference-rule-based decision procedure for verification of heap-manipulating programs 34–49. Springer, 2008. with mutable data and cyclic data structures. In VMCAI’07, volume [12] A. R. Bradley, Z. Manna, and H. B. Sipma. What’s decidable about ar- 4349 of LNCS, pages 106–121. Springer, 2007. rays? In VMCAI’06, volume 3855 of LNCS, pages 427–442. Springer, [34] Z. Rakamari´c, R. Bruttomesso, A. J. Hu, and A. Cimatti. Verifying 2006. heap-manipulating programs in an SMT framework. In ATVA’07, [13] B.-Y. E. Chang and X. Rival. Relational inductive shape analysis. In volume 4762 of LNCS, pages 237–252. Springer, 2007. POPL’08, pages 247–260. ACM, 2008. [35] S. Ranise and C. Zarba. A theory of singly-linked lists and its ex- [14] W.-N. Chin, C. David, H. H. Nguyen, and S. Qin. Automated veri- tensible decision procedure. In SEFM’06, pages 206–215. IEEE-CS, fication of shape, size and bag properties via user-defined predicates 2006. in separation logic. Science of , 77(9):1006 – [36] J. Reynolds. Separation logic: a logic for shared mutable data struc- 1036, 2012. tures. In LICS’02, pages 55–74. IEEE-CS, 2002. [15] A. Chlipala. Mostly-automated verification of low-level programs in [37] G. Rosu, C. Ellison, and W. Schulte. Matching logic: An alternative computational separation logic. In PLDI’11, pages 234–245. ACM, to Hoare/Floyd logic. In AMAST’10, volume 6486 of LNCS, pages 2011. 142–162. Springer, 2010. [16] E. Cohen, M. Dahlweid, M. A. Hillebrand, D. Leinenbach, M. Moskal, [38] J. Smans, B. Jacobs, and F. Piessens. Implicit dynamic frames. ACM T. Santen, W. Schulte, and S. Tobies. VCC: A practical system for Trans. Program. Lang. Syst., 34(1):2:1–2:58, 2012. verifying concurrent C. In TPHOLs’09, volume 5674 of LNCS, pages [39] P. Suter, M. Dotta, and V. Kuncak. Decision procedures for algebraic 23–42. Springer, 2009. data types with abstractions. In POPL’10, pages 199–210. ACM, [17] L. M. de Moura and N. Bjørner. Z3: An efficient SMT solver. In 2010. TACAS’08, volume 4963 of LNCS, pages 337–340. Springer, 2008. [40] P. Suter, A. S. K¨oksal, and V. Kuncak. Satisfiability modulo recur- [18] D. Distefano, P. W. O’Hearn, and H. Yang. A local shape analysis sive programs. In SAS’11, volume 6887 of LNCS, pages 298–315. based on separation logic. In TACAS’06, volume 3920 of LNCS, pages Springer, 2011. 287–302. Springer, 2006. [41] H. Yang, O. Lee, J. Berdine, C. Calcagno, B. Cook, D. Distefano, and P. W. O’Hearn. Scalable shape analysis for systems code. In CAV’08, [19] T. Hubert and C. March´e. A case study of C source code verification: volume 5123 of LNCS, pages 385–398. Springer, 2008. the Schorr-Waite algorithm. In SEFM’05, pages 190–199. IEEE-CS, 2005. [42] J. Yang and C. Hawblitzel. Safe to the last instruction: automated verification of a type-safe operating system. In PLDI’10, pages 99– [20] B. Jacobs, J. Smans, P. Philippaerts, F. Vogels, W. Penninckx, and 110. ACM, 2010. F. Piessens. Verifast: A powerful, sound, predictable, fast verifier for C and Java. In NFM’11, volume 6617 of LNCS, pages 41–55. Springer, [43] K. Zee, V. Kuncak, and M. C. Rinard. Full functional verification of 2011. linked data structures. In PLDI’08, pages 349–361. ACM, 2008. [21] N. Klarlund and A. Møller. MONA. BRICS, Department of Com- [44] K. Zee, V. Kuncak, and M. C. Rinard. An integrated proof language PLDI’09 puter Science, Aarhus University, January 2001. Available from for imperative programs. In , pages 338–351. ACM, 2009. http://www.brics.dk/mona/.