Natural Proofs for Structure, Data, and Separation

Natural Proofs for Structure, Data, and Separation XiaokangQiu PranavGarg AndreiStefanescu P.Madhusudan University of Illinois at Urbana-Champaign, USA {qiu2, garg11, stefane1, madhu}@illinois.edu Abstract fields), data (the integers stored in data-fields of the tree respect the We propose natural proofs for reasoning with programs that ma- binary search tree property, or the data stored in a tree is a max- nipulate data-structures against complex specifications— specifica- heap), and separation (the procedure modifies one list and not the tions that describe the structure of the heap, the data stored within other and leaves the two lists disjoint at exit, etc.). The fact that the it, and separation and framing of sub-structures. Natural proofs are dynamic heap contains an unbounded number of locations means a subclass of proofs that are amenable to completely automated that expressing the above properties requires quantification in some reasoning, that provide sound but incomplete procedures, and that form, which immediately precludes the use of most SMT decidable capture common reasoning tactics in program verification. We de- theories (there are only a few of them known that can handle quan- trand velop a dialect of separation logic over heaps, called Dryad, with tification; e.g., the array property fragment [12] and the S recursive definitions that avoids explicit quantification. We develop logic [24, 25]). Consequently, expressing such properties naturally ways to reason with heaplets using classical logic over the theory and succinctly in a logical formalism has been challenging, and of sets, and develop natural proofs for reasoning using proof tactics reasoning with them automatically even more so. involving disciplined unfoldings and formula abstractions. Natural For instance, in the Boogie line of tools (including VCC) of proofs are encoded into decidable theories of first-order logic so as writing specifications using first-order logic and employing SMT to be discharged using SMT solvers. solvers to validate verification conditions, the specification of invariants of even simple methods like singly-linked-list insert is te- We also implement the technique and show that a large class of 1 more than 100 correct programs that manipulate data-structures are dious. In such code , second-order properties (reachability, acyclic- amenable to full functional correctness using the proposed natural ity, separation, etc.) are smuggled in using carefully chosen ghost proof method. These programs are drawn from a variety of sources variables; for example, acyclicity of a list is encoded by assign- including standard data-structures, the Schorr-Waite algorithm for ing a ghost number (idx) to each node in the list, with the property garbage collection, a large number of low-level C routines from the that the numbers associated with adjacent nodes strictly increase Glib library, the OpenBSD library and the Linux kernel, and rou- going down the list. These ghost variables require careful manip- tines from a secure verified OS-browser project. Our work is the ulation when the structures are updated; for example, inserting a first that we know of that can handle such a wide range of full func- node may require updating the ghost numbers for other nodes in tional verification properties of heaps automatically, given pre/post the list, in order to maintain the acyclicity property. Once such a and loop invariant annotations. We believe that this work paves the ghost-encoding of the specification is formulated, the validation of way for the deductive verification technology to be used by pro- verification conditions, which typically have quantifiers, are dealt grammers who do not (and need not) understand the internals of with using sound heuristics (a wide variety of them including e- the underlying logic solvers, significantly increasing their applica- matching, model-based quantifier instantiation, etc. are available), bility in building reliable systems. but are still often not enough and have to be augmented by instantiation triggers from the verification engineer to help the proof go through. 1. Introduction In recent years, separation logic, especially in combination In recent years, the automated deductive verification paradigm for with recursive definitions, has emerged as a much more succinct software verification that combines user written modular contracts and natural logic to express properties about structure and separa- and loop invariants with automated theorem proving of the result- tion [31, 35]. However, the validation of verification conditions re- ing verification conditions has become very powerful. The latter sulting from separation logic invariants are also complex, and has process is often executed by automated logical decision procedures eluded automatic reasoning and exploitation of SMT solvers (even supported by SMT solvers, which have emerged as robust and pow- more so than tools such as Boogie that use classical logic). Again, erful engines to automatically find proofs. Several techniques and help from the user in proving the verification conditions are cur- erifast edrock tools have been developed [2, 16, 20] and there have been several rently necessary— the tools V [20] and B [15], for success stories of large software verification projects using this ap- instance, admit separation logic specifications but require the user to write low-level lemmas and proof tactics to guide the verifica- proach (the Verve OS project[41], the Microsoft hypervisor veri- 2 fication project using VCC [16], and a recent verified-for-security tion. For example, in verifying an in-place reversal of a linked list , edrock OS+browser for mobile applications [28] to name a few). B would require several lemmas and a hint package be sup- Verification conditions do not, however, always fall into decid- plied at the level of the code in order for the proof to go through. able theories. In particular, the verification of properties of the dy- The work in this paper is motivated by the opinion that en- namically modified heap is a big challenge for logical methods. The tirely decidable logics are too restrictive, in general, to support dynamically manipulated heap poses several challenges, as typical correctness properties of heaps require complex combinations of 1 http://vcc.codeplex.com/SourceControl/changeset/view/dcaa4d0ee8c2#vcc structure (e.g., p points to a tree structure, or to a doubly-linked /Docs/Tutorial/c/7.2.list.c list, or to an almost balanced tree, with respect to certain pointer- 2 http://plv.csail.mit.edu/bedrock/Tutorial.html 1 2012/11/14 the verification of complex specifications of functional correctness to express structural restrictions; for example. tree-ness can be ex- for heap manipulating programs, and the other extreme of user- pressed simply as: supplied proof tactics and lemmas is too tedious, requiring of the user too much knowledge of the underlying proof systems/decision tree(x)::(x = nil ∧ emp) ∨ (x 7−→ (l, r) ∗ tree(l) ∗ tree(r)) procedures. Our aim is to build completely automatic, sound, but We first define a new logic, Dryad, that permits no explicit incomplete proof techniques that can solve a large class of proper- quantification, but permits powerful recursive definitions in order ties involving complex data-structures. to define integers, sets/multisets of integers, and sets of locations, The natural proof methodology: using least fixed-points. The logic Dryad furthermore has a heaplet Our proof method, called the natural proof method, is inspired by semantics and allows the spatial conjunction operator ∗. However, work on natural proofs proposed by Madhusudan et al. [26]; natural a key design feature of Dryad is that the heaplet for recursive proofs exploit a fixed set of proof tactics, keeping the expressive- formulas is essentially determined by the syntax as opposed to ness of powerful logics, retaining the automated nature of proving the semantics. In classical separation logic, a formula of the form validity, but giving up on completeness (i.e., gives up decidability, α ∗ β says that the heaplet can be partitioned into any two disjoint retaining soundness). The idea of natural proofs [26] is to identify heaplets, one satisfying α and the other β. In Dryad, the heaplet a subclass of proofs N such that (a) a large class of valid verifi- for a (complex) formula is determined and hence if there is a way cation conditions of real-world programs have a proof in N, and to partition the heaplet, there is precisely one way to do so. We (b) searching for a proof in N is decidable. In fact, we would even have found that most uses of separation logic to express properties like the search for a proof in N to be efficiently decidable, possibly can be written quite succinctly and easily using Dryad (in fact, it utilizing the automatic logic solvers (SMT solvers) that exist today. is easier to write such deterministic-heap specifications). The key Natural proofs are hence a fixed set of proof tactics whose appli- advantage is that this eliminates implicit existential quantification cation is itself expressible in a decidable logic. The natural proofs the separation operator provides. In a verification condition that developed in [26] were too restrictive, however, handling only sin- combines the pre-condition in the negative and the post-condition gle trees, with no scope for handling multiple or more complex in the positive, the classical semantics for ∗ invariably introduces data-structures and their separation (see Related Work for more de- universal

Load more