Verified Programs with Binders Martin Clochard, Claude Marché, Andrei Paskevich

Total Page:16

File Type:pdf, Size:1020Kb

Load more

Verified Programs with Binders Martin Clochard, Claude Marché, Andrei Paskevich To cite this version: Martin Clochard, Claude Marché, Andrei Paskevich. Verified Programs with Binders. Programming Languages meets Program Verification, Jan 2014, San Diego, United States. hal-00913431 HAL Id: hal-00913431 https://hal.inria.fr/hal-00913431 Submitted on 3 Dec 2013 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Verified Programs with Binders ∗ Martin Clochard1,2,3 Claude Marche´2,3 Andrei Paskevich3,2 1ENS Paris, 2Inria Saclay – Ile-de-France,ˆ 3LRI (CNRS & Universite´ Paris-Sud), France Abstract for programming datatypes with binders in a systematic way, so as Programs that treat datatypes with binders, such as theorem provers to avoid classical traps such as variable capture (see [24, 25] for or higher-order compilers, are regularly used for mission-critical an overview). On the other hand, there are also several approaches purposes, and must be both reliable and performant. Formally prov- proposed for reasoning about datatypes with binders (e.g. de Bruijn ing such programs using as much automation as possible is highly indices, locally nameless, nominal or nested representations), typ- desirable. In this paper, we propose a generic approach to handle ically using highly expressive logical frameworks such as those datatypes with binders both in the program and its specification implemented in interactive proof assistants, as exemplified by the in a way that facilitates automated reasoning about such datatypes POPLmark challenge [3]. However, there has been significantly and also leads to a reasonably efficient code. Our method is im- less work considering simultaneously the issues for developing and plemented in the Why3 environment for program verification. We formally proving programs with binders. validate it on the examples of a lambda-interpreter with several re- Our aim is thus to bridge a gap between two seemingly opposite duction strategies and a simple tableaux-based theorem prover. objectives: reasoning easily about datatypes with binders (hence the representation should be simple) and implementing them with Categories and Subject Descriptors F.3.1 [Theory of Computa- a reasonably efficient structure (hence the representation should be tion]: Logics and Meanings of Programs—Specifying and Verify- clever). Instead of looking for a single representation of binders ing and Reasoning about Programs that would be fit for both tasks, we introduce two different repre- Keywords Formal Verification, Binders, Verified Symbolic Com- sentations: one on the logic side to perform reasoning, one on the putations, Automated Theorem Proving implementation side to perform efficient computations. The imple- mentation representation is interpreted to the logic one, assigning to every program object a logical model, which is used in program 1. Introduction specifications. The idea is that almost all logical reasoning is done This work is about developing programs involving datatypes with using the model, without losing efficiency on the program side. binders, in a safe manner. Such datatypes appear when one wants Our method is implemented using the Why3 environment for to represent symbolic expressions, such as logic formulas, alge- program verification [12] that generates proof obligations dis- braic expressions, or abstract syntax trees of programs. As binders charged by external automated and interactive provers. This en- are the natural way to model quantifiers and anonymous function vironment is presented in Section 2. Why3 allows extraction of expressions, they are widely used to formalize logic and program- programs to OCaml code, thereby allowing efficient execution of ming languages. the proved code. In order to deal with datatypes with binders in a In this context, we propose an approach aiming at both produc- generic way, we designed and implemented a general scheme of ing reasonably efficient programs manipulating binders, and for- definitions of such types, together with a procedure for translat- mally verifying their correctness, using as much automation as pos- ing them into Why3 code defining types, operations, lemmas about sible. Typical examples of such programs include theorem provers, them, and also hints for proving those lemmas. This method is de- higher-order language interpreters and compilers, and computer al- tailed in Section 4. gebra systems. Some of these tools are used to produce mission- Our approach is illustrated on several examples: the terms critical code, requiring a high level of trust. Formally proving such of lambda-calculus, the formulas of first-order logic, the terms programs is thus desirable, and handling datatypes with binders is of system F<: (from the POPLmark challenge). For each ex- a major challenge in this task. ample, all properties and programs that are automatically gener- Dealing with binders is an active research area since a long time. ated are formally proved correct with quite limited human assis- On the one hand, there are several generic approaches and tools tance after generation. We experimented with the generated repre- sentations on two case studies: a verified interpreter for lambda- ∗ Work partly supported by the Bware project of the French national re- calculus using several reduction strategies (Section 5), and certi- search organization (ANR-12-INSE-0010, http://bware.lri.fr/) fied sound tableaux-based theorem prover (Section 6). The source files for our developments are available at http://www.lri.fr/ Permission to make digital or hard copies of all or part of this work for personal or ~clochard/. The obtained results validate the fact that the imple- classroom use is granted without fee provided that copies are not made or distributed mentation of datatypes with binders, as generated by our tool, is for profit or commercial advantage and that copies bear this notice and the full citation competitive with hand-written ones. Notice that in order to handle on the first page. Copyrights for components of this work owned by others than the these examples, our approach has to treat the problem of substitu- author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission tion under binders, making it possible to implement, for example, and/or a fee. Request permissions from [email protected]. an innermost reduction strategy or a skolemization procedure. PLPV ’14, January 21, 2014, San Diego, CA, USA. Copyright is held by the owner/author(s). Publication rights licensed to ACM. ACM 978-1-4503-2567-7/14/01. $15.00. http://dx.doi.org/10.1145/2541568.2541571 2. Preliminaries representing every variable by an identifier, causes a lot of trou- 2.1 The Why3 Environment ble when dealing with operations like substitution, as undesirable captures can arise during a naive substitution. Getting it right typi- Why3 is a platform for deductive program verification, providing a cally involves reasoning modulo α-equivalence by renaming bound rich language for specification and programming, called WhyML. variables into fresh ones, which is inefficient and error-prone. Also, It relies on external provers, both automated and interactive, in reasoning modulo α-equivalence requires to state and prove a great order to discharge the auxiliary lemmas and verification conditions. number of invariance results. WhyML is used as an intermediate language for verification of C, Considering a datatype with binders, five operations stand out Java or Ada programs, and is also intended to be comfortable as a as basic blocks, for which there is a general scheme independent of primary programming language. the actual datatype. The specification component of WhyML [6], used to write • program annotations and background theories, is an extension of Construction, in order to build the values of the datatype. It first-order logic. It features ML-style polymorphic types, algebraic includes the important case of variable binding. datatypes, inductive and co-inductive predicates, recursive defini- • Decomposition, usually via pattern-matching. As the inverse of tions over algebraic types. Constructions like pattern matching, let- construction, the case of variable unbinding must be considered. binding and conditionals can be used directly inside formulas and • Equality test—usually modulo α-equivalence. terms. A type, function, or predicate can be either defined or de- clared abstract and axiomatized. The specification part of the lan- • Substitution, in a way that avoids capture. guage can serve as a common format for theorem proving prob- • Testing whether a variable occurs free in a term, or collecting lems, suitable for multiple provers. The Why3 tool generates proof the set of free variables. obligations from lemmas and goals, then dispatches them to mul- tiple provers, including Alt-Ergo, CVC4, CVC3, Z3, E, SPASS, So, from the logic side, a good representation of such a datatype Vampire, Coq, and PVS. As most of the provers do not support is one where those operations are easy to reason about. On the some of the language features, typically pattern matching, polymor- implementation side, efficiency comes first. phic types, or recursion, Why3 applies a series of encoding trans- Let us consider several possible representations of binding, us- formations to eliminate unsupported constructions before dispatch- ing the pure lambda-calculus and the term λx.(λy.xya)xa as an ing a proof obligation. Other transformations can also be imposed example. by the user in order to simplify the proof search. Named representation The programming part of WhyML [12] is a dialect of ML with a number of restrictions to make automated proving viable.
Recommended publications
  • A Mechanised Proof of G\" Odel's Incompleteness Theorems Using Nominal Isabelle

    A Mechanised Proof of G\" Odel's Incompleteness Theorems Using Nominal Isabelle

    A Mechanised Proof of G¨odel’s Incompleteness Theorems using Nominal Isabelle Lawrence C. Paulson Abstract An Isabelle/HOL formalisation of G¨odel’s two incompleteness theorems is presented. The work follows Swierczkowski’s´ detailed proof of the theorems using hereditarily finite (HF) set theory [32]. Avoiding the usual arithmetical encodings of syntax eliminates the necessity to formalise elementary number theory within an embedded logical calculus. The Isabelle formalisation uses two separate treatments of variable binding: the nominal package [34] is shown to scale to a development of this complexity, while de Bruijn indices [3] turn out to be ideal for coding syntax. Critical details of the Isabelle proof are described, in particular gaps and errors found in the literature. 1 Introduction This paper describes mechanised proofs of G¨odel’s incompleteness theorems [8], including the first mechanised proof of the second incompleteness theorem. Very informally, these results can be stated as follows: Theorem 1 (First Incompleteness Theorem) If L is a consistent theory capable of formalising a sufficient amount of elementary mathematics, then there is a sentence δ such that neither δ nor ¬δ is a theorem of L, and moreover, δ is true.1 Theorem 2 (Second Incompleteness Theorem) If L is as above and Con(L) is a sentence stating that L is consistent, then Con(L) is not a theorem of L. Both of these will be presented formally below. Let us start to examine what these theorems actually assert. They concern a consistent formal system, say L, based on first-order logic with some additional axioms: G¨odel chose Peano arith- metic (PA) [7], but hereditarily finite (HF) set theory is an alternative [32], used here.
  • Verified Programs with Binders Martin Clochard, Claude Marché, Andrei Paskevich

    Verified Programs with Binders Martin Clochard, Claude Marché, Andrei Paskevich

    Verified Programs with Binders Martin Clochard, Claude Marché, Andrei Paskevich To cite this version: Martin Clochard, Claude Marché, Andrei Paskevich. Verified Programs with Binders. Programming Languages meets Program Verification, Jan 2014, San Diego, United States. hal-00913431 HAL Id: hal-00913431 https://hal.inria.fr/hal-00913431 Submitted on 3 Dec 2013 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Verified Programs with Binders ∗ Martin Clochard1,2,3 Claude Marche´2,3 Andrei Paskevich3,2 1ENS Paris, 2Inria Saclay – Ile-de-France,ˆ 3LRI (CNRS & Universite´ Paris-Sud), France Abstract for programming datatypes with binders in a systematic way, so as Programs that treat datatypes with binders, such as theorem provers to avoid classical traps such as variable capture (see [24, 25] for or higher-order compilers, are regularly used for mission-critical an overview). On the other hand, there are also several approaches purposes, and must be both reliable and performant. Formally prov- proposed for reasoning about datatypes with binders (e.g. de Bruijn ing such programs using as much automation as possible is highly indices, locally nameless, nominal or nested representations), typ- desirable.
  • Canonical Hybridlf: Extending Hybrid with Dependent Types

    Canonical Hybridlf: Extending Hybrid with Dependent Types

    LSFA 2015 Canonical HybridLF: Extending Hybrid with Dependent Types Roy L. Crole ([email protected]) & Amy Furniss ([email protected]) Dept of Computer Science, University of Leicester, University Road, Leicester, LE1 7RH, U.K. Abstract We introduce Canonical HybridLF (CHLF), a metalogic for proving properties of deductive systems, im- plemented in Isabelle HOL. CHLF is closely related to two other metalogics. The first is the Edinburgh Logical Framework (LF) by Harper, Honsell and Plotkin. The second is the Hybrid system developed by Ambler, Crole and Momigliano which provides a Higher-Order Abstract Syntax (HOAS) based on un-typed lambda calculus. Historically there are two problems with HOAS: its incompatibility with inductive types and the presence of exotic terms. Hybrid provides a partial solution to these problems whereby HOAS functions that include bound variables in the metalogic are automatically converted to a machine-friendly de Bruijn representation hidden from the user. The key innovation of CHLF is the replacement of the un-typed lambda calculus with a dependently-typed lambda calculus in the style of LF. CHLF allows signatures containing constants representing the judgements and syntax of an object logic, together with proofs of metatheorems about its judgements, to be entered using a HOAS interface. Proofs that metatheorems defined in the signature are valid are created using the M2 metalogic of Schurmann and Pfenning. We make a number of advances over existing versions of Hybrid: we now have the utility of dependent types; the unitary bound variable capability of Hybrid is now potentially finitary; a type system performs the role of Hybrid well-formedness predicates; and the old method of indicating errors using special elements of core datatypes is replaced with a more streamlined one that uses the Isabelle option type.
  • The Representational Adequacy of Hybrid

    The Representational Adequacy of Hybrid

    Math. Struct. in Comp. Science: page 1 of 62. c Cambridge University Press 2011 doi:10.1017/S0960129511000041 The representational adequacy of Hybrid R. L. CROLE Department of Computer Science, University of Leicester, University Road, Leicester, LE1 7RH, United Kingdom Email: [email protected] Received 11 August 2010; revised 14 December 2010 The Hybrid system (Ambler et al. 2002b), implemented within Isabelle/HOL, allows object logics to be represented using higher order abstract syntax (HOAS), and reasoned about using tactical theorem proving in general, and principles of (co)induction in particular. The form of HOAS provided by Hybrid is essentially a lambda calculus with constants. Of fundamental interest is the form of the lambda abstractions provided by Hybrid.The user has the convenience of writing lambda abstractions using names for the binding variables. However, each abstraction is actually a definition of a de Bruijn expression, and Hybrid can unwind the user’s abstractions (written with names) to machine friendly de Bruijn expressions (without names). In this sense the formal system contains a hybrid of named and nameless bound variable notation. In this paper, we present a formal theory in a logical framework, which can be viewed as a model of core Hybrid, and state and prove that the model is representationally adequate for HOAS. In particular, it is the canonical translation function from λ-expressions to Hybrid that witnesses adequacy. We also prove two results that characterise how Hybrid represents certain classes of λ-expression. We provide the first detailed proof to be published that proper locally nameless de Bruijn expressions and α-equivalence classes of λ-expressions are in bijective correspondence.