Craig Interpolation and Proof Manipulation
Total Page:16
File Type:pdf, Size:1020Kb
Craig Interpolation and Proof Manipulation Theory and Applications to Model Checking Doctoral Dissertation submitted to the Faculty of Informatics of the Università della Svizzera Italiana in partial fulfillment of the requirements for the degree of Doctor of Philosophy presented by Simone Fulvio Rollini under the supervision of Prof. Natasha Sharygina February 2014 Dissertation Committee Prof. Rolf Krause Università della Svizzera Italiana, Switzerland Prof. Evanthia Papadopoulou Università della Svizzera Italiana, Switzerland Prof. Roberto Sebastiani Università di Trento, Italy Prof. Ofer Strichman Technion, Israel Dissertation accepted on 06 February 2014 Prof. Natasha Sharygina Prof. Igor Pivkin Prof. Stefan Wolf i I certify that except where due acknowledgement has been given, the work pre- sented in this thesis is that of the author alone; the work has not been submitted previously, in whole or in part, to qualify for any other academic award; and the content of the thesis is the result of work which has been carried out since the offi- cial commencement date of the approved research program. ii Abstract Model checking is one of the most appreciated methods for automated formal veri- fication of software and hardware systems. The main challenge in model checking, i.e. scalability to complex systems of extremely large size, has been successfully ad- dressed by means of symbolic techniques, which rely on an efficient representation and manipulation of the systems based on first order logic. Symbolic model checking has been considerably enhanced with the introduction in the last years of Craig interpolation as a means of overapproximation. Interpolants can be efficiently computed from proofs of unsatisfiability based on their structure. A number of algorithms are available to generate different interpolants from the same proof; additional interpolants can be obtained by turning a proof into another proof of unsatisfiability of the same formula using transformation techniques. Interpolants are thus not unique, and it is fundamental to understand which inter- polants have the highest quality, resulting in an optimal verification performance; it is an open problem to determine what features determine the quality of interpolants, and how they are connected with the effectiveness in verification. The goal of this thesis is to identify aspects that make interpolants good, and to develop new techniques that guide the generation of interpolants in order to improve the performance of model checking approaches. We contribute to the state-of-the-art by providing a characterization of quality in terms of semantic and syntactic features, such as logical strength, size, presence of quantifiers. We present a family of algorithms that generalize existing techniques and are able to produce interpolants of different structure and strength, with and without quantifiers, from the same proof. These algorithms are examined in the context of various model checking applications, where collections of interdependent interpolants satisfying particular properties are needed; we formally analyze the re- lationships among the properties and derive necessary and sufficient conditions on the applicability of the algorithms. We introduce a framework for proof manipulation; it includes a method to over- come limitations in existing interpolation algorithms for first order theories, as well as a set of compression algorithms, which, reducing the size of proofs, consequently iii iv reduce the size of interpolants generated from them. Finally, we provide experimental evidence that size and logical strength have a significant impact on verification: small interpolants improve the verification per- formance, while stronger or weaker interpolants are beneficial in different model checking applications. Interpolants are generated from proofs, as produced by logic solvers. As a sec- ondary line of research, we developed and successfully tested a hybrid propositional satisfiability solver built on a model-based stochastic technique known as cross- entropy method. Contents Contents iv 1 Introduction 1 1.1 Symbolic Model Checking . .3 1.2 SAT and SMT Solving . .4 1.2.1 Stochastic Techniques for Satisfiability . .5 1.3 Interpolation-Based Model Checking . .6 1.3.1 Interpolants Quality: Strength and Structure . .7 1.3.2 Interpolation Properties in Model Checking . .8 1.4 Resolution Proof Manipulation . 10 1.4.1 Proof Compression . 10 1.4.2 Enabling Interpolation in SMT . 11 1.5 Thesis Outline . 12 2 Hybrid Propositional Satisfiability 15 2.1 Satisfiability and SAT Solving . 16 2.1.1 DPLL and SLS . 16 2.2 A Cross-Entropy Based Approach to Satisfiability . 21 2.2.1 Cross-Entropy for Optimization . 22 2.2.2 The CROiSSANT Approach . 23 2.2.3 Experimental Results . 30 2.2.4 Related Work . 32 2.2.5 Summary and Future Developments . 33 3 Craig Interpolation in Model Checking 35 3.1 Interpolation in Symbolic Model Checking . 36 3.1.1 Interpolation, BMC and IC3 . 37 3.2 Interpolants Quality . 40 3.3 Generation of Interpolants . 41 v vi Contents 3.3.1 Craig Interpolation . 41 3.3.2 Interpolation Systems . 44 3.3.3 Interpolation in Arbitrary First Order Theories . 48 3.3.4 Interpolation in SAT . 51 3.3.5 Interpolation in SMT . 56 3.4 Theory Labeled Interpolation Systems . 58 3.4.1 Interpolant Strength in SMT . 61 3.4.2 Interpolation in Difference Logic . 63 3.4.3 Summary and Future Developments . 65 3.5 A Parametric Interpolation Framework for First Order Theories . 66 3.5.1 A Parametric Interpolation Framework . 67 3.5.2 Interpolation in First Order Systems . 71 3.5.3 Interpolation in the Hyper-Resolution System . 76 3.5.4 Related Work . 83 3.5.5 Summary and Future Developments . 83 3.6 Impact of Interpolant Strength and Structure . 85 3.6.1 PeRIPLO . 86 3.6.2 Function Summaries in Bounded Model Checking . 87 3.6.3 Experimental Evaluation . 89 3.6.4 Summary and Future Developments . 92 4 Interpolation Properties in Model Checking 95 4.1 Interpolation Systems . 97 4.1.1 Collectives . 97 4.2 Collectives of Interpolation Systems . 99 4.2.1 Collectives of Single Systems . 99 4.2.2 Collectives of Families of Systems . 101 4.3 Collectives of Labeled Interpolation Systems . 107 4.3.1 Collectives of Families of LISs . 108 4.3.2 Collectives of Single LISs . 124 4.4 Collectives of Theory Labeled Interpolation Systems . 126 4.4.1 Collectives of Families and of Single -LISs . 127 4.5 Summary and Future Developments . .T . 130 5 Proof Manipulation 133 5.1 Resolution Proofs . 134 5.2 The Local Transformation Framework . 136 5.2.1 Extension to Resolution Proof DAGs . 137 5.2.2 Soundness of the Local Transformation Framework . 139 vii Contents 5.2.3 A Transformation Meta-Algorithm . 143 5.3 Proof Compression . 144 5.3.1 Proof Redundancies . 146 5.3.2 Proof Regularity . 146 5.3.3 Proof Compactness . 154 5.3.4 Experiments on SMT Benchmarks . 160 5.3.5 Experiments on SAT Benchmarks . 163 5.4 Proof Transformation for Interpolation . 166 5.4.1 Pivot Reordering Algorithms . 167 5.4.2 SMT Solving and AB-Mixed Predicates . 169 5.4.3 Experiments on SMT Benchmarks . 173 5.4.4 Pivot Reordering for Propositional Interpolation . 174 5.5 Heuristics for the Proof Transformation Algorithms . 176 5.6 Related Work . 178 5.7 Summary and Future Developments . 180 6 Conclusions 183 Bibliography 187 viii Contents Chapter 1 Introduction In the last three decades there has been a considerable investment in the develop- ment of techniques and tools aimed at making the formal verification of hardware and software fully automated. Among various approaches, model checking is highly appreciated for its exhaustiveness and degree of automation: once a model (of hard- ware or software) and a specification of a property (desired behavior) are given, a tool can be run to explore all possible behaviors of the model, determining whether the property holds. The main difficulty in model checking is that for complex systems the size of the model often becomes too large to be handled in a reasonable amount of time. A remarkable improvement can be obtained by means of symbolic techniques: the model is encoded together with the negation of the property in a logic formula, satisfiable if and only if the specification is not met by the model. If this is the case, a logic solver can produce a satisfying assignment, representing a violating behavior; otherwise, a proof of unsatisfiability can be built as a witness of the validity of the property in the model. Symbolic model checking has been considerably enhanced with the introduction of Craig interpolation, as a means to perform a guided overapproximation of sets of states of the model. Interpolants can be efficiently computed from proofs of unsatisfiability; various algorithms are available to generate different interpolants from the same proof. Moreover, a given proof can be transformed into another proof of unsatisfiability of the same formula by means of proof manipulation techniques, consequently allowing additional interpolants. Interpolants are thus not unique, and it is fundamental to understand which in- terpolants result in an optimal verification performance: it is an open problem to determine what features determine the quality of interpolants, and how they are connected with the effectiveness in the verification. 1 2 This thesis focuses on the notion of quality of interpolants: the goal is to identify aspects that make interpolants good, and to develop new techniques that guide the generation of interpolants in order to improve the performance of model checking approaches. We contribute to the state-of-the-art in a number of ways. First, we provide a characterization of quality from both a semantic and a syntactic point of view, taking into account features as logical strength, size of formulae, and presence of quantifiers. Second, we present a family of algorithms that generalize existing techniques and are able to produce interpolants of different structure and strength, with and without quantifiers, from the same proof; they apply to arbitrary inference systems, in the context of both propositional logic and first order theories.