Mechanizing Confluence: Automated and Certified Analysis of First

Mechanizing Confluence dissertation by Julian Nagele submitted to the Faculty of Mathematics, Computer Science and Physics of the University of Innsbruck in partial fulfillment of the requirements for the degree of Doctor of Philosophy advisor: Univ.-Prof. Dr. Aart Middeldorp Innsbruck, October 2017 Mechanizing Confluence Automated and Certified Analysis of First- and Higher-Order Rewrite Systems dissertation Mechanizing Confluence Automated and Certified Analysis of First- and Higher-Order Rewrite Systems Julian Nagele [email protected] October 2017 advisor: Univ.-Prof. Dr. Aart Middeldorp Eidesstattliche Erklärung Ich erkläre hiermit an Eides statt durch meine eigenhändige Unterschrift, dass ich die vorliegende Arbeit selbständig verfasst und keine anderen als die angegebenen Quellen und Hilfsmittel verwendet habe. Alle Stellen, die wörtlich oder inhaltlich den angegebenen Quellen entnommen wurden, sind als solche kenntlich gemacht. Die vorliegende Arbeit wurde bisher in gleicher oder ähnlicher Form noch nicht als Magister-/Master-/Diplomarbeit/Dissertation eingereicht. Datum Unterschrift Abstract This thesis is devoted to the mechanized confluence analysis of rewrite systems. Rewrite systems consist of directed equations and computation is performed by successively replacing instances of left-hand sides of equations by the corresponding instance of the right-hand side. Confluence is a fundamental property of rewrite systems, which ensures that different computation paths produce the same result. Since rewriting is Turing-complete, confluence is undecidable in general. Nevertheless, techniques have been developed that can be used to determine confluence for many rewrite systems and several automatic confluence provers are under active development. Our goal isto improve three aspects of automatic confluence analysis, namely (a) reliability, (b) power, and (c) versatility. The importance of these aspects is witnessed by the annual international confluence competition, where the leading automated tools analyze confluence of rewrite systems. To improve the reliability of automatic confluence analysis, we formalize confluence criteria for rewriting in the proof assistant Isabelle/HOL, resulting in a verified, executable checker for confluence proofs. To enhance the power of confluence tools, we present a remarkably simple technique, based on the addition and removal of redundant equations, that strengthens existing techniques. To make automatic confluence analysis more versatile, we develop a higher-order confluence tool, making automatic confluence analysis applicable to systems with functional and bound variables. Acknowledgments At the beginning of this thesis I want to express my gratitude to the many people who have influenced my work and made it an exciting and, on the whole, enjoyable journey. First of all, I would like to thank my advisor Aart Middeldorp. His advice, support, and encouragement were vital. I sincerely appreciate his guidance in both research and other aspects of life, such as traveling Japan. I am grateful to all members of the Computational Logic group, both past and present, for taking me in and creating such a pleasant environment. Thank you for everything you have taught me. I explicitly mention Bertram Felgenhauer, for countless insights and occasional Albernheiten, Cezary Kaliszyk for motivating enthusiasm, Vincent van Oostrom, for collaboration and chitchat on etymology, Christian Sternagel, for energizing discussions and Isabelle and TEX support, Thomas Sternagel, for being my PhD and coffee companion, René Thiemann, for introducing me to interactive theorem proving in the first place, and Sarah Winkler for internship encouragement and advice. Special thanks go to Harald Zankl, for taking me under his wing and supporting me through some of the most important steps of the way. I had the pleasure of collaborating with many amazing people. I am indebted to them all, in particular my fellow confluence competitors: the problem submitters, the tool authors, and especially my CoCo co-organizers, Takahito Aoto, Nao Hirokawa, and Naoki Nishida. I would also like to say thank you to Nuno Lopes, who was my mentor during my internship at Microsoft Research, for offering me a glimpse into the fascinating world of compilers and challenging me to apply my knowledge in an unfamiliar domain. Finally, I wish to thank my friends and family, especially my parents Barbara and Joachim, for their constant support and unwavering belief that has enabled me to pursue this work, and of course Maria, for all the experiences and things we share. This work has been supported by the Austrian Science Fund through the FWF projects P22467 and P27528. ix Contents 1 Introduction1 1.1 Rewriting and Confluence . .2 1.2 Formalization and Certification . .4 1.3 Overview . .6 2 Rewriting9 2.1 Preliminary Notions . .9 2.2 Abstract Rewrite Systems . 10 2.3 Term Rewriting . 17 2.4 Lambda Calculus . 24 2.5 Higher-Order Rewriting . 26 3 Closing Critical Pairs 31 3.1 Critical Pair Lemma . 32 3.2 Strongly Closed Critical Pairs . 33 3.3 Parallel Closed Critical Pairs . 36 3.4 Almost Parallel Closed Critical Pairs . 45 3.5 Critical Pair Closing Systems . 47 3.6 Certificates . 51 3.7 Summary . 52 4 Rule Labeling 55 4.1 Preliminaries . 56 4.2 Formalized Confluence Results . 60 4.2.1 Local Peaks . 61 4.2.2 Local Decreasingness . 66 4.3 Checkable Confluence Proofs . 70 4.3.1 Linear Term Rewrite Systems . 70 4.3.2 Left-linear Term Rewrite Systems . 72 4.3.3 Certificates . 77 xi Contents 4.4 Assessment . 80 4.5 Summary . 81 5 Redundant Rules 83 5.1 Theory . 85 5.2 Formalization and Certification . 88 5.3 Heuristics . 90 5.4 Summary . 92 6 Confluence of the Lambda Calculus 95 6.1 Nominal Lambda Terms . 96 6.2 The Z Property . 99 6.3 The Triangle Property . 103 6.4 Assessment . 104 7 Confluence of Higher-Order Rewriting 107 7.1 Higher-Order Critical Pairs . 107 7.2 Termination . 110 7.3 Orthogonality . 114 7.4 Modularity . 116 7.5 Redundant Rules . 117 7.6 Summary . 118 8 CSI and CSIˆho 119 8.1 Usage . 119 8.2 Implementation Details . 122 8.3 Experimental Results . 124 8.3.1 Term Rewrite Systems . 124 8.3.2 Higher-Order Rewrite Systems . 128 9 Conclusion 131 9.1 Related Work . 131 9.2 Future Work . 134 Bibliography 135 Index 149 xii Chapter 1 Introduction Science is knowledge which we understand so well that we can teach it to a computer. Donald E. Knuth In safety-critical systems, like medical devices and spaceflight, ensuring correctness of computer programs is imperative. Notorious failures, like the malfunctioning Therac-25 medical accelerator, the Pentium floating point divi- sion bug, or the crash of the Ariane 5 rocket, show that modern applications have long reached a level of complexity where human judgment and testing are not sufficient to guarantee faultless operation. Instead formal methods can be used to avoid subtle errors and provide a high degree of reliability. Formal methods apply mathematics and logic to software, formal verification employs these methods for proving correctness of programs. We argue that in computer science formality is not optional. Formal methods merely make that explicit, after all, programming languages are formal languages and every program is a formula. Recent years have seen tremendous success in formally verified software, such as the seL4 microkernel and the CompCert C compiler. Alas, formal verification is still very labor-intensive. It is our believe that in order forformal verification to be widely adopted, a high degree of automation, where large parts of the verification process can be performed at the push of a button, is indispensable. Consequently, we need tools that check properties of critical parts of programs automatically. However, these tools are programs themselves, which makes them vulnerable to the very same problems. That is, they will contain errors. To ensure correctness of the verification process itself, a common approach is to use so-called computational proof assistants, which are computer programs designed for checking mathematical inferences in a highly trustworthy fashion. 1 Chapter 1 Introduction 1.1 Rewriting and Confluence Considering the myriad of existing programming languages and their complexity, in order to facilitate mathematical reasoning and to ensure that results are applicable to all concrete implementations, we would like to perform correctness proofs on a more abstract level, using a mathematical model of computation. Rewriting is such a model of computation. It is the process of transforming objects in a step-wise fashion, replacing equals by equals. Typically the objects are expressions in some formal language, and the transformations are given by a set of equations. Applied in a directed way, they describe rewrite steps, performed by replacing instances of left-hand sides of equations by the corresponding instance of the right-hand side. If no further replacement is possible, we have reached the result of our computation. Consider the equations 0 + y = y 0 × y = 0 s(x) + y = s(x + y) s(x) × y = y + (x × y) Applying them from left to right, they are rewrite rules for computing addition and multiplication on natural numbers represented as Peano numbers, i.e., using a zero constant and a successor function. For instance we can compute 1 × 2 as follows: s(0) × s(s(0)) = s(s(0)) + (0 × s(s(0))) = s(s(0)) + 0 = s(s(0) + 0) = s(s(0 + 0)) = s(s(0)) In each step we instantiated an equation by replacing the variables x and y, and after finding the left-hand side of that equation in a subexpression (indicated by underlining), we replaced it by the corresponding right-hand side. Specifying programs by such sets of directed equations, called rewrite systems, is the essence of the functional programming paradigm. Indeed rewriting as sketched above is Turing-complete and consequently, all basic properties like termination, i.e., the question whether all computations finish in a finite amount of time, are undecidable. 2 1.1 Rewriting and Confluence Returning to our example equations for arithmetic we find that, given some expression, there are often multiple possibilities to apply an equation, yielding different results.

Load more