Performance Engineering of Proof-Based Software Systems at Scale by Jason S

Performance Engineering of Proof-Based Software Systems at Scale by Jason S. Gross B.S., Massachusetts Institute of Technology (2013) S.M., Massachusetts Institute of Technology (2015) Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY February 2021 © Massachusetts Institute of Technology 2021. All rights reserved. Author............................................................. Department of Electrical Engineering and Computer Science January 27, 2021 Certified by . Adam Chlipala Associate Professor of Electrical Engineering and Computer Science Thesis Supervisor Accepted by . Leslie A. Kolodziejski Professor of Electrical Engineering and Computer Science Chair, Department Committee on Graduate Students 2 Performance Engineering of Proof-Based Software Systems at Scale by Jason S. Gross Submitted to the Department of Electrical Engineering and Computer Science on January 27, 2021, in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract Formal verification is increasingly valuable as our world comes to rely more onsoft- ware for critical infrastructure. A significant and understudied cost of developing mechanized proofs, especially at scale, is the computer performance of proof generation. This dissertation aims to be a partial guide to identifying and resolving performance bottlenecks in dependently typed tactic-driven proof assistants like Coq. We present a survey of the landscape of performance issues in Coq, with micro- and macro-benchmarks. We describe various metrics that allow prediction of performance, such as term size, goal size, and number of binders, and note the occasional surprising lack of a bottleneck for some factors, such as total proof term size. To our knowledge such a roadmap to performance bottlenecks is a new contribution of this dissertation. The central new technical contribution presented by this dissertation is a reflective framework for partial evaluation and rewriting, already used to compile a code generator for field-arithmetic cryptographic primitives which generates code currently used in Google Chrome. We believe this prototype is the first scalably performant realization of an approach for code specialization which does not require adding to the trusted code base. Our extensible engine, which combines the traditional concepts of tailored term reduction and automatic rewriting from hint databases with on-the-fly generation of inductive codes for constants, is also of interest to replace these ingre- dients in proof assistants’ proof checkers and tactic engines. Additionally, we use the development of this framework itself as a case study for the various performance issues that can arise when designing large proof libraries. We also present a novel method of simple and fast reification, developed and published during this PhD. Finally, we present additional lessons drawn from the case studies of a category-theory library, a proof-producing parser generator, and cryptographic code generation. Thesis Supervisor: Adam Chlipala Title: Associate Professor of Electrical Engineering and Computer Science 3 4 Dedicated to future users and developers of proof assistants. Dedicated also to my mom, for her perpetual support and nurturing throughout my life. 5 6 Acknowledgments I’m extremely grateful to my advisor Adam Chlipala for his patience, guidance, en- couragement, advice, and wisdom, during the writing of this dissertation, and through my research career. I don’t know what it’s like to have any other PhD advisor, but I can’t imagine having a PhD advisor who would have been better for my mental health than Adam. I want to thank my coworkers, with special thanks to Andres Erbsen for many engaging conversations and rich and productive collaborations. I want to thank Rajashree Agrawal for her help in finding a much better story for my work than I had ever had, and for helping me find how to present this story in both my defense and this dissertation. I want to thank the Coq development team—without whom I would not have a proof assistant to use—for their patience and responsiveness to my many, many bug reports, feature requests, and questions. Special thanks to Pierre-Marie Pédrot for doing the heavy lifting of tracking down performance issues inside Coq and explaining them to me, and fixing many of them. I’d also like to thank Matthieu Sozeau for adding support for universe polymorphism and primitive projections to Coq, and responding to all of my bug reports on the functionality and performance of these features from my work on the HoTT Category Theory library during the development of these features. I’m grateful to the rest of my thesis committee—Professor Saman Amarasinghe and Professor Nickolai Zeldovich—for their support and direction during the editing of this dissertation and my defense. Moving on to more specific acknowledgments, thanks to Andres Erbsen for pointing out to me some of the particular performance bottlenecks in Coq that I made use of in this thesis, including those of subsubsection Sharing in Section 2.6.1 and those of subsections Name Resolution, Capture-Avoiding Substitution, Quadratic Creation of Substitutions for Existential Variables, and Quadratic Substitution in Function Application in Subsection 2.6.3. Thanks to András Kovács for a brief exchange with me and Andres in which it became clear that we could frame many of the performance issues we were encountering as needing to “get the basics right.” Thanks to Hugo Herbelin for sharing the trick with type of to propagate universe constraints1 as well as useful conversations on Coq’s bug tracker that allowed us to track down performance issues.2 Thanks to Jonathan Leivent for sharing the trick of annotating identifiers with : Type to avoid needing to adjust universes3. Thanks to Pierre-Marie Pédrot for conversations on Coq’s Gitter and his help in tracking down performance bottlenecks in earlier versions of our reification scripts and in Coq’s tactics. Thanks to Beta Ziliani for his help in using Mtac2, as well as his invaluable 1https://github.com/coq/coq/issues/5996#issuecomment-338405694 2https://github.com/coq/coq/issues/6252 3https://github.com/coq/coq/issues/5996#issuecomment-670955273 7 guidance in figuring out how to use canonical structures to reify to PHOAS. Thanks to John Wiegley for feedback on “Reification by Parametricity: Fast Setup for Proof by Reflection, in Two Lines of Ltac” [GEC18], which is included in slightly-modified form distributed between Chapter 6 and various sections of Chapter 3. Thanks to Karl Palmskog for pointing me at Lamport and Paulson [LP99] and Paulson [Pau18].4 Thanks to Karl Palmskog, Talia Ringer, Ilya Sergey, and Zachary Tatlock for making the high-quality LATEX bibliography file for “QED at Large: A Survey of Engineering of Formally Verified Software” [Rin+20] available on GitHub5 and pointing me at it; it’s been quite useful in polishing the bibliography of this document. I want to thank all of the people not explicitly named here who have contributed, formally or informally, to these efforts. A significant fraction of the text of this dissertation is taken from papers I’veco- authored during my PhD, sometimes with major edits, other times with only minor edits to conform to the flow of the dissertation. In particular: Chapter 4 is largely taken from a draft paper co-authored with Andres Erbsen and Adam Chlipala. Sections 6.1 and 3.2 are based on the introduction to “Reification by Parametricity: Fast Setup for Proof by Reflection, in Two Lines of Ltac” [GEC18]. Chapter 6is largely taken from [GEC18], with some new text for this dissertation. Chapter 7 is based largely on “Experience Implementing a Performant Category- Theory Library in Coq” [GCS14], and I’d like to thank Benedikt Ahrens, Daniel R. Grayson, Robert Harper, Bas Spitters, and Edward Z. Yang for feedback on this paper. Sections 7.3, 7.5, 7.4.1 and 8.2.3 are largely taken from [GCS14]. Some of the text in Subsections 8.2.1 and 8.3.1 also comes from [GCS14]. For those interested in history, our method of reification by parametricity presented in Chapter 6 was inspired by the evm compute tactic [MCB14]. We first made use of pattern to allow vm compute to replace cbv-with-an-explicit-blacklist when we discovered cbv was too slow and the blacklist too hard to maintain. We then noticed that in the sequence of doing abstraction; vm compute; application; 훽-reduction; reification, we could move 훽-reduction to the end of the sequence if we fused reification 4@palmskog on gitter https://gitter.im/coq/coq?at=5e5ec0ae4eefc06dcf31943f 5https://github.com/proofengineering/proofengineering-bib/blob/master/ proof-engineering.bib 8 with application, and thus reification by parametricity was born. Finally, I would like to take this opportunity to acknowledge the people in my life who have helped me succeed on this long journey. Thanks to my mom, for taking every opportunity to enrich my life and setting me on this path, for encouraging me from my youth and always supporting me in all that I do. Thanks also to my sister Rachel, my dad, and the rest of my family for always being kind and supportive. Thank you to my several friends, and especially to Allison Strandberg, with whom my friendship through the years has been invaluable and fulfilling. Finally, I will be eternally grateful to Rajashree for her faith in me and for everything she’s done to help me excel. While the technical work in proof assistants has always been a delight, writing papers has remained a struggle, and the process of completing my PhD with a thesis and a defense would have been a great deal more stressful without Rajashree. This work was supported in part by a Google Research Award and National Science Foundation grants CCF-1253229, CCF-1512611, and CCF-1521584.

Performance Engineering of Proof-Based Software Systems at Scale by Jason S

Reflexive Interpreters 1 the Problem of Control

Type Safety of Rewrite Rules in Dependent Types

A Critique of Abelson and Sussman Why Calculating Is Better Than

The Evolution of Lisp

A Certified Study of a Reversible Programming Language

Reading List

A Survey of Engineering of Formally Verified Software

Analyzing Individual Proofs As the Basis of Interoperability Between Proof Systems

Introduction to the Literature on Programming Language Design Gary T

Elaboration in Dependent Type Theory

Programming and Proving with Classical Types

Extending Nunchaku to Dependent Type Theory