Don't Mind the Formalization

DON’T MIND THE FORMALIZATION GAP: THE DESIGN AND USAGE OF HS-TO-COQ Antal Spector-Zabusky A DISSERTATION in Computer and Information Science Presented to the Faculties of the University of Pennsylvania in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy 2021 Supervisor of Dissertation Stephanie Weirich Professor of Computer and Information Science Graduate Group Chairperson Mayur Naik, Professor of Computer and Information Science Dissertation Commitee Steve Zdancewic, Professor of Computer and Information Science, Chair Benjamin Pierce, Professor of Computer and Information Science Mayur Naik, Professor of Computer and Information Science Andrew Appel, Professor of Computer Science, Princeton DON’T MIND THE FORMALIZATION GAP: THE DESIGN AND USAGE OF HS-TO-COQ COPYRIGHT 2021 Antal Spector-Zabusky This work is licensed under a Creative Commons Attribution-NonCommercial-Share- Alike 4.0 International (CC BY-NC-SA 4.0) License To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-sa/4.0/ Acknowledgments First and foremost, I want to extend my deepest thanks to my advisor, Stephanie Weirich. I still remember being an uncertain PhD student coming into her office to change projects and hearing her say, “I have a great project for you”. I was nervous it wouldn’t be a good fit, but from the moment the next words were out of her mouth I knew it was perfect. She understood me from the get-go, both as a researcher and as a person, and the lessons I took from her mentorship in research, programming, theorem proving, and writing made me the programming language theorist I am today. Stephanie’s keen mind, her insightful problem-solving skills, her persistence, her eagerness to get hands on with code, and her patience made her the best advisor I could have asked for. And her passion for board games was crucially important icing on the cake! Without you, Stephanie, this document – and everything it stands for – wouldn’t be here. In a similar vein, I want to extend my thanks to Benjamin Pierce, with whom I worked when I first arrived at Penn. Benjamin knew how to get me involved with research, bringing me into collaborations both inside and outside Penn and ensuring my entry into the programming language research community; he also gave me my first graduate-level lessons in writing good research papers and giving high-quality talks, skills which I have taken to heart and have kept cultivating. Benjamin also helped me through my struggle to find a thesis project that suited me, and was dedicated to helping me find that project no matter who was running it. Thank you, Benjamin, for getting me started, and for knowing when to let me fly. I also want to thank the rest of my committee, for reading this dissertation and more. Thank you to Steve Zdancewic, who has been there since the beginning; beyond this thesis, he’s helped me with many a presentation and sparred with me in many a board game. Thank you to Andrew Appel, whose presence and advice as part of DeepSpec helped bring a different perspective to my research goals. And thank you to Mayur Naik for stepping into the DeepSpec world to understand my work. Beyond my committee, I have been supported by more people than I can name. Thank you to my Penn PL cohort, who entered the PhD with me: Leonidas Lam- propoulos, Jennifer Paykin, and Robert Rand. It was a sincere pleasure to navigate the PhD together with them, from classes to Quizzo to research and finally (finally!) to graduating. I’m glad to call them my colleagues, but I’m even more grateful to call them my friends. And thank you as well to Kenny Foner, Alex Burka, Sonia Roberts, and my other Penn friends for their academic, institutional, and personal support. In the same vein, my thanks go out to all my other collaborators, because research is not an island. Thank you to the hs-to-coq team, without whom the work in this dissertation wouldn’t have been possible: Stephanie, Yao Li, Joachim Breitner, iii Christine Rizkallah, and John Wiegley. Going back further, thank you to my earliest collaborators, on testing and micro-policies: Benjamin, Leo, Arthur Azevedo de Amorim, Cătălin Hriț, cu, John Hughes, Dimitrios Vytiniotis, Nick Giannarakis, and Andrew Tolmach. And of course, my thanks to everyone in Penn PL Club not just for their academic support, but also for making Fridays (and other days) better. Even before Penn, my thanks to the Williams CS department for nurturing me and my love of computer science. In particular, my thanks to my undergraduate advisor Steve Freund, for teaching me about programming language theory, introducing me to PL research, and for helping me navigate my path forward; and to Jeannie Albrecht, for giving me the opportunity to do my very first computer science research project. I also want to thank GET-UP for supporting all the graduate students at Penn. I’m proud of what we built in solidarity with each other, and although we didn’t win, I believe we can change that next time. On a less academic note, my thanks to the communities in Philadelphia who supported me throughout this process. Thank you to my Penn Quizzo team, for making sure I was out of the house on Monday nights; to Penn Gamers, for transmuting a shared love of board games into friendships; and to the Thursday night contra dance, for being a broad community that embraced me and kept me moving. I would not have made it to the finish line without the love and support of all my friends. My wholehearted thanks to the group chat: Colin Killick, Molly Olguín, Mattie Mitchell, Annie Moriondo, Jackie Pineda-Andrews, and Ian Pineda-Andrews (or Agni, Casimir, Ilaina, Max, Screech, and all of Punchworld —Zoltán). Thank you for being there literally 24/7. I appreciate their being a neverending fount of camaraderie, serious ideas, silly jokes, and sincere emotional support. My thanks to my Williams undergraduate thesis crew: Matt Hosek, Tori Borish, Katie Kumamoto, and Dan Kohane. I’m glad to have had them along for this journey twice – and we’re all finally allowed to sleep now! My thanks to Aaron Bauer, Jake Levinson, and Nick Arnosti, who introduced me to board games. This would be enough of a reason for thanks, but I also appreciate their lasting friendship and its propensity for deep conversation (preferably over a board game or four). And my thanks to Sasha Ehrhardt for being my friend since we were 5 years old, from tigers and gibbons to computers and libraries with a whole lot more in between. My thanks to Caron Bove for driving me between LACS and IHS back in high school so that I could attend my math and science classes – I told you then that I’d acknowledge you now, and I’m delighted to finally be able to do so. I want to recognize my great-uncle Clifford Spector, whom I wish I’d met. It amazes me that his work in computability theory nestled so closely next to my chosen field without me realizing it, and I’m tickled that he ended up my academic great- great-great-uncle, our familial and professional lineages off by only two generations. And last, but never least, my neverending thanks and gratitude to my family. Your love, support, and jokes (good and bad) mean more to me than I will ever be able to say. My thanks to my mom and dad, Stacia Zabusky and Donald Spector, for not merely believing in me, but for reifying that belief into actions that build me up. And my thanks to my brother, Elias Spector-Zabusky, for understanding me more deeply than anybody else in the world. I love you all. iv ABSTRACT DON’T MIND THE FORMALIZATION GAP: THE DESIGN AND USAGE OF HS-TO-COQ Antal Spector-Zabusky Stephanie Weirich Using proof assistants to perform formal, mechanical software verification is a powerful technique for producing correct software. However, the verification is time- consuming and limited to software written in the language of the proof assistant. As an approach to mitigating this tradeoff, this dissertation presents hs-to-coq, a tool for translating programs written in the Haskell programming language into the Coq proof assistant, along with its applications and a general methodology for using it to verify programs. By introducing edit files containing programmatic descriptions of code transformations, we provide the ability to flexibly adapt our verification goals to exist anywhere on the spectrum between “increased confidence” and “full functional correctness”. v Contents Title i Copyright ii Acknowledgments iii Abstractv Contents vi List of Figures viii Chapter 1. Introduction1 1.1. How to work with hs-to-coq 4 1.2. The edit language and the mechanized formalization gap8 1.3. DeepSpec9 1.4. Contributions 10 1.5. Outline 11 Chapter 2. An Introductory Example: Bags 12 2.1. Bags in GHC 12 2.2. Translating Bag and its operations 15 2.3. Edits for Bags 16 2.4. Specifying the behavior of Bags 18 2.5. From program to theorem 19 Chapter 3. hs-to-coq: Design and Usage 21 3.1. How we’ve used hs-to-coq 21 3.2. Desiderata 22 3.3. Test suite 25 3.4. Mechanized formalization gaps 26 3.5. Infix operators 26 3.6. Notation for literals 27 3.7. Transforming code automatically 29 3.8. Partiality 31 3.9. Recursion 33 Chapter 4. The Edit Language 39 4.1.

Load more