Automatic Reasoning Techniques for Non-Serializable
Total Page:16
File Type:pdf, Size:1020Kb
AUTOMATIC REASONING TECHNIQUES FOR NON-SERIALIZABLE DATA-INTENSIVE APPLICATIONS A Dissertation Submitted to the Faculty of Purdue University by Gowtham Kaki In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy August 2019 Purdue University West Lafayette, Indiana ii THE PURDUE UNIVERSITY GRADUATE SCHOOL STATEMENT OF DISSERTATION APPROVAL Dr. Suresh Jagannathan, Chair Department of Computer Science Dr. Xiangyu Zhang Department of Computer Science Dr. Tiark Rompf Department of Computer Science Dr. Pedro Fonseca Department of Computer Science Approved by: Dr. Voicu S Popescu Head of the Department Graduate Program iii To Pavani, my source of strength and wisdom. In the pursuit of unity, elegance, and happiness ... iv ACKNOWLEDGMENTS I will be forever grateful to my advisor Suresh Jagannathan, who took a chance on me seven years ago when I had nothing to show except for my enthusiasm for functional programming. Right from the beginning, Suresh has been receptive of my ideas, regardless of how vague and ill-conceived they were, and would patiently work with me over multiple iterations of reifying them into research papers. I have immensely benefited from his constant encouragement to think bold and different, to question fundamental assumptions, and to fearlessly pursue unconventional ap- proaches to their logical conclusion. Working with him over the past seven years has taught me how to think, changed my perception towards research and life in general, and helped me grow as a scientist and a human being. I am thankful to KC Sivaramakrishnan, my long-time collaborator, mentor, and friend, for his constant encouragement and timely feedback. It is KC who introduced me to the fascinating world of distributed data stores, weak consistency and isolation { the focus of my current work, and also the topic of this thesis. I have hugely benefited from the numerous discussions we had over the years on these topics, several of them resulting in new ideas that are now published. As my senior in the research group, KC has set a fine example through the high standards of his work ethic for me to emulate During my internships at Microsoft Research India, I have had the good fortune of collaborating with Ganesan Ramalingam, who, through his own example, reinforced in me such virtues as clarity of thought, intellectual humility, and sense of crafts- manship in doing research. I have also enjoyed my collaborations with several other immensely talented people, including Kartik Nagar, Mahsa Nazafzadeh, Kiarash Rah- mani, Kapil Earanky, and Samodya Abeysiriwardane, each of whom taught me a new perspective in problem solving. v I am fortunate to have been part of a vibrant research community in Programming Languages at Purdue. Gustavo Petri, He Zhu, KC Sivaramakrishnan, and Xuankang Lin helped me bring the Purdue PL (PurPL) reading group into existence in Spring 2014. PurPL has since matured into an umbrella organization for all PL groups at Purdue, thanks to the meticulous work of Tiark Rompf, Roopsha Samantha, Ben Delaware, Milind Kulkarni, and their students. I have personally benefited from the sense of community and belongingness that PurPL fosters among the PL graduate students at Purdue. I am thankful for friendship and encouragement that I received from several of my friends at Purdue, including Abhilash Jindal, GV Srikanth, Priya Murria, Vandith Pamuru, Sivakami Suresh, Habiba Farrukh, Devendra Verma, Raghu Vamsi, Jithin Joseph, Ravi Pramod, Suvidha Kancherla, and Praseem Banzal. They have made my long stay at Purdue enjoyable, and gave me memories that are worth cherishing. I am incredibly lucky to have made lasting friendships during my undergraduate years at BITS, Pilani, which continued to support and sustain me through my Ph.D. Over the last seven years, it is to these friends that I have often turned to during the tough times when I felt an acute need for solace and acceptance. For their unadulterated friendship and unconditional acceptance, I am forever indebted to my friends Bharat and Swathi, Karthik and Vandana, Kartik and Mounica, Kashyap and Nikitha, Krishna and Subbu, Sandeep and Sowjanya, Sasidhar and Prathyusha, Uttam and Sreeja, and Mahesh Lagadapati. They constitute my family beyond my kin. Lastly, but most importantly, I would like to thank my parents for reposing an im- mense faith in my abilities and values, and offering me a complete freedom to pursue my own interests without once expressing any misgiving. My greatest sense of grati- tude however is towards my best friend and wife Pavani without whose unconditional love, tireless support, and constant encouragement none of the work described in this thesis would have been possible; If I have achieved anything through this thesis, it is as much hers as it is mine. vi TABLE OF CONTENTS Page LIST OF TABLES :::::::::::::::::::::::::::::::::: viii LIST OF FIGURES ::::::::::::::::::::::::::::::::: ix ABSTRACT ::::::::::::::::::::::::::::::::::::: xi 1 INTRODUCTION :::::::::::::::::::::::::::::::: 1 1.1 Contributions :::::::::::::::::::::::::::::::: 5 1.1.1 Compositional Reasoning and Inference for Weak Isolation ::: 5 1.1.2 Bounded Verification under Weak Consistency :::::::::: 6 1.1.3 Principled Derivation of Mergeable Replicated Data Types ::: 6 1.2 Roadmap :::::::::::::::::::::::::::::::::: 7 2 COMPOSITIONAL REASONING AND INFERENCE FOR WEAK ISO- LATION :::::::::::::::::::::::::::::::::::::: 8 2.1 Motivation :::::::::::::::::::::::::::::::::: 8 2.2 : Syntax and Semantics ::::::::::::::::::::::::: 18 2.2.1T Isolation Specifications ::::::::::::::::::::::: 24 2.3 The Reasoning Framework ::::::::::::::::::::::::: 27 2.3.1 The Rely-Guarantee Judgment :::::::::::::::::: 28 2.3.2 Semantics and Soundness ::::::::::::::::::::: 33 2.4 Inference ::::::::::::::::::::::::::::::::::: 35 2.4.1 Soundness of Inference ::::::::::::::::::::::: 41 2.4.2 From to the First-Order Logic :::::::::::::::::: 42 2.4.3 DecidabilityS ::::::::::::::::::::::::::::: 47 2.5 ACIDifier Implementation :::::::::::::::::::::::: 49 2.5.1 Pragmatics ::::::::::::::::::::::::::::: 51 2.6 Evaluation :::::::::::::::::::::::::::::::::: 53 2.7 Related Work :::::::::::::::::::::::::::::::: 57 3 BOUNDED VERIFICATION UNDER WEAK CONSISTENCY ::::::: 60 3.1 Replicated State Anomalies: The Motivation for Verification :::::: 61 3.2 The Q9 Programming Framework ::::::::::::::::::::: 63 3.2.1 Explicit Effect Representation ::::::::::::::::::: 67 3.3 System Model :::::::::::::::::::::::::::::::: 70 3.4 The Q9 Verification Engine :::::::::::::::::::::::: 72 3.4.1 Core Calculus :::::::::::::::::::::::::::: 72 3.4.2 Abstract Relations ::::::::::::::::::::::::: 76 vii Page 3.4.3 Symbolic Execution ::::::::::::::::::::::::: 77 3.4.4 Automated Repair ::::::::::::::::::::::::: 86 3.5 Transactions ::::::::::::::::::::::::::::::::: 89 3.6 Implementation and Evaluation :::::::::::::::::::::: 91 3.6.1 Verification Experiments :::::::::::::::::::::: 92 3.6.2 Validation Experiments ::::::::::::::::::::::: 97 3.7 Related Work :::::::::::::::::::::::::::::::: 98 4 DERIVATION OF MERGEABLE REPLICATED DATA TYPES ::::: 102 4.1 Motivation ::::::::::::::::::::::::::::::::: 107 4.2 Abstracting Data Structures as Relations :::::::::::::::: 115 4.3 Deriving Relational Merge Specifications :::::::::::::::: 118 4.3.1 Compositionality ::::::::::::::::::::::::: 119 4.3.2 Type Specifications for Characteristic Relations :::::::: 122 4.3.3 Derivation Rules ::::::::::::::::::::::::: 124 4.4 Deriving Merge Functions :::::::::::::::::::::::: 129 4.4.1 Concretizing Orders ::::::::::::::::::::::: 131 4.5 Implementation :::::::::::::::::::::::::::::: 132 4.5.1 Quark store :::::::::::::::::::::::::::: 133 4.6 Evaluation ::::::::::::::::::::::::::::::::: 134 4.6.1 Data Structure Benchmarks ::::::::::::::::::: 135 4.6.2 Application Benchmarks ::::::::::::::::::::: 139 4.7 Related Work ::::::::::::::::::::::::::::::: 141 5 CONCLUDING REMARKS AND FUTURE WORK :::::::::::: 144 REFERENCES ::::::::::::::::::::::::::::::::::: 148 VITA :::::::::::::::::::::::::::::::::::::::: 157 viii LIST OF TABLES Table Page 2.1 The discovered isolation levels for TPC-C transactions ::::::::::: 56 3.1 Consistency Models ::::::::::::::::::::::::::::::: 88 3.2 A sample of the anomalies found and fixes discovered by Q9 :::::::: 93 3.3 Verification Statistics :::::::::::::::::::::::::::::: 96 4.1 Characteristic relations for various data types ::::::::::::::: 117 4.2 A description of data structure benchmarks used in Quark evaluation. :: 135 4.3 Quark application Benchmarks :::::::::::::::::::::::: 139 ix LIST OF FIGURES Figure Page 2.1 TPC-C new order transaction :::::::::::::::::::::::: 9 2.2 Database schema of TPC-C's order management system. The naming convention indicates primary keys and foreign keys. For e.g., ol id is the primary key column of the order line table, whereas ol o id is a foreign key that refers to the o id column of the order table. :::::::::::: 11 2.3 An RC execution involving two instances (T1 and T2) of the new order trans- action depicted in Fig. 2.1. Both instances read the d id District record concurrently, because neither transaction is committed when the reads are executed. The subsequent operations are effectively sequentialized, since T2 commits before T1. Nonetheless, both transactions read the same value for d next o id resulting in them adding Order records with the same ids, which in turn triggers a violation of TPC-C's consistency condition. :::::::::: 12 2.4 Foreach loop from Fig. 2.1 ::::::::::::::::::::::::::: 16 2.5 : Syntax ::::::::::::::::::::::::::::::::::::