Automatic Techniques for Proving Correctness of Heap-Manipulating Programs

c 2013 Xiaokang Qiu AUTOMATIC TECHNIQUES FOR PROVING CORRECTNESS OF HEAP-MANIPULATING PROGRAMS BY XIAOKANG QIU DISSERTATION Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate College of the University of Illinois at Urbana-Champaign, 2013 Urbana, Illinois Doctoral Committee: Associate Professor Madhusudan Parthasarathy, Chair Zisman Family Professor Rajeev Alur, University of Pennsylvania Associate Professor Grigore Ro¸su Associate Professor Mahesh Viswanathan ABSTRACT Reliability is critical for system software, such as OS kernels, mobile browsers, embedded systems and cloud systems. The correctness of these programs, especially for security, is highly desirable, as they should provide a trustwor- thy platform for higher-level applications and the end-users. Unfortunately, due to its inherent complexity, the verification process of these programs is typically manual/semi-automatic, tedious, and painful. Automating the reasoning behind these verification tasks and decreasing the dependence on manual help is one of the greatest challenges in software verification. This dissertation presents two logic-based automatic software verification systems, namely Strand and Dryad, that help in the task of verification of heap-manipulating programs, which is one of the most complex aspects of modern software that eludes automatic verification. Strand is a logic that combines an expressive heap-logic with an arbitrary data-logic and admits several powerful decidable fragments. The general decision procedures can be used in not only proving programs correct but also in software analysis and testing. Dryad is a family of logics, including Dryadtree as a first-order logic for trees and Dryadsep as a dialect of separation logic. Both the two logics are amenable to automated reasoning using the natural proof strategy, a radically new approach to software verification. Dryad and the natural proof techniques are so far the most efficient logic-based approach that can verify the full correctness of a wide variety of challenging programs, including a large number of programs from various open-source libraries. They hold promise of hatching the next-generation automatic verification techniques. ii To Ping. iii ACKNOWLEDGMENTS The six-year scholarly pursuit of a Ph.D. in Illinois has been the greatest challenge in my life. It is my privilege to thank all the people that have influenced this endeavor. I am deeply indebted to my advisor, Madhusudan Parthasarathy, who has been a consistent source of research guidance and financial support. As a venerable mentor, he has taught me virtually all I know of being a good researcher. I am extremely admired for his incredible enthusiasm, deep think- ing, broad knowledge, and sense of responsibility. I thank other members of my doctoral committee, Rajeev Alur, Grigore Ro¸su and Mahesh Viswanathan. They have all eagerly provided me with advice on both my research and career. My research collaborator Gennaro Parlato has been an “older brother” of mine, and taught me important and practical lessons on conducting research, writing papers, and surviving the Ph.D. I also thank my fellow students, including Pranav Garg, Edgar Pek, Shambwaditya Saha, Francesco Sorrentino and Andrei S¸tef˘anescu. They have been a source for great conversation and feedback on my research. I am grateful for the great staff in the Department of Computer Science. Elaine Wilson handled my travel arrangements and reimbursements with ex- traordinary efficiency. Shirley Finke and Jennifer Dittmar made sure I got paid every semester. Rhonda McElroy, Mary Beth Kelly and Kara MacGre- gor helped me get visas, write and sign important supporting letters, and meet the coursework requirements on time. I owe much thanks to my parents, Jianxiang Qiu and Minguang Shen. They have unceasingly supported and encouraged me for coming to America and being an academic. I am also very thankful to Elder Wei-Laung Hu and brothers and sisters in the Cornerstone Fellowship, for their prayer, sharing and encouragement in my spiritual journey in the Midwest. iv Finally, I especially thank my wife, Ping. I would not have finished the degree without her love, support and patience. She has been with me, through thick and thin. For all those times, this dissertation is lovingly dedicated to her. v TABLE OF CONTENTS LISTOFTABLES .............................viii LISTOFFIGURES ............................ ix LISTOFABBREVIATIONS . .. .. x CHAPTER 1 INTRODUCTION . 1 1.1 SummaryofContributions . 3 1.2 Organization ........................... 4 CHAPTER 2 PRELIMINARIES . 6 2.1 Satisfiability Modulo Theories and Z3 . 6 2.2 Monadic Second-Order Theory and Mona ........... 7 2.3 SeparationLogic ......................... 8 CHAPTER 3 STRAND LOGIC .................... 11 3.1 Overview.............................. 12 3.2 Motivating Examples and Logic Design . 13 3.3 RecursivelyDefinedData-Structures . 17 3.4 TheLogic ............................. 23 3.5 Program Verification Using Strand .............. 27 CHAPTER 4 DECISION PROCEDURES . 39 4.1 Overview.............................. 39 4.2 Satisfiability-Preserving Embeddings . 43 4.3 ASemanticallyDefinedFragment . 48 4.4 ASyntacticallyDefinedFragment . 54 4.5 ExperimentalEvaluation . 64 4.6 RelatedWork ........................... 69 CHAPTER5 NATURALPROOFS. 71 5.1 The Dryadtree Logic ....................... 75 5.2 Deriving the Verification Condition . 82 5.3 A Decidable Fragment of Dryadtree ...............100 5.4 FormulaAbstraction . .109 vi 5.5 Experiments............................115 5.6 RelatedWork ...........................119 CHAPTER 6 COMBINING SEPARATION AND RECURSION . 123 6.1 LogicDesign............................124 6.2 Syntax...............................127 6.3 Semantics .............................129 6.4 Examples .............................137 6.5 Translating toALogicover theGlobalHeap . 139 CHAPTER 7 NATURAL PROOFS FOR STRUCTURE, DATA, ANDSEPARATION ..........................147 7.1 ProgramsandHoare-triples . .149 7.2 Generating the Verification Condition . 151 7.3 UnfoldingAcrosstheFootprint . .158 7.4 FormulaAbstraction . .162 7.5 CaseStudy ............................166 7.6 ExperimentalEvaluation . .171 7.7 RelatedWork ...........................177 7.8 AnnotationSynthesis . .179 CHAPTER8 CONCLUSIONS . .191 8.1 Conclusions ............................191 8.2 ALookAhead...........................192 REFERENCES...............................194 vii LIST OF TABLES 4.1 Results of program verification using Strand ......... 68 5.1 Results of program verification using Dryadtree ........120 6.1 Domain-exact property and Scope function . 141 7.1 Results of verifying data-structure algorithms . 173 7.2 Results of verifying open-source libraries . 175 viii LIST OF FIGURES 3.1 A list with head and tail . 14 3.2 l′ inherits data values from l ................... 15 3.3 A binary tree example represented in Rbt ............ 20 3.4 Definition of the tailorX function ................ 22 3.5 Syntax of Strand ........................ 24 3.6 Predicates defined for non-updating statements. 33 3.7 Predicates defined for updating statements. 34 3.8 Asyntactic transformationforconditions. 35 4.1 Definition of the interpret function ............... 49 4.2 Definition of the tailor function ................. 51 syn 4.3 Syntax of Stranddec ...................... 55 4.4 A valid subset X that falsifies β ................. 58 5.1 Syntax of Dryadtree ....................... 77 5.2 Recursivedefinitionsforredblacktrees . 80 5.3 AVL-findroutine .........................100 5.4 Pre/post conditions and recursive definition for AVL-find . 100 5.5 Expanding the symbolic heap and generating the formulas . 101 5.6 Syntax of local formulas ϕp(x)..................103 dec 5.7 Syntax of Dryadtree .......................104 5.8 Inductive definition of map(ϕ)..................106 6.1 Syntax of Dryadsep .......................128 6.2 The pure predicateforterms/formulas . 131 6.3 Translation of Dryadsep terms .................142 6.4 Translation of Dryadsep formulas ................142 7.1 Syntaxofprograms . .149 7.2 Casestudy:Heapify . .. .. .166 ix LIST OF ABBREVIATIONS APF Array Property Fragment BST Binary Search Tree FOL First-Order Logic ITE If-Then-Else LCA Least Common Ancestor MSO Monadic Second-Order MSOL Monadic Second-Order Logic RDDS Recursively Defined Data-Structure SL Separation Logic SMT Satisfiability Modulo Theories SPE Satisfiability-Preserving Embedding VC Verification Condition WS1S Weak monadic Second-order theory of 1 Successor WS2S Weak monadic Second-order theory of 2 Successors x CHAPTER 1 INTRODUCTION As software systems have become indispensable in our daily lives, their reliability has grown to be one of the most concerned issues, especially when deployed for critical applications and services. This includes complex embedded software in avionics, vehicles and medical equipments, system software such as operating systems and browsers, as well as today’s emerging cloud software. Numerous approaches have been proposed to build reliability-critical software to satisfy its complex correctness requirements. A focus of intense research in this area has been program verification using theorem provers, uti- lizing manually provided proof annotations, such as pre- and post-conditions for functions and loop invariants. Automatic theory solvers (e.g. SMT solvers) that handle a variety of quantifier-free theories including arithmetic, uninterpreted functions, Boolean logic, etc., serve as

Load more