Narrowing the Gap Between Verification and Systematic Testing

DISS. ETH NO. 22795 Narrowing the gap between verification and systematic testing A thesis submitted to attain the degree of Doctor of sciences of ETH Zurich (Dr. sc. ETH Zurich) Presented by Maria Christakis Dipl. Electrical and Computer Engineering National Technical University of Athens, Greece Born on October 9, 1986 Citizen of Greece Accepted on the recommendation of Prof. Peter Müller : examiner Prof. Cristian Cadar : co-examiner Dr. Patrice Godefroid : co-examiner Prof. Thomas Gross : co-examiner 2015 To my parents, Thekla and Yannis, in deepest love and gratitude. Abstract This dissertation focuses on narrowing the gap between verification and systematic testing, in two directions: (1) by complementing verification with systematic testing, and (2) by pushing systematic testing toward reaching verification. In the first direction, we explore how to effectively combine unsound static analysis with systematic testing, so as to guide test generation toward properties that have not been checked (soundly) by a static analyzer. Our combination significantly reduces the test effort while checking more unverified properties. In the process of testing properties that are typically not soundly verified by static analyzers, we identify important limitations in existing test generation tools, with respect to considering sufficient oracles and all factors that affect the outcome of these oracles. Specifically, testing tools that use object invariants as filters on input data do not thoroughly check whether these invariants are actually maintained by the unit under test. Moreover, existing test generation tools ignore the potential interaction of static state with a unit under test. We address these issues by presenting novel techniques based on static analysis and dynamic symbolic execution. Our techniques detected a significant number of errors that were previously missed by existing tools. We also investigate in which ways it is beneficial to complement sound, interactive verification with systematic testing for the properties that have not been verified yet. In the second direction, we push systematic testing toward verification, in particular, toward proving memory safety of the ANI Windows image parser. This is achieved by concentrating on the main limitations of systematic dynamic test generation, namely, imperfect symbolic execution and path explosion, in the context of this parser. Based on insights from attempting to reach verification of the ANI parser using only systematic testing, we then define IC-Cut, a new compositional search strategy for automatically and dynamically discovering simple function interfaces, where large programs can be effectively decomposed. IC-Cut preserves code coverage and increases bug finding in significantly less exploration time compared to the current search strategy of the dynamic symbolic execution tool SAGE. An additional novelty of IC-Cut is that it can identify which decomposed program units are exhaustively tested and, thus, dynamically verified. i ii Résumé Cette thèse s’efforce de combler le fossé qui existe aujourd’hui entre vérifica- tion et test systématique en explorant deux axes: d’une part, en complétant la vérification par des tests systématiques, et d’autre part, en repoussant les limites des tests systématiques pour obtenir des garanties proches de celles obtenues par la vérification. Dans un premier temps, nous étudions l’interaction des analyses statiques non-sûres avec le test systématique ; nous montrons comment orienter la génération de tests de manière à couvrir des propriétés qui n’ont pas été vérifiées (de manière sûre) par l’analyseur statique. Notre technique combi- nant les deux approches réduit de manière significative les efforts consacrés au test, tout en vérifiant davantage de propriétés. Lorsqu’il s’agit de tester les propriétés qui ne sont généralement pas vérifiées de manière correcte par les analyseurs statiques, nous mettons en exergue d’importantes limitations des outils de génération de tests préexistants ; en particulier, nous soulignons l’importance du choix d’un oracle correct, ainsi que l’importance de prendre en compte tous les facteurs susceptibles d’influencer les résultats de l’oracle. Plus précisément, les outils de test qui se basent sur des invariants d’objets pour filtrer leurs entrées ne vérifient généralement pas que ces invariants sont bel et bien maintenus dans le module présentement soumis aux tests. De plus, les outils actuels de génération de tests ne prennent pas en compte l’interaction entre l’état global du programme et le comportement du module testé. Nous proposons une réponse à ces problèmes, et présentons des techniques novatrices basées sur l’analyse statique et l’exécution symbolique dynamique. Nos techniques ont réussi à identifier un nombre tout à fait significatif d’erreurs qui n’avaient jusqu’alors jamais été décelées par les outils existants. Nous explorons également différents contextes où il peut s’avérer utile de compléter une phase de vérification interactive sûre par du test systématique visant les propriétés qui n’ont pas encore été vérifiées. Dans un second temps, nous rapprochons le test systématique de la véri- fication, plus particulièrement, en essayant de prouver l’absence d’erreurs mémoires dans l’analyseur syntaxique d’images Windows ANI. Ce résultat est obtenu en se focalisant sur les limitations fondamentales de la généra- tion systématique de tests dynamiques, à savoir, dans le cas de cet analyseur syntaxique, une exécution symbolique imparfaite et une explosion combina- iii toire. Ce que nous avons appris en vérifiant l’analyseur syntaxique ANI, et ce uniquement à l’aide de test systématique, a été mis à profit dans le projet IC-Cut: IC-Cut est une nouvelle procédure de recherche pour le test systé- matique, qui opère de manière compositionnelle, et permet de découvrir les interfaces simples des fonctions existantes. Il est ainsi possible de décom- poser de larges programmes. Comparé à la stratégie de recherche de l’outil SAGE (exécution symbolique dynamique), IC-Cut garde la même surface de code couvert, tout en consacrant beaucoup moins de temps à l’exploration. Un apport supplémentaire d’IC-Cut est qu’il permet d’identifier lesquels des modules ont été testés de manière exhaustive, c’est-à-dire, vérifiés de manière dynamique. iv Acknowledgments I would like to acknowledge: Peter Müller, my supervisor, who generously shared his time and ex- pertise with me, challenged me to do better, let me be stubborn, and stuck with me through all the twists and turns. Peter, thank you for your sage ad- vice, infinite patience, excellent guidance, good humor, and the life-changing Skype call on Tuesday, May 10, 2011. I hope that our friendship and col- laboration continue to grow even stronger in time. Patrice Godefroid, my internship mentor, whose undivided commitment to research I admire. Patrice, I become better just watching you attack a problem. Thank you for your unlimited energy, sharp feedback, immense help, earnest attention, and the enviable opportunity to work with you. The co-examiners of my dissertation, Cristian Cadar, Patrice Godefroid, and Thomas Gross, who took the time out of their busy schedules to be on my defense committee and provided many insightful comments and suggestions on this dissertation. Valentin Wüstholz,my friend and collaborator, who not only is the in- ventor of free calls, but also knows how to pick wine. Valentin, thank you for your tolerance and moral support, restaurant suggestions, write effects, abstractions, visitors, and debuggers. Marlies Weissert, who makes the office a magical place. Marlies, thank you for your tremendous assistance, lovely smile, and invaluable fashion tips. Malte Schwerhoff (or Malticz), my friend and end-of-week coffee buddy. Dimitar Asenov (or professional Mitko), my favorite IT coordinator. Lucas Brutschy (or Lucky Luke), my after-lunch coffee buddy. Milos Novacek, for not hesitating to disclose the secrets of the heap. Uri Juhasz, for subsidization. The members of our group, CédricFavre, Pietro Ferrara, Yannis Kassios, and Alex Summers. I am grateful for our years together and your reassurance when times were tough. My students, Patrick Emmisberger, Patrick Spettel, Timon Gehr, and David Rohr. Thank you for your fresh ideas and for turning them into code. Patrick Emmisberger, thank you for introducing me to the delicious world of cordon blue. v Denise Spicher, for her keen help, attentive answers to all my questions, and always-pleasant mood. Everyone at Microsoft Research Redmond. Thank you for numerous discussions, your active encouragement, and warmest welcome. Jonathan Protzenko, thank you for the elegant translation of the abstract of this dissertation into French. My parents, Thekla and Yannis, to whom I am forever grateful. You did everything more than right, and I would be nowhere without you. Thank You. vi Contents 1 Introduction1 1.1 Systematic dynamic test generation...............2 1.2 Combining verification and systematic testing.........5 1.3 Pushing systematic testing toward verification......... 12 I Complementing verification with systematic testing 15 2 Guiding test generation to unverified program executions 17 2.1 Approach............................. 19 2.1.1 Verification annotations................. 19 2.1.2 Guiding dynamic symbolic execution.......... 21 2.2 Condition inference........................ 24 2.2.1 Abstraction........................ 24 2.2.2 May-unverified conditions................ 27 2.2.3 Must-unverified conditions................ 28 2.2.4 Combined

Narrowing the Gap Between Verification and Systematic Testing

Well-Founded Functions and Extreme Predicates in Dafny: a Tutorial

Programming Language Features for Refinement

Developing Verified Sequential Programs with Event-B

Master's Thesis

Essays on Emotional Well-Being, Health Insurance Disparities and Health Insurance Markets

Dafny: an Automatic Program Veriﬁer for Functional Correctness K

Bounded Model Checking of Pointer Programs Revisited?

The Merger Control Review

Btech Cse Syllabus 2017.Pdf

Developing Verified Programs with Dafny

Dafny Quick Reference* J

User Interaction in Deductive Interactive Program Verification