Using Software Execution Models to Prevent, Localize and Fix Bugs

Mining and Checking Object Behavior Valentin Dallmeier Dissertation zur Erlangung des Grades des Doktors der Ingenierwissenschaften der Naturwissenschaftlich-Technischen Fakultaten¨ der Universitat¨ des Saarlandes Saarbrucken,¨ 2010 Day of Defense Dean Prof. Holger Hermanns Head of the Examination Board Prof. Dr. Reinhard Wilhelm Members of the Examination Board Prof. Dr. Andreas Zeller, Prof. Dr. Sebastian Hack Dr. Gordon Fraser ii Summary This thesis introduces a novel approach to modeling the behavior of programs at runtime. We leverage the structure of object-oriented programs to derive models that de- scribe the behavior of individual objects. Our approach mines object behavior models, finite state automata where states correspond to different states of an object, and tran- sitions are caused by method invocations. Such models capture the effects of method invocations on an object’s state. To our knowledge, our approach is the first to com- bine the control-flow with information about the values of variables. Our ADABU tool is able to mine object behavior models from the executions of large interactive JAVA programs. To investigate the usefulness of our technique, we study two different appli- cations of object behavior models: Mining Specifications Many existing verification techniques are difficult to apply be- cause in practice the necessary specifications are missing. We use ADABU to automatically mine specifications from the execution of test suites. To enrich these specifications, our TAUTOKO tool systematically generates test cases that exercise previously uncovered behavior. Our results show that, when fed into a typestate verifier, such enriched specifications are able to detect more bugs than the original versions. Generating Fixes We present PACHIKA, a tool to automatically generate possible fixes for failing program runs. Our approach uses object behavior models to compare passing and failing runs. Differences in the models both point to anoma- lies and suggest possible ways to fix the anomaly. In a controlled experiment, PACHIKA was able to synthesize fixes for real bugs mined from the history of two open-source projects. iv Zusammenfassung Diese Arbeit stellt einen neuen Ansatz zur Modellierung des Verhaltens eines Pro- grammes zur Laufzeit vor. Wir nutzen die Struktur Objektorientierter Programme aus um Modelle zu erzeugen, die das Verhalten einzelner Objekte beschreiben. Unser Ansatz generiert Objektverhaltensmodelle, endliche Automaten deren Zustande¨ unter- schiedlichen Zustanden¨ des Objektes entsprechen. Zustandsuberg¨ ange¨ im Automaten werden durch Methodenaufrufe ausgelost.¨ Diese Modelle erfassen die Auswirkungen von Methodenaufrufen auf den Zustand eines Objektes. Nach unserem Kenntnisstand ist unser Ansatz der Erste, der Informationen uber¨ den Kontrollfluss eines Programms mit den Werten von Variablen kombiniert. Unser ADABU Prototyp ist in der Lage, Objektverhaltensmodelle von Ausfuhrungen¨ großer JAVA Programme zu lernen. Um die Anwendbarkeit unseres Ansatzes in der Praxis zu untersuchen, haben wir zwei unterschiedliche Anwendungen von Objektverhaltensmodellen untersucht: Lernen von Spezifikationen Viele Ansatze¨ zur Programmverifikation sind in der Pra- xis schwierig zu verwenden, da die notwendigen Spezifikationen fehlen. Wir verwenden ADABU um Spezifikationen von der Ausfuhrung¨ automatischer Tests zu lernen. Um die Spezifikationen zu vervollstandigen¨ generiert der TAUTOKO Prototyp systematisch Tests, die gezielt neues Verhalten abtesten. Unsere Ergeb- nisse zeigen, dass derart vervollstandigte¨ Spezifikationen fur¨ ein spezielles Ver- ifikationsverfahren namens ”Typestate Verification” wesentlich mehr Fehler fin- den als die ursprunglichen¨ Spezifikationen. Automatische Programmkorrektur Wir stellen PACHIKA vor, ein Werkzeug das au- tomatisch mogliche¨ Programmkorrekturen fur¨ fehlerhafte Programmlaufe¨ vor- schlagt.¨ Unser Ansatz verwendet Objektverhaltensmodelle um das Verhalten von normalen und fehlerhaften Laufen¨ zu vergleichen. Unterschiede in den Modellen weisen auf Anomalien hin und zeigen mogliche¨ Korrekturen auf. In einem kon- trollierten Experiment war PACHIKA in der Lage, Korrekturen fur¨ echte Fehler aus der Versionsgeschichte zweier quelloffener Programme zu generieren. Acknowledgments First and foremost, I thank my adviser Andreas Zeller for his patience and support over the years. A big thank you also goes to Sebastian Hack for being my second examiner and for sharing his wisdom on static analysis. Throughout the course of my PhD, I have had the pleasure to collaborate with excellent researchers. A big thank you to Christian Lindig who taught me a lot in the early years of my PhD. Thank you to Andrzej Wasylkowski for fruitful discussions and sharing his knowledge. I also thank Thomas Zimmermann for his help with developing IBUGS, and Bertrand Meyer for joint work on PACHIKA. A special thank you goes to Laura Dietz for her patience explaining the secrets of machine learning. Thank you to Tobias Scheffer for supporting our work on debugging with machine learning. I also thank Nikolai Knopp for a great bachelor’s thesis, and Christoph Mallon for support with JFIRM. A big thank you goes to my colleagues at the chair for Software Engineering. I really enjoyed the open atmosphere, fruitful retreats at Dagstuhl, and last but not least the conversations in the coffee breaks. I am indebted to Kim Herzig, Sascha Just, Christian Holler and Sebastian Hafner for keeping the infrastructure at the chair up and running. A special thank you goes to my friend and colleague Martin Burger, my office mate through most of my PhD time. I also thank my parents and my sisters for supporting me. Finally, I thank my wife Jasmin for her patience and love. Contents 1 Introduction 1 1.1 About this Thesis ............................ 3 1.2 Terminology ............................... 4 1.3 Publications ............................... 6 2 Classifying Bugs 7 2.1 Source Data ............................... 7 2.2 Classification .............................. 8 2.3 Conclusions ............................... 10 3 State of the Art 11 3.1 Dynamic Program Behavior ...................... 12 3.2 Program Spectra ............................. 16 3.3 Call-Sequence Sets ........................... 17 3.4 Finite State Automata .......................... 20 3.4.1 Learning Finite State Automata ................ 20 3.4.2 Software Process Models .................... 22 3.4.3 Extended Finite State Machines ................ 23 3.4.4 Object Usage Specifications .................. 25 3.4.5 Markov Chains ......................... 27 3.4.6 Summary ............................ 28 3.5 Invariants ................................ 29 3.6 Conclusions ............................... 30 4 Object Behavior Models 33 4.1 Identifiers ................................ 36 4.2 Inspectors ................................ 36 ix CONTENTS 4.3 Value Access Paths ........................... 37 4.4 Object States .............................. 38 4.5 Object Behavior Models ........................ 39 4.6 Model Depth .............................. 42 4.7 State Abstraction ............................ 43 4.8 Conclusions ............................... 45 5 Mining Object Behavior Models 47 5.1 Tracing ................................. 48 5.1.1 Data Collection ......................... 48 5.1.2 Architecture ........................... 49 5.1.3 Principles ............................ 51 5.1.4 Traced Data ........................... 54 5.1.5 Object Identifiers ........................ 54 5.1.6 Tracing Inspector Values .................... 55 5.1.7 Multithreading ......................... 55 5.1.8 Runtime Evaluation ...................... 55 5.2 Model Mining .............................. 57 5.2.1 Dynamic Heap Model ..................... 57 5.2.2 Model Generation ....................... 58 5.2.3 Runtime Optimizations ..................... 60 5.3 Dynamic Side-Effect Analysis ..................... 60 5.3.1 Pure Methods .......................... 62 5.3.2 Analysis ............................ 62 5.3.3 Tracing ............................. 63 5.3.4 Algorithm ............................ 63 5.3.5 Multiple Program Runs ..................... 65 5.3.6 Soundness ........................... 65 5.3.7 Evaluation ........................... 65 5.3.8 Related Work .......................... 70 5.4 Conclusions ............................... 71 6 Mining Bug Benchmarks 73 6.1 Motivation ................................ 74 6.2 Related Work .............................. 75 6.2.1 Existing Benchmark Suites ................... 75 6.2.2 Defect Localization Tools ................... 76 6.2.3 Bug Classification ....................... 77 6.3 Bug Extraction from History ...................... 78 x CONTENTS 6.3.1 Prerequisites .......................... 78 6.3.2 Fix Identification ........................ 79 6.3.3 Extraction ............................ 80 6.3.4 Test Execution ......................... 81 6.3.5 Associated Tests ........................ 82 6.3.6 Meta Information ........................ 83 6.3.7 Repository ........................... 86 6.4 Subjects ................................. 87 6.4.1 Characteristics ......................... 87 6.4.2 Locality ............................. 88 6.4.3 Size ............................... 88 6.4.4 Syntactical Properties ...................... 91 6.5 Minimizing Fixes with Delta Debugging ................ 92 6.5.1 Delta Debugging ........................ 93 6.5.2 Minimizing Fixes ........................ 94 6.6 Biased Data Sets ...........................

Load more