Empirically-Grounded Construction of Bug Prediction and Detection Tools
Total Page:16
File Type:pdf, Size:1020Kb
Empirically-Grounded Construction of Bug Prediction and Detection Tools Inauguraldissertation der Philosophisch-naturwissenschaftlichen Fakultat¨ der Universitat¨ Bern | downloaded: 28.9.2021 vorgelegt von Haidar Osman von Syrien Leiter der Arbeit: Prof. Dr. Oscar Nierstrasz Institut fur¨ Informatik https://doi.org/10.24442/boristheses.850 source: Empirically-Grounded Construction of Bug Prediction and Detection Tools Inauguraldissertation der Philosophisch-naturwissenschaftlichen Fakultat¨ der Universitat¨ Bern vorgelegt von Haidar Osman von Syrien Leiter der Arbeit: Prof. Dr. Oscar Nierstrasz Institut fur¨ Informatik Von der Philosophisch-naturwissenschaftlichen Fakultat¨ angenommen. Bern, 11.12.2017 Der Dekan: Prof. Dr. Gilberto Colangelo This dissertation can be downloaded from scg.unibe.ch. Copyright c 2017 by Haidar Osman This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 Switzer- land license. To see the license go to http://creativecommons.org/licenses/by-sa/2.5/ch/ Attribution–ShareAlike Acknowledgments Doing a PhD at SCG has been a beautiful journey full of rewards. The doctorate title is only one of them. First and foremost, I would like to express my deep gratitude to Oscar. From aca- demic guidance and scientific discussions to personal counselling and after-coffee fun puzzles, Oscar has offered everything that made my PhD journey that enjoyable. And on the personal level, being away from home, I always felt safe knowing that Oscar is there. Oscar ... thank you. I would like to thank Martin Pinzger for accepting to serve on the PhD committee and for his valuable feedback on this thesis. I also thank Paolo Favaro for chairing the PhD defence. A big thank you to the SCG family. I am grateful for each and everyone of you. You really made SCG feel like home. I highly value our discussions, fights, papers, darts, swims, barbecues, beers, and coffees. They will always be great memories. I thank Mircea for his lifting spirit and valuable input, Andrei for being a support- ive friend, Boris for the great debates and fun, Andrea for the manly gossip, Jan for your delightfully-different character, Nevena for breaking the masculine atmo- sphere, Claudio for the heated discussions, Manuel for the fruitful collaborations, Yuriy for the context switching, Oli for the deep technical knowledge, Mohammad for bringing a different academic style, and Leo for the fun experiments. You guys are wonderful people and I’m honoured to have served my term with you. I am grateful to Iris for never saying no when I needed help. I am also indebted to my students: Manuel, Andi, Ensar, Jakob, Sebastien, Simon, Aliya, Cedric, and Mathias. Your hard work has contributed a lot to my PhD. I would like to thank my friends and brothers back home for their support. Yazan, Montajab, Alaa, Sami, Mothanna, Monther, Safwan, and Rami ... thank you for always being there for me. My in-laws Ahmad and Samira, thank you for your prayers and warm wishes. i ii I dedicate this work to my family ... to my parents Mohsen and Ghada, and to my sister Riham for their unconditional love and support, for believing in me no matter what, and for setting me on the right path throughout my life ... to my kids Rima and Yosef ... You are the joy of my life ... Your faces bring light to my darkest nights ... I hope you grow up to be proud of your father one day ... and finally, to Dima ... my friend, my love, my all. A wise man once said, “Give me one constant and I can build the universe”. You are that one constant in my life. Without you nothing is possible, nothing makes sense, and nothing matters. Dima ... this is to you. Abstract There is an increasing demand on high-quality software as software bugs have an economic impact not only on software projects, but also on national economies in general. Software quality is achieved via the main quality assurance activities of testing and code reviewing. However, these activities are expensive, thus they need to be carried out efficiently. Auxiliary software quality tools such as bug detection and bug prediction tools help developers focus their testing and reviewing activities on the parts of software that more likely contain bugs. However, these tools are far from adoption as main- stream development tools. Previous research points to their inability to adapt to the peculiarities of projects and their high rate of false positives as the main obstacles of their adoption. We propose empirically-grounded analysis to improve the adaptability and ef- ficiency of bug detection and prediction tools. For a bug detector to be efficient, it needs to detect bugs that are conspicuous, frequent, and specific to a software project. We empirically show that the null-related bugs fulfill these criteria and are worth building detectors for. We analyze the null dereferencing problem and find that its root cause lies in methods that return null. We propose an empirical solution to this problem that depends on the wisdom of the crowd. For each API method, we extract the nullability measure that expresses how often the return value of this method is checked against null in the ecosystem of the API. We use nullability to annotate API methods with nullness annotation and warn developers about missing and excessive null checks. For a bug predictor to be efficient, it needs to be optimized as both a machine learning model and a software quality tool. We empirically show how feature selec- tion and hyperparameter optimizations improve prediction accuracy. Then we opti- mize bug prediction to locate the maximum number of bugs in the minimum amount of code by finding the most cost-effective combination of bug prediction configura- tions, i.e., dependent variables, machine learning model, and response variable. We show that using both source code and change metrics as dependent variables, apply- ing feature selection on them, then using an optimized Random Forest to predict the number of bugs results in the most cost-effective bug predictor. Throughout this thesis, we show how empirically-grounded analysis helps us achieve efficient bug prediction and detection tools and adapt them to the character- istics of each software project. iii iv Contents 1 Introduction 1 1.1 Auxiliary Software Quality Assurance Tools . .1 1.2 Our Approach: Empirically-Grounded Analysis . .3 1.2.1 Adaptability . .3 1.2.2 Efficiency . .4 1.2.3 EGA to the Rescue . .4 1.3 Contributions in Detail . .5 1.3.1 Building an Efficient Null Dereference Detection Approach . .5 1.3.2 Efficient Bug Prediction . .6 1.4 Outline . .8 2 State of the Art 9 2.1 Detection Tools for Frequent Bug Categories . .9 2.1.1 Discovering Prevalent Bugs . .9 2.1.2 Method Invocation Bug Detection . 10 2.1.3 Null Dereferencing Bug Detection . 11 2.2 Bug Prediction Tools . 12 2.2.1 Software Metrics . 13 2.2.2 Feature Selection in Bug Prediction . 14 2.2.3 Prediction Models . 16 2.2.4 Hyperparameter Optimization . 16 2.2.5 Experimental Setup . 17 2.2.6 Cost-Aware Evaluation . 17 2.3 Conclusions . 18 3 Discovering the Missing Null Check Bug Pattern 21 3.1 Bug-Fix Analysis Procedure . 22 3.1.1 Building The Software Corpora . 22 3.1.2 Analyzing Source Code and Change History . 22 3.1.3 Manual Inspection . 24 3.2 Results . 24 3.2.1 The Types of Fixed Files . 25 3.2.2 Fix Size . 26 3.2.3 The Most Frequent Bug-Fix Patterns . 28 v vi CONTENTS 3.3 Bug-Fix Patterns . 28 3.3.1 Missing Null Checks . 29 3.3.2 Wrong Name/Value . 29 3.3.3 Missing Invocation . 30 3.3.4 Undue Invocation . 31 3.3.5 Other Patterns . 31 3.4 Implications . 31 3.5 Threats to Validity . 32 3.6 Conclusions . 32 4 Null Usage Analysis 35 4.1 Motivation . 36 4.2 Null Check Analysis . 37 4.2.1 Experimental Corpus . 37 4.2.2 Terminology . 37 4.2.3 Analysis . 38 4.2.4 Manual Inspection . 39 4.3 Results . 40 4.3.1 How Common Are Null Checks? . 40 4.3.2 What Entities Do Developers Check For Null? . 40 4.3.3 Where Does Null Come From? . 44 4.3.4 What Does Null Mean? . 45 4.4 Discussion . 46 4.5 Threats to Validity . 47 4.6 Conclusions . 48 5 An Empirical Solution to the Missing Null Check Bug 49 5.1 Harvesting the Wisdom of the Crowd . 49 5.1.1 Motivation . 49 5.1.2 The Nullability Measure . 50 5.2 Bytecode Collection and Analysis . 51 5.2.1 Bytecode Collection . 51 5.2.2 Static Analysis . 52 5.2.3 Validation . 53 5.3 Nullability Distribution . 55 5.3.1 Documentation . 57 5.3.2 Disagreement between Internal and External Usage . 59 5.4 Manual Inspection . 60 5.5 Transferring Nullability to the IDE . 61 5.5.1 Evaluation . 64 5.6 Threats To Validity . 64 5.6.1 Construct Validity . 64 5.6.2 Generalizability . 65 5.7 Conclusions . 65 CONTENTS vii 6 Optimizing Bug Prediction by Applying Feature Selection 67 6.1 Technical Background . 68 6.2 Motivation . 71 6.2.1 Regression vs Classification . 71 6.2.2 Dimensionality Reduction . 71 6.2.3 Filters vs Wrappers . 71 6.3 Empirical Study . 72 6.3.1 Experimental Setup . 73 6.3.2 Results . 74 6.3.3 Threats to Validity . 81 6.4 Conclusions . 81 7 Optimizing Bug Prediction by Tuning Hyperparameters 83 7.1 Empirical Study . 84 7.1.1 Machine Learning Algorithms . 84 7.1.2 Parameter Tuning . 84 7.1.3 Procedure . 85 7.1.4 Results .