Effective Assignment and Assistance to Software Developers and Reviewers

EFFECTIVE ASSIGNMENT AND ASSISTANCE TO SOFTWARE DEVELOPERS AND REVIEWERS A Dissertation by Motahareh Bahrami Zanjani Master of Science, Islamic Azad University, 2010 Submitted to the Department of Electrical Engineering and Computer Science and the faculty of the Graduate School of Wichita State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy May 2017 © Copyright 2017 by Motahareh Bahrami Zanjani All Rights Reserved EFFECTIVE ASSIGNMENT AND ASSISTANCE TO SOFTWARE DEVELOPERS AND REVIEWERS The following faculty members have examined the final copy of this dissertation for form and content, and recommend that it be accepted in partial fulfillment of the requirement for the degree of Doctor of Philosophy with a major in Electrical Engineering and Computer Science. Huzefa Kagdi, Committee Chair Christian Bird, Committee Member Vinod Namboodiri, Committee Member Ehsan Salari, Committee Member Sergio Salinas Monroy, Committee Member Kaushik Sinha, Committee Member Accepted for the College of Engineering Royce Bowden, Dean Accepted for the Graduate School Dennis Livesay, Dean iii DEDICATION To my parents and my sister. iv ACKNOWLEDGEMENTS First of all I would like to express my gratitude to my supervisor and dissertation committee chair, Dr. Huzefa Kagdi, whose expertise added considerably to my graduate experience. His knowledge and expertise in area of Software evolution is admirable. Without his guidance this dissertation would not be possible. I would also like to thank Dr. Christian Bird from Empirical Software Engineering (ESE) group at Microsoft Research. He provided ample guidance and sug- gestion related to this dissertation work. His extensive knowledge in this research area enlightened me in the right direction. I would also like to thank other dissertation committee members who put their valuable time to review my dissertation and make it a successful one. My sincere gratitude goes to the faculty members that I came across during my entire graduate program at Wichita State University. Last but not least, the extraordinary support I received from my parents, family, and friends has been the most important part of my journey throughout my graduate studies. I would like to extend my special gratitude to them for supporting me in every step of the way. v ABSTRACT The conducted research is within the realm of software maintenance and evolution. Human reliance and dominance are ubiquitous in sustaining a high-quality large software system. Au- tomatically assigning the right solution providers to the maintenance task at hand is arguably as important as providing the right tool support for it, especially in the far too commonly found state of inadequate or obsolete documentation of large-scale software systems. Several maintenance tasks related to assignment and assistance to software developers and reviewers are addressed, and multiple solutions are presented. The key insight behind the presented solutions is the analysis and use of micro-levels of human-to-code and human-to-human interactions. The formulated methodology consists of the restrained use of machine learning techniques, lightweight source code analysis, and mathematical quantification of different markers of developer and reviewer expertise from these micro interactions. In this dissertation, we first present the automated solutions for Software Change Impact Analysis based on interaction and code review activities of developers. Then we provide explana- tion for two separate developer expertise models which use the micro-levels of human-to-code and human-to-human interactions from the previous code review and interaction activities of developers. Next, we present a reviewer expertise model based on code review activities of developers and show how this expertise model can be used for Code Reviewer Recommendation. At the end we examine the influential features that characterize the acceptance probability of a submitted patch (implemented code change) by developers. We present a predictive model that classifies whether a patch will be accepted or not as soon as it is submitted for code review in order to assist developers and reviewers in prioritizing and focusing their efforts. A rigorous empirical validation on large open source and commercial systems shows that the solutions based on the presented methodology outperform several existing solutions. The quan- titative gains of our solutions across a spectrum of evaluation metrics along with their statistical significance are reported. vi TABLE OF CONTENTS Chapter Page 1 Introduction . 1 1.1 Definitions . 2 1.2 Conducted Research and Projected Contributions . 3 1.3 Overview of The Presented Approaches . 4 1.4 Dissertation Findings . 6 1.5 Dissertation Organization . 7 2 Background and Related Researches . 8 2.1 Software Maintenance Background Contextualizing the Presented Work . 8 2.2 Micro-Evolution Repositories . 10 2.2.1 Interaction Histories from IDEs . 11 2.2.2 Code Review Histories . 11 2.3 Macro-Evolution Repositories . 14 2.4 Related Work . 15 2.4.1 Interaction History . 15 2.4.2 Code Review . 16 2.4.3 Software Change Impact Analysis . 18 2.4.4 Developer Recommendation . 20 2.4.5 Reviewer Recommendation . 22 2.4.6 Predicting the outcome of code review . 23 3 Change Impact Analysis (InComIA) ............................ 24 3.1 Introduction . 24 3.2 The InComIA Approach . 26 3.2.1 Interactions and Commits for Change Request Resolution . 27 3.2.2 Extracting Interacted Entities . 29 3.2.3 Extracting Committed Entities . 30 3.2.4 Obtaining Source Code of Entities from Interacted and Committed Revisions 31 3.2.5 Creating a Corpus from Source Code and Textual Descriptions . 32 3.2.6 Indexing and Querying the Corpus . 34 3.2.7 An Example from Mylyn ........................... 36 3.3 Empirical Evaluation . 37 3.3.1 Research Questions . 38 3.3.2 Experiment Setup . 39 3.3.3 Subject Software System . 39 3.3.4 Dataset . 39 3.3.5 Training and Testing Sets . 40 3.3.6 Performance Metrics . 40 3.3.7 Hypotheses Testing . 41 3.3.8 Case Study Results . 42 3.4 Threats to Validity . 45 vii TABLE OF CONTENTS (continued) Chapter Page 4 Change Impact Analysis (RevIA).............................. 47 4.1 Introduction . 47 4.2 The RevIA Approach . 49 4.2.1 Extracting Code Review Comments and Creating a Corpus . 50 4.2.2 Re-Ranking the Recommended Source Code Files with Issue Change Pro- neness . 51 4.3 Empirical Evaluation . 52 4.3.1 Research Questions . 52 4.3.2 Experiment Setup . 52 4.3.3 Subject Software System . 53 4.3.4 Training and Testing Sets . 53 4.3.5 Performance Metrics . 53 4.3.6 Hypotheses Testing . 54 4.3.7 Case Study Results . 54 4.4 Threats to Validity . 57 5 Developer Recommendation (iHDev)............................ 58 5.1 Introduction . 58 5.2 Approach . 61 5.2.1 Key Terms and Definition . 61 5.2.2 Locating Relevant Entities to Change Request . 63 5.2.3 Mining Interaction Histories to Recommend Developers . 64 5.2.4 An Example from Mylyn ........................... 70 5.3 Case Study . 71 0 5.3.1 Compared Approaches: xFinder, xFinder , and iMacPro . 72 5.3.2 Subject Software Systems . 72 5.3.3 Benchmarks: Training and Testing Datasets . 74 5.3.4 Metrics and Statistical Analyses . 75 5.3.5 Results . 77 5.3.6 Discussion . 80 5.4 Threats to Validity . 81 6 Developer Recommendation (rDevX) ........................... 84 6.1 Introduction . 84 6.2 The Developer Expertise Model . 86 6.2.1 Why Code Reviews to Build a New Model for Developer Expertise? . 87 6.2.2 Markers and Expertise Model . 91 6.3 Application for Recommending Appropriate Developers in Change Request Triaging 94 6.3.1 Locating Relevant Entities to Change Request . 95 6.3.2 Recommending developers based on SoE scores . 97 6.3.3 An Example from Mylyn ........................... 98 viii TABLE OF CONTENTS (continued) Chapter Page 6.4 Case Study . 100 6.4.1 xFinder ....................................101 6.4.2 DevCom ....................................102 6.4.3 Benchmarks: Bugs, Commits, and Reviews . 103 6.4.4 Metrics and Hypotheses . 105 6.4.5 Results . 107 6.4.6 Qualitative Analysis . 110 6.5 Threats to Validity . 111 7 Code Review Recommendation (cHRev)..........................113 7.1 Introduction . 113 7.2 Background on Modern Code Review . 116 7.3 The cHRev Approach . 117 7.3.1 Formulating Reviewer Expertise Model . 118 7.3.2 Scoring and Recommending reviewers . 120 7.3.3 Implementation of cHRev . 122 7.3.4 A Motivating Example from Mylyn . 122 7.4 Case Study . 124 7.4.1 Design . 124 7.4.2 Compared Approaches: REVFINDER, xFinder and RevCom . 125 7.4.3 Subject Systems and Evaluation Datasets . 126 7.4.4 Evaluation Protocol for cHRev . 128 7.4.5 Accuracy Metrics and Hypothesis Testing . 129.

Load more