Supporting Feature-Level Software Maintenance

W&M ScholarWorks Dissertations, Theses, and Masters Projects Theses, Dissertations, & Master Projects 2010 Supporting feature-level software maintenance Meghan Kathleen Revelle College of William & Mary - Arts & Sciences Follow this and additional works at: https://scholarworks.wm.edu/etd Part of the Computer Sciences Commons Recommended Citation Revelle, Meghan Kathleen, "Supporting feature-level software maintenance" (2010). Dissertations, Theses, and Masters Projects. Paper 1539623567. https://dx.doi.org/doi:10.21220/s2-jjz0-7b75 This Dissertation is brought to you for free and open access by the Theses, Dissertations, & Master Projects at W&M ScholarWorks. It has been accepted for inclusion in Dissertations, Theses, and Masters Projects by an authorized administrator of W&M ScholarWorks. For more information, please contact [email protected]. Supporting Feature-Level Software Maintenance Meghan Revelle Columbia, Maryland Master of Science, The College of William and Mary, 2005 Bachelor of Science, Mary Washington College, 2003 A Dissertation presented to the Graduate Faculty of the College of William and Mary in Candidacy for the Degree of Doctor of Philosophy Department of Computer Science The College of William and Mary May 2010 APPROVAL PAGE This Dissertation is submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Approved by the Committee, April 2010 ~ Assistant Professor Denys Poshyvanyk, Computer Science The College of William and Mary Associate Professor Peter Kemper, Computer Science The College of William and Mary Professor Robert Noonan, Computer Science The College of William and Mary Assista~engShen, The College of William andS Mary ABSTRACT PAGE Software maintenance is the process of modifying a software system to fix defects, improve performance, add new functionality, or adapt the system to a new environment. A maintenance task is often initiated by a bug report or a request for new functionality. Bug reports typically describe problems with incorrect behaviors or functionalities. These behaviors or functionalities are known as features. Even in very well-designed systems, the source code that implements features is often not completely modularized. The delocalized nature of features makes maintaining them challenging. Since maintenance tasks are expressed in terms of features, the goal of this dissertation is to support software maintenance at the feature-level. We focus on two tasks in particular: feature location and impact analysis via feature coupling. Feature location is the process of identifying the source code that implements a feature, and it is an essential first step to any maintenance task. There are many existing techniques for feature location that incorporate various types of analyses such as static, dynamic, and textual. In this dissertation, we recognize the advantages of leveraging several types of analyses and introduce a new approach to feature location based on combining dynamic analysis, textual analysis, and web mining algorithms applied to software. The use of web mining for feature location is a novel contribution, and we show that our new techniques based on web mining are significantly more effective than the current state of the art. After using feature location to identify a feature's source code, maintenance can be completed on that feature. Impact analysis should then be performed to revalidate the system and determine which other features may have been affected by the modifications. We define three feature coupling metrics that capture the relationship between features based on structural information, textual information, and their combination. Our novel feature coupling metrics can be used for impact analysis to quantify the strength of coupling between pairs of features. We performed three empirical studies on open-source software systems to asses the feature coupling metrics and established three major results. First, there is a moderate to strong statistically significant correlation between feature coupling and faults. Second, feature coupling can be used to correctly determine about half of the other features that would be affected by a change to a given feature. Finally, we found that the metrics align with developers' opinions about pairs of features that are actually coupled. Table of Contents Acknowledgments ix List of Tables X List of Figures xii 1 Introduction 2 1.1 Research Goals and Contributions 4 1.2 Scope of this Dissertation 6 1.3 Dissertation Organization 7 1.4 Bibliographical Notes ... 9 2 Survey of Feature Location Research 10 2.1 Dimensions of the Survey 12 2.1.1 Type of Article . 13 2.1.2 Type of Analysis 14 2.1.3 Sources of Information . 16 2.1.4 Granularity ....... 16 2.1.5 Programming Language Support 17 2.1.6 Presentation of the Results 17 2.1.7 Evaluation ......... 18 2.1.8 Comparison to Other Feature Location Techniques 18 2.1.9 Systems used for Evaluation . 19 2.2 Survey of Feature Location Techniques . 20 11 2.2.1 Dynamic Feature Location 20 2.2.2 Static Feature Location . 25 2.2.3 Textual Feature Location 27 2.2.4 Combined Dynamic and Static Feature Location 31 2.2.5 Combined Dynamic and Textual Feature Location 32 2.2.6 Combined Static and Textual Feature Location . 33 2.2. 7 Combined Dynamic, Static, and Textual Feature Location . 34 2.2.8 Other Feature Location Techniques. 35 2.3 Feature Location Tools and Studies . 37 2.3.1 Tools ........... 37 2.3.1.1 Tools for Dynamic Feature Location 37 2.3.1.2 Tools for Static Feature Location .. 38 2.3.1.3 Tools for Textual Feature Location . 39 2.3.1.4 Tools for Documenting Features 40 2.3.1.5 Visualization Tools .... 40 2.3.1.6 Program Exploration Tools 42 2.3.2 Case Studies . 43 2.3.2.1 Case Studies Comparing Feature Location Techniques 44 2.3.2.2 Industrial Case Studies 45 2.3.2.3 User Studies 46 2.4 Discussion and Open Issues 48 2.4.1 Comparisons 49 2.4.2 Benchmarks . 50 2.4.3 Tools 51 2.4.4 User Studies 52 2.4.5 Feature Location and Education 52 2.5 Conclusion 52 3 An Exploratory Study on Assessing Feature Location Techniques 54 3.1 Feature Location Techniques .. 55 iii 301.1 Core Techniques 0 0 0 55 301.2 Combined Techniques 57 302 Exploratory Study 0 0 0 0 59 30201 Research Questions 0 59 30202 Subject Systems 60 30203 Input to the Analyses 61 302.4 Relevancy Assessment 64 303 Results 0 0 0 0 0 0 0 0 0 0 0 0 0 67 30301 jEdit Study Findings 0 67 30302 Eclipse Study Findings 0 70 30303 Discussion 0 0 0 0 0 0 73 303.4 Threats to Validity 0 74 3.4 Related Work 75 305 Conclusion 76 4 Using Data Fusion and Web Mining to Support Feature Location 78 401 A Data Fusion Model for Feature Location 0 0 0 0 0 0 0 81 401.1 Textual Information from Information Retrieval 0 81 401.2 Execution Information from Dynamic Analysis 82 401.3 Dependence Information from Web Mining 84 401.301 HITS 0 0 0 84 401.302 PageRank 0 86 4ol.4 Fusions 87 402 Experimental Evaluation 0 89 40201 Systems and Benchmarks 90 40202 Hypotheses 0 0 0 0 0 0 0 0 91 40203 Data Collection and Analysis 92 403 Results and Discussion 0 0 0 93 40301 Statistical Analysis 0 97 40302 Impact of the Selection of a Threshold 0 98 iv 4.3.3 Locating All of a Feature's Methods 100 4.3.4 Using a Static Call Graph 102 4.3.5 Discussion . 105 4.3.6 Threats to Validity . 109 4.4 Related Work 110 4.5 Conclusion . 112 5 Feature Coupling 114 5.1 Related Work ............. 117 5.1.1 Structural Coupling Measures . 118 5.1.2 Other Static Coupling Measures 119 5.1.3 Dynamic Coupling Measures .. 119 5.1.4 Applications of Coupling Metrics 120 5.2 Analyzing Structured and Unstructured Information in Source Code 120 5.2.1 Structured Information . 121 5.2.2 Unstructured Information 121 5.2.2.1 Build the Corpus . 122 5.2.2.2 Index the Corpus 123 5.2.2.3 Compute Textual Similarities . 123 5.3 Using Structural and Textual Information for Feature Coupling 124 5.3.1 System Representation .... 124 5.3.2 Structural Feature Coupling . 125 5.3.3 Textual Feature Coupling 126 5.3.4 Hybrid Feature Coupling 128 5.3.5 Theoretical Evaluation . 128 5.3.6 Classification within the Unified Framework for Coupling Measurement129 5.3.7 Measurement Tool .............. 131 5.3.8 An Example of Measuring Feature Coupling . 132 5.4 Case Studies . 133 5.4.1 Subject Systems and Data Sets 135 v 5.4.2 Case Study Settings . 136 5.4.3 The Relationship Between Feature Coupling and Faults 137 5.4.3.1 Hybrid Feature Coupling ...... 141 5.4.3.2 Comparison with an Existing Metric . 142 5.4.3.3 The Confounding Effect of Size . 143 5.4.4 Using Structural and Textual Coupling to Support Feature-Level Im pact Analysis . 144 5.4.4.1 Feature-Class Coupling 149 5.4.5 Developer Study . 150 5.4.5.1 Programmers . 150 5.4.5.2 Task Description . 151 5.4.5.3 Agreement Among the Participants and with the Metrics 151 5.4.6 Threats to Validity . 155 5.4.6.1 Internal Threats to Validity . 155 5.4.6.2 External Threats to Validity 156 5.5 Conclusion . 157 6 FLAT3 : Feature Location and Textual Tracing Tool 159 6.1 FLAT3 ............... 160 6.1.1 Textual Feature Location 161 6.1.2 Dynamic Feature Location 163 6.1.3 Integrated Feature Location . 165 6.1.4 Annotating Features 165 6.1.5 Visualization 167 6.2 Related Work 168 6.3 Conclusion 168 7 Conclusion 169 A Classification of Feature Location Articles 172 A.1 Dimensions of the Feature Location Taxonomy 172 vi A.2 Classification of Surveyed Feature Location Articles 176 B Exploratory Study Instructions 184 B.1 Overview ..

Load more