The Impact of Software Evolution on Code Coverage Information
Total Page:16
File Type:pdf, Size:1020Kb
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Computer Science and Engineering, Department CSE Conference and Workshop Papers of 2001 The Impact of Software Evolution on Code Coverage Information Sebastian Elbaum University of Nebraska-Lincoln, [email protected] David Gable University of Nebraska-Lincoln, [email protected] Gregg Rothermel University of Nebraska-Lincoln, [email protected] Follow this and additional works at: https://digitalcommons.unl.edu/cseconfwork Part of the Computer Sciences Commons Elbaum, Sebastian; Gable, David; and Rothermel, Gregg, "The Impact of Software Evolution on Code Coverage Information" (2001). CSE Conference and Workshop Papers. 132. https://digitalcommons.unl.edu/cseconfwork/132 This Article is brought to you for free and open access by the Computer Science and Engineering, Department of at DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in CSE Conference and Workshop Papers by an authorized administrator of DigitalCommons@University of Nebraska - Lincoln. Proceedings. IEEE International Conference on Software Maintenance, 2001. Digital Object Identifier: 10.1109/ICSM.2001.972727 Publication Year: 2001 , Page(s): 170 - 179 The Impact of Software Evolution on Code Coverage Information Sebastian Elbaum David Gable Gregg Rothermel Dept. of Computer Dept. of Computer Computer Science Dept. Science and Engineering Science and Engineering Oregon State University University of Nebraska-Lincoln University of Nebraska-Lincoln Corvallis, OR Lincoln, Nebraska Lincoln, Nebraska [email protected] [email protected] [email protected] Abstract Often, code coverage information is collected for a ver- sion of a program to aid in some maintenance or testing task Many tools and techniques for addressing software performed on that particular version. For example, the exe- T P P maintenance problems rely on code coverage information. cution of test suite on version i of program generates Often, this coverage information is gathered for a specific coverage information that could be used to determine the T P version of a software system, and then used to perform anal- statement coverage adequacy of on i . yses on subsequent versions of that system without being In many other cases, however, code coverage informa- P P recalculated. As a software system evolves, however, mod- tion collected on a particular version i of program is ifications to the software alter the software’s behavior on used to aid in analyses or tasks performed on subsequent particular inputs, and code coverage information gathered versions of P . For example, most regression test selection on earlier versions of a program may not accurately reflect and test case prioritization techniques (e.g. [4, 10, 21, 23, P the coverage that would be obtained on later versions. This 25, 26]) use test coverage information from i to help se- discrepancy may affect the success of analyses dependent lect or prioritize the tests that should be executed on some P P +j i on code coverage information. Despite the importance of later version i of . Similarly, some techniques for re- P coverage information in various analyses, in our search of liability estimation [11] use coverage information from i P +j the literature we find no studies specifically examining the to assess the risk of executing certain components in i . impact of software evolution on code coverage information. In many such cases, reuse of coverage data is essential. Therefore, we conducted empirical studies to examine this It would make no sense, for example, to run all the tests T P +1 impact. The results of our studies suggest that even rela- in on i in order to use that coverage information to P tively small modifications can greatly affect code coverage T +1 select the subset of that must be run on i !Inother information, and that the degree of impact of change on cov- cases, it simply is not cost-effective to re-gather coverage erage may be difficult to predict. information (reapplying expensive profiling techniques and re-executing the program many times) for each successive version of an evolving program. Instead, coverage informa- 1 Introduction tion is gathered on some version of P , and re-used — with- out being recalculated — on several subsequent versions. Many software maintenance techniques and tools require Of course, as software evolves, modifications to that knowledge about the dynamic behavior of software. Pro- software can alter that software’s behavior on particular in- gram profiling techniques [2, 15] provide such knowledge, puts, and code coverage information calculated for a set of P P collecting code coverage information about such things as inputs on a version i of may not accurately reflect the the statements, branches, paths, or functions encountered or coverage that would be obtained if that set of inputs were taken during a program’s execution. Such code coverage applied to subsequent versions. Thus, techniques that rely information supports maintenance-related activities such as on previously computed code coverage information may de- impact analysis [3, 8], dynamic slicing [1, 12, 14], assess- pend for their success on the assumption that coverage in- ments of test adequacy [16, 19], selective regression test- formation remains sufficiently stable as software evolves. ing [10, 21, 25], predictions of fault likelihood [6], dynamic Despite the importance of code coverage information, code measurement [17], test suite minimization [22, 27], and the frequency with which maintenance techniques de- and test case prioritization [4, 23, 26]. pend on its reuse, in our search of the research literature we find little data on the effects of program evolution on value of a cell in the matrix can be 1 or 0, depending on coverage information. Rosenblum and Weyuker [20] con- whether a component was covered by a test or not. jecture that coverage information remains relatively stable For example, Table 1 depicts two versions of the function v v 1 across program versions in practice, and this hypothesis is computeTax ( 0 and ). The changes between versions supported in their study of regression test selection predic- are shown in italics. Table 2 shows a test suite developed v tors. Studies of techniques that rely on the relative stability to achieve statement coverage of version 0 of the function, of coverage information [4, 23] have shown that those tech- listing its coverage of statements in both versions, and Ta- C (v ) C (v ) 1 niques can succeed, suggesting indirectly that sufficient sta- ble 3 presents the coverage matrices, 0 and ,that bility may exist. Beyond these informal or indirect reports, result from executing that test suite on the two versions. however, we can find no previous research specifically ad- To quantify differences in coverage information, and to dressing questions about the impact of software evolution let us measure the effects of evolution on coverage, we se- C (v ) C (v ) j on code coverage information. lected four metrics. Let i and be coverage ma- j We are therefore conducting empirical studies investi- trices for versions i and of a program, respectively. gating code coverage and its stability in evolving software. Matrix density (MD) measures the distribution of a test This paper reports the results of two such studies: a con- trolled experiment and a case study. Our results indicate suite across a set of components. MD is computed by C (v ) c t counting each 1 in i , and then dividing by that even small changes during the evolution of a program and multiplying by 100. For our sample program MD can have a profound impact on coverage information, and C (v ) C (v ) 2 that this impact increases rapidly as the degree of change is 62% for 1 and 48% for . Component coverage (CC) measures the percentage of increases. Furthermore, our results suggest that the im- components executed by a test suite. CC is computed pact of evolution on coverage information may be difficult to predict. These findings have consequences for certain by counting each component that was executed by at least one test, and then dividing by c and multiplying techniques that use code coverage information from earlier C (v ) software versions to perform tasks on subsequent versions. by 100. For our sample program CC is 100% for 0 C (v ) and 86% for 1 . In the next section of this paper we present our re- Change across components (CAC) measures the per- search questions and measures, and discuss our empirical centage of change in component coverage between approaches. Sections 3 and 4 present our two studies in C (v ) C (v ) j coverage matrices i and . CAC is computed turn, describing their design and results. Finally, Section by counting the number of components that did not re- 5 discusses the overall implications of the results of both ceive identical coverage on both versions (vector com- studies, and discusses future work. parison), dividing it by c, and multiplying by 100. For the two versions of our sample program, CAC is 71%. 2 Empirical Studies Change across tests (CAT) measures the percentage of change in test suite execution between coverage ma- We are interested in the following research questions: C (v ) C (v ) j trices i and . CAT is computed by count- ing the number of inputs that did not execute the same RQ1: How does program evolution affect code coverage components on both versions, dividing it by t,and information? multiplying by 100. For the two versions of our sample RQ2: What impact can a code modification have on code program, CAT is 67%. coverage information? RQ3: Are certain granularities of code coverage informa- These metrics capture different aspects of code coverage tion more stable than others? information. The first two metrics relate to individual cov- erage matrices, and quantify the relationship of a test suite 2.1 Measures to a set of components.