Comparison of Unit-Level Automated Test Generation Tools

Comparison of Unit-Level Automated Test Generation Tools Shuang Wang and Jeff Offutt Software Engineering George Mason University Fairfax, VA 22030, USA {swangb,offutt}@gmu.edu Abstract effort will be worthwhile [2]. Another possible reason is that unit tests must be maintained, and maintenance for unit Data from projects worldwide show that many software tests is often not budgeted. A third possibility is that de- projects fail and most are completed late or over budget. velopers may not know how to design and implement high Unit testing is a simple but effective technique to improve quality unit tests; they are certainly not taught this crucial software in terms of quality, flexibility, and time-to-market. knowledge in most undergraduate computer science pro- A key idea of unit testing is that each piece of code needs grams. its own tests and the best person to design those tests is Using automated unit test tools instead of manual testing the developer who wrote the software. However, generating can help with all three problems. Automated unit test tools tests for each unit by hand is very expensive, possibly pro- can reduce the time and effort needed to design and imple- hibitively so. Automatic test data generation is essential to ment unit tests, they can make it easier to maintain tests as support unit testing and as unit testing is achieving more at- the program changes, and they can encapsulate knowledge tention, developers are using automated unit test data gen- of how to design and implement high quality tests so that eration tools more often. However, developers have very developers do not need to know as much. But an impor- little information about which tools are effective. This ex- tant question developers must answer is “which tool should periment compared three well-known public-accessible unit I use?” test data generation tools, JCrasher, TestGen4j, and JUB. This empirical study looks at the most technical chal- We applied them to Java classes and evaluated them based lenging part of unit testing, test data generation. We se- on their mutation scores. As a comparison, we created two lected tools to empirically evaluate based on the following additional sets of tests for each class. One test set con- three factors: (1) the tool must automatically generate test tained random values and the other contained values to sat- values with little or no input from the tester, (2) the tool isfy edge coverage. Results showed that the automatic test must test Java classes, and (3) the tool must be free and data generation tools generated tests with almost the same readily available (for example, through the web). mutation scores as the random tests. We selected three well-known, public-accessible automated tools (Section 2.1). The first is JCrasher [3], a random testing tool that causes the class under test to “crash.” 1 Introduction The second is TestGen4J [11], whose primary focus is to ex- ercise boundary value testing of arguments passed to meth- An important goal of unit testing is to verify that each ods. The third is JUB (JUnit test case Builder) [19], which unit of software works properly. Unit testing allows many is a framework based on the Builder pattern [7]. We use problems to be found early in software development. A these tools to automatically generate tests for a collection comprehensive unit test suite that runs together with daily of Java classes (Section 2.3). builds is essential to a successful software project [20]. As As a control, our second step was to manually generate the computing field uses more agile processes, relies more two additional sets of tests. A set of purely random tests was on test driven development, and has higher reliability re- generated for each class as a “minimal effort” comparison, quirements for software, unit testing will continue to in- and tests to satisfy edge coverage on the control flow graph crease in importance. were generated. However, some developers still do not do much unit test- Third, we seeded faults using the mutation analysis tool ing. One possible reason is they do not think the time and muJava [9, 10] (Section 2.4). MuJava is an automated class mutation system that automatically generates mutants for the “weakest effort” testing strategy, and it seems natural to Java classes, and evaluates test sets by calculating the num- expect a unit test data generator tool to at least do better than ber of mutants killed. random value generation. We wrote a special-purpose tool Finally, we applied the tests to muJava (Section 2.4), and that generated random tests in two steps. For each test, the compared their mutation scores (the percentage of killed tool arbitrarily selected a method from the class to test. (The mutants). Results are given in Section 3 and discussed in number of methods in each class is given in Table 2.) Then Section 4. Related work is presented in Section 5, and Sec- the tool randomly generated values for each parameter for tion 6 concludes the paper. that method. The tool did not parse the classes–the methods and parameters were hard-coded into tables in the tool. We 2 Experimental Design decided to create the same random number of tests for each subject class as the tool from Section 2.1 that created the most tests for that class. For all the subject classes, JCrasher This section describes the design of the experiment. generated the most tests, so the study had the same number First, each unit test data generation tool used is described, of random tests as JCrasher had. then the process used to manually generate additional tests We elected to go with a test criterion as a second control. is presented. Next, the Java classes used in the study are Formal test criteria are widely promoted by researchers and discussed and the muJava mutation testing tool is presented. educators, but are only spottily used in industry [8]. We The process used in conducting the experiment is then pre- chose one of the weakest and most basic test criterion: edge sented, followed by possible threats to validity. coverage on the control flow graphs. We created control flow graphs by hand for each method in each class, then 2.1 Subjects–Unit Testing Tools designed inputs to cover each edge in the graphs. Table 1 summarizes the three tools this study examined 2.3 Java Classes Tested in this paper. Each tool is described in detail below. JCrasher [3] is an automatic robustness testing tool for Table 2 lists the Java classes used in this experiment. Java classes. JCrasher examines the type information of BoundedStack is a small, fixed sized, implementation methods in Java classes and constructs code fragments that of a Stack from the Eclat tool’s website [14]. Inventory will create instances of different types to test the behavior is taken from the MuClipse [18] project, the eclipse plug-in of the public methods with random data. JCrasher explic- version of muJava. Node is a mutable set of Strings that is itly attempts to detect bugs by causing the class under test to a small part of a publish/subscribe system. It was a sample crash, that is, to throw an undeclared runtime exception. Al- solution from a graduate class at George Mason University. though limited by the randomness of the input values, this Queue is a mutable, bounded FIFO data structure of fixed approach has the advantage of being completely automatic. size. Recipe is also taken from the MuClipse project; it is No inputs are required from the developer. a javabean class that represents a real-world Recipe object. TestGen4J [11] automatically generates JUnit test cases Twelve is another sample solution. It tries to combine from Java class files. Its primary focus is to perform bound- three given integers with arithmetic operators to compute ary value testing of the arguments passed to methods. It exactly twelve. VendingMachine is from Ammann and uses rules, written in a user-configurable XML file, that de- Offutt’s book [1]. It models a simple vending machine for fines boundary conditions for the data types. The test code chocolate candy. is separated from test data with the help of JTestCase1. JUB (JUnit test case Builder) [19] is a JUnit test case generator framework accompanied by a number of IDE spe- Table 2. Subject Classes Used cific extensions. These extensions (tools, plug-ins, etc.) are Name LOC Methods invoked from within the IDE and must store generated test BoundedStack 85 11 case code inside the source code repository administered by Inventory 67 11 the IDE. Node 77 9 Queue 59 6 2.2 Additional Test Sets Recipe 74 15 TrashAndTakeOut 26 2 As a control comparison, we generated two additional Twelve 94 1 sets of tests for each class by hand with some limited tool VendingMachine 52 6 support. Testing with random values is widely considered Total 534 61 1http://jtestcase.sourceforge.net/ 2 Table 1. Automated Unit Testing Tools Name Version Inputs Interface JCrasher 0.1.9 (2004) Source File Eclipse Plug-in TestGen4J 0.1.4-alpha (2005) Jar File Command Line (Linux) JUB 0.1.2 (2002) Source File Eclipse Plug-in 2.4 MuJava Table 3. Classes and Mutants Classes Mutants Our primary measurement of the test sets in this exper- Traditional Class Total iment is their ability to find faults. MuJava is used to seed BoundedStack 224 4 228 faults (mutants) into the classes and evaluate how many mu- Inventory 101 50 151 tants each test set kills. Node 18 4 22 MuJava [9, 10] is a mutation system for Java classes.

Comparison of Unit-Level Automated Test Generation Tools

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support