Examining the Performance of Static Analyzers

Kevin Daimi and Shadi Banitaan Department of Mathematics, Computer Science and Software Engineering University of Detroit Mercy, 4001 McNichols Road, Detroit, MI 48221 {daimikj, banitash}@udmercy.edu

Kathy Liszka Department of Computer Science The University of Akron Akron, Ohio 44325-4003 [email protected]

ABSTRACT tools often uncover true but trivial bugs and some details about violations throughout the development Static Analysis refers to the analysis of computer lifecycle of software. They further added that there is programs prior to executing them to reveal potential little published information regarding the evaluation of problems that need to be fixed before executing the these tools to verify their claims. It is understandable programs. In this paper, five static analyzers for Java that companies may prohibit the publication of any programs will be examined and compared using three experimental data for commercial tools. However, Java programs, which are randomly selected from a publishing such data for open source tools should not be collection available on the Internet. a problem.

Keywords During software development, it is valuable to obtain early estimates of the defect density of software Static Analyzer, Java, Evaluation, Software Engineering components to further improve the quality of software. Such estimates identify fault-prone areas of code requiring further testing. It is valuable to collect early I. INTRODUCTION estimates of fault density for software components throughout the process of software development. Static analysis is becoming a critical component for Nagappan et al. [13] presented an empirical software development. Currently, many software methodology for the early projection of pre-release developers are appreciating the advantages of using defect density based on the outcomes of static analysis static analyzers to improve software. Static analyzers tools. With the aid of two different static analysis tools, function through using techniques from program the discovered defects were used to predict the actual analysis, model checking, and automated deduction [3]. pre-release defect density for Windows Server 2003. Static analysis tools can also be used to automate the They concluded that there was a strong positive process of identifying violations of security rules [12]. correlation between the static analysis list of defects and the pre-release defect density list obtained through Despite the popularity of static analysis tools for actual testing. There are a number of approaches for software flaws discovery, experimental assessments of static analysis. Static analysis by Abstract Interpretation the correctness and merits of the output of these tools [15] is one such approach. The authors indicated that are lacking. Ayewah et al. [2] examined the types of this approach offers a considerable assurance and warnings generated and the classification of warnings evidence needed for supporting its findings. They into false positives, trivial bugs and serious bugs for demonstrated that static analysis must be able to scale FindBugs, a static analysis tool for Java programs. and report few false positives without calling for expert They stipulated some perception into why static analysis interference. As mentioned above, public information on evaluating evaluative rules that have been extracted during a given static analyzers is scarce. An interesting study by Ware execution [15]. This tool is equipped with a default set et al. [17] focused on evaluating the degree to which of rules which can be used to reveal common eight static analysis tools can isolate violations of a development bugs. PMD also supports custom broad set of coding heuristics for increasing the quality analyses by allowing users the opportunity to develop and security of Java SE code. They revealed that a their own (new) evaluative rules. It scans Java source significant number of security violations were not code looking for potential problems including empty detected by any tool. The resulting vulnerabilities can try, catch, finally, and switch statements, dead code, easily lead to various attacks. Note that three of the suboptimal code, overcomplicated expressions, and tools used in this study; CheckStyle, Findbugs, and duplicate code. It can be integrated with JDeveloper, PMD are further analyzed in our study below. , JEdit, JBuilder, BlueJ, CodeGuide, In this paper, four open source and one commercial NetBeans/Sun Java Studio Enterprise/Creator, IntelliJ static analysis tools are evaluated. Three levels of IDEA, TextPad, Maven, Ant, Gel, JCreator, and evaluation including general features, performance, and . Copeland [6] indicated that Junit tests can be capabilities are exercised. For this purpose, three kept in good order by using PMD]. random programs available online, are used. To study the performance of each tool on unearthing various . ESC/Java2 fault/violations categories and sub-categories, violations were temporarily inserted into these programs. The Extended Static Checker for Java version 2 Outcomes of these evaluations are summarized in (ESC/Java2) is a programming tool that endeavors to various tables. discover common run-time errors in JML-annotated Java programs by static analysis of the program code II. STATIC ANALYSIS TOOLS OVERVIEW and its formal annotations. It allows users the flexibility to control the extent and types of checking that The static analysis tools for Java studied in this paper ESC/Java2 implements by annotating Java programs are briefly described below. with specifically formatted comments called pragmas [8]. This implies that the ESC/Java2 tool tries to A. FindBugs unearth common run-time errors in Java programs at compile time [10]. The approach used in ESC/Java2 FindBugs is an open source static analysis tool that digs comprises a range of techniques for statically checking into class or JAR files looking for potential problems the correctness of various program constraints. through matching Java bytecodes against a list of known Extended static checking usually deploys an automated bug patterns [9]. The current version of FindBugs theorem prover [7]. ESC/Java2 can be integrated with (2.0.2) requires JRE (or JDK) 1.5.0 or later to operate. the Mobius Program Verification Environment, used as However, it can analyze programs compiled for any a command-line tool with a simple GUI front- version of Java, from 1.0 to 1.8. It is capable of end, or added as an Eclipse plugin. identifying over 250 potential types of errors. FindBugs D. CheckStyle uses real bugs in software, extracts a bug pattern from those bugs, and develops possible detectors that can Checkstyle is an open source development tool, which efficiently pinpoint that bug pattern. In other words, it aims to help write Java code that follows is based on the concept of bug patterns [5]. The process some coding standard [4]. It automates the process of is evaluated by trying the recommended detector on checking Java code resulting in coding standard various test cases for that bug pattern [11]. In FindBugs, enforcement. Checkstyle is highly configurable and can bugs are ranked from 1-20, and grouped into the following categories: scariest (rank 1-4), scary (rank 5- support many coding standards. A number of sample 9), troubling (rank 10-14), and of concern (rank 15-20). configuration files are supplied for well-known conventions, such as Sun Code Conventions. It provides a flexible way for developers to share Historically, Checkstyle’s main functionality evolved information and define and install plugins. It can be around checking code layout concerns, but since its integrated with Eclipse, Maven, NetBeans, Hudson, and internal architecture was modified starting in version 3, IntelliJ. more checks for other purposes have been added. Currently, Checkstyle provides checks that uncover a B. PMD number of issues including class design problems, duplicate code, or bug patterns like double checked PMD is an open-source, rule-based, static source code locking. It supports loading a configuration from URL analyzer that analyzes Java source code based on reference and can be integrated with Eclipse, IntelliJ IDEA, NetBeans, BlueJ, tIDE, Emacs JDE, Jedit, IV. TOOLS RFORMANCE EVALUATION Editor, Maven, and QALab. Having analyzed the five tools based on the three E. AppPerfect Java Code Test (AppPerfect) criterions; violations, run-time, and memory, a deep- rooted evaluation will be carried out to reveal the actual performance of each tool with regards to various fault AppPerfect Java Code Test is a commercial static Java categories. For this purpose, various faulty codes are code analysis tool aimed at automating Java code temporary injected in the three programs. The fault review and enforcing good Java coding practices [1]. categories that will be used for this evaluation involves AppPerfect Code Test analyzes both Java and Java data faults, control faults, interface faults, measurement Server Pages (JSP) source code using a large set of Java faults, duplicate code, and code convention violations. coding rules extracted from experts in the Java Each of these categories is further divided into programming field. These rules are grouped into a subcategories. Detailed analysis is provided in tables 4- number of functional areas such as security, 9 below. In these tables, “Y” indicates that the tool is optimization, and portability. AppPerfect analyzes Java able to catch such a fault. The performance of these code and furnishes detailed information about diverse tools and their analysis are based the examples that were metrics for the source code, such as number of code lines, comments lines, complexities of methods, and selected. As the tables reveals, only the stated number of methods. It provides a number of reports to violations were investigated. It is possible that the performance will be different should other violations are describe problems in the source code about through its exercised. It is worth noting that most of these user interface. These reports can be exported into faults/violations were flagged out immediately by various formats, such as HTML, PDF, CSV, XLS, and Eclipse IDE for Java even before the tools were applied. XML. AppPerfect Java Code Test supports IDE This implies that, for these violations/examples, the integration with most commonly used IDEs including Eclipse IDE for Java behaved as good as the tools Eclipse, NetBeans, IntelliJ, JBuilder and JDeveloper. above.

III. GENERAL FEATURES COMPARISON By examining tables 4-9 below, it is obvious that the commercial tool, AppPerfect, has the best capabilities. In this section, the five tools are compared using However, it failed to catch the “Variable assigned twice Eclipse. The criteria used include the total number of but never used between assignments” and “long variable violations found, run time, and memory usage. To this name” faults. As it could be seen, PMD was able to extent, three randomly selected large high complexity catch them. Furthermore, both PMD and CheckStyle Java programs from PlanetSourceCode [14] are used. were able to find the “Variables/method/class/interface These programs include the following: 1. A Pong Game, names have dollar signs” fault when AppPerfect failed 2. A Basic Calculator Application, and 3. Gtroids to. Arcade Shooter [14]. Other programs available on PlanetSourceCode can be included if needed. The aim The evaluation of the tools presented in tables 4-9 is of this random selection was to conclude unbiased based on the number of faults discovered for each fault comparison, and provide a set of programs for interested category. For the data faults, PMD performed the best readers to look at when verifying the outcomes of this among the reaming four tools followed by ESC/Java. study. The results of the features comparison for the With regards to control faults, CheckStyle and PMD five tools based on the three programs are summarized were the best. However, CheckStyle did better. With in Tables 1-3. A blank row indicates the tool did not regards to catching interface faults, PMD performed check the program for some reason. For the Checkstyle better than ESC.Java. Using “Classes with high tool, violations refer to warnings. Cyclomatic Complexity” as a criterion, only PMD was able to detect such a fault. None of the open source tools was able to uncover the duplicate code faults. By observing the tables 1-3 below, we conclude that Finally, PMD was superior with regards to code AppPerfect is more optimized in terms of run-time and convention violations. memory usage than the other tools. This should not cause any surprise as AppPerfect is a commercial tool. V. TOOL CAPABILITY ANALYSIS However, PMD was able to discover more violations than AppPerfect in two of the programs. Furthermore, Checkstyle was able to find many warnings and The third evaluation deals with investigating the ESC/JAVA2 found more violations than AppPerfect in capabilities of each tool. The following five one of the programs. capabilities are employed for this purpose: test support, rule configuration, violation classification, auto fixing, analysis concentrates on extracting simple and metrics analysis. Test support indicates whether the measurements (metrics). Table 10 summarizes the tool can provide test cases to test the program. Rule results of this evaluation. Using this table, it is evident configuration implies that users can add, remove and that FindBugs and PMD satisfied three out of five modify rules. Violation classification deals with capabilities. The two open source tools only lacked two allocating faults to classes/types. Auto fixing refers to capabilities as compared to the commercial tool. the automatic correction of some faults. Finally, Metrics

TABLE I TOOLS COMPARISON USING PROGRAM-1 Tool Violations Run Time (Sec.) Memory (MB) FindBugs 12 4 123 PMD 71 3 139 Checkstyle 982* 3 99 ESC/JAVA2 AppPerfect 25 1 44

TABLE II TOOLS COMPARISON USING PROGRAM-2 Tool Violations Run Time (Sec.) Memory (MB) FindBugs PMD 12 4 123 Checkstyle 2049* 2 105 ESC/JAVA2 498 3 194 AppPerfect 93 2 60

TABLE III TOOLS COMPARISON USING PROGRAM-3 Tool Violations Run Time (Sec.) Memory (MB) FindBugs PMD 1593 6 226 Checkstyle 10495* 2 150 ESC/JAVA2 94 8 239 AppPerfect 564 5 108

TABLE IV ANALYSIS USING DATA FAULTS Violation Tool Performance PMD FindBugs Checkstyle AppPerfect ECS/Java2 Uninitialized local variable N N N Y N Variable declared but never used Y N N Y N Variable assigned twice but never used Y N N N N between assignments Undeclared variable N N N Y Y Assigning a variable to itself Y N N Y N

TABLE V ANALYSIS USING CONTROL FAULTS Violation Tool Performance PMD FindBugs Checkstyle AppPerfect ECS/Java2 Unreachable code N N N Y N Empty try/catch/finally/switch blocks Y N Y Y N Empty if/while statements Y N Y Y N Method calls in loop N N N Y N Switch case does not cover all cases N N N Y N Array length in loop condition N N N Y N Empty for statement N N Y Y N Unnecessary do while loop N N N Y N

TABLE ANALYSIS USING INTERFACE FAULTS Violation Tool Performance PMD FindBugs Checkstyle AppPerfect ECS/Java2 Mismatched parameter type N N N Y Y Mismatched parameter number N N N Y Y Unused parameter Y N N Y N Uncalled methods N N N Y N Unnecessary return Y N N Y N Unused imports Y N N Y N Unused public classes N N N Y N Unused public field N N N Y N

TABLE VII ANALYSIS USING MEASUREMENT FAULTS Violation Tool Performance PMD FindBugs Checkstyle AppPerfect ECS/Java2 Classes with high Cyclomatic Y N N Y N Complexity

TABLE VIII ANALYSIS USING DUPLICATE CODE FAULTS Violation Tool Performance PMD FindBugs Checkstyle AppPerfect ECS/Java2 Copied/pasted code (could imply N N N Y N copied/pasted bugs) Methods have same name N N N Y N

TABLE IX ANALYSIS USING CODE CONVENTION FAULTS Violation Tool Performance PMD FindBugs Checkstyle AppPerfect ECS/Java2 Method names start with capital letter Y Y Y Y N Short method name Y N N Y N Long variable name Y N N N N Class name starting with lower case Y N Y Y N character Variable/method/class/interface names Y N Y N N have dollar signs For loops that could be while loops Y N N Y N If statement without curly braces Y N Y Y N Incomplete parts of for N Y N Y N

TABLE X TOOLS CAPABILITY ANALYSIS Tool Capability Test Support Rule Configuration Violation Auto Fixing Metrics Analysis classification PMD Y Y Y N N

FindBugs Y Y N N N

Checkstyle Y Y Y Y Y

AppPerfect Y N N N N

ESC/Java2 Y Y Y N N

\

TABLE XI VIOLATION COVERAGE STATISTICS Violation Category Tool PMD FindBugs Checkstyle AppPerfect ECS/Java2 Data Faults 3/5 0/5 0/5 4/5 1/5 Control Faults 2/8 0/8 3/8 8/8 0/8 Interface Faults 3/8 0/8 0/8 8/8 2/8 Measurement Faults 1/1 0/1 0/1 1/1 0/1 Duplicate Code Faults 0/2 0/2 0/2 2/2 0/0 Code Convention faults 7/8 1/8 4/8 6/8 0/8 Total Coverage 20/32 1/32 7/32 29/32 3/32

of violations/faults (total number of fault subcategories). VI. VIOLATION COVERAGE STATISTICS Once again, PMD proved to be reasonable with regards Statistics are very important for the analysis and to the total number of violations sub-categories. By presentation of the collected data. Table 11 excluding the commercial tool, it is clear PMD is the demonstrates the fault coverage statistics based on the best and Checkstyle is second best. data collected from tables 4-9. The denominator represents the number of subcategories for the category in question and the numerator refers to how many fault VII. CONCLUSIONS subcategories the tool was able to successfully detect. Static analyzers can locate potential problems in Note that in Table 11, “32” represents the total number software code and facilitate good practices among

software designers. In an attempt to assist software [5] B Cole, D. Hakim, D. Hovemeyer, . , W. developers in selecting suitable static analyzers for their Pugh, and K. Stephens, Improving Your Software projects, five static analyzers were evaluated and Using Static Analysis to Find Bugs, in proc. ACM compared. The result of this evaluation indicated that SIGPLAN International Conference on Objected some of the open source tools can be good enough in Oriented Programming, Systems, Languages, and discovering problems and are comparable to Applications (OOPSLA’06), Portland, Oregon, commercial ones. Based on the Java programs used and USA, 2006, pp. 637-674. examples of faults introduced, it is concluded that [6] T. Copeland, PMD Applied: An Easy-to-Use Guide PMD’s performance proved to be the best. It is possible for Developers, Centennial Books, 2005. that different results might be produced when using [7] ESC/Java, Wikipedia, Available: more programs and introducing additional examples on http://en.wikipedia.org/wiki/ESC/Java, December 2012. [8] ESC/Java2 Summary, Available: further fault categories. Furthermore, it was interesting http://kindsoftware.com/products/opensource/ESCJ to discover that the Eclipse IDE for Java was able to ava2, May 2012. unearth almost all the fault sub-categories immediately [9] FindBugsTM – Finds Bugs in Java programs, after typing the statements in prior to using any of the Available: http://findbugs.sourceforge.net , 2012. static analyzers. [10] C. Flanagan, K. Leino, M. Lillibridge, G. Nelson, J.

B. Saxe and R. Stata. "Extended Static Checking Future improvements will concentrate on including for Java," in Proc. the Conference on Programming more open source tools, expanding the fault categories, Language Design and Implementation, Berlin, deploying more Java programs, and checking Java Germany, pages 234--245, 2002. coding security. [11] D. Hovemeyer and W. Pugh. Finding Bugs is Easy,

ACM SIGPLAN Notices, Vol. 39, No. 12, pp. 92- ACKNOWLEDGEMENT 106, 2004.

The authors would like to thank Xiaodan Lu and [12] R. Krishnan, M. Nadworny, and N. Bharill, “Static Xiaochen Zhang for their help. Analysis Tools for Security Checking in Code at Motorola,” Ada Letters, Vol. 28, No. 1, pp. 76-82, REFERENCES 2008. [13] N. Nagappan, and T. Ball, Static Analysis Tools as

early Indicators of Pre-Release Defect Density, in [1] AppPerfect Java Code test, AppPerfect th Proc. the 27 International Conference on Software Corporation, Available: Engineering (ICSE ’05), St. Louis, MO, USA, http://www.appperfect.com/products/java-code- 2005, pp. 580-586. test.. [14] PlanetSourceCode, Available: http://www.planet- [2] N. Ayawah, and W. Pugh, Evaluating Static source-code.com. Analysis Defect Warnings on Production Software, [15] PMD, Available: http://pmd.sourceforge.net, May, in proc. 7th ACM SIGPLAN-SIGSOFT Workshop 2012. on Program Analysis for Software Tools and [16] A. Venet, and M. Lowry, Static Analysis for Engineering (PASTE’07), San Diego, California, Software Assurance: Soundness, Scalability, and USA, 2007, pp. 1-7. Adaptiveness, in proc. Workshop on Future of [3] T. Ball, and S. K. Rajamani, The SLAM Project: Software Engineering Research (FoSER 2010), Debugging System Software via Static Analysis, in Santa Fe, New Mexico, USA, 2010, pp. 393-396. Proc. the 29th ACM SIGPLAN-SIGACT Symposium [17] M. Ware, and C. Fox, Securing Java Code: on Principles of Programming Languages Heuristics and an Evaluation of Static Analysis (POPL’02), Portland, OR, USA, 2008, pp. 1-3. Tools, in proc. SAW ’08, Tucson, Arizona, USA, [4] Checkstyle 5.6, Available: 2008, pp.12-21. http://checkstyle.sourceforge.net, September 2012.