An Overview on the Static Code Analysis Approach in Software Development
Total Page:16
File Type:pdf, Size:1020Kb
An overview on the Static Code Analysis approach in Software Development Ivo Gomes 1, Pedro Morgado 1, Tiago Gomes 1, Rodrigo Moreira 2, 1 Software Testing and Quality, Master in Informatics and Computing Engineering, 2 Software Testing and Quality, Doctoral Program in Informatics Engineering, Faculdade de Engenharia da Universidade do Porto, Rua Dr. Roberto Frias 4200-465, Porto, Portugal {ei05021, ei05051, ei05080, pro08007}@fe.up.pt Abstract. Static analysis examines program code and reasons over all possible behaviors that might arise at run time. Tools based on static analysis can be used to find defects in programs. Recent technology advances has brought forward tools that do deeper analyses that discover more defects and produce a limited amount of false warnings. The aim of this work is to succinctly describe static code analysis, its features and potential, giving an overview of the concepts and technologies behind this type of approach to software development as well as the tools that enable the usage of code reviewing tools to aid programmers in the development of applications, thus being able to improve the code and correct errors before an actual execution of the code. Keywords: static analysis, code review, code inspection, source code, bugs, dynamic analysis, software testing, manual review. 1 Introduction The use of analytical methods to review source code in order to correct implementation bugs is, and has been, one of the backbone pillars behind software development. In the beginning of software development there was no conscience on how necessary and effective a review might be, but in the 1970’s, formal review and inspections were recognized as important to productivity and product quality, and thus were adopted by development projects [1]. This new approach to software development acknowledges defect removal in the early stages of the development process proved to produce more reliable and efficient programs. Fagan’s definition of error detection efficiency is as follows [2]: ΠΠΝΠΡ ΔΝΣΜΒ ΐΧ ΏΜ ΗΜΞΓΑΗΝΜ Error Detection EfLiciency = × 100 . (1) ΈΝΏΚ ΓΠΠΝΠΡ ΗΜ ΖΓ ΞΠΝΒΣΑ ΐΓΔΝΠΓ ΗΜΡΞΓΑΗΝΜ So, as far as source code is concerned, it is in the best interest of the programmer to take advantage of static analysis. Although this does not imply that other forms of software analysis should be discouraged, on the contrary, the best way to certify that an implementation has the least amount of errors or defects is by combining both the static and the dynamic measures of analysis. The static analysis approach is meant to review the source code, checking the compliance of specific rules, usage of arguments and so forth; the dynamic approach is essentially executing the code, running the program and dynamically checking for inconsistencies of the given results. This means that testing and reviewing code are separate and distinguishable things, but it is unadvised that one should occur without the other, and it is also arguable as to what should be done first, testing or reviewing software [3]. This work focuses on the description of the static methods of analysis, with a special attention to the available tools in the market that provide this kind of service. This paper is organized in the following sections: Section 1, this current section, introduced the static analysis approach; Section 2 will describe a relative brief overview of static analysis, followed by the description of the most common methods of code reviewing done by humans: self review, walkthrough, peer review, inspection and audit. In order to ascertain the truly fundamental qualities of static code analysis and more importantly, to distinguish them from the dynamical testing approaches, Section 3 will describe the advantages and disadvantages regarding static analysis. A comprehensive comparison between code review and testing shall explain why the usage of just one of them is discouraged; Section 4 will summarize a listing of the most popular software tools that are capable of performing this type of code analysis which shall be followed by a comparison between some aspects of these tools; a further evaluation of these tools is described in Section 5; in Section 6 will feature some possible enhancements to be performed on such tools; and finally Section 7 will express a discussion over static code analysis tools in software development. 2 Overview of the Static Analysis approach Static code analysis is the analysis of computer software which is performed without the actual execution of the programs built from that software, as opposite of dynamic analysis (testing software by executing programs). For the majority of cases the analysis is performed on some version of the source code and in the other cases some form of the object code. The term is usually applied to the analysis performed by an automated software tool, with human analysis being called program understanding, program comprehension or code inspection. It can be argued that software metrics and reverse engineering are forms of static analysis, but such discussion is not the aim of this work. Programmers make little mistakes all the time, like a missing semicolon here, an extra parenthesis there, and so on. Most of the time these gaffes are inconsequential, the compiler notes the error, the programmer fixes the code, and the development process continues. However, this quick cycle of feedback and response normally does not apply to most security vulnerabilities, which can lie dormant for an indefinite amount of time before discovery. As explained earlier, the longer a defect on the software lies dormant, the more expensive it can be to fix [4]. The promise of static analysis is to identify many common coding problems automatically before a program is released. Static analysis aims to examine the text of a program statically, without attempting to execute it. Theoretically, static analysis tools can examine either a program’s source code or a compiled form of the program to equal benefit, although the problem of decoding the latter can be difficult [4]. 2.1 Manual Review Manual reviewing or auditing is a form of static analysis, very time-consuming, and to perform it effectively, human code auditors must first know what type of errors there are supposed to find before they can rigorously examine the code. The reviewing of an application’s code can be done in any phase of software development, but the best results are when this is done at an early stage, because the costs and risk of detecting and correcting security vulnerabilities and quality defects late in the software development process can be high. When those bugs escape into the market and are discovered by customers, the fallout can affect the bottom line and damage reputations [5]. Reviewing includes not only the code, but all documentation, requirements and designs the developer produces, everything is susceptible of being review, because there can be errors hidden in every step of software development. Basically, static code analysis performed by humans can be divided in two major categories: self reviews and 3 rd party reviews, which are tightly related to the Personal Software Process and the Team Software Process [6]. In the picture below, the initial phase shows the actual implementing of the code, which obviously isn’t any type of static analysis. Following is the self review of the written code, where the programmer tries to evaluate and correct by himself the code he implemented. The walkthrough focuses on the presentation to an audience of the code in question by its programmer. The peer review is when the programmer presents his code to a colleague to review. Finally the inspection and audit , which is usually done by a third party of evaluators, the audit being the highest formal review [5]. Fig. 1. Flow of types of reviews that increase formality. The best way to detect and correct bugs in an early stage of development is when the programmer himself performs the review and tries to find and correct problems in his code, this is commonly known as self review. In every programmer there should be a sense of personal responsibility in his implementations, and as such, it is always a good idea to try and keep track of the most common mistakes he does. This way in time it will become easier to prevent repeating them once again. There are some guidelines as to how to perform a proper self review: producing reviewable items (code, design, specifications, etc.); trying not to review code on screen, to circumvent the tendency to correct bugs as they are found; not reviewing the code right after it is written; to follow a structured review process; create personal checklists of the most common mistakes; taking enough time to review the code, so as to be certain that everything is as it should be (usually half the time it was required to write the code is more than enough to properly review it) [7]. The team review process can be a bit more complex, and there several different steps in reviewing software as a group of people. An interesting method is the walkthrough, in which the developer explains his code and ideas to an audience, being subject to their criticism. In addition, there are formal requisites to perform static reviews of code. This kind of group review can be achieved with a before-after technique, meaning there is a necessity of a review plan prior to the review (assembled by the leading reviewer) and a review report that contains all the results. The components of a formal review plan are: the review goals, the collection of items being reviewed, a set of preconditions for the review, roles, team size, participants, training requirements, review steps and procedures, checklists and other related documents to be distributed to participants, the time requirements, the nature of the review log and summary report, and rework and follow-up criteria and procedures.