Static Code Analysis for Embedded Systems Master of Science Thesis in Computer Science and Engineering
Total Page:16
File Type:pdf, Size:1020Kb
Static Code Analysis For Embedded Systems Master of Science Thesis in Computer Science and Engineering MAGNUS ÅGREN Department of Computer Science and Engineering CHALMERS UNIVERSITY OF TECHNOLOGY UNIVERSITY OF GOTHENBURG Göteborg, Sweden, August 2009 The Author grants to Chalmers University of Technology and University of Gothenburg the non-exclusive right to publish the Work electronically and in a non-commercial purpose make it accessible on the Internet. The Author warrants that he/she is the author to the Work, and warrants that the Work does not contain text, pictures or other material that violates copyright law. The Author shall, when transferring the rights of the Work to a third party (for example a publisher or a company), acknowledge the third party about this agreement. If the Author has signed a copyright agreement with a third party regarding the Work, the Author warrants hereby that he/she has obtained any necessary permission from this third party to let Chalmers University of Technology and University of Gothenburg store the Work electronically and make it accessible on the Internet. Static Code Analysis For Embedded Systems MAGNUS ÅGREN © MAGNUS ÅGREN, August 2009. Examiner: PATRIK JANSSON Department of Computer Science and Engineering Chalmers University of Technology SE-412 96 Göteborg Sweden Telephone + 46 (0)31-772 1000 Department of Computer Science and Engineering Göteborg, Sweden August 2009 Abstract Much software for embedded systems is written in languages such as C. This is known to be error prone, because of manual memory management and similar insecurities. A countermeasure against such problems is static code analysis. This thesis presents an evaluation of techniques for static code analysis, focusing on methods of fault detection. A number of different analysis tools have been tested, at Ascom Wireless Solutions, a developer of embedded system for wireless communication, on production code. The tools were able to detect real faults, but with significant manual interaction required. Acknowledgements This thesis presents the project \Static Code Analysis For Embedded Systems", for the degree of master of science in computer science and engineering, at Chalmers University of Technology. The work was carried out at Ascom Wireless Solutions, in Gothenburg, during the spring of 2009. I want to thank my examiner at Chalmers, Patrik Jansson, for taking on this project and guiding me though it. I also wish to thank my supervisors at Ascom, Jonas Estberger and Martin Karlsson, for many fruitful discussions and continuous support. Lastly, I thank the developers in the Amazon project at Ascom for putting up with me running much of the analysis software on their compilation server. i Contents 1 Introduction 1 1.1 Ascom Wireless Solutions . .1 1.2 Objectives . .1 1.3 Scope . .2 2 Analysis 3 2.1 Software measures . .4 2.2 Tools . .5 3 Pilot study 6 3.1 Tools evaluated . .6 3.2 Simplified faults . .8 4 Results 13 4.1 Lint . 13 4.2 Cppcheck . 13 4.3 Saturn . 14 4.4 Coverity Prevent . 15 4.5 Klocwork Insight . 16 5 Discussion 25 5.1 Conclusions . 25 Bibliography 28 ii Chapter 1 Introduction Writing fault-free software is a notoriously difficult task. As software has be- come ubiquitous, methods for addressing the issue of software faults are of high importance. Static code analysis is the \analysis of source code carried out with- out execution of that software"[21]. Ideally, because the code does not need to be executable, the analysis can be applied at an early stage of development. How well this works in practise is discussed in chapter 5. Early feedback allows developers to correct potential faults before code is committed into central repositories. The merit of this approach lies in the difficulty of deriving from a failure the fault that caused it. The process of tracking down the source of a failure is well-known to be costly. 1.1 Ascom Wireless Solutions Ascom Wireless Solutions is a developer of products and systems for wireless communication. The end users work within industry, medical care, correctional institutions etc. The section Mobile Software Design (MSWD) is responsible for the development of software for wireless telephones. Requirements on the software are high reliability and usability in safety critical systems. Static code analysis is proposed as a means to improve the development work and the feed- back to the developers. 1.2 Objectives The purpose of this work is to evaluate different methods of static code analysis. The objective is threefold: • The definition of a number of different methods of analysis and discussions of their strengths, limitations and theoretical underpinnings. • A prototype toolkit for static code analysis, based on these findings. Its implementation shall permit a high degree of automation and the toolkit shall be used on code developed by MSWD. • A recommendation for how MSWD shall proceed with static code analysis. 1 To achieve this, four guiding question will be answered: • What methods of static code analysis are available? • What are the inherent limitations of static code analysis? • Which measurements correlate well to software quality? • What tools are readily available? 1.3 Scope The code that is to be analysed is written in a combination of C and C++. Tool front-ends must therefore be able to handle both C and C++ constructs. The main intended use case of the toolkit is as an aid during everyday development work. The amount of manual interaction should therefore be minimised. The target of the analysis shall be detection of faults. There are numerous tools for static code analysis available. To avoid rein- venting the wheel, implementation work will focus on combining existing tools, as well as processing their output to ease use. The subject area of software testing is closely related to static code analysis. Both practises are concerned with defect detection. Testing however, usually requires executable code, and hence becomes applicable later in the development cycle. While testing in general is beyond the scope of this report, there is no clear-cut distinction between test and static analysis. General quality improvement activities, such as coding guidelines, are be- yond the scope of this work. Organisation of this Thesis Chapter 2 details the notion of software faults and discusses methods for static code analysis. Chapter 3 introduces an eval- uation of different tools, and chapter 4 presents the results of this evaluation. Finally, chapter 5 contains conclusions. 2 Chapter 2 Analysis Defects in software can result in unintended, erroneous behaviour. More for- mally, a defect or fault is defined as: A flaw in a component or system that can cause the component or system to fail to perform its required function e.g. an incorrect state- ment or data definition. A defect, if encountered during execution, may cause a failure of the component or system [21]. Consequently, a failure is defined to be: Deviation of the component or system from its expected delivery, service or result [21]. Thus, not all faults lead to failures. The terms fault and defect are used inter- changeably in this thesis. To achieve an understanding of software defects, and to create a nomencla- ture, taxonomies of faults have been created. One taxonomy, that combines results of previous efforts with new contributions, is the Common Weakness Enumeration (CWE) [3], maintained by MITRE, an American, non-profit fed- eral research corporation. The focus of CWE is on security flaws, and thus, many of the faults listed belong to the subset of vulnerabilities: faults that can be exploited through malicious intent. The classical example of a secu- rity vulnerability is the buffer overflow: through a carefully crafted string, that is written to memory past the end of an input buffer, an attacker alters the function of a program. Most relevant to this report are faults that are not vulnerabilities; since the failures most relevant to prevent are those that occur during normal program execution. Given below is a selection of faults, listed in CWE, that during normal program execution, without malicious intent, can lead to failures, and that can be identified through static code analysis. • Resource exhaustion; an unbounded amount of resources are allocated, but never released. • Memory leak; allocated memory is not freed before the last reference is removed. • Double free; one memory address is consecutively freed twice. 3 • Use after free; memory is referenced after it has been deallocated. • Freeing unallocated memory, the free function is invoked on a pointer not referencing dynamically allocated memory. • Null pointer dereference. • Array index out of bounds. • Data race; data can be altered between two operations, when it is expected to remain constant. • Dead code: code that, because of its surrounding context, cannot be exe- cuted. 2.1 Software measures One specific method of static code analysis is the collection of different software measures1. This effort aims to quantify qualitative aspects of software; such as complexity, testability, reliability, and maintainability. The obtained values have two different uses. Either to function as basis for resource estimation, by serving as an assessment of required effort. E.g., a value of testability may function as basis for determining the number of testers to employ. Alternatively, the measurement results can serve to predict defect density. Although this may be valid for entire software products, note however, that Fenton and Neil [11] did not find correspondence between complexity metrics and the number of faults of a software module. Voas argue that, although complexity itself is not an error, it increases the probability of latent, hidden errors [22]. In conclusion, note also that, in their critique of software defect prediction models [10], Fenton and Neil remark that: Despite the many efforts to predict defects, there appears to be little consensus on what the constituent elements of the problem really are.