Automatic Bug-Finding Techniques for Large Software Projects
Total Page:16
File Type:pdf, Size:1020Kb
Masaryk University Faculty}w¡¢£¤¥¦§¨ of Informatics!"#$%&'()+,-./012345<yA| Automatic Bug-finding Techniques for Large Software Projects Ph.D. Thesis Jiri Slaby Brno, 2013 Revision 159 Supervisor: prof. RNDr. Antonín Kučera, PhD. Co-supervisor: assoc. prof. RNDr. Jan Strejček, PhD. ii Copyright c Jiri Slaby, 2013 All rights reservedtož to ja iii iv Abstract This thesis studies the application of contemporary automatic bug-finding techniques to large software projects. As they tend to be complex systems, there is a call for efficient analyzers that can exercise such projects without generating too many false positives. To fulfill these requirements, this thesis aims at the static analysis with the primary focus at techniques like the abstract interpretation and symbolic execution. Since their introduction in 1970’s, their applicability is increasing thanks to current fast processors and further improvements of the techniques. The thesis summarizes the current state of the art of chosen techniques implemented in the current static analyzers. Further, we discuss some large code bases which can be used as a test-bed for the techniques. Finally, we choose the Linux kernel. On the top of the explored state of the art, we de- velop our own techniques. We present the framework Stanse which utilizes the abstract interpretation and has no issues with handling the Linux kernel. The framework also prunes false positives resulting from some checkers. The pruning method is based on the pattern matching, but it turns out to be weak. Hence we subsequently try to improve the situation by the symbolic execution. For that purpose we use an executor called Klee in our new tool Symbiotic. But since the symbolic execution requires a long computation time, we slice the code by a static slicer first. During our experiments, we cumulated knowledge about categories of bugs usually present in code such as the Linux kernel. We investigated it more and developed a database of known bugs in a fixed version of the kernel. It is publicly available and contains information about errors and false positives found by several bug-finding tools. This can help developers of new static analyzers to compare their brand new tool against the database and see how their tool stands with respect to false positives, missed errors and of course error hits. We also participated with Symbiotic in the Software Verification Com- petition. The tool had to be adapted to the competition needs. Since we participated only with a too preliminary version of the tool, the results were not so encouraging. Nonetheless, the tool was improved later and it gives good results now – we collect nearly all points in some competition categories. We plan to compete in the competition with our tuned tool again. v vi Acknowledgement I would like to thank my supervisors prof. Kučera and assoc. prof. Strejček for their patience with my ignorance during the nice past years spent at the university. I really enjoyed that time and was taught many new things which would stay unrevealed for me otherwise. I thank my colleagues for their ideas and inspiration, especially Marek with whom we started the studies and walked most of the path. Of course, my parents and family altogether deserve many thanks for their eternal support. Without them and their care, I would be only a poor man on this Earth. And last, but definitely not least, Zuzanka must not be missing on this list. Her love, patience, and care is much appreciated. I am really glad we met each other. vii viii Contents 1 Introduction 1 1.1 Motivation . .3 1.1.1 Requirements for Analyzers . .4 1.1.2 Picking the Project . .7 1.2 Background . .8 1.2.1 Classes of Reports . .9 1.2.2 Theoretical Properties . .9 1.2.3 Sensitivity of Analyzers . 11 1.2.4 Types of Analyses . 12 1.2.5 Other Views . 13 1.2.6 Code Parsing . 13 1.2.7 Pointer Analysis . 16 1.3 Thesis Goals . 20 1.4 Thesis Contribution . 21 1.5 Thesis Organization . 21 2 State of the Art 23 2.1 Pattern Matching . 24 2.2 Data-flow Analysis . 25 2.3 Abstract Interpretation . 29 2.4 Symbolic Execution . 35 2.4.1 Concolic Execution . 39 2.5 Program Slicing . 42 2.6 Model Checking . 45 2.7 Industrial Tools . 47 2.8 Summary of the Linux Kernel Checkers . 48 3 Stanse 51 3.1 Framework Functionality . 52 3.2 Checkers . 59 3.2.1 AutomatonChecker ................. 59 3.2.2 LockChecker ...................... 61 3.2.3 ThreadChecker .................... 61 ix 3.2.4 ReachabilityChecker ................ 61 3.3 Results on Linux Kernel . 62 3.3.1 Important Errors Found by Stanse .......... 64 3.4 Discussion . 66 3.5 Conclusions . 69 4 Symbiotic 71 4.1 Running Example . 72 4.2 Our Contribution . 74 4.3 Instrumentation . 76 4.4 Slicing . 79 4.5 Symbolic Execution . 81 4.6 Implementation . 81 4.6.1 Prepare . 83 4.6.2 Slicer . 84 4.6.3 Kleerer . 85 4.6.4 Symbolic Execution . 86 4.6.5 Stitching Everything Together . 86 4.7 Experimental Results . 88 4.8 Related Work . 89 4.9 Future Work . 90 4.10 Conclusion . 91 5 Building Bug-database 93 5.1 Database Structure . 95 5.2 Database Frontend . 95 5.3 Current Contents of the Database . 98 5.3.1 Sources of Bug-Reports . 98 5.3.2 Linux Kernel Error Types . 99 5.3.3 Bug-Reports in the Database . 101 5.4 Use of the Database . 102 5.5 Conclusion and Future Plans . 105 6 Participating SV-COMP 107 6.1 The Competition . 108 6.2 Our Participation . 110 6.3 Results . 111 6.4 Conclusion . 113 7 Conclusion 115 Bibliography 117 x A Author’s Contribution 143 A.1 List of Publications . 143 A.2 List of Software . 144 xi xii Chapter 1 Introduction In the contemporary world, we are surrounded by devices making our lives easier. It counts from modern lifts which speak to us by natural language to supercomputers helping to treat cancer such as IBM Watson [But12]. Since Moore’s law [Moo65] has proven to hold for the past four decades, most present devices are featuring quite fast processing units thanks to the increasing count of built-in transistors. This allows developers to implement many interesting but complex algorithms. However being written by human beings vastly, it may lead to errant behaviour of the implementations as they are only as strong as the weakest link in the imaginary chain. One example is the error that led to a crash of the Ariane 5 rocket that should launch a satellite to the orbit of the Earth [Dow97]. $370M were buried in the middle of nowhere due to a programmer’s omission to check for an overflow in the acceleration computation. So these errors obviously cause companies revenue losses. It is known the later in the development cycle an error is revealed, the more expensive it is to fix the error [BB01]. The situation becomes even worse when the software is deployed at the customer’s site or in a rocket flying 13 000 ft above the sea level already. Companies have been always desiring for cheap solutions perfectly fit- ting their business plans. And yet, solutions which are based on algorithms that will cause no harm when deployed, completely error-free at best. But they are out of luck as, in general, there cannot be a tool which would tell us whether some algorithm is error-free. If there were one, it would be in contradiction of the Halting problem [Tur36]. However while it is impossible in theory, it does not necessarily mean we should not at least try hard for non-general cases when dealing with algorithms implemented in software. There, the implementations have at least finite state space thanks to finite memory. Even though the limited state space is still huge, companies have been always using techniques helping to avoid and remove errors from their programs. Even some kind of standardization of used coding principles like dictated coding style may reduce the number of introduced errors [BB01]. 1 2 CHAPTER 1. INTRODUCTION The reason may be that a project management imposes a duty on develop- ers to write simpler code. For example, the management can ban complex constructs which are known to introduce errors. Or, for some environments like embedded systems, heap allocations can be disallowed. This results in weeding out one whole category of programming errors – memory leaks. Some companies also started adopting standards like ISO 9001:2000 [ISO00] or MISRA C [Bur13]. Their customers are assured that the internal man- agement of such a company conforms to a standardized behaviour that tries to increase quality of developed products. Standardization cannot by definition evade programming errors com- pletely. Since the early times of computer technology, programmers in com- panies, or testers if they were distinct persons, were forced to test programs. Testers were running the programs with diverse input, random inclusive, to make the program misbehave or crash in order to catch and fix program errors. This approach is called dynamic program testing [MSTW10]. It is used very often as it is relatively cheap and mostly very effective technique. Thanks to executing the code in real-time with real data, whenever the dy- namic testing reveals an error, it is guaranteed to be genuine (if we look aside from malfunction of broken hardware). But it indeed takes a huge amount of time to perform systematic testing of large projects. And yet, the dynamic testing needs obviously a runnable executable which is not usually available in the early phases of development.