Undefined Behaviour in the C Language
Total Page:16
File Type:pdf, Size:1020Kb
FAKULTA INFORMATIKY, MASARYKOVA UNIVERZITA Undefined Behaviour in the C Language BAKALÁŘSKÁ PRÁCE Tobiáš Kamenický Brno, květen 2015 Declaration Hereby I declare, that this paper is my original authorial work, which I have worked out by my own. All sources, references, and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Vedoucí práce: RNDr. Adam Rambousek ii Acknowledgements I am very grateful to my supervisor Miroslav Franc for his guidance, invaluable help and feedback throughout the work on this thesis. iii Summary This bachelor’s thesis deals with the concept of undefined behavior and its aspects. It explains some specific undefined behaviors extracted from the C standard and provides each with a detailed description from the view of a programmer and a tester. It summarizes the possibilities to prevent and to test these undefined behaviors. To achieve that, some compilers and tools are introduced and further described. The thesis contains a set of example programs to ease the understanding of the discussed undefined behaviors. Keywords undefined behavior, C, testing, detection, secure coding, analysis tools, standard, programming language iv Table of Contents Declaration ................................................................................................................................ ii Acknowledgements .................................................................................................................. iii Summary .................................................................................................................................. iv Keywords .................................................................................................................................. iv 1 Introduction ..................................................................................................................... 1 2 Compilers and Analysis Tools ........................................................................................... 2 2.1 GCC ........................................................................................................................... 2 2.2 Clang ......................................................................................................................... 3 2.3 Splint ........................................................................................................................ 3 2.4 Cppcheck .................................................................................................................. 4 2.5 Valgrind .................................................................................................................... 4 3 Undefined Behaviors ........................................................................................................ 5 3.1 Demotion of one real floating type to another produces a value outside the range that can be represented (6.3.1.5). ....................................................................................... 7 3.1.1 Programmer View ............................................................................................ 8 3.1.2 Tester View ...................................................................................................... 8 3.2 An lvalue designating an object of automatic storage duration that could have been declared with the register storage class is used in a context that requires the value of the designated object, but the object is uninitialized. (6.3.2.1). ................... 10 3.2.1 Programmer View .......................................................................................... 11 3.2.2 Tester View .................................................................................................... 11 3.3 The program attempts to modify a string literal (6.4.5). ....................................... 13 3.3.1 Programmer View .......................................................................................... 13 3.3.2 Tester View .................................................................................................... 14 3.4 Two pointer types that are required to be compatible are not identically qualified, or are not pointers to compatible types (6.7.6.1). ............................................................ 15 3.4.1 Programmer View .......................................................................................... 15 3.4.2 Tester View .................................................................................................... 16 3.5 The } that terminates a function is reached, and the value of the function call is used by the caller (6.9.1). .................................................................................................. 17 3.5.1 Programmer View .......................................................................................... 18 3.5.2 Tester View .................................................................................................... 18 v 3.6 The signal function is used in a multi-threaded program (7.14.1.1). .................... 19 3.6.1 Programmer View .......................................................................................... 19 3.6.2 Tester View .................................................................................................... 21 3.7 The string pointed to by the mode argument in a call to the fopen function does not exactly match one of the specified character sequences (7.21.5.3). .......................... 22 3.7.1 Programmer View .......................................................................................... 23 3.7.2 Tester View .................................................................................................... 23 3.8 There are insufficient arguments for the format in a call to one of the formatted input/output functions, or an argument does not have an appropriate type (7.21.6.1, 7.21.6.2, 7.29.2.1, 7.29.2.2). .............................................................................................. 25 3.8.1 Programmer View .......................................................................................... 25 3.8.2 Tester View .................................................................................................... 27 3.9 The pointer argument to the free or realloc function does not match a pointer earlier returned by a memory management function, or the space has been deallocated by a call to free or realloc (7.22.3.3, 7.22.3.5). ................................................................ 28 3.9.1 Programmer View .......................................................................................... 29 3.9.2 Tester View .................................................................................................... 30 3.10 The value of the result of an integer arithmetic or conversion function cannot be represented (7.8.2.1, 7.8.2.2, 7.8.2.3, 7.8.2.4, 7.22.6.1, 7.22.6.2, 7.22.1). ....................... 32 3.10.1 Programmer View .......................................................................................... 32 3.10.2 Tester View .................................................................................................... 34 4 Conclusion and Future Work ......................................................................................... 36 Reference List ......................................................................................................................... 38 Attachments ........................................................................................................................... 40 vi 1 Introduction The programming language C does not have so strict specification and thus provides a lot of leeway in terms of the implementation. Instead of strict and complex rules, the language C is designed for better performance, simplicity, and portability. C source code can be compiled on computers with different processors or operating systems only with slight changes. It is the programmer’s concern to write portable code and avoid expressions specific to an architecture (e.g. size of data types). The programmer also has full control over memory management and therefore has to allocate and free the memory himself or herself since there is no default garbage collector to take care of it. All this low level programming has some consequences. It is much easier, even for a very experienced programmer, to make a mistake. With certain mistakes, a program might not compile, produce wrong results, crash, or even act correctly according to programmer’s expectations. But the fact is that anything could happen depending on the circumstances such as registers content. Such an unexpected event is referred to as “undefined behavior” [1, 2]. The C standard [3, Sec. J.2] contains an attachment describing some cases of undefined behavior but it is more a summary than anything else. Further behaviors not included in the attachment are mentioned in the text of the C standard. It is generally assumed that situations leading to undefined behavior will not occur and it is up to the programmer to take care of their appearance. Some compilers and analysis tools are able to recognize a few undefined behaviors in the code. This does not mean, however, that the programmer should not be familiar with them and should not try to reduce the risk of their occurrence. 1 2 Compilers and Analysis Tools In this thesis, the implementation of the C standard was tested