Random Testing of Open Source C Compilers

Total Page:16

File Type:pdf, Size:1020Kb

Random Testing of Open Source C Compilers View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by The University of Utah: J. Willard Marriott Digital Library RANDOM TESTING OF OPEN SOURCE C COMPILERS by Xuejun Yang A dissertation submitted to the faculty of The University of Utah in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science School of Computing The University of Utah December 2014 Copyright c Xuejun Yang 2014 All Rights Reserved The University of Utah Graduate School STATEMENT OF DISSERTATION APPROVAL The dissertation of Xuejun Yang has been approved by the following supervisory committee members: John Regehr , Chair 06/06/2014 Date Approved Ganesh Gopalakrishnan , Member 06/06/2014 Date Approved Matthew Flatt , Member 04/03/2014 Date Approved Eric Eide , Member 06/05/2014 Date Approved Chris Lattner , Member 05/12/2014 Date Approved and by Ross Whitaker , Chair/Dean of the Department/College/School of Computing and by David B. Kieda, Dean of The Graduate School. ABSTRACT Compilers are indispensable tools to developers. We expect them to be correct. However, compiler correctness is very hard to be reasoned about. This can be partly explained by the daunting complexity of compilers. In this dissertation, I will explain how we constructed a random program generator, Csmith, and used it to find hundreds of bugs in strong open source compilers such as the GNU Compiler Collection (GCC) and the LLVM Compiler Infrastructure (LLVM). The success of Csmith depends on its ability of being expressive and unambiguous at the same time. Csmith is composed of a code generator and a GTAV (Generation-Time Analysis and Validation) engine. They work interactively to produce expressive yet unambiguous random programs. The expressiveness of Csmith is attributed to the code generator, while the unambiguity is assured by GTAV. GTAV performs program analyses, such as points-to analysis and effect analysis, efficiently to avoid ambiguities caused by undefined behaviors or unspecified behaviors. During our 4.25 years of testing, Csmith has found over 450 bugs in the GNU Compiler Collection (GCC) and the LLVM Compiler Infrastructure (LLVM). We analyzed the bugs by putting them into different categories, studying the root causes, finding their locations in compilers' source code, and evaluating their importance. We believe analysis results are useful to future random testers, as well as compiler writers/users. To my wife, for the countless weekends with a missing husband. CONTENTS ABSTRACT :::::::::::::::::::::::::::::::::::::::::::::::::::::::: iii LIST OF FIGURES ::::::::::::::::::::::::::::::::::::::::::::::::: ix LIST OF TABLES ::::::::::::::::::::::::::::::::::::::::::::::::::: x ACKNOWLEDGMENTS :::::::::::::::::::::::::::::::::::::::::::: xi CHAPTERS 1. INTRODUCTION ::::::::::::::::::::::::::::::::::::::::::::::: 1 1.1 A monkey . .1 1.2 A compiler . .1 1.3 Putting a monkey and a compiler together - random testing of a compiler . .2 1.4 Adding more compilers to the equation - random differential testing . .4 1.5 Challenges . .5 1.5.1 A random program generator that is both expressive and unambiguous5 1.5.2 Find and understand bugs in open source C compilers . .6 1.6 Evaluation . .7 1.7 Contributions . .8 1.8 Thesis organization . .9 2. BACKGROUND :::::::::::::::::::::::::::::::::::::::::::::::: 10 2.1 Software testing . 10 2.2 Compilers . 12 2.2.1 History of compilers . 12 2.2.2 Structure of a compiler . 13 2.3 GCC ........................................................ 14 2.3.1 History . 14 2.3.2 System overview . 14 2.3.3 GCC testing . 15 2.4 LLVM and Clang . 16 2.4.1 History . 16 2.4.2 System overview . 17 2.4.3 LLVM testing . 17 2.5 Compiler correctness and testing . 18 2.6 Random testing of compilers . 19 3. CODE GENERATOR :::::::::::::::::::::::::::::::::::::::::::: 21 3.1 The objectives . 21 3.2 Design choices . 22 3.2.1 Maximizing expressiveness while maintaining unambiguity . 22 3.2.2 Finding a test oracle . 23 3.2.3 Validating intermediate states vs. final states . 25 3.2.4 Grammar-based generation vs. AST-based generation . 25 3.2.5 Avoiding false positives vs. postgeneration trimming . 26 3.3 The shape of a randomly generated program . 26 3.4 Overview of the random generation process . 28 3.5 The generation grammar of Csmith . 31 3.6 Other considerations of context sensitivity: checks, balances, and biases . 32 3.6.1 Type check . 33 3.6.2 Value-range check . 34 3.6.3 Qualifiers check . 34 3.6.4 Unsafe arithmetic check . 35 3.6.5 Pointer check . 36 3.6.6 Array out-of-bound access check . 37 3.6.7 Use before initialization check . 38 3.6.8 The program size balance . 38 3.6.9 The biases . 38 3.7 Exceptions to the general AST construction order . 41 4. GENERATION TIME ANALYSIS AND VALIDATION ::::::::::::: 42 4.1 The objectives . 42 4.2 The necessity and challenges of generation time analysis and validation (GTAV) 43 4.3 GTAV framework . 43 4.3.1 Data representations and control representations . 43 4.3.2 Pre-commit GTAV . 44 4.3.3 Post-commit GTAV. 45 4.3.4 Transfer function for programs . 45 4.3.5 Transfer function for functions . 45 4.3.6 Transfer functions for statements . 47 4.3.7 Transfer function for loops . 47 4.3.8 Transfer function for expressions . 50 4.4 Points-to analysis and validation within GTAV framework . 50 4.4.1 Abstract variables . 50 4.4.2 The lattice . 52 4.4.3 Validation of pointer references . 52 4.4.4 Validation of points-to analysis itself . 54 4.5 Side-effect analysis within GTAV framework and the evaluation order problem 55 4.5.1 The algorithm and the lattice . 57 4.5.2 The implementation . 58 4.5.3 Side-effect validation for volatiles . 58 4.6 Union-field analysis and validation within GTAV . 59 4.6.1 The lattice . 59 4.6.2 Validation of union-field reads . 60 4.6.3 Validation of assignment between overlapping objects . 61 4.7 Guaranteed convergence of fixed point of the loop transfer function . 61 vi 5. EVALUATION :::::::::::::::::::::::::::::::::::::::::::::::::: 63 5.1 Uncontrolled opportunistic compiler bug finding . 63 5.1.1 What kinds of bugs are there? . 63 5.1.2 Experience with commercial compilers . 64 5.1.3 Experience with open source compilers . 65 5.1.4 Testing CompCert . 66 5.2 Comparison with the other random program generators . 67 5.2.1 Previously published random C program generators . 67 5.2.2 Generating performance . 68 5.2.3 Compiler code coverage . 70 5.2.4 Bug-finding power . 72 5.2.5 Bug-finding performance as a function of test-case size . 77 5.2.6 Bug-finding performance compared to other tools . 78 6. COMPILER BUG STUDIES ::::::::::::::::::::::::::::::::::::: 81 6.1 Compile-time bugs vs. wrong-code bugs . ..
Recommended publications
  • The Compcert C Verified Compiler: Documentation and User's Manual
    The CompCert C verified compiler: Documentation and user’s manual Xavier Leroy To cite this version: Xavier Leroy. The CompCert C verified compiler: Documentation and user’s manual: Version 2.4. [Intern report] Inria. 2014. hal-01091802v1 HAL Id: hal-01091802 https://hal.inria.fr/hal-01091802v1 Submitted on 6 Dec 2014 (v1), last revised 9 Dec 2020 (v8) HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. The CompCert C verified compiler Documentation and user’s manual Version 2.4 Xavier Leroy INRIA Paris-Rocquencourt September 17, 2014 Copyright 2014 Xavier Leroy. This text is distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. The text of the license is available at http://creativecommons.org/ licenses/by-nc-sa/4.0/ Contents 1 CompCert C: a trustworthy compiler 5 1.1 Can you trust your compiler? .................................. 5 1.2 Formal verification of compilers ................................ 6 1.3 Structure of the CompCert C compiler ............................. 9 1.4 CompCert C in practice ..................................... 11 1.4.1 Supported target platforms ............................... 11 1.4.2 The supported C dialect ................................ 12 1.4.3 Performance of the generated code ..........................
    [Show full text]
  • An Executable Semantics for Compcert C⋆
    An Executable Semantics for CompCert C? Brian Campbell LFCS, University of Edinburgh, [email protected] Abstract. CompCert is a C compiler developed by Leroy et al, the ma- jority of which is formalised and verified in the Coq proof assistant. The correctness theorem is defined in terms of a semantics for the `CompCert C' language, but how can we gain faith in those semantics? We explore one approach: building an equivalent executable semantics that we can check test suites of code against. Flaws in a compiler are often reflected in the output they produce: buggy compilers produce buggy code. Moreover, bugs can evade testing when they only manifest themselves on source code of a particular shape, when particular opti- misations are used. This has made compilers an appealing target for mechanized verification, from early work such as Milner and Weyhrauch's simple compiler in LCF [15] to modern work on practical compilers, such as the Verisoft C0 compiler [10] which is used as a key component of a larger verified software system. The CompCert project [11] has become a nexus of activity in recent years, in- cluding the project members' own efforts to refine the compiler and add certified optimisations [22,18,8], and external projects such as an extensible optimisation framework [20] and compiling concurrent programs [23]. The main result of the project is a C compiler which targets the assembly languages for a number of popular processor architectures. Most of the compiler is formalised in the Coq proof assistant and accompanied by correctness results, most notably proving that any behaviour given to the assembly output must be an acceptable be- haviour for the C source program.
    [Show full text]
  • A Multipurpose Formal RISC-V Specification Without Creating New Tools
    A Multipurpose Formal RISC-V Specification Without Creating New Tools Thomas Bourgeat, Ian Clester, Andres Erbsen, Samuel Gruetter, Andrew Wright, Adam Chlipala MIT CSAIL Abstract experimentation in basic research, progressively extensible RISC-V is a relatively new, open instruction set architecture to full-fledged realistic systems. with a mature ecosystem and an official formal machine- readable specification. It is therefore a promising playground 1.2 From a golden standard to formal-methods for formal-methods research. infrastructure However, we observe that different formal-methods re- There is the potential for a mutually beneficial relationship search projects are interested in different aspects of RISC-V between the community of formal methods and the RISC-V and want to simplify, abstract, approximate, or ignore the community. other aspects. Often, they also require different encoding On the one hand, the RISC-V Foundation already adopted a styles, resulting in each project starting a new formalization golden standard in the form of an ISA specification in the Sail from-scratch. We set out to identify the commonalities be- language [6]. There is now an agreed-on, machine-processed tween projects and to represent the RISC-V specification as description of authorized behaviors for a significant portion a program with holes that can be instantiated differently by of the spec, beyond the usual natural-language description. different projects. On the other hand, the formal-methods community can Our formalization of the RISC-V specification is written leverage RISC-V as a great laboratory to prototype ideas, in Haskell and leverages existing tools rather than requiring where researchers will be able to get mature compilation new domain-specific tools, contrary to other approaches.
    [Show full text]
  • Certification of an Instruction Set Simulator Xiaomu Shi
    Certification of an Instruction Set Simulator Xiaomu Shi To cite this version: Xiaomu Shi. Certification of an Instruction Set Simulator. Systèmes embarqués. Université de Grenoble, 2013. Français. tel-00937524v1 HAL Id: tel-00937524 https://tel.archives-ouvertes.fr/tel-00937524v1 Submitted on 28 Jan 2014 (v1), last revised 28 Jun 2017 (v2) HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. THÈSE Pour obtenir le grade de DOCTEUR DE L’UNIVERSITÉ DE GRENOBLE Spécialité : Informatique Arrêté ministérial : 7 août 2006 Présentée par Xiaomu SHI Thèse dirigée par Jean-François MONIN et codirigée par Vania JOLOBOFF préparée au sein de VERIMAG et de Mathématiques, Sciences et Technologies de l’Information, Informatique Certification of an Instruction Set Simulator Thèse soutenue publiquement le 10 juillet 2013, devant le jury composé de : M. Yves Bertot Directeur de Recherche, INRIA Sophia-Antipolis, Examinateur Mme Sandrine Blazy Professeur, IRISA, Rapporteur M. Vania Joloboff Directeur de Recherche, LIAMA, Co-Directeur de thèse M. Xavier Leroy Directeur de Recherche, INRIA Rocquencourt, Examinateur M. Laurent Maillet-Contoz Ingénieur, STMicroelectronics, Examinateur M. Claude Marché Directeur de Recherche, INRIA Saclay - Île-de-France et LRI, Rapporteur M.
    [Show full text]
  • A Trustworthy Monadic Formalization of the Armv7 Instruction Set Architecture
    A Trustworthy Monadic Formalization of the ARMv7 Instruction Set Architecture Anthony Fox and Magnus O. Myreen Computer Laboratory, University of Cambridge, UK Abstract. This paper presents a new HOL4 formalization of the cur- rent ARM instruction set architecture, ARMv7. This is a modern RISC architecture with many advanced features. The formalization is detailed and extensive. Considerable tool support has been developed, with the goal of making the model accessible and easy to work with. The model and supporting tools are publicly available { we wish to encourage others to make use of this resource. This paper explains our monadic specifica- tion approach and gives some details of the endeavours that have been made to ensure that the sizeable model is valid and trustworthy. A novel and efficient testing approach has been developed, based on automated forward proof and communication with ARM development boards. 1 Introduction Instruction set architectures (ISAs) provide a precise interface between hard- ware (microprocessors) and software (high-level code). Formal models of instruc- tion sets are pivotal when verifying computer micro-architectures and compilers. There are also areas where it is appropriate to reason directly about low-level machine code, e.g. in verifying operating systems, embedded code and device drivers. Detailed architecture models have also been used in formalizing multi- processor memory models, see [15]. In the past, academic work has frequently examined the pedagogical DLX architecture, see [6]. When compared to commercial architectures, DLX is rela- tively uncomplicated { being a simplified version of the MIPS RISC architecture. Consequently, it is rarely cumbersome to work with the entire DLX instruction set.
    [Show full text]
  • Certified and Efficient Instruction Scheduling. Application To
    Certified and efficient instruction scheduling. Application to interlocked VLIW processors. Cyril Six, Sylvain Boulmé, David Monniaux To cite this version: Cyril Six, Sylvain Boulmé, David Monniaux. Certified and efficient instruction scheduling. Application to interlocked VLIW processors.. Proceedings of the ACM on Programming Languages, ACM, 2020, OOPSLA 2020, pp.129. 10.1145/3428197. hal-02185883v3 HAL Id: hal-02185883 https://hal.archives-ouvertes.fr/hal-02185883v3 Submitted on 23 Nov 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Certified and Efficient Instruction Scheduling Application to Interlocked VLIW Processors CYRIL SIX, Kalray S.A., France and Univ. Grenoble Alpes, CNRS, Grenoble INP, Verimag, France SYLVAIN BOULMÉ, Univ. Grenoble Alpes, CNRS, Grenoble INP, Verimag, France DAVID MONNIAUX, Univ. Grenoble Alpes, CNRS, Grenoble INP, Verimag, France CompCert is a moderately optimizing C compiler with a formal, machine-checked, proof of correctness: after successful compilation, the assembly code has a behavior faithful to the source code. Previously, it only supported target instruction sets with sequential semantics, and did not attempt reordering instructions for optimization. We present here a CompCert backend for a VLIW core (i.e. with explicit parallelism at the instruction level), the first CompCert backend providing scalable and efficient instruction scheduling.
    [Show full text]
  • The Compcert C Verified Compiler: Documentation and User’S Manual Xavier Leroy
    The CompCert C verified compiler: Documentation and user’s manual Xavier Leroy To cite this version: Xavier Leroy. The CompCert C verified compiler: Documentation and user’s manual: Version 3.1. [Intern report] Inria. 2017, pp.1-68. hal-01091802v5 HAL Id: hal-01091802 https://hal.inria.fr/hal-01091802v5 Submitted on 5 Nov 2017 (v5), last revised 9 Dec 2020 (v8) HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution - NonCommercial - ShareAlike| 4.0 International License The CompCert C verified compiler Documentation and user’s manual Version 3.1 Xavier Leroy INRIA Paris August 18, 2017 Copyright 2017 Xavier Leroy. This text is distributed under the terms of the Creative Commons Attribution-NonCommercial- ShareAlike 4.0 International License. The text of the license is available at http://creativecommons. org/licenses/by-nc-sa/4.0/ Contents 1 CompCert C: a trustworthy compiler5 1.1 Can you trust your compiler?.....................................5 1.2 Formal verification of compilers...................................6 1.3 Structure of the CompCert C compiler................................9 1.4 CompCert C in practice........................................ 11 1.4.1 Supported target platforms.................................
    [Show full text]
  • An Abstract Stack Based Approach to Verified Compositional Compilation to Machine Code
    An Abstract Stack Based Approach to Verified Compositional Compilation to Machine Code YUTING WANG, Yale University, USA PIERRE WILKE, Yale University, USA and CentraleSupélec, France ZHONG SHAO, Yale University, USA A key ingredient contributing to the success of CompCert, the state-of-the-art verified compiler for C, is its block-based memory model, which is used uniformly for all of its languages and their verified compilation. However, CompCert’s memory model lacks an explicit notion of stack. Its target assembly language represents the runtime stack as an unbounded list of memory blocks, making further compilation of CompCert assembly into more realistic machine code difficult since it is not possible to merge these blocks into afiniteand continuous stack. Furthermore, various notions of verified compositional compilation rely on some kindof mechanism for protecting private stack data and enabling modification to the public stack-allocated data, which is lacking in the original CompCert. These problems have been investigated but not fully addressed before, in the sense that some advanced optimization passes that significantly change the ways stack blocks are (de-)allocated, such as tailcall recognition and inlining, are often omitted. We propose a lightweight and complete solution to the above problems. It is based on the enrichment of CompCert’s memory model with an abstract stack that keeps track of the history of stack frames to bound the stack consumption and that enforces a uniform stack access policy by assigning fine-grained permissions to stack memory. Using this enriched memory model for all the languages of CompCert, we are able to reprove the correctness of the full compilation chain of CompCert, including all the optimization passes.
    [Show full text]
  • Simple, Light, Yet Formally Verified, Global Common Subexpression
    Simple, light, yet formally verified, global common subexpression elimination and loop-invariant code motion David Monniaux Verimag, Univ. Grenoble Alpes, CNRS, Grenoble INP∗ Cyril Six Kalray S.A. & Verimag May 5, 2021 Abstract We present an approach for implementing a formally certified loop- invariant code motion optimization by composing an unrolling pass and a formally certified yet efficient global subexpression elimination. This approach is lightweight: each pass comes with a simple and inde- pendent proof of correctness. Experiments show the approach signif- icantly narrows the performance gap between the CompCert certified compiler and state-of-the-art optimizing compilers. Our static analysis employs an efficient yet verified hashed set structure, resulting in fast compilation. This is the full version, with appendices, of an article published in LCTES 2021. 1 Introduction arXiv:2105.01344v1 [cs.PL] 4 May 2021 In this article, we present an approach for obtaining a complex optimization (loop-invariant code motion), which brings a major performance boost to certain applications (including linear algebra kernels), by composition of simple optimization passes that can be proved correct using simple local arguments. We have implemented that scheme in the CompCert verified compiler.1 ∗Institute of engineering Univ. Grenoble Alpes 1Our developments are available at https://gricad-gitlab.univ-grenoble-alpes.fr/certicompil/compcert-kvx.git 1 1.1 CompCert CompCert Leroy [2009b,a] is the first optimizing C compiler with a formal proof of correctness mature enough to be used in industry Bedin Fran¸caet al. [2012], K¨astneret al. [2018]; it is now available both as a research tool2 and commercial software.3 This proof of correctness, verified by the Coq proof assistant, ensures that the behavior of the assembly code produced by the compiler matches the behavior of the source code; in particular, CompCert does not have the middle-end bugs usually found in compilers Yang et al.
    [Show full text]
  • Factsheet Compcert C Compiler
    CompCert C Compiler CompCert is an optimizing C compiler which is formally • Static assertions via the _Static_assert key- verified, using machine-assisted mathematical proofs, to word from ISO C 2011. guarantee the absence of compiler bugs. • Pragmas and attributes to control alignment and sec- Its intended use is compiling safety-critical and mission- tion placement of global variables. critical software written in C and meeting high levels of assurance. The code it produces is proved to behave exactly as specified by the semantics of the source C program. Supported targets The formal proof covers all transformations from the ab- • PowerPC 32-bit (Signal Processing Extension SPE stract syntax tree to the generated assembly code. and Variable Length Encoding VLE are currently not supported) Key benefits • PowerPC 32-/64-bit hybrid (32-bit pointers, 64-bit integer computations, SPE and VLE are currently not • The correctness proof of CompCert guarantees that supported) all safety properties verified on the source code (e.g. • ARMv6 ISA with VFPv2 coprocessor (big or little by static analyzers or model checking) automatically endian) hold for the generated code as well. • ARMv7 ISA with VFPv3-D16 coprocessor (big or • On typical embedded processors, code generated by little endian) CompCert offers performance comparable to code • AArch64 (ARMv8 ISA, 64-bit, little endian) generated by GCC at optimization level 1. • ia32 (x86 32-bit, SSE2 extension required) • AMD64 (x86 64-bit) • RISC-V (Base instruction sets RV32I and RV64I; ex- Optimizations tensions M, F, and D) Supported target systems are Linux with the GNU toolchain CompCert implements the following optimizations, all for all architectures.
    [Show full text]
  • The Compcert C Verified Compiler: Documentation and User's Manual
    The CompCert C verified compiler: Documentation and user’s manual Xavier Leroy To cite this version: Xavier Leroy. The CompCert C verified compiler: Documentation and user’s manual. [Intern report] Inria. 2020, pp.1-78. hal-01091802v8 HAL Id: hal-01091802 https://hal.inria.fr/hal-01091802v8 Submitted on 9 Dec 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution - NonCommercial - ShareAlike| 4.0 International License The CompCert C verified compiler Documentation and user’s manual Version 3.8 Xavier Leroy Collège de France and Inria November 16, 2020 Copyright 2020 Xavier Leroy. This text is distributed under the terms of the Creative Commons Attribution-NonCommercial- ShareAlike 4.0 International License. The text of the license is available at http://creativecommons. org/licenses/by-nc-sa/4.0/ Contents 1 CompCert C: a trustworthy compiler5 1.1 Can you trust your compiler?.....................................5 1.2 Formal verification of compilers...................................6 1.3 Structure of the CompCert C compiler................................9 1.4 CompCert C in practice........................................ 11 1.4.1 Supported target platforms.................................. 11 1.4.2 The supported C dialect...................................
    [Show full text]
  • 1 Introduction
    Ben Greenman CompCert Overview 2015-11-19 Abstract CompCert is a verified C compiler, designed and implemented by Xavier Leroy along with: • Sandrine Blazy • Bernard Serpette • Michael Schmidt • Zaynah Dargaye • Guillarme Claret • Guillaume Melquiond • Damien Doligez • Jacques-Henri Jourdan • Sylvie Boldo • Benjamin Gregoire` • Franc¸ois Pottier • Andrew Appel • Thomas Moniot • Silvain Rideau • Gordon Stewart • Laurence Rideau • Tahina Ramananandro • . and many more CompCert compiles a safe subset of the C89 and C99 languages. CompCert guarantees that all ob- servable behaviors of compiled code are observable source code behaviors. The project began in 2005 and continues to improve today. This document is a summary of the CompCert users manual and annotated source code. 1 Introduction Debugging is always a bummer. This is especially true if buggy software starts losing money, releasing secrets, or killing people. Hence the push for verified software: let’s get it right the first time. Embedded systems is a large application area where mistakes can be fatal. Airplanes, automobiles, construction equipment, spaceships, and the like all run embedded systems. Embedded systems have another constraint: the programs must follow a strict resource limit. For these reasons, the C programming language is commonly used in embedded systems because of its proximity to the hardware and portability. Debugging embedded systems code is very difficult. The CompCert C compiler offers an efficient, optimizing, and safe C compiler [18]. CompCert claims that if it compiles a program, the executable will only perform only a subset of the actions one would expect the source code to perform. This guarantee does not eliminate debugging—nor does it specify execution time or memory consumption1—but it at least rules out translation bugs, thereby encouraging source-program reasoning.
    [Show full text]