Verified Static Analyzes for Low-Level Languages Vincent Laporte

Verified static analyzes for low-level languages Vincent Laporte To cite this version: Vincent Laporte. Verified static analyzes for low-level languages. Programming Languages [cs.PL]. Université Rennes 1, 2015. English. NNT : 2015REN1S078. tel-01285624 HAL Id: tel-01285624 https://tel.archives-ouvertes.fr/tel-01285624 Submitted on 9 Mar 2016 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. ANNĒE 2015 THĒSE / UNIVERSITĒ DE RENNES 1 sous le sceau de l’Universitē Europēenne de Bretagne pour le grade de DOCTEUR DE L’UNIVERSITĒ DE RENNES 1 Mention : informatique Ēcole doctorale Matisse prēsentēe par Vincent Laporte prēparēe ā l’Unitē Mixte de Recherche 6074 - IRISA Institut de recherche en informatique et systēmes alēatoires UFR Informatique Ēlectronique (ISTIC) Thēse rapportēe par M. John Gregory MORRISETT Professeur - Cornell University/rapporteur M. Anders MØLLER Professeur - Aarhus University/rapporteur et soutenue ā Rennes le 25 novembre 2015 Vērification devant le jury composē de : d’analyses statiques M. Anders MØLLER Professeur - Aarhus University/rapporteur pour langages de M. Xavier LEROY Directeur de recherche - INRIA Rocquencourt/examinateur bas niveau M. Antoine MINĒ Professeur des universitēs - Paris vi/examinateur Mme. Marie-Laure POTET Professeur des universitēs - ĒNSIMAG/examinatrice Mme. Sandrine BLAZY Professeur des universitēs - Univ. de Rennes 1/directrice de thēse M. David PICHARDIE Professeur des universitēs - ĒNS Rennes/codirecteur de thēse Contents Rēsumē ētendu en français vii 1 Vérification statique de programmes . vii 2 Analyse de langages de bas-niveau . ix 3 Peut-on avoir confiance en un analyseur statique? . x 4 Contributions et organisation de ce document . xi 1 Introduction 1 1.1 Static Verification of Software . 1 1.2 Analysis of Low-level Languages . 2 1.3 Trusting a Static Analyzer . 4 1.4 Contributions and Structure of this Document . 5 2 Context 7 2.1 Introduction to abstract-interpretation-based static analysis in Coq . 7 2.1.1 Syntax of a Toy Language . 7 2.1.2 Sets as Propositions . 8 2.1.3 Abstract Data Types . 8 2.1.4 Type Classes . 11 2.1.5 Concrete Semantics . 12 2.1.6 Abstract Semantics . 14 2.1.7 Flow-Sensitive Analyzer . 15 2.1.8 Non-Relational Value Analysis . 18 2.2 CompCert . 19 2.3 Conclusion . 23 3 Modular construction of static analyzers of a C-like language in Coq 25 3.1 Abstract Domain Library . 26 3.1.1 Abstract Domain Interface . 27 3.1.2 Example of Abstract Domain: Intervals . 28 3.1.3 Abstract Domain Functors . 29 3.2 Fixpoint Resolution . 31 3.3 Numerical Abstraction . 33 3.3.1 Abstraction of Numerical Environments . 34 3.3.2 Building Non-relational Abstraction of Numerical Environments . 34 3.3.3 Abstraction of Numerical Values: Instances and Functor . 38 3.4 Memory Abstraction . 40 3.4.1 Specification . 40 3.4.2 Basic Implementation . 42 3.5 Experimental Evaluation . 44 3.6 Related Work and Conclusion . 45 i Contents 4 Verified value analysis & self-modifying programs 47 4.1 Disassembling by Abstract Interpretation . 48 4.2 Semantics of Goto★ ................................ 51 4.2.1 Abstract Syntax . 51 4.2.2 Semantics . 54 4.3 Abstract Interpreter . 56 4.3.1 Memory Abstract Domain . 56 4.3.2 Abstract Semantics . 59 4.3.3 Fixpoint Computation . 61 4.3.4 Soundness of the Abstract Interpreter . 63 4.4 Case Studies and Analysis Extensions . 64 4.4.1 Basic Example . 64 4.4.2 A First Extension: Dealing with Flags . 65 4.4.3 A Second Extension: Trace Partitioning . 66 4.4.4 A Third Extension: Abstract Decoding . 68 4.5 Related Work . 70 4.6 Conclusion and Perspectives . 71 5 Precise abstraction of C-like memories 73 5.1 Background . 73 5.2 Overview of the Memory Domain . 75 5.2.1 Memory Cells . 76 5.2.2 Points-to Domain . 77 5.2.3 Underlying (Relational) Numerical Domain . 77 5.2.4 Conversion . 78 5.3 Technical Details . 78 5.3.1 Pointer Expressions . 79 5.3.2 Points-to Domain . 79 5.3.3 Numerical Conversion . 80 5.3.4 Load Elimination . 80 5.3.5 Abstract Transformers . 82 5.3.6 Assign & Store . 84 5.4 Soundness . 85 5.4.1 Functional Memory . 85 5.4.2 Points-to Domain . 87 5.4.3 Concretization Relation . 88 5.4.4 Key Lemmas . 89 5.5 Extensions . 90 5.5.1 Local Variables . 90 5.5.2 Realization & Type Punning . 90 5.5.3 The Case of Null Pointers . 92 5.6 Conclusion . 94 6 Practicalities 95 6.1 Abstract Interpretation of Structured Programs with Gotos . 96 6.2 Inter-Procedural Analysis . 98 ii Contents 6.3 Progress Verification . 101 6.3.1 The Non-Block Monad . 101 6.3.2 Progress Verification in the Memory Domain . 102 6.3.3 Progress Verification in Numerical Domains . 104 6.3.4 Summary of progress checks . 104 6.4 More Numerical Domains . 104 6.4.1 Handling Machine Integers . 106 6.4.2 Polyhedral Domain . 106 6.4.3 Communication Between Domains . 107 6.5 On Using the Analyzer . 108 6.5.1 Common Pitfalls . 108 6.5.2 Sample Results . 110 6.6 Conclusion . 113 7 Conclusion 115 7.1 Summary . 115 7.2 Perspectives . 116 7.2.1 Improving the usability of the Verasco analyzer . 116 7.2.2 Weak cells and summarization . 117 7.2.3 Verified static analysis of concurrent programs . 119 Bibliography 121 iii List of Figures 2.1 Syntax of ge, a toy control-flow graph language with structured expressions 7 2.2 Example ge program computing the factorial function . 8 2.3 Sets as propositions . 9 2.4 Semantics of ge expressions . 13 2.5 Semantics of ge programs . 13 2.6 Generic post-fixpoint solver . 15 2.7 Syntax of CFG programs . 20 2.8 Syntax of C minor statements . 21 ♯ 2.9 Two C programs that get more defined when compiled by CompCert . 22 3.1 Overview of the architecture of the analyzer . 27 3.2 Signature of weak lattices with widening . 27 3.3 Specification of weak lattices . 28 3.4 Domain of intervals, abstracting signed machine integers . 29 3.5 Lattice structure of a domain with bottom . 30 3.6 Fixpoint checker . 32 3.7 Syntax of numerical expressions (nexpr) .................... 34 3.8 Semantics of numerical expressions . 35 3.9 Signature of relational numerical domains . 36 3.10 Signature.

Load more