Denotational Translation Validation
Total Page:16
File Type:pdf, Size:1020Kb
Denotational Translation Validation The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Govereau, Paul. 2012. Denotational Translation Validation. Doctoral dissertation, Harvard University. Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:10121982 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA c 2012 - Paul Govereau All rights reserved. Thesis advisor Author Greg Morrisett Paul Govereau Denotational Translation Validation Abstract In this dissertation we present a simple and scalable system for validating the correctness of low-level program transformations. Proving that program transfor- mations are correct is crucial to the development of security critical software tools. We achieve a simple and scalable design by compiling sequential low-level programs to synchronous data-flow programs. Theses data-flow programs are a denotation of the original programs, representing all of the relevant aspects of the program se- mantics. We then check that the two denotations are equivalent, which implies that the program transformation is semantics preserving. Our denotations are computed by means of symbolic analysis. In order to achieve our design, we have extended symbolic analysis to arbitrary control-flow graphs. To this end, we have designed an intermediate language called Synchronous Value Graphs (SVG), which is capable of representing our denotations for arbitrary control-flow graphs, we have built an algorithm for computing SVG from normal assembly language, and we have given a formal model of SVG which allows us to simplify and compare denotations. Finally, we report on our experiments with LLVM M.D., a prototype denotational translation validator for the LLVM optimization framework. iii Contents Title Page . .i Abstract . iii Table of Contents . iv List of Figures . vii List of Tables . ix 1 Introduction 1 1.1 Design . .6 1.2 A brief history of translation validation . 10 1.3 Outline of this dissertation . 13 2 Overview 16 2.1 Introduction . 17 2.2 Normalizing translation validation . 18 2.3 Validation by Example . 21 2.3.1 Basic Blocks . 21 2.3.2 Extended Basic Blocks . 29 2.3.3 Loops . 32 2.4 Normalization . 36 2.4.1 Efficiency . 41 2.4.2 Extended Example . 43 3 Assembly Language 48 3.1 Assembly Language Syntax . 49 3.1.1 Working with Assembly Language . 54 3.2 Target Language . 56 3.2.1 Target Language Syntax . 57 3.2.2 Translation . 60 3.3 Side Effects . 62 3.3.1 Modeling Memory . 65 3.4 Completing the Semantics . 66 3.4.1 Translation Validation . 68 iv Contents v 3.4.2 λ-Graphs . 71 4 Synchronous Value Graphs 77 4.0.3 SVG Syntax . 80 4.1 Categorical Semantics . 91 4.1.1 Motivation . 91 4.2 Generalizing Monads . 100 4.3 Arrows . 104 4.3.1 Relationship to Monads . 109 4.3.2 Choice . 112 4.3.3 Loops . 114 4.3.4 Denotational Model . 116 5 Mechanized Semantics and Rewrite Rules 122 5.1 SVG Category . 123 5.1.1 Co-inductive Proof Techniques . 127 5.1.2 Category Lemmas . 131 5.2 SVG Arrow . 132 5.2.1 Choice . 135 5.2.2 Loops . 137 6 Implementation 139 6.1 Introduction . 140 6.1.1 Gated SSA Form . 140 6.1.2 Computing Gated SSA . 145 6.2 Background . 146 6.2.1 Relation Between The Dominator Tree and The Dominance Frontier . 149 6.2.2 Gating Paths . 151 6.3 Gating as a single-source path-expression problem . 154 6.3.1 From Path Expressions to Gating Functions . 159 6.4 Path Expressions via Gaussian Elimination . 165 6.4.1 Path Sequences . 167 6.4.2 Front- and Back-solving . 170 6.4.3 Decomposition . 172 6.4.4 Examples . 174 6.4.5 Simple Optimizations . 179 6.5 Decomposition using dominators . 180 6.5.1 Gating Monad . 182 6.5.2 Path Compression . 185 6.5.3 Block Processing Order . 187 Contents vi 6.5.4 Main Algorithm . 189 6.6 Irreducible graphs . 193 6.6.1 The Derived Graph . 195 6.6.2 Modifications to Main Algorithm . 197 7 Experimental Evaluation 202 7.1 Experimental Setup . 205 7.1.1 Pipeline information . 207 7.2 Compiler User Experiments . 208 7.2.1 Pipeline Results . 208 7.2.2 Validated Optimization . 210 7.2.3 Validation Time . 215 7.3 Compiler Developer Experiments . 217 7.3.1 Testing Individual Optimizations . 218 7.3.2 Rewrite Rules . 221 7.3.3 Improving Results with Additional Rewrite Rules . 226 8 Conclusion 235 8.0.4 Discussion . 236 8.1 Future work . 238 A Data-flow Semantics 248 A.1 Operational Semantics . 248 A.2 Static Semantics . 248 List of Figures 2.1 LLVM M.D. from a bird's eye view . 20 2.2 Extraction of loop iterations . 32 2.3 Representation of while loops . 35 2.4 Normalization of Large Example. 46 2.5 Normalization of Large Example (continued). 47 3.1 Syntax of Assembly Language Functions . 49 3.2 Assembly Language Type System . 50 3.3 Assembly Lanugauge Values . 51 3.4 Assembly Language Instructions . 51 3.5 Assembly Language Operations . 52 3.6 Loop Example . 53 3.7 Syntax of the Extended Assembly Language . 58 3.8 Graph representation of isEven function . 72 4.1 Syntax of Synchronous Value Graphs . 80 5.1 Rewrite lemmas corresponding to composition. 130 5.2 Lemmas for first properties. 134 5.3 Lemmas for left properties. 136 5.4 Lemmas for left properties. 138 6.1 Gated Assembly Language Instructions . 141 6.2 Loop Example . 143 6.3 SSA and GSA for Loop Example . 144 6.4 A Control-flow Graph . 147 6.5 Dominator Tree for Control-flow Graph in Figure 6.4 . 148 6.6 Example: a conditional statement. 174 6.7 Control-flow graph for a simple loop . 177 6.8 Main Algorithm . 190 6.9 Irreducible graph and its dominator tree. 193 6.10 Graph derived from Control-flow Graph in Figure 6.9 . 196 vii List of Figures viii 7.1 Validation results for optimization pipeline . 209 7.2 A validated optimizer using LLVM M.D. and an off-the-shelf optimizer. 211 7.3 Comparison of validated and unvalidate AES benchmark. 214 7.4 Validation time normalized to compilation time. 216 7.5 Validator results for individual optimizations . 219 7.6 Validator results for individual optimizations (continued) . 220 7.7 Results for GVN optimization . 221 7.8 LICM . 224.