CIS 341: COMPILERS Announcements

Total Page:16

File Type:pdf, Size:1020Kb

CIS 341: COMPILERS Announcements Lecture 24 CIS 341: COMPILERS Announcements • HW6: Analysis & Optimizations – Alias analysis, constant propagation, dead code elimination, register allocation – Due: TOMORROW at Midnight • Final Exam: – Date: Friday, May 4th – Time: noon – 2:00pm – Location: Moore 216 – Coverage: cumulative, but emphasis on material since the midterm Zdancewic CIS 341: Compilers 2 Phi nodes Alloc “promotion” Register allocation ! REVISITING SSA Zdancewic CIS 341: Compilers 3 Dominator Trees • Domination is transitive: – if A dominates B and B dominates C then A dominates C • Domination is anti-symmetric: – if A dominates B and B dominates A then A = B • Every flow graph has a dominator tree – The Hasse diagram of the dominates relation 1 1 2 2 3 4 3 4 5 6 5 6 7 8 7 8 9 0 9 0 CIS 341: Compilers 4 Phi Functions • Solution: φ functions – Fictitious operator, used only for analysis • implemented by Mov at x86 level – Chooses among different versions of a variable based on the path by which control enters the phi node.! %uid = phi <ty> v1, <label1>, … , vn, <labeln> entry: int y = … %y1 = … int x = … %x1 = … int z = … %z1 = … %p = icmp … if (p) { br i1 %p, label %then, label %else x = y + 1; then: } else { %x2 = add i64 %y1, 1 br label %merge x = y * 2; else: } %x3 = mult i64 %y1, 2 z = x + 3; merge: %x4 = phi i64 %x2, %then, %x3, %else %z2 = %add i64 %x4, 3 Zdancewic CIS 341: Compilers 5 Alloca Promotion • Not all source variables can be allocated to registers – If the address of the variable is taken (as permitted in C, for example) – If the address of the variable “escapes” (by being passed to a function) • An alloca instruction is called promotable if neither of the two conditions above holds entry: %x = alloca i64 // %x cannot be promoted %y = call malloc(i64 8) %ptr = bitcast i8* %y to i64** store i65** %ptr, %x // store the pointer into the heap entry: %x = alloca i64 // %x cannot be promoted %y = call foo(i64* %x) // foo may store the pointer into the heap • Happily, most local variables declared in source programs are promotable – That means they can be register allocated Zdancewic CIS 341: Compilers 6 Phi Placement Alternative • Less efficient, but easier to understand: • Place phi nodes "maximally" (i.e. at every node with > 2 predecessors) • If all values flowing into phi node are the same, then eliminate it: %x = phi t %y, %pred1 t %y %pred2 … t %y %predK // code that uses %x ⇒ // code with %x replaced by %y • Interleave with other optimizations – copy propagation – constant propagation – etc. Zdancewic CIS 341: Compilers 7 Example SSA Optimizations l : %p = alloca i64 1 Find • How to place phi store 0, %p alloca %b = %y > 0 nodes without breaking SSA? br %b, %l2, %l3 max φs • Note: the “real” LAS/ implementation l2: combines many of these LAA steps into one pass. store 1, %p – Places phis directly at the dominance frontier br %l3 DSE • This example also illustrates other common DAE optimizations: l3: – Load after store/alloca %x = load %p – Dead store/alloca ret %x elim φs elimination Example SSA Optimizations l : %p = alloca i64 1 Find • How to place phi store 0, %p alloca %b = %y > 0 nodes without %x1 = load %p breaking SSA? br %b, %l2, %l3 max φs • Insert LAS/ l2: LAA – Loads at the store 1, %p end of each %x = load %p 2 block br %l3 DSE DAE l3: %x = load %p ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • How to place phi store 0, %p alloca %b = %y > 0 nodes without %x1 = load %p breaking SSA? br %b, %l2, %l3 max φs • Insert LAS/ l2: %x3 = φ[%x1,%l1] LAA – Loads at the store 1, %p end of each %x = load %p 2 block br %l3 DSE – Insert φ-nodes at each block DAE l3: %x4 = φ[%x1;%l1, %x2:%l2] %x = load %p ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • How to place phi store 0, %p alloca %b = %y > 0 nodes without %x1 = load %p breaking SSA? br %b, %l2, %l3 max φs • Insert l : %x = φ[%x ,%l ] LAS/ 2 3 1 1 – Loads at the store %x3, %p LAA store 1, %p end of each %x = load %p 2 block br %l3 DSE – Insert φ-nodes at each block DAE l3: %x4 = φ[%x1;%l1, %x2:%l2] – Insert stores store %x4, %p %x = load %p after φ-nodes ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • For loads after store 0, %p alloca %b = %y > 0 stores (LAS): %x = load %p 1 – Substitute all br %b, %l , %l 2 3 max φs uses of the load by the value LAS/ being stored l2: %x3 = φ[%x1,%l1] store %x3, %p LAA – Remove the load store 1, %p %x2 = load %p br %l3 DSE DAE l3: %x4 = φ[%x1;%l1, %x2:%l2] store %x4, %p %x = load %p ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • For loads after store 0, %p alloca %b = %y > 0 stores (LAS): %x = load %p 1 – Substitute all br %b, %l , %l 2 3 max φs uses of the load by the value LAS/ being stored l2: %x3 = φ[%x1,%l1] store %x3, %p LAA – Remove the load store 1, %p %x2 = load %p br %l3 DSE DAE l3: %x4 = φ[%x1;%l1, %x2:%l2] store %x4, %p %x = load %p ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • For loads after store 0, %p alloca %b = %y > 0 stores (LAS): %x = load %p 1 – Substitute all br %b, %l , %l 2 3 max φs uses of the load by the value LAS/ being stored l2: %x3 = φ[0,%l1] store %x3, %p LAA – Remove the load store 1, %p %x2 = load %p br %l3 DSE DAE l3: %x4 = φ[0;%l1, %x2:%l2] store %x4, %p %x = load %p ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • For loads after store 0, %p alloca %b = %y > 0 stores (LAS): – Substitute all br %b, %l , %l 2 3 max φs uses of the load by the value LAS/ being stored l2: %x3 = φ[0,%l1] store %x3, %p LAA – Remove the load store 1, %p %x2 = load %p br %l3 DSE DAE l3: %x4 = φ[0;%l1, %x2:%l2] store %x4, %p %x = load %p ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • For loads after store 0, %p alloca %b = %y > 0 stores (LAS): – Substitute all br %b, %l , %l 2 3 max φs uses of the load by the value LAS/ being stored l2: %x3 = φ[0,%l1] store %x3, %p LAA – Remove the load store 1, %p %x2 = load %p br %l3 DSE DAE l3: %x4 = φ[0;%l1, 1:%l2] store %x4, %p %x = load %p ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • For loads after store 0, %p alloca %b = %y > 0 stores (LAS): – Substitute all br %b, %l , %l 2 3 max φs uses of the load by the value LAS/ being stored l2: %x3 = φ[0,%l1] store %x3, %p LAA – Remove the load store 1, %p br %l3 DSE DAE l3: %x4 = φ[0;%l1, 1:%l2] store %x4, %p %x = load %p ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • For loads after store 0, %p alloca %b = %y > 0 stores (LAS): – Substitute all br %b, %l , %l 2 3 max φs uses of the load by the value LAS/ being stored l2: %x3 = φ[0,%l1] store %x3, %p LAA – Remove the load store 1, %p br %l3 DSE DAE l3: %x4 = φ[0;%l1, 1:%l2] store %x4, %p %x = load %p ret %x4 elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • Dead Store store 0, %p alloca %b = %y > 0 Elimination (DSE) – Eliminate all br %b, %l , %l 2 3 max φs stores with no subsequent LAS/ loads. l2: %x3 = φ[0,%l1] store %x3, %p LAA store 1, %p • Dead Alloca br %l DSE 3 Elimination (DAE) – Eliminate all allocas with no l : %x = φ[0;%l , 1:%l ] DAE 3 4 1 2 subsequent store %x4, %p loads/stores. ret %x4 elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • Dead Store store 0, %p alloca %b = %y > 0 Elimination (DSE) – Eliminate all br %b, %l , %l 2 3 max φs stores with no subsequent LAS/ loads. l2: %x3 = φ[0,%l1] store %x3, %p LAA store 1, %p • Dead Alloca br %l DSE 3 Elimination (DAE) – Eliminate all allocas with no l : %x = φ[0;%l , 1:%l ] DAE 3 4 1 2 subsequent store %x4, %p loads/stores. ret %x4 elim φs Example SSA Optimizations l : 1 Find • Eliminate φ nodes: alloca %b = %y > 0 – Singletons – With identical br %b, %l2, %l3 max φs values from each l : %x = [0,%l ] LAS/ 2 3 φ 1 predecessor LAA – See Aycock & Horspool, 2002 br %l3 DSE DAE l3: %x4 = φ[0;%l1, 1:%l2] ret %x4 elim φs Example SSA Optimizations l : 1 Find • Eliminate φ nodes: alloca %b = %y > 0 – Singletons – With identical br %b, %l2, %l3 max φs values from each l : %x = [0,%l ] LAS/ 2 3 φ 1 predecessor LAA br %l3 DSE DAE l3: %x4 = φ[0;%l1, 1:%l2] ret %x4 elim φs Example SSA Optimizations l : 1 Find • Done! alloca %b = %y > 0 br %b, %l2, %l3 max φs LAS/ l2: LAA br %l3 DSE DAE l3: %x4 = φ[0;%l1, 1:%l2] ret %x4 elim φ LLVM Phi Placement • This transformation is also sometimes called register promotion – older versions of LLVM called this “mem2reg” memory to register promotion • In practice, LLVM combines this transformation with scalar replacement of aggregates (SROA) – i.e.
Recommended publications
  • Topic 1D Slides
    Topic I (d): Static Single Assignment Form (SSA) 621-10F/Topic-1d-SSA 1 Reading List Slides: Topic Ix Other readings as assigned in class 621-10F/Topic-1d-SSA 2 ABET Outcome Ability to apply knowledge of SSA technique in compiler optimization An ability to formulate and solve the basic SSA construction problem based on the techniques introduced in class. Ability to analyze the basic algorithms using SSA form to express and formulate dataflow analysis problem A Knowledge on contemporary issues on this topic. 621-10F/Topic-1d-SSA 3 Roadmap Motivation Introduction: SSA form Construction Method Application of SSA to Dataflow Analysis Problems PRE (Partial Redundancy Elimination) and SSAPRE Summary 621-10F/Topic-1d-SSA 4 Prelude SSA: A program is said to be in SSA form iff Each variable is statically defined exactly only once, and each use of a variable is dominated by that variable’s definition. So, is straight line code in SSA form ? 621-10F/Topic-1d-SSA 5 Example 푿ퟏ In general, how to transform an arbitrary program into SSA form? Does the definition of X 푿ퟐ 2 dominate its use in the example? 푿ퟑ 흓 (푿ퟏ, 푿ퟐ) 푿 ퟒ 621-10F/Topic-1d-SSA 6 SSA: Motivation Provide a uniform basis of an IR to solve a wide range of classical dataflow problems Encode both dataflow and control flow information A SSA form can be constructed and maintained efficiently Many SSA dataflow analysis algorithms are more efficient (have lower complexity) than their CFG counterparts. 621-10F/Topic-1d-SSA 7 Algorithm Complexity Assume a 1 GHz machine, and an algorithm that takes f(n) steps (1 step = 1 nanosecond).
    [Show full text]
  • A General Compiler Framework for Speculative Optimizations Using Data Speculative Code Motion
    A General Compiler Framework for Speculative Optimizations Using Data Speculative Code Motion Xiaoru Dai, Antonia Zhai, Wei-Chung Hsu, Pen-Chung Yew Department of Computer Science and Engineering University of Minnesota Minneapolis, MN 55455 {dai, zhai, hsu, yew}@cs.umn.edu Abstract obtaining precise data dependence analysis is both difficult and expensive for languages such as C in which Data speculative optimization refers to code dynamic and pointer-based data structures are frequently transformations that allow load and store instructions to used. When the data dependence analysis is unable to be moved across potentially dependent memory show that there is definitely no data dependence between operations. Existing research work on data speculative two memory references, the compiler must assume that optimizations has mainly focused on individual code there is a data dependence between them. It is quite often transformation. The required speculative analysis that that such an assumption is overly conservative. The identifies data speculative optimization opportunities and examples in Figure 1 illustrate how such conservative data the required recovery code generation that guarantees the dependences may affect compiler optimizations. correctness of their execution are handled separately for each optimization. This paper proposes a new compiler S1: = *q while ( p ){ while( p ){ framework to facilitate the design and implementation of S2: *p= b 1 S1: if (p->f==0) S1: if (p->f == 0) general data speculative optimizations such as dead store
    [Show full text]
  • Effective Representation of Aliases and Indirect Memory Operations in SSA Form
    Effective Representation of Aliases and Indirect Memory Operations in SSA Form Fred Chow, Sun Chan, Shin-Ming Liu, Raymond Lo, Mark Streich Silicon Graphics Computer Systems 2011 N. Shoreline Blvd. Mountain View, CA 94043 Contact: Fred Chow (E-mail: [email protected], Phone: USA (415) 933-4270) Abstract. This paper addresses the problems of representing aliases and indirect memory operations in SSA form. We propose a method that prevents explosion in the number of SSA variable versions in the presence of aliases. We also present a technique that allows indirect memory operations to be globally commonized. The result is a precise and compact SSA representation based on global value numbering, called HSSA, that uniformly handles both scalar variables and indi- rect memory operations. We discuss the capabilities of the HSSA representation and present measurements that show the effects of implementing our techniques in a production global optimizer. Keywords. Aliasing, Factoring dependences, Hash tables, Indirect memory oper- ations, Program representation, Static single assignment, Value numbering. 1 Introduction The Static Single Assignment (SSA) form [CFR+91] is a popular and efficient represen- tation for performing analyses and optimizations involving scalar variables. Effective algorithms based on SSA have been developed to perform constant propagation, redun- dant computation detection, dead code elimination, induction variable recognition, and others [AWZ88, RWZ88, WZ91, Wolfe92]. But until now, SSA has only been used mostly for distinct variable names in the program. When applied to indirect variable constructs, the representation is not straight-forward, and results in added complexities in the optimization algorithms that operate on the representation [CFR+91, CG93, CCF94].
    [Show full text]
  • Compiler Construction
    Compiler construction PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 10 Dec 2011 02:23:02 UTC Contents Articles Introduction 1 Compiler construction 1 Compiler 2 Interpreter 10 History of compiler writing 14 Lexical analysis 22 Lexical analysis 22 Regular expression 26 Regular expression examples 37 Finite-state machine 41 Preprocessor 51 Syntactic analysis 54 Parsing 54 Lookahead 58 Symbol table 61 Abstract syntax 63 Abstract syntax tree 64 Context-free grammar 65 Terminal and nonterminal symbols 77 Left recursion 79 Backus–Naur Form 83 Extended Backus–Naur Form 86 TBNF 91 Top-down parsing 91 Recursive descent parser 93 Tail recursive parser 98 Parsing expression grammar 100 LL parser 106 LR parser 114 Parsing table 123 Simple LR parser 125 Canonical LR parser 127 GLR parser 129 LALR parser 130 Recursive ascent parser 133 Parser combinator 140 Bottom-up parsing 143 Chomsky normal form 148 CYK algorithm 150 Simple precedence grammar 153 Simple precedence parser 154 Operator-precedence grammar 156 Operator-precedence parser 159 Shunting-yard algorithm 163 Chart parser 173 Earley parser 174 The lexer hack 178 Scannerless parsing 180 Semantic analysis 182 Attribute grammar 182 L-attributed grammar 184 LR-attributed grammar 185 S-attributed grammar 185 ECLR-attributed grammar 186 Intermediate language 186 Control flow graph 188 Basic block 190 Call graph 192 Data-flow analysis 195 Use-define chain 201 Live variable analysis 204 Reaching definition 206 Three address
    [Show full text]
  • Techniques for Effectively Exploiting a Zero Overhead Loop Buffer
    Techniques for Effectively Exploiting a Zero Overhead Loop Buffer Gang-Ryung Uh1, Yuhong Wang2, David Whalley2, Sanjay Jinturkar1, Chris Burns1, and Vincent Cao1 1 Lucent Technologies, Allentown, PA 18103, U.S.A. {uh,sjinturkar,cpburns,vpcao}@lucent.com 2 Computer Science Dept., FloridaStateUniv. Tallahassee, FL 32306-4530, U.S.A. {yuhong,whalley}@cs.fsu.edu Abstract. A Zero Overhead Loop Buffer (ZOLB) is an architectural feature that is commonly found in DSP processors. This buffer can be viewed as a compiler managed cache that contains a sequence of instruc- tions that will be executed a specified number of times. Unlike loop un- rolling, a loop buffer can be used to minimize loop overhead without the penalty of increasing code size. In addition, a ZOLB requires relatively little space and power, which are both important considerations for most DSP applications. This paper describes strategies for generating code to effectively use a ZOLB. The authors have found that many common im- proving transformations used by optimizing compilers to improve code on conventional architectures can be exploited (1) to allow more loops to be placed in a ZOLB, (2) to further reduce loop overhead of the loops placed in a ZOLB, and (3) to avoid redundant loading of ZOLB loops. The results given in this paper demonstrate that this architectural fea- ture can often be exploited with substantial improvements in execution time and slight reductions in code size. 1 Introduction The number of DSP processors is growing every year at a much faster rate than general-purpose computer processors. For many applications, a large percentage of the execution time is spent in the innermost loops of a program [1].
    [Show full text]
  • A Practical System for Intermodule Code Optimization at Link-Time
    D E C E M B E R 1 9 9 2 WRL Research Report 92/6 A Practical System for Intermodule Code Optimization at Link-Time Amitabh Srivastava David W. Wall d i g i t a l Western Research Laboratory 250 University Avenue Palo Alto, California 94301 USA The Western Research Laboratory (WRL) is a computer systems research group that was founded by Digital Equipment Corporation in 1982. Our focus is computer science research relevant to the design and application of high performance scientific computers. We test our ideas by designing, building, and using real systems. The systems we build are research prototypes; they are not intended to become products. There two other research laboratories located in Palo Alto, the Network Systems Laboratory (NSL) and the Systems Research Center (SRC). Other Digital research groups are located in Paris (PRL) and in Cambridge, Massachusetts (CRL). Our research is directed towards mainstream high-performance computer systems. Our prototypes are intended to foreshadow the future computing environments used by many Digital customers. The long-term goal of WRL is to aid and accelerate the development of high-performance uni- and multi-processors. The research projects within WRL will address various aspects of high-performance computing. We believe that significant advances in computer systems do not come from any single technological advance. Technologies, both hardware and software, do not all advance at the same pace. System design is the art of composing systems which use each level of technology in an appropriate balance. A major advance in overall system performance will require reexamination of all aspects of the system.
    [Show full text]
  • Compiler Optimization and Code Generation
    Compiler Optimization and Code Generation Professor: Sc.D., Professor Vazgen Melikyan Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 3 Developed By: Vazgen Melikyan 1 Course Overview Introduction: Overview of Optimizations 1 lecture Intermediate-Code Generation 2 lectures Machine-Independent Optimizations 3 lectures Code Generation 2 lectures Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 3 Developed By: Vazgen Melikyan 2 Machine-Independent Optimizations Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 3 Developed By: Vazgen Melikyan 3 Causes of Redundancy Redundancy is available at the source level due to recalculations while one calculation is necessary. Redundancies in address calculations Redundancy is a side effect of having written the program in a high-level language where referrals to elements of an array or fields in a structure is done through accesses like A[i][j] or X -> f1. As a program is compiled, each of these high-level data-structure accesses expands into a number of low-level arithmetic operations, such as the computation of the location of the [i, j]-th element of a matrix A. Accesses to the same data structure often share many common low-level operations. Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler
    [Show full text]
  • Denotational Translation Validation
    Denotational Translation Validation The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Govereau, Paul. 2012. Denotational Translation Validation. Doctoral dissertation, Harvard University. Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:10121982 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA c 2012 - Paul Govereau All rights reserved. Thesis advisor Author Greg Morrisett Paul Govereau Denotational Translation Validation Abstract In this dissertation we present a simple and scalable system for validating the correctness of low-level program transformations. Proving that program transfor- mations are correct is crucial to the development of security critical software tools. We achieve a simple and scalable design by compiling sequential low-level programs to synchronous data-flow programs. Theses data-flow programs are a denotation of the original programs, representing all of the relevant aspects of the program se- mantics. We then check that the two denotations are equivalent, which implies that the program transformation is semantics preserving. Our denotations are computed by means of symbolic analysis. In order to achieve our design, we have extended symbolic analysis to arbitrary control-flow graphs. To this end, we have designed an intermediate language called Synchronous Value Graphs (SVG), which is capable of representing our denotations for arbitrary control-flow graphs, we have built an algorithm for computing SVG from normal assembly language, and we have given a formal model of SVG which allows us to simplify and compare denotations.
    [Show full text]
  • Imit Cuttack
    IMIT CUTTACK MCA 2 nd SEMESTER MCA 205 (Environmental Studies and Green IT) SUJEET KUMAR NAYAK EMAIL ID: [email protected] Sujeet ku.nayak. MCA 205 (Environmental Studies and Green IT) Email id: [email protected] PAGE NO: 2 MODULE-1 1.1 INTRODUCTION The word ‘Environment’ is derived from the French word ‘Environner’ which means to encircle, around or surround. The biologist Jacob Van Uerkal (1864-1944) introduced the term ‘environment’ in Ecology. Ecology is the study of the interactions between an organism of some kind and its environment. As given by Environment Protection Act 1986, Environment is the sum total of land, water, air, interrelationships among themselves and also with the human beings and other living organisms. Environmental Science is the interdisciplinary field and requires the study of the interactions among the physical, chemical and biological components of the Environment with a focus on environmental pollution and degradation. It is the science of physical phenomena in the environment. It studies the sources, reactions, transport, effect and fate of a biological species in the air, water and soil and the effect of and from human activity upon these. Environmental Science deals with the study of processes in soil, water, air and organisms which lead to pollution or environmental damages and the scientific basis for the establishment of a standard which can be considered acceptably clean, safe and healthy for human beings and natural ecosystems. The Environment is about the surrounding external conditions influencing development or growth of people, animal or plants; living or working conditions etc. Environment belongs to all living beings and is thus important for all.
    [Show full text]
  • Secure Optimization Through Opaque Observations
    Secure Optimization Through Opaque Observations SON TUAN VU, Sorbonne Université, CNRS, LIP6, France ALBERT COHEN, Google, France KARINE HEYDEMANN, Sorbonne Université, CNRS, LIP6, France ARNAUD DE GRANDMAISON, Arm, France CHRISTOPHE GUILLON, STMicroelectronics, France Secure applications implement software protections against side-channel and physical attacks. Such protections are meaningful at machine code or micro-architectural level, but they typically do not carry observable semantics at source level. To prevent optimizing compilers from altering the protection, security engineers embed input/output side-effects into the protection. These side-effects are error-prone and compiler-dependent, and the current practice involves analyzing the generated machine code to make sure security or privacy properties are still enforced. Vu et al. recently demonstrated how to automate the insertion of volatile side- effects in a compiler [52], but these may be too expensive in fined-grained protections such as control-flow integrity. We introduce observations of the program state that are intrinsic to the correct execution of security protections, along with means to specify and preserve observations across the compilation flow. Such observations complement the traditional input/output-preservation contract of compilers. We show how to guarantee their preservation without modifying compilation passes and with as little performance impact as possible. We validate our approach on a range of benchmarks, expressing the secure compilation of these applications in terms of observations to be made at specific program points. CCS Concepts: • Software and its engineering ! Compilers. Additional Key Words and Phrases: compiler, security, optimization, debugging, LLVM 1 INTRODUCTION Compilers care about preserving the input/output (I/O) behavior of the program; they achieve this by preserving functional correctness of computations linking input to output values, and making sure these take place in the order and under the conditions specified in the source program.
    [Show full text]
  • Equality Saturation: Engineering Challenges and Applications
    UNIVERSITY OF CALIFORNIA, SAN DIEGO Equality Saturation: Engineering Challenges and Applications A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Computer Science by Michael Benjamin Stepp Committee in charge: Professor Sorin Lerner, Chair Professor Ranjit Jhala Professor William Griswold Professor Rajesh Gupta Professor Todd Millstein 2011 UMI Number: 3482452 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent on the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. UMI 3482452 Copyright 2011 by ProQuest LLC. All rights reserved. This edition of the work is protected against unauthorized copying under Title 17, United States Code. ProQuest LLC. 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, MI 48106 - 1346 Copyright Michael Benjamin Stepp, 2011 All rights reserved. The dissertation of Michael Benjamin Stepp is approved, and it is acceptable in quality and form for publication on microfilm and electronically: Chair University of California, San Diego 2011 iii DEDICATION First, I would like to express my gratitude to my advisor Sorin Lerner. His guidance and encouragement made this research possible. I offer thanks to my parents for believing in me, and in my abilities. Their love and support made this academic achievement possible. I am lucky to have been born to the greatest set of parents anyone could ask for, and I see proof of that every day.
    [Show full text]
  • Low-Cost Memory Analyses for Efficient Compilers
    Low-cost memory analyses for efficient compilers Maroua Maalej Kammoun To cite this version: Maroua Maalej Kammoun. Low-cost memory analyses for efficient compilers. Other [cs.OH]. Univer- sité de Lyon, 2017. English. NNT : 2017LYSE1167. tel-01626398v2 HAL Id: tel-01626398 https://tel.archives-ouvertes.fr/tel-01626398v2 Submitted on 11 Dec 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. No d’ordre NNT : 2017LYSE1167 THÈSE DE DOCTORAT DE L’UNIVERSITÉ DE LYON opérée au sein de l’Université Claude Bernard Lyon 1 École Doctorale ED512 InfoMaths Spécialité de doctorat : Informatique Soutenue publiquement le 26/09/2017, par : Maroua MAALEJ KAMMOUN Low-cost memory analyses for efficient compilers Devant le jury composé de : Xavier Urbain, Professeur des Universités, Université Claude Bernard Lyon 1 Examinateur Sanjay Rajopadhye, Professeur, Colorado State University Rapporteur Corinne Ancourt, Maître de Recherches, École supérieure des mines de Paris Rapporteure Alain Girault, Directeur de Recherches, Inria Examinateur Karine Heydemann, Maître de Conférences, Université Pierre et Marie Curie Examinatrice Vincent Loechner, Maître de Conférences, Université de Strasbourg Examinateur Frédéric Vivien, Directeur de Recherches, Inria Directeur de thèse Laure Gonnord, Maître de Conférences, Université Claude Bernard Lyon1 Co-directrice de thèse !" #$% &' & "# $ % &'( %) "# $ ''* "# $ %'*)' "# + **,' ! " "%( " *- ' .
    [Show full text]