4. Selected Topics in Compiler Construction

Total Page:16

File Type:pdf, Size:1020Kb

Load more

Content of Lecture 1. Introduction Compilers and Language Processing Tools 2. Syntax and Type Analysis 2.1 Lexical Analysis Summer Term 2011 2.2 Context-Free Syntax Analysis 2.3 Context-Dependent Analysis 3. Translation to Target Language Prof. Dr. Arnd Poetzsch-Heffter 3.1 Translation of Imperative Language Constructs 3.2 Translation of Object-Oriented Language Constructs Software Technology Group TU Kaiserslautern 4. Selected Topics in Compiler Construction 4.1 Intermediate Languages 4.2 Optimization 4.3 Register Allocation 4.4 Just-in-time Compilation 4.5 Further Aspects of Compilation 5. Garbage Collection 6. XML Processing (DOM, SAX, XSLT) c Prof. Dr. Arnd Poetzsch-Heffter 1 c Prof. Dr. Arnd Poetzsch-Heffter 2 Chapter Outline 4. Selected Topics in Compiler Construction 4.1 Intermediate Languages 4.1.1 3-Address Code 4.1.2 Other Intermediate Languages 4. Selected Topics in Compiler 4.2 Optimization 4.2.1 Classical Optimization Techniques Construction 4.2.2 Potential of Optimizations 4.2.3 Data Flow Analysis 4.2.4 Non-local Optimization 4.3 Register Allocation 4.3.1 Sethi-Ullman Algorithm 4.3.2 Register Allocation by Graph Coloring 4.4 Just-in-time Compilation 4.5 Further Aspects of Compilation c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 4 Selected topics in compiler construction Selected topics in compiler construction (2) Focus: Learning objectives: Techniques that go beyond the direct translation of source Intermediate languages for translation and optimization of • • languages to target languages imperative languages Concentrate on concepts instead of language-dependent details Different optimization techniques • • Use program representations tailored for the considered tasks Different static analysis techniques for (intermediate) programs • (instead of source language syntax): • Register allocation I simplifies representation • Some aspects of code generation I (but needs more work to integrate tasks) • c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 5 c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 6 Intermediate languages Intermediate languages Intermediate languages Intermediate languages are used as • I appropriate program representation for certain language implementation tasks I common representation of programs of different source languages 4.1 Intermediate languages Source Source Source ... Language 1 Language 2 Language n Intermediate Language Target Target Target ... Language 1 Language 2 Language m c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 7 c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 8 Intermediate languages Intermediate languages 3-Address Code Intermediate languages (2) Intermediate languages for translation are comparable to data • structures in algorithm design, i.e., for each task, an intermediate 4.1.1 3-Address Code language is more or less suitable. Intermediate languages can conceptually be seen as abstract • machines. c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 9 c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 10 Intermediate languages 3-Address Code Intermediate languages 3-Address Code 3-address code 3-address code (2) 3-address code (3AC) is a common intermediate language with many variants. A program in 3AC consists of Properties: a list of global variables • only elementary data types (but often arrays) a list of procedures with parameters and local variables • • no nested expressions a main procedure • • sequential execution, jumps and procedure calls as statements each procedure has a sequence of 3AC commands as body • • named variables as in a high level language • unbounded number of temporary variables • c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 11 c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 12 Intermediate languages 3-Address Code Intermediate languages 3-Address Code 3AC commands 3AC commands (2) Syntax Explanation call p(x1, ..., xn) is encoded as: Syntax Explanation (block is considered as one command) x: variable (global, local, parameter, temporary) param x1 x := y bop z y,z: variable or constant ... x : = uop z bop: binary operator param xn x:= y param x uop: unary operator call p jump or conditional jump to label L call p goto L cop: comparison operator return y return y causes jump to return address if x cop y goto L only procedure-local jumps with (optional) result y x:= a[i] a one-dimensional array a[i]:= y x : = & a a global, local variable or parameter We assume that 3AC only contains labels x:= *y & a address of a for which jumps are used in the program. *x := y * dereferencing operator c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 13 c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 14 Intermediate languages 3-Address Code Intermediate languages 3-Address Code Basic blocks Basic blocks (2) Remarks: A sequence of 3AC commands can be uniquely partitioned into basic The commands of a basic block are always executed sequentially, blocks. • there are no jumps to the inside A basic block B is a maximal sequence of commands such that Often, a designated exit-block for a procedure containing the • at the end of B, exactly one jump, procedure call, or return return jump at its end is required. This is handled by additional • command occurs transformations. The transitions between basic blocks are often denoted by flow labels only occur at the first command of a basic block • • charts. c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 15 c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 16 Beispiel: (3AC und Basisblöcke) Wir betrachten den 3AC für ein C-Programm: int a[2]; int b[7]; int skprod(int i1, int i2, int lng) {... } int main( ) { Intermediate languages 3-Address Code Intermediate languages a[0] = 1; a[1] = 2;3-Address Code b[0] = 4; b[1] = 5; b[2] = 6; skprod(0,1 ,2); Example: 3AC and basic blocks Example: 3ACreturn and basic0; blocks (2) } 3AC with basic block partitioning for main procedure Beispiel: (3AC und Basisblöcke) 3AC mit Basisblockzerlegung für die Prozedur main: ConsiderWir the betrachten following C den program: 3AC für ein C-Programm: main: int a[2]; int b[7]; a[0] := 1 a[1] := 2 int skprod(int i1, int i2, int lng) {... } b[0] := 4 b[1] := 5 int main( ) { b[2] := 6 a[0] = 1; a[1] = 2; param 0 b[0] = 4; b[1] = 5; b[2] = 6; param 1 skprod(0,1 ,2); param 2 return 0; call skprod } return 0 3AC mit Basisblockzerlegung für die Prozedur main: 28.06.2007© A. Poetzsch-Heffter, TU Kaiserslautern 296 main: c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 17 c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 18 a[0] := 1 Intermediate languages 3-Address Code Intermediate languages Prozedur skprod mit 3AC und3-Address Basisblockzerlegung: Code a[1] := 2 b[0] := 4 int skprod(int i1, int i2, int lng) { Example: 3AC and basic blocks (3) Example: 3AC andint ix, basic res = 0; blocks (4) b[1] := 5 for( ix=0; ix <= lng-1; ix++ ){ b[2] := 6 res += a[i1+ix] * b[i2+ix]; } param 0 Procedure skprod asreturn 3AC res; with basic blocks param 1 } ProcedureProzedur skprod: skprodparam mit 23AC und Basisblockzerlegung: skprod: call skprod res:= 0 int skprod(int i1, int i2, int lng) { ix := 0 return 0 int ix, res = 0; t0 := lng-1 for( ix=0; ix <= lng-1; ix++ ){ if ix<=t0 28.06.2007res +=© a[i1+ix]A. Poetzsch-Heffter, * TU b[i2+ix]; Kaiserslautern 296 true false t1 := i1+ix } t2 := a[t1] t1 := i2+ix return res; t3 := b[t1] } t1 := t2*t3 res:= es+t1 ix := ix+1 skprod: return res res:= 0 28.06.2007© A. Poetzsch-Heffter, TU Kaiserslautern 297 ix := 0 c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 19 c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 20 Intermediate languages 3-Address Code Intermediate languages 3-Address Code t0 := lng-1 Intermediate LanguageVariation im Rahmen Variationsif einer ix<=t0 Zwischensprache: Characteristics of 3-Address Code 3-Adress-Code nach Elimination von Feldoperationen 3 AC after eliminationanhand of des array obigentrue operationsBeispiels: (atfalse above example) t1 := i1+ixskprod: t2 := a[t1]res:= 0 t1 := i2+ixix := 0 t3 := b[t1] Control flow is explicit. t0 := lng-1 • t1 := t2*t3if ix<=t0 Only elementary operations res:= es+t1 • true false Rearrangement and exchange of commands can be handled ix t1:= := ix+1i1+ix • tx := t1*4 return res relatively easily. ta := a+tx t2 := *ta t1 := i2+ix tx := t1*4 tb := b+tx return res t3 := *tb t1 := t2*t3 res:= res+t1 ix := ix+1 28.06.2007© A. Poetzsch-Heffter, TU Kaiserslautern 297 c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 21 c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 22 28.06.2007© A. Poetzsch-Heffter, TU Kaiserslautern 298 Intermediate languages Other Intermediate Languages Intermediate languages Other Intermediate Languages Further Intermediate Languages 4.1.2 Other Intermediate Languages We consider 3AC in Static Single Assignment (SSA) representation • Stack Machine Code • c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 23 c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 24 IntermediateIn SSA-Repräsentation languages besitztOther Intermediate jede Languages Variable genau Intermediate languages Other Intermediate Languages Singleeine Definition. Static Assignment Dadurch Form wird der Zusammenhang Single Static Assignment Form (2) zwischen Anw endu ng u nd Definition in der Zwischensprache explizit, d.h. eine zusätzliche Ifdef-use-Verkettung a variable a is read at a program oder use-def-Verkettung position, this is a use of a .wird SSA is essentially a refinement of 3AC.
Recommended publications
  • A Deep Dive Into the Interprocedural Optimization Infrastructure

    A Deep Dive Into the Interprocedural Optimization Infrastructure

    Stes Bais [email protected] Kut el [email protected] Shi Oku [email protected] A Deep Dive into the Luf Cen Interprocedural [email protected] Hid Ue Optimization Infrastructure [email protected] Johs Dor [email protected] Outline ● What is IPO? Why is it? ● Introduction of IPO passes in LLVM ● Inlining ● Attributor What is IPO? What is IPO? ● Pass Kind in LLVM ○ Immutable pass Intraprocedural ○ Loop pass ○ Function pass ○ Call graph SCC pass ○ Module pass Interprocedural IPO considers more than one function at a time Call Graph ● Node : functions ● Edge : from caller to callee A void A() { B(); C(); } void B() { C(); } void C() { ... B C } Call Graph SCC ● SCC stands for “Strongly Connected Component” A D G H I B C E F Call Graph SCC ● SCC stands for “Strongly Connected Component” A D G H I B C E F Passes In LLVM IPO passes in LLVM ● Where ○ Almost all IPO passes are under llvm/lib/Transforms/IPO Categorization of IPO passes ● Inliner ○ AlwaysInliner, Inliner, InlineAdvisor, ... ● Propagation between caller and callee ○ Attributor, IP-SCCP, InferFunctionAttrs, ArgumentPromotion, DeadArgumentElimination, ... ● Linkage and Globals ○ GlobalDCE, GlobalOpt, GlobalSplit, ConstantMerge, ... ● Others ○ MergeFunction, OpenMPOpt, HotColdSplitting, Devirtualization... 13 Why is IPO? ● Inliner ○ Specialize the function with call site arguments ○ Expose local optimization opportunities ○ Save jumps, register stores/loads (calling convention) ○ Improve instruction locality ● Propagation between caller and callee ○ Other passes would benefit from the propagated information ● Linkage
  • Attacking Client-Side JIT Compilers.Key

    Attacking Client-Side JIT Compilers.Key

    Attacking Client-Side JIT Compilers Samuel Groß (@5aelo) !1 A JavaScript Engine Parser JIT Compiler Interpreter Runtime Garbage Collector !2 A JavaScript Engine • Parser: entrypoint for script execution, usually emits custom Parser bytecode JIT Compiler • Bytecode then consumed by interpreter or JIT compiler • Executing code interacts with the Interpreter runtime which defines the Runtime representation of various data structures, provides builtin functions and objects, etc. Garbage • Garbage collector required to Collector deallocate memory !3 A JavaScript Engine • Parser: entrypoint for script execution, usually emits custom Parser bytecode JIT Compiler • Bytecode then consumed by interpreter or JIT compiler • Executing code interacts with the Interpreter runtime which defines the Runtime representation of various data structures, provides builtin functions and objects, etc. Garbage • Garbage collector required to Collector deallocate memory !4 A JavaScript Engine • Parser: entrypoint for script execution, usually emits custom Parser bytecode JIT Compiler • Bytecode then consumed by interpreter or JIT compiler • Executing code interacts with the Interpreter runtime which defines the Runtime representation of various data structures, provides builtin functions and objects, etc. Garbage • Garbage collector required to Collector deallocate memory !5 A JavaScript Engine • Parser: entrypoint for script execution, usually emits custom Parser bytecode JIT Compiler • Bytecode then consumed by interpreter or JIT compiler • Executing code interacts with the Interpreter runtime which defines the Runtime representation of various data structures, provides builtin functions and objects, etc. Garbage • Garbage collector required to Collector deallocate memory !6 Agenda 1. Background: Runtime Parser • Object representation and Builtins JIT Compiler 2. JIT Compiler Internals • Problem: missing type information • Solution: "speculative" JIT Interpreter 3.
  • Handout – Dataflow Optimizations Assignment

    Handout – Dataflow Optimizations Assignment

    Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout – Dataflow Optimizations Assignment Tuesday, Mar 19 DUE: Thursday, Apr 4, 9:00 pm For this part of the project, you will add dataflow optimizations to your compiler. At the very least, you must implement global common subexpression elimination. The other optimizations listed below are optional. You may also wait until the next project to implement them if you are going to; there is no requirement to implement other dataflow optimizations in this project. We list them here as suggestions since past winners of the compiler derby typically implement each of these optimizations in some form. You are free to implement any other optimizations you wish. Note that you will be implementing register allocation for the next project, so you don’t need to concern yourself with it now. Global CSE (Common Subexpression Elimination): Identification and elimination of redun- dant expressions using the algorithm described in lecture (based on available-expression anal- ysis). See §8.3 and §13.1 of the Whale book, §10.6 and §10.7 in the Dragon book, and §17.2 in the Tiger book. Global Constant Propagation and Folding: Compile-time interpretation of expressions whose operands are compile time constants. See the algorithm described in §12.1 of the Whale book. Global Copy Propagation: Given a “copy” assignment like x = y , replace uses of x by y when legal (the use must be reached by only this def, and there must be no modification of y on any path from the def to the use).
  • 1 More Register Allocation Interference Graph Allocators

    1 More Register Allocation Interference Graph Allocators

    More Register Allocation Last time – Register allocation – Global allocation via graph coloring Today – More register allocation – Clarifications from last time – Finish improvements on basic graph coloring concept – Procedure calls – Interprocedural CS553 Lecture Register Allocation II 2 Interference Graph Allocators Chaitin Briggs CS553 Lecture Register Allocation II 3 1 Coalescing Move instructions – Code generation can produce unnecessary move instructions mov t1, t2 – If we can assign t1 and t2 to the same register, we can eliminate the move Idea – If t1 and t2 are not connected in the interference graph, coalesce them into a single variable Problem – Coalescing can increase the number of edges and make a graph uncolorable – Limit coalescing coalesce to avoid uncolorable t1 t2 t12 graphs CS553 Lecture Register Allocation II 4 Coalescing Logistics Rule – If the virtual registers s1 and s2 do not interfere and there is a copy statement s1 = s2 then s1 and s2 can be coalesced – Steps – SSA – Find webs – Virtual registers – Interference graph – Coalesce CS553 Lecture Register Allocation II 5 2 Example (Apply Chaitin algorithm) Attempt to 3-color this graph ( , , ) Stack: Weighted order: a1 b d e c a1 b a e c 2 a2 b a1 c e d a2 d The example indicates that nodes are visited in increasing weight order. Chaitin and Briggs visit nodes in an arbitrary order. CS553 Lecture Register Allocation II 6 Example (Apply Briggs Algorithm) Attempt to 2-color this graph ( , ) Stack: Weighted order: a1 b d e c a1 b* a e c 2 a2* b a1* c e* d a2 d * blocked node CS553 Lecture Register Allocation II 7 3 Spilling (Original CFG and Interference Graph) a1 := ..
  • Comparative Studies of Programming Languages; Course Lecture Notes

    Comparative Studies of Programming Languages; Course Lecture Notes

    Comparative Studies of Programming Languages, COMP6411 Lecture Notes, Revision 1.9 Joey Paquet Serguei A. Mokhov (Eds.) August 5, 2010 arXiv:1007.2123v6 [cs.PL] 4 Aug 2010 2 Preface Lecture notes for the Comparative Studies of Programming Languages course, COMP6411, taught at the Department of Computer Science and Software Engineering, Faculty of Engineering and Computer Science, Concordia University, Montreal, QC, Canada. These notes include a compiled book of primarily related articles from the Wikipedia, the Free Encyclopedia [24], as well as Comparative Programming Languages book [7] and other resources, including our own. The original notes were compiled by Dr. Paquet [14] 3 4 Contents 1 Brief History and Genealogy of Programming Languages 7 1.1 Introduction . 7 1.1.1 Subreferences . 7 1.2 History . 7 1.2.1 Pre-computer era . 7 1.2.2 Subreferences . 8 1.2.3 Early computer era . 8 1.2.4 Subreferences . 8 1.2.5 Modern/Structured programming languages . 9 1.3 References . 19 2 Programming Paradigms 21 2.1 Introduction . 21 2.2 History . 21 2.2.1 Low-level: binary, assembly . 21 2.2.2 Procedural programming . 22 2.2.3 Object-oriented programming . 23 2.2.4 Declarative programming . 27 3 Program Evaluation 33 3.1 Program analysis and translation phases . 33 3.1.1 Front end . 33 3.1.2 Back end . 34 3.2 Compilation vs. interpretation . 34 3.2.1 Compilation . 34 3.2.2 Interpretation . 36 3.2.3 Subreferences . 37 3.3 Type System . 38 3.3.1 Type checking . 38 3.4 Memory management .
  • Introduction Inline Expansion

    Introduction Inline Expansion

    CSc 553 Principles of Compilation Introduction 29 : Optimization IV Department of Computer Science University of Arizona [email protected] Copyright c 2011 Christian Collberg Inline Expansion I The most important and popular inter-procedural optimization is inline expansion, that is replacing the call of a procedure Inline Expansion with the procedure’s body. Why would you want to perform inlining? There are several reasons: 1 There are a number of things that happen when a procedure call is made: 1 evaluate the arguments of the call, 2 push the arguments onto the stack or move them to argument transfer registers, 3 save registers that contain live values and that might be trashed by the called routine, 4 make the jump to the called routine, Inline Expansion II Inline Expansion III 1 continued.. 3 5 make the jump to the called routine, ... This is the result of programming with abstract data types. 6 set up an activation record, Hence, there is often very little opportunity for optimization. 7 execute the body of the called routine, However, when inlining is performed on a sequence of 8 return back to the callee, possibly returning a result, procedure calls, the code from the bodies of several procedures 9 deallocate the activation record. is combined, opening up a larger scope for optimization. 2 Many of these actions don’t have to be performed if we inline There are problems, of course. Obviously, in most cases the the callee in the caller, and hence much of the overhead size of the procedure call code will be less than the size of the associated with procedure calls is optimized away.
  • Feasibility of Optimizations Requiring Bounded Treewidth in a Data Flow Centric Intermediate Representation

    Feasibility of Optimizations Requiring Bounded Treewidth in a Data Flow Centric Intermediate Representation

    Feasibility of Optimizations Requiring Bounded Treewidth in a Data Flow Centric Intermediate Representation Sigve Sjømæling Nordgaard and Jan Christian Meyer Department of Computer Science, NTNU Trondheim, Norway Abstract Data flow analyses are instrumental to effective compiler optimizations, and are typically implemented by extracting implicit data flow information from traversals of a control flow graph intermediate representation. The Regionalized Value State Dependence Graph is an alternative intermediate representation, which represents a program in terms of its data flow dependencies, leaving control flow implicit. Several analyses that enable compiler optimizations reduce to NP-Complete graph problems in general, but admit linear time solutions if the graph’s treewidth is limited. In this paper, we investigate the treewidth of application benchmarks and synthetic programs, in order to identify program features which cause the treewidth of its data flow graph to increase, and assess how they may appear in practical software. We find that increasing numbers of live variables cause unbounded growth in data flow graph treewidth, but this can ordinarily be remedied by modular program design, and monolithic programs that exceed a given bound can be efficiently detected using an approximate treewidth heuristic. 1 Introduction Effective program optimizations depend on an intermediate representation that permits a compiler to analyze the data and control flow semantics of the source program, so as to enable transformations without altering the result. The information from analyses such as live variables, available expressions, and reaching definitions pertains to data flow. The classical method to obtain it is to iteratively traverse a control flow graph, where program execution paths are explicitly encoded and data movement is implicit.
  • Iterative-Free Program Analysis

    Iterative-Free Program Analysis

    Iterative-Free Program Analysis Mizuhito Ogawa†∗ Zhenjiang Hu‡∗ Isao Sasano† [email protected] [email protected] [email protected] †Japan Advanced Institute of Science and Technology, ‡The University of Tokyo, and ∗Japan Science and Technology Corporation, PRESTO Abstract 1 Introduction Program analysis is the heart of modern compilers. Most control Program analysis is the heart of modern compilers. Most control flow analyses are reduced to the problem of finding a fixed point flow analyses are reduced to the problem of finding a fixed point in a in a certain transition system, and such fixed point is commonly certain transition system. Ordinary method to compute a fixed point computed through an iterative procedure that repeats tracing until is an iterative procedure that repeats tracing until convergence. convergence. Our starting observation is that most programs (without spaghetti This paper proposes a new method to analyze programs through re- GOTO) have quite well-structured control flow graphs. This fact cursive graph traversals instead of iterative procedures, based on is formally characterized in terms of tree width of a graph [25]. the fact that most programs (without spaghetti GOTO) have well- Thorup showed that control flow graphs of GOTO-free C programs structured control flow graphs, graphs with bounded tree width. have tree width at most 6 [33], and recent empirical study shows Our main techniques are; an algebraic construction of a control that control flow graphs of most Java programs have tree width at flow graph, called SP Term, which enables control flow analysis most 3 (though in general it can be arbitrary large) [16].
  • PGI Compilers

    PGI Compilers

    USER'S GUIDE FOR X86-64 CPUS Version 2019 TABLE OF CONTENTS Preface............................................................................................................ xii Audience Description......................................................................................... xii Compatibility and Conformance to Standards............................................................xii Organization................................................................................................... xiii Hardware and Software Constraints.......................................................................xiv Conventions.................................................................................................... xiv Terms............................................................................................................ xv Related Publications.........................................................................................xvii Chapter 1. Getting Started.....................................................................................1 1.1. Overview................................................................................................... 1 1.2. Creating an Example..................................................................................... 2 1.3. Invoking the Command-level PGI Compilers......................................................... 2 1.3.1. Command-line Syntax...............................................................................2 1.3.2. Command-line Options............................................................................
  • Dataflow Optimizations

    Dataflow Optimizations

    Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2010 Handout – Dataflow Optimizations Assignment Tuesday, Mar 16 DUE: Thursday, Apr 1 (11:59pm) For this part of the project, you will add dataflow optimizations to your compiler. At the very least, you must implement global common subexpression elimination. The other optimizations listed below are optional. We list them here as suggestions since past winners of the compiler derby typically implement each of these optimizations in some form. You are free to implement any other optimizations you wish. Note that you will be implementing register allocation for the next project, so you don’t need to concern yourself with it now. Global CSE (Common Subexpression Elimination): Identification and elimination of redun­ dant expressions using the algorithm described in lecture (based on available-expression anal­ ysis). See §8.3 and §13.1 of the Whale book, §10.6 and §10.7 in the Dragon book, and §17.2 in the Tiger book. Global Constant Propagation and Folding: Compile-time interpretation of expressions whose operands are compile time constants. See the algorithm described in §12.1 of the Whale book. Global Copy Propagation: Given a “copy” assignment like x = y , replace uses of x by y when legal (the use must be reached by only this def, and there must be no modification of y on any path from the def to the use). See §12.5 of the Whale book and §10.7 of the Dragon book. Loop-invariant Code Motion (code hoisting): Moving invariant code from within a loop to a block prior to that loop.
  • Register Allocation by Puzzle Solving

    Register Allocation by Puzzle Solving

    University of California Los Angeles Register Allocation by Puzzle Solving A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Computer Science by Fernando Magno Quint˜aoPereira 2008 c Copyright by Fernando Magno Quint˜aoPereira 2008 The dissertation of Fernando Magno Quint˜ao Pereira is approved. Vivek Sarkar Todd Millstein Sheila Greibach Bruce Rothschild Jens Palsberg, Committee Chair University of California, Los Angeles 2008 ii to the Brazilian People iii Table of Contents 1 Introduction ................................ 1 2 Background ................................ 5 2.1 Introduction . 5 2.1.1 Irregular Architectures . 8 2.1.2 Pre-coloring . 8 2.1.3 Aliasing . 9 2.1.4 Some Register Allocation Jargon . 10 2.2 Different Register Allocation Approaches . 12 2.2.1 Register Allocation via Graph Coloring . 12 2.2.2 Linear Scan Register Allocation . 18 2.2.3 Register allocation via Integer Linear Programming . 21 2.2.4 Register allocation via Partitioned Quadratic Programming 22 2.2.5 Register allocation via Multi-Flow of Commodities . 24 2.3 SSA based register allocation . 25 2.3.1 The Advantages of SSA-Based Register Allocation . 29 2.4 NP-Completeness Results . 33 3 Puzzle Solving .............................. 37 3.1 Introduction . 37 3.2 Puzzles . 39 3.2.1 From Register File to Puzzle Board . 40 iv 3.2.2 From Program Variables to Puzzle Pieces . 42 3.2.3 Register Allocation and Puzzle Solving are Equivalent . 45 3.3 Solving Type-1 Puzzles . 46 3.3.1 A Visual Language of Puzzle Solving Programs . 46 3.3.2 Our Puzzle Solving Program .
  • Eliminating Scope and Selection Restrictions in Compiler Optimizations

    Eliminating Scope and Selection Restrictions in Compiler Optimizations

    ELIMINATING SCOPE AND SELECTION RESTRICTIONS IN COMPILER OPTIMIZATION SPYRIDON TRIANTAFYLLIS ADISSERTATION PRESENTED TO THE FACULTY OF PRINCETON UNIVERSITY IN CANDIDACY FOR THE DEGREE OF DOCTOR OF PHILOSOPHY RECOMMENDED FOR ACCEPTANCE BY THE DEPARTMENT OF COMPUTER SCIENCE SEPTEMBER 2006 c Copyright by Spyridon Triantafyllis, 2006. All Rights Reserved Abstract To meet the challenges presented by the performance requirements of modern architectures, compilers have been augmented with a rich set of aggressive optimizing transformations. However, the overall compilation model within which these transformations operate has remained fundamentally unchanged. This model imposes restrictions on these transforma- tions’ application, limiting their effectiveness. First, procedure-based compilation limits code transformations within a single proce- dure’s boundaries, which may not present an ideal optimization scope. Although aggres- sive inlining and interprocedural optimization can alleviate this problem, code growth and compile time considerations limit their applicability. Second, by applying a uniform optimization process on all codes, compilers cannot meet the particular optimization needs of each code segment. Although the optimization process is tailored by heuristics that attempt to a priori judge the effect of each transfor- mation on final code quality, the unpredictability of modern optimization routines and the complexity of the target architectures severely limit the accuracy of such predictions. This thesis focuses on removing these restrictions through two novel compilation frame- work modifications, Procedure Boundary Elimination (PBE) and Optimization-Space Ex- ploration (OSE). PBE forms compilation units independent of the original procedures. This is achieved by unifying the entire application into a whole-program control-flow graph, allowing the compiler to repartition this graph into free-form regions, making analysis and optimization routines able to operate on these generalized compilation units.