Register Allocation on the Intel® Itanium® Architecture

Total Page:16

File Type:pdf, Size:1020Kb

Register Allocation on the Intel® Itanium® Architecture Technisch-Naturwissenschaftliche Fakultät Register Allocation on the Intel® Itanium® Architecture DISSERTATION zur Erlangung des akademischen Grades Doktor im Doktoratsstudium der TECHNISCHEN WISSENSCHAFTEN Eingereicht von: Dipl.-Math. Gerolf Fritz Hoflehner Angefertigt am: Institut fuer Systemsoftware Beurteilung: o.Univ.-Prof. Dr. Dr.h.c. Hanspeter Mössenböck (Betreuung) a.Univ.-Prof. Dr. Andreas Krall San Jose, März 2010 1 Abstract Register allocators based on graph-coloring have been implemented in commercial and research compilers since Gregory Chaitin¶s and colleagues pioneering work in the early 1980¶s. A coloring allocator decides which live range (a program or compiler generated variable) is allocated a machine register (³allocation problem´). The Itanium processor architecture supports predicated code, control- and data speculation and a dynamic register stack. These features make the allocation problem more challenging. This thesis analyses and describes efficient extensions in a coloring allocator for the Itanium processor. • Predicated code: The thesis describes compile time efficient coloring methods in the presence of predicated instructions, without compromising run-time performance of a more elaborate allocator based on the predicate query system, PQS. In particular it classifies predicated live ranges and shows that classical register allocation techniques can be used effectively to engineer efficient coloring allocators for predicated code. When predicated code is generated from compiler control flow more expensive predicate analysis frameworks like PQS don¶t have to be employed. • Speculated code: The thesis describes a new method of efficiently allocating speculated live ranges avoiding spill code generated by a more conservative method. In particular the NaT propagation problem is solved efficiently. • Dynamic register stack: The thesis reviews methods to use the dynamic register stack effectively in particular regions with function calls and/or pipelined loops. • Scalable allocation: A generic problem of coloring allocators is that they can be slow for large candidate sets. This thesis proposes the scalable allocator as a generic coloring method capable of allocating effectively programs with large register candidates sets. The methods can also be used for parallel allocation e.g. on multi-core machines. The experimental results on the CPU2006 benchmark suite illustrate the effectiveness of new methods. Finally, the thesis reviews the development of coloring allocators since Chaitin. 2 Kurzfassung Gregory Chaitin und seine Kollegen haben um 1980 die Pionierarbeit für Registerallokation basierend auf Graphfärbung (ÄFarballokator³) geleistet. Diese Allokatoren entscheiden, welchen ÄLebensspannen³ (d.h. deklarierte oder vom Compiler erzeugte Variablen, engl. live ranges) Maschinenregister (ÄFarben³) zugeteilt werden (Allokationsproblem). Der Itanium-Prozessor unterstützt Instruktionen mit Prädikaten (bedingt ausführbare Instruktionen), Kontroll- und Datenspekulation sowie einen dynamischen Register Stack. Diese Eigenschaften erschweren die Lösung des Allokationsproblems. Die vorliegende Dissertation untersucht und beschreibt effiziente Erweiterungen in einem Farballokator für den Itanium-Prozessor. • Prädikate: Es werden effiziente Methoden (zur Übersetzerzeit) für Farballokatoren vorgestellt, die Lebenspannen in Code mit bedingt ausführbaren Instruktionen allokieren. Insbesondere werden Äprädikatierte³ Lebensspannen klassifiziert und es wird gezeigt, das klassische M ethoden zu einem effizienten Farballokator für diese Lebensspannen erweitert werden können. In einem Compiler kann die Allokation mit diesen Methoden genau so effizienten Code generieren wie aufwendigere Verfahren, insbesondere Verfahren, die das Äpredicate query system³ (PQS) benutzen. • Spekulation: Es wird eine neue Methode erläutert, die ± im Vergleich zu konservativen Verfahren ± Spill Code für spekulative Lebensspannen vermeiden kann. Inbesondere wird eine effiziente Lösung für das NaT Propagation Problem vorgestellt. • Dynamischer Register Stack: Es wird beschrieben, wie der dynamische Register Stack in Code mit Funktionsaufrufen oder Äpipelined³ Schleifen (engl. Äsoftware- pipeline loops³) effizient verwendet werden kann. • Skalierbare Allokation: Es wird der skalierbare Allokator vorgeschlagen für die Lösung Allokationsprobleme beliebiger Grösse. Skalierbare Allokation erlaubt insbesondere die Parallelisierung des Allokationsproblems und ist unabhängig von der Prozessor-Architektur. 3 Die experimentellen Resultate für die CPU2006 Benchmark Suite zeigt die Effizienz der vorgestellen Verfahren. Schließlich enthält diese Dissertation einen ausführlichen Überblick über die Forschungsergebnisse für Farballokatoren seit Chaitin. 4 Acknowledgements The tree with our fruits is watered by many. First, this work would not have been possible without Intel and many of its excellent engineers. Roland Kenner implemented the original version of the graph-coloring allocator and PQS. The idea to eliminate interferences for portions of a data speculated live range (Section 7.2) had been implemented in the compiler before 1999, the year I started at Intel. Daniel Lavery was my exceptional engineering manager for seven years, taking a lot of interest in and continuously challenging my thoughts on register allocation. His questions and curiosity helped shape the section on predicate-aware register allocation. I had the good fortune to benefit from many discussions and working with an outstanding team and colleagues: Howard Chen, Darshan Desai, Kalyan Muthukumar, Robyn Sampson and Sebastian Winkel. Alban Douillet and Alex Settle helped bringing to life the product version of the multiple alloc algorithms during their summer internships. The Intel Compiler Lab provided a great place to work and let me drive innovation amidst the challenges of tight product schedules. Second, I would like to thank Prof. Hanspeter Mössenböck for his patience and support. His kind personality, openness, and advice were inspiring. I cannot imagine how he could have been more supportive. Finally, and most importantly, I thank the woman in my life, Elisabeth Reinhold. Her push and loving support were always there when I needed it most. 5 Contents Abstract................................ ................................ ................................ ........................... 2 Kurzfassung ................................ ................................ ................................ .................... 3 Acknowledgements ................................ ................................ ................................ ......... 5 1 Introduction ................................ ................................ ................................ .............. 8 1.1 Compilers and Optimizations................................ ................................ ........... 8 1.2 Register Allocation based on Graph-Coloring ................................ .................. 8 1.3 Itanium Processor Family................................ ................................ ................. 9 1.4 Overview ................................ ................................ ................................ ......... 9 2 Background on IA-64................................ ................................ .............................. 11 2.1 IA-64 Instructions................................ ................................ .......................... 11 2.2 Predication................................ ................................ ................................ ..... 18 2.3 Architected Registers ................................ ................................ ..................... 21 2.4 Register Stack Frame ................................ ................................ ..................... 22 2.5 Register Spilling and Backing Store................................ ............................... 26 2.6 Speculation ................................ ................................ ................................ ....27 2.6.1 Control Speculation................................ ................................ .................... 27 2.6.2 Data Speculation ................................ ................................ ........................ 29 2.6.3 Combined Control- and Data Speculation................................ ................... 30 3 Review of Graph-Coloring based Register Allocation ................................ ............. 31 3.1 Foundations ................................ ................................ ................................ ...31 3.1.1 Chaitin-style Register Allocation................................ ................................ 31 3.1.2 Priority-based Register Allocation................................ .............................. 35 3.2 Worst-case Time and Space Analysis................................ ............................. 37 3.3 Developments ................................ ................................ ................................ 42 3.3.1 Spill Cost Reduction ................................ ................................ .................. 43 3.3.2 Scoping................................ ................................ ................................ ...... 47 3.3.3 Coalescing ................................ ................................ ................................ .49 3.3.4 Extensions................................ ................................ ................................ ..50 3.4 Alternative
Recommended publications
  • Control Flow Analysis in Scheme
    Control Flow Analysis in Scheme Olin Shivers Carnegie Mellon University [email protected] Abstract Fortran). Representative texts describing these techniques are [Dragon], and in more detail, [Hecht]. Flow analysis is perhaps Traditional ¯ow analysis techniques, such as the ones typically the chief tool in the optimising compiler writer's bag of tricks; employed by optimising Fortran compilers, do not work for an incomplete list of the problems that can be addressed with Scheme-like languages. This paper presents a ¯ow analysis ¯ow analysis includes global constant subexpression elimina- technique Ð control ¯ow analysis Ð which is applicable to tion, loop invariant detection, redundant assignment detection, Scheme-like languages. As a demonstration application, the dead code elimination, constant propagation, range analysis, information gathered by control ¯ow analysis is used to per- code hoisting, induction variable elimination, copy propaga- form a traditional ¯ow analysis problem, induction variable tion, live variable analysis, loop unrolling, and loop jamming. elimination. Extensions and limitations are discussed. However, these traditional ¯ow analysis techniques have The techniques presented in this paper are backed up by never successfully been applied to the Lisp family of computer working code. They are applicable not only to Scheme, but languages. This is a curious omission. The Lisp community also to related languages, such as Common Lisp and ML. has had suf®cient time to consider the problem. Flow analysis dates back at least to 1960, ([Dragon], pp. 516), and Lisp is 1 The Task one of the oldest computer programming languages currently in use, rivalled only by Fortran and COBOL. Flow analysis is a traditional optimising compiler technique Indeed, the Lisp community has long been concerned with for determining useful information about a program at compile the execution speed of their programs.
    [Show full text]
  • Language and Compiler Support for Dynamic Code Generation by Massimiliano A
    Language and Compiler Support for Dynamic Code Generation by Massimiliano A. Poletto S.B., Massachusetts Institute of Technology (1995) M.Eng., Massachusetts Institute of Technology (1995) Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 1999 © Massachusetts Institute of Technology 1999. All rights reserved. A u th or ............................................................................ Department of Electrical Engineering and Computer Science June 23, 1999 Certified by...............,. ...*V .,., . .* N . .. .*. *.* . -. *.... M. Frans Kaashoek Associate Pro essor of Electrical Engineering and Computer Science Thesis Supervisor A ccepted by ................ ..... ............ ............................. Arthur C. Smith Chairman, Departmental CommitteA on Graduate Students me 2 Language and Compiler Support for Dynamic Code Generation by Massimiliano A. Poletto Submitted to the Department of Electrical Engineering and Computer Science on June 23, 1999, in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract Dynamic code generation, also called run-time code generation or dynamic compilation, is the cre- ation of executable code for an application while that application is running. Dynamic compilation can significantly improve the performance of software by giving the compiler access to run-time infor- mation that is not available to a traditional static compiler. A well-designed programming interface to dynamic compilation can also simplify the creation of important classes of computer programs. Until recently, however, no system combined efficient dynamic generation of high-performance code with a powerful and portable language interface. This thesis describes a system that meets these requirements, and discusses several applications of dynamic compilation.
    [Show full text]
  • Strength Reduction of Induction Variables and Pointer Analysis – Induction Variable Elimination
    Loop optimizations • Optimize loops – Loop invariant code motion [last time] Loop Optimizations – Strength reduction of induction variables and Pointer Analysis – Induction variable elimination CS 412/413 Spring 2008 Introduction to Compilers 1 CS 412/413 Spring 2008 Introduction to Compilers 2 Strength Reduction Induction Variables • Basic idea: replace expensive operations (multiplications) with • An induction variable is a variable in a loop, cheaper ones (additions) in definitions of induction variables whose value is a function of the loop iteration s = 3*i+1; number v = f(i) while (i<10) { while (i<10) { j = 3*i+1; //<i,3,1> j = s; • In compilers, this a linear function: a[j] = a[j] –2; a[j] = a[j] –2; i = i+2; i = i+2; f(i) = c*i + d } s= s+6; } •Observation:linear combinations of linear • Benefit: cheaper to compute s = s+6 than j = 3*i functions are linear functions – s = s+6 requires an addition – Consequence: linear combinations of induction – j = 3*i requires a multiplication variables are induction variables CS 412/413 Spring 2008 Introduction to Compilers 3 CS 412/413 Spring 2008 Introduction to Compilers 4 1 Families of Induction Variables Representation • Basic induction variable: a variable whose only definition in the • Representation of induction variables in family i by triples: loop body is of the form – Denote basic induction variable i by <i, 1, 0> i = i + c – Denote induction variable k=i*a+b by triple <i, a, b> where c is a loop-invariant value • Derived induction variables: Each basic induction variable i defines
    [Show full text]
  • CSE 401/M501 – Compilers
    CSE 401/M501 – Compilers Dataflow Analysis Hal Perkins Autumn 2018 UW CSE 401/M501 Autumn 2018 O-1 Agenda • Dataflow analysis: a framework and algorithm for many common compiler analyses • Initial example: dataflow analysis for common subexpression elimination • Other analysis problems that work in the same framework • Some of these are the same optimizations we’ve seen, but more formally and with details UW CSE 401/M501 Autumn 2018 O-2 Common Subexpression Elimination • Goal: use dataflow analysis to A find common subexpressions m = a + b n = a + b • Idea: calculate available expressions at beginning of B C p = c + d q = a + b each basic block r = c + d r = c + d • Avoid re-evaluation of an D E available expression – use a e = b + 18 e = a + 17 copy operation s = a + b t = c + d u = e + f u = e + f – Simple inside a single block; more complex dataflow analysis used F across bocks v = a + b w = c + d x = e + f G y = a + b z = c + d UW CSE 401/M501 Autumn 2018 O-3 “Available” and Other Terms • An expression e is defined at point p in the CFG if its value a+b is computed at p defined t1 = a + b – Sometimes called definition site ... • An expression e is killed at point p if one of its operands a+b is defined at p available t10 = a + b – Sometimes called kill site … • An expression e is available at point p if every path a+b leading to p contains a prior killed b = 7 definition of e and e is not … killed between that definition and p UW CSE 401/M501 Autumn 2018 O-4 Available Expression Sets • To compute available expressions, for each block
    [Show full text]
  • Loop Transformations and Parallelization
    Loop transformations and parallelization Claude Tadonki LAL/CNRS/IN2P3 University of Paris-Sud [email protected] December 2010 Claude Tadonki Loop transformations and parallelization C. Tadonki – Loop transformations Introduction Most of the time, the most time consuming part of a program is on loops. Thus, loops optimization is critical in high performance computing. Depending on the target architecture, the goal of loops transformations are: improve data reuse and data locality efficient use of memory hierarchy reducing overheads associated with executing loops instructions pipeline maximize parallelism Loop transformations can be performed at different levels by the programmer, the compiler, or specialized tools. At high level, some well known transformations are commonly considered: loop interchange loop (node) splitting loop unswitching loop reversal loop fusion loop inversion loop skewing loop fission loop vectorization loop blocking loop unrolling loop parallelization Claude Tadonki Loop transformations and parallelization C. Tadonki – Loop transformations Dependence analysis Extract and analyze the dependencies of a computation from its polyhedral model is a fundamental step toward loop optimization or scheduling. Definition For a given variable V and given indexes I1, I2, if the computation of X(I1) requires the value of X(I2), then I1 ¡ I2 is called a dependence vector for variable V . Drawing all the dependence vectors within the computation polytope yields the so-called dependencies diagram. Example The dependence vectors are (1; 0); (0; 1); (¡1; 1). Claude Tadonki Loop transformations and parallelization C. Tadonki – Loop transformations Scheduling Definition The computation on the entire domain of a given loop can be performed following any valid schedule.A timing function tV for variable V yields a valid schedule if and only if t(x) > t(x ¡ d); 8d 2 DV ; (1) where DV is the set of all dependence vectors for variable V .
    [Show full text]
  • Foundations of Scientific Research
    2012 FOUNDATIONS OF SCIENTIFIC RESEARCH N. M. Glazunov National Aviation University 25.11.2012 CONTENTS Preface………………………………………………….…………………….….…3 Introduction……………………………………………….…..........................……4 1. General notions about scientific research (SR)……………….……….....……..6 1.1. Scientific method……………………………….………..……..……9 1.2. Basic research…………………………………………...……….…10 1.3. Information supply of scientific research……………..….………..12 2. Ontologies and upper ontologies……………………………….…..…….…….16 2.1. Concepts of Foundations of Research Activities 2.2. Ontology components 2.3. Ontology for the visualization of a lecture 3. Ontologies of object domains………………………………..………………..19 3.1. Elements of the ontology of spaces and symmetries 3.1.1. Concepts of electrodynamics and classical gauge theory 4. Examples of Research Activity………………….……………………………….21 4.1. Scientific activity in arithmetics, informatics and discrete mathematics 4.2. Algebra of logic and functions of the algebra of logic 4.3. Function of the algebra of logic 5. Some Notions of the Theory of Finite and Discrete Sets…………………………25 6. Algebraic Operations and Algebraic Structures……………………….………….26 7. Elements of the Theory of Graphs and Nets…………………………… 42 8. Scientific activity on the example “Information and its investigation”……….55 9. Scientific research in Artificial Intelligence……………………………………..59 10. Compilers and compilation…………………….......................................……69 11. Objective, Concepts and History of Computer security…….………………..93 12. Methodological and categorical apparatus of scientific research……………114 13. Methodology and methods of scientific research…………………………….116 13.1. Methods of theoretical level of research 13.1.1. Induction 13.1.2. Deduction 13.2. Methods of empirical level of research 14. Scientific idea and significance of scientific research………………………..119 15. Forms of scientific knowledge organization and principles of SR………….121 1 15.1. Forms of scientific knowledge 15.2.
    [Show full text]
  • Hybrid Analysis of Memory References And
    HYBRID ANALYSIS OF MEMORY REFERENCES AND ITS APPLICATION TO AUTOMATIC PARALLELIZATION A Dissertation by SILVIUS VASILE RUS Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY December 2006 Major Subject: Computer Science HYBRID ANALYSIS OF MEMORY REFERENCES AND ITS APPLICATION TO AUTOMATIC PARALLELIZATION A Dissertation by SILVIUS VASILE RUS Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Approved by: Chair of Committee, Lawrence Rauchwerger Committee Members, Nancy Amato Narasimha Reddy Vivek Sarin Head of Department, Valerie Taylor December 2006 Major Subject: Computer Science iii ABSTRACT Hybrid Analysis of Memory References and Its Application to Automatic Parallelization. (December 2006) Silvius Vasile Rus, B.S., Babes-Bolyai University Chair of Advisory Committee: Dr. Lawrence Rauchwerger Executing sequential code in parallel on a multithreaded machine has been an elusive goal of the academic and industrial research communities for many years. It has recently become more important due to the widespread introduction of multi- cores in PCs. Automatic multithreading has not been achieved because classic, static compiler analysis was not powerful enough and program behavior was found to be, in many cases, input dependent. Speculative thread level parallelization was a welcome avenue for advancing parallelization coverage but its performance was not always op- timal due to the sometimes unnecessary overhead of checking every dynamic memory reference. In this dissertation we introduce a novel analysis technique, Hybrid Analysis, which unifies static and dynamic memory reference techniques into a seamless com- piler framework which extracts almost maximum available parallelism from scientific codes and incurs close to the minimum necessary run time overhead.
    [Show full text]
  • Optimization Compiler Passes Optimizations the Role of The
    Analysis Synthesis of input program of output program Compiler (front -end) (back -end) character Passes stream Intermediate Lexical Analysis Code Generation token intermediate Optimization stream form Syntactic Analysis Optimization abstract intermediate Before and after generating machine syntax tree form code, devote one or more passes over the program to “improve” code quality Semantic Analysis Code Generation annotated target AST language 2 The Role of the Optimizer Optimizations • The compiler can implement a procedure in many ways • The optimizer tries to find an implementation that is “better” – Speed, code size, data space, … Identify inefficiencies in intermediate or target code Replace with equivalent but better sequences To accomplish this, it • Analyzes the code to derive knowledge about run-time • equivalent = "has the same externally visible behavior – Data-flow analysis, pointer disambiguation, … behavior" – General term is “static analysis” Target-independent optimizations best done on IL • Uses that knowledge in an attempt to improve the code – Literally hundreds of transformations have been proposed code – Large amount of overlap between them Target-dependent optimizations best done on Nothing “optimal” about optimization target code • Proofs of optimality assume restrictive & unrealistic conditions • Better goal is to “usually improve” 3 4 Traditional Three-pass Compiler The Optimizer (or Middle End) IR Source Front Middle IR Back Machine IROpt IROpt IROpt ... IROpt IR Code End End End code 1 2 3 n Errors Errors Modern
    [Show full text]
  • CS 4120 Introduction to Compilers Andrew Myers Cornell University
    CS 4120 Introduction to Compilers Andrew Myers Cornell University Lecture 19: Introduction to Optimization 2020 March 9 Optimization • Next topic: how to generate better code through optimization. • Tis course covers the most valuable and straightforward optimizations – much more to learn! –Other sources: • Appel, Dragon book • Muchnick has 10 chapters of optimization techniques • Cooper and Torczon 2 CS 4120 Introduction to Compilers How fast can you go? 10000 direct source code interpreters (parsing included!) 1000 tokenized program interpreters (BASIC, Tcl) AST interpreters (Perl 4) 100 bytecode interpreters (Java, Perl 5, OCaml, Python) call-threaded interpreters 10 pointer-threaded interpreters (FORTH) simple code generation (PA4, JIT) 1 register allocation naive assembly code local optimization global optimization FPGA expert assembly code GPU 0.1 3 CS 4120 Introduction to Compilers Goal of optimization • Compile clean, modular, high-level source code to expert assembly-code performance. • Can’t change meaning of program to behavior not allowed by source. • Different goals: –space optimization: reduce memory use –time optimization: reduce execution time –power optimization: reduce power usage 4 CS 4120 Introduction to Compilers Why do we need optimization? • Programmers may write suboptimal code for clarity. • Source language may make it hard to avoid redundant computation a[i][j] = a[i][j] + 1 • Architectural independence • Modern architectures assume optimization—hard to optimize by hand! 5 CS 4120 Introduction to Compilers Where to optimize? • Usual goal: improve time performance • But: many optimizations trade off space vs. time. • Example: loop unrolling replaces a loop body with N copies. –Increasing code space speeds up one loop but slows rest of program down a little.
    [Show full text]
  • Compiler Transformations for High-Performance Computing
    Compiler Transformations for High-Performance Computing DAVID F. BACON, SUSAN L. GRAHAM, AND OLIVER J. SHARP Computer Sc~ence Diviszon, Unwersbty of California, Berkeley, California 94720 In the last three decades a large number of compiler transformations for optimizing programs have been implemented. Most optimization for uniprocessors reduce the number of instructions executed by the program using transformations based on the analysis of scalar quantities and data-flow techniques. In contrast, optimization for high-performance superscalar, vector, and parallel processors maximize parallelism and memory locality with transformations that rely on tracking the properties of arrays using loop dependence analysis. This survey is a comprehensive overview of the important high-level program restructuring techniques for imperative languages such as C and Fortran. Transformations for both sequential and various types of parallel architectures are covered in depth. We describe the purpose of each traneformation, explain how to determine if it is legal, and give an example of its application. Programmers wishing to enhance the performance of their code can use this survey to improve their understanding of the optimization that compilers can perform, or as a reference for techniques to be applied manually. Students can obtain an overview of optimizing compiler technology. Compiler writers can use this survey as a reference for most of the important optimizations developed to date, and as a bibliographic reference for the details of each optimization.
    [Show full text]
  • Compiler Construction
    Compiler construction PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 10 Dec 2011 02:23:02 UTC Contents Articles Introduction 1 Compiler construction 1 Compiler 2 Interpreter 10 History of compiler writing 14 Lexical analysis 22 Lexical analysis 22 Regular expression 26 Regular expression examples 37 Finite-state machine 41 Preprocessor 51 Syntactic analysis 54 Parsing 54 Lookahead 58 Symbol table 61 Abstract syntax 63 Abstract syntax tree 64 Context-free grammar 65 Terminal and nonterminal symbols 77 Left recursion 79 Backus–Naur Form 83 Extended Backus–Naur Form 86 TBNF 91 Top-down parsing 91 Recursive descent parser 93 Tail recursive parser 98 Parsing expression grammar 100 LL parser 106 LR parser 114 Parsing table 123 Simple LR parser 125 Canonical LR parser 127 GLR parser 129 LALR parser 130 Recursive ascent parser 133 Parser combinator 140 Bottom-up parsing 143 Chomsky normal form 148 CYK algorithm 150 Simple precedence grammar 153 Simple precedence parser 154 Operator-precedence grammar 156 Operator-precedence parser 159 Shunting-yard algorithm 163 Chart parser 173 Earley parser 174 The lexer hack 178 Scannerless parsing 180 Semantic analysis 182 Attribute grammar 182 L-attributed grammar 184 LR-attributed grammar 185 S-attributed grammar 185 ECLR-attributed grammar 186 Intermediate language 186 Control flow graph 188 Basic block 190 Call graph 192 Data-flow analysis 195 Use-define chain 201 Live variable analysis 204 Reaching definition 206 Three address
    [Show full text]
  • Codegeneratoren
    Codegeneratoren Sebastian Geiger Twitter: @lanoxx, Email: [email protected] January 26, 2015 Abstract This is a very short summary of the topics that are tought in “Codegeneratoren“. Code generators are the backend part of a compiler that deal with the conversion of the machine independent intermediate representation into machine dependent instructions and the various optimizations on the resulting code. The first two Sections deal with prerequisits that are required for the later parts. Section 1 discusses computer architectures in order to give a better understanding of how processors work and why di↵erent optimizations are possible. Section 2 discusses some of the higher level optimizations that are done before the code generator does its work. Section 3 discusses instruction selection, which conserns the actual conversion of the intermediate representation into machine dependent instructions. The later chapters all deal with di↵erent kinds of optimizations that are possible on the resulting code. I have written this document to help my self studing for this subject. If you find this document useful or if you find errors or mistakes then please send me an email. Contents 1 Computer architectures 2 1.1 Microarchitectures ............................................... 2 1.2 InstructionSetArchitectures ......................................... 3 2 Optimizations 3 2.1 StaticSingleAssignment............................................ 3 2.2 Graphstructuresandmemorydependencies . 4 2.3 Machineindependentoptimizations. 4 3 Instruction selection 5 4 Register allocation 6 5 Coallescing 7 6 Optimal register allocation using integer linear programming 7 7 Interproceedural register allocation 8 7.1 Global Register allocation at link time (Annotation based) . 8 7.2 Register allocation across procedure and module bounaries (Web) . 8 8 Instruction Scheduling 9 8.1 Phaseorderingproblem:...........................................
    [Show full text]