Register Allocation on the Intel® Itanium® Architecture
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Control Flow Analysis in Scheme
Control Flow Analysis in Scheme Olin Shivers Carnegie Mellon University [email protected] Abstract Fortran). Representative texts describing these techniques are [Dragon], and in more detail, [Hecht]. Flow analysis is perhaps Traditional ¯ow analysis techniques, such as the ones typically the chief tool in the optimising compiler writer's bag of tricks; employed by optimising Fortran compilers, do not work for an incomplete list of the problems that can be addressed with Scheme-like languages. This paper presents a ¯ow analysis ¯ow analysis includes global constant subexpression elimina- technique Ð control ¯ow analysis Ð which is applicable to tion, loop invariant detection, redundant assignment detection, Scheme-like languages. As a demonstration application, the dead code elimination, constant propagation, range analysis, information gathered by control ¯ow analysis is used to per- code hoisting, induction variable elimination, copy propaga- form a traditional ¯ow analysis problem, induction variable tion, live variable analysis, loop unrolling, and loop jamming. elimination. Extensions and limitations are discussed. However, these traditional ¯ow analysis techniques have The techniques presented in this paper are backed up by never successfully been applied to the Lisp family of computer working code. They are applicable not only to Scheme, but languages. This is a curious omission. The Lisp community also to related languages, such as Common Lisp and ML. has had suf®cient time to consider the problem. Flow analysis dates back at least to 1960, ([Dragon], pp. 516), and Lisp is 1 The Task one of the oldest computer programming languages currently in use, rivalled only by Fortran and COBOL. Flow analysis is a traditional optimising compiler technique Indeed, the Lisp community has long been concerned with for determining useful information about a program at compile the execution speed of their programs. -
Language and Compiler Support for Dynamic Code Generation by Massimiliano A
Language and Compiler Support for Dynamic Code Generation by Massimiliano A. Poletto S.B., Massachusetts Institute of Technology (1995) M.Eng., Massachusetts Institute of Technology (1995) Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 1999 © Massachusetts Institute of Technology 1999. All rights reserved. A u th or ............................................................................ Department of Electrical Engineering and Computer Science June 23, 1999 Certified by...............,. ...*V .,., . .* N . .. .*. *.* . -. *.... M. Frans Kaashoek Associate Pro essor of Electrical Engineering and Computer Science Thesis Supervisor A ccepted by ................ ..... ............ ............................. Arthur C. Smith Chairman, Departmental CommitteA on Graduate Students me 2 Language and Compiler Support for Dynamic Code Generation by Massimiliano A. Poletto Submitted to the Department of Electrical Engineering and Computer Science on June 23, 1999, in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract Dynamic code generation, also called run-time code generation or dynamic compilation, is the cre- ation of executable code for an application while that application is running. Dynamic compilation can significantly improve the performance of software by giving the compiler access to run-time infor- mation that is not available to a traditional static compiler. A well-designed programming interface to dynamic compilation can also simplify the creation of important classes of computer programs. Until recently, however, no system combined efficient dynamic generation of high-performance code with a powerful and portable language interface. This thesis describes a system that meets these requirements, and discusses several applications of dynamic compilation. -
Strength Reduction of Induction Variables and Pointer Analysis – Induction Variable Elimination
Loop optimizations • Optimize loops – Loop invariant code motion [last time] Loop Optimizations – Strength reduction of induction variables and Pointer Analysis – Induction variable elimination CS 412/413 Spring 2008 Introduction to Compilers 1 CS 412/413 Spring 2008 Introduction to Compilers 2 Strength Reduction Induction Variables • Basic idea: replace expensive operations (multiplications) with • An induction variable is a variable in a loop, cheaper ones (additions) in definitions of induction variables whose value is a function of the loop iteration s = 3*i+1; number v = f(i) while (i<10) { while (i<10) { j = 3*i+1; //<i,3,1> j = s; • In compilers, this a linear function: a[j] = a[j] –2; a[j] = a[j] –2; i = i+2; i = i+2; f(i) = c*i + d } s= s+6; } •Observation:linear combinations of linear • Benefit: cheaper to compute s = s+6 than j = 3*i functions are linear functions – s = s+6 requires an addition – Consequence: linear combinations of induction – j = 3*i requires a multiplication variables are induction variables CS 412/413 Spring 2008 Introduction to Compilers 3 CS 412/413 Spring 2008 Introduction to Compilers 4 1 Families of Induction Variables Representation • Basic induction variable: a variable whose only definition in the • Representation of induction variables in family i by triples: loop body is of the form – Denote basic induction variable i by <i, 1, 0> i = i + c – Denote induction variable k=i*a+b by triple <i, a, b> where c is a loop-invariant value • Derived induction variables: Each basic induction variable i defines -
CSE 401/M501 – Compilers
CSE 401/M501 – Compilers Dataflow Analysis Hal Perkins Autumn 2018 UW CSE 401/M501 Autumn 2018 O-1 Agenda • Dataflow analysis: a framework and algorithm for many common compiler analyses • Initial example: dataflow analysis for common subexpression elimination • Other analysis problems that work in the same framework • Some of these are the same optimizations we’ve seen, but more formally and with details UW CSE 401/M501 Autumn 2018 O-2 Common Subexpression Elimination • Goal: use dataflow analysis to A find common subexpressions m = a + b n = a + b • Idea: calculate available expressions at beginning of B C p = c + d q = a + b each basic block r = c + d r = c + d • Avoid re-evaluation of an D E available expression – use a e = b + 18 e = a + 17 copy operation s = a + b t = c + d u = e + f u = e + f – Simple inside a single block; more complex dataflow analysis used F across bocks v = a + b w = c + d x = e + f G y = a + b z = c + d UW CSE 401/M501 Autumn 2018 O-3 “Available” and Other Terms • An expression e is defined at point p in the CFG if its value a+b is computed at p defined t1 = a + b – Sometimes called definition site ... • An expression e is killed at point p if one of its operands a+b is defined at p available t10 = a + b – Sometimes called kill site … • An expression e is available at point p if every path a+b leading to p contains a prior killed b = 7 definition of e and e is not … killed between that definition and p UW CSE 401/M501 Autumn 2018 O-4 Available Expression Sets • To compute available expressions, for each block -
Loop Transformations and Parallelization
Loop transformations and parallelization Claude Tadonki LAL/CNRS/IN2P3 University of Paris-Sud [email protected] December 2010 Claude Tadonki Loop transformations and parallelization C. Tadonki – Loop transformations Introduction Most of the time, the most time consuming part of a program is on loops. Thus, loops optimization is critical in high performance computing. Depending on the target architecture, the goal of loops transformations are: improve data reuse and data locality efficient use of memory hierarchy reducing overheads associated with executing loops instructions pipeline maximize parallelism Loop transformations can be performed at different levels by the programmer, the compiler, or specialized tools. At high level, some well known transformations are commonly considered: loop interchange loop (node) splitting loop unswitching loop reversal loop fusion loop inversion loop skewing loop fission loop vectorization loop blocking loop unrolling loop parallelization Claude Tadonki Loop transformations and parallelization C. Tadonki – Loop transformations Dependence analysis Extract and analyze the dependencies of a computation from its polyhedral model is a fundamental step toward loop optimization or scheduling. Definition For a given variable V and given indexes I1, I2, if the computation of X(I1) requires the value of X(I2), then I1 ¡ I2 is called a dependence vector for variable V . Drawing all the dependence vectors within the computation polytope yields the so-called dependencies diagram. Example The dependence vectors are (1; 0); (0; 1); (¡1; 1). Claude Tadonki Loop transformations and parallelization C. Tadonki – Loop transformations Scheduling Definition The computation on the entire domain of a given loop can be performed following any valid schedule.A timing function tV for variable V yields a valid schedule if and only if t(x) > t(x ¡ d); 8d 2 DV ; (1) where DV is the set of all dependence vectors for variable V . -
Foundations of Scientific Research
2012 FOUNDATIONS OF SCIENTIFIC RESEARCH N. M. Glazunov National Aviation University 25.11.2012 CONTENTS Preface………………………………………………….…………………….….…3 Introduction……………………………………………….…..........................……4 1. General notions about scientific research (SR)……………….……….....……..6 1.1. Scientific method……………………………….………..……..……9 1.2. Basic research…………………………………………...……….…10 1.3. Information supply of scientific research……………..….………..12 2. Ontologies and upper ontologies……………………………….…..…….…….16 2.1. Concepts of Foundations of Research Activities 2.2. Ontology components 2.3. Ontology for the visualization of a lecture 3. Ontologies of object domains………………………………..………………..19 3.1. Elements of the ontology of spaces and symmetries 3.1.1. Concepts of electrodynamics and classical gauge theory 4. Examples of Research Activity………………….……………………………….21 4.1. Scientific activity in arithmetics, informatics and discrete mathematics 4.2. Algebra of logic and functions of the algebra of logic 4.3. Function of the algebra of logic 5. Some Notions of the Theory of Finite and Discrete Sets…………………………25 6. Algebraic Operations and Algebraic Structures……………………….………….26 7. Elements of the Theory of Graphs and Nets…………………………… 42 8. Scientific activity on the example “Information and its investigation”……….55 9. Scientific research in Artificial Intelligence……………………………………..59 10. Compilers and compilation…………………….......................................……69 11. Objective, Concepts and History of Computer security…….………………..93 12. Methodological and categorical apparatus of scientific research……………114 13. Methodology and methods of scientific research…………………………….116 13.1. Methods of theoretical level of research 13.1.1. Induction 13.1.2. Deduction 13.2. Methods of empirical level of research 14. Scientific idea and significance of scientific research………………………..119 15. Forms of scientific knowledge organization and principles of SR………….121 1 15.1. Forms of scientific knowledge 15.2. -
Hybrid Analysis of Memory References And
HYBRID ANALYSIS OF MEMORY REFERENCES AND ITS APPLICATION TO AUTOMATIC PARALLELIZATION A Dissertation by SILVIUS VASILE RUS Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY December 2006 Major Subject: Computer Science HYBRID ANALYSIS OF MEMORY REFERENCES AND ITS APPLICATION TO AUTOMATIC PARALLELIZATION A Dissertation by SILVIUS VASILE RUS Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Approved by: Chair of Committee, Lawrence Rauchwerger Committee Members, Nancy Amato Narasimha Reddy Vivek Sarin Head of Department, Valerie Taylor December 2006 Major Subject: Computer Science iii ABSTRACT Hybrid Analysis of Memory References and Its Application to Automatic Parallelization. (December 2006) Silvius Vasile Rus, B.S., Babes-Bolyai University Chair of Advisory Committee: Dr. Lawrence Rauchwerger Executing sequential code in parallel on a multithreaded machine has been an elusive goal of the academic and industrial research communities for many years. It has recently become more important due to the widespread introduction of multi- cores in PCs. Automatic multithreading has not been achieved because classic, static compiler analysis was not powerful enough and program behavior was found to be, in many cases, input dependent. Speculative thread level parallelization was a welcome avenue for advancing parallelization coverage but its performance was not always op- timal due to the sometimes unnecessary overhead of checking every dynamic memory reference. In this dissertation we introduce a novel analysis technique, Hybrid Analysis, which unifies static and dynamic memory reference techniques into a seamless com- piler framework which extracts almost maximum available parallelism from scientific codes and incurs close to the minimum necessary run time overhead. -
Optimization Compiler Passes Optimizations the Role of The
Analysis Synthesis of input program of output program Compiler (front -end) (back -end) character Passes stream Intermediate Lexical Analysis Code Generation token intermediate Optimization stream form Syntactic Analysis Optimization abstract intermediate Before and after generating machine syntax tree form code, devote one or more passes over the program to “improve” code quality Semantic Analysis Code Generation annotated target AST language 2 The Role of the Optimizer Optimizations • The compiler can implement a procedure in many ways • The optimizer tries to find an implementation that is “better” – Speed, code size, data space, … Identify inefficiencies in intermediate or target code Replace with equivalent but better sequences To accomplish this, it • Analyzes the code to derive knowledge about run-time • equivalent = "has the same externally visible behavior – Data-flow analysis, pointer disambiguation, … behavior" – General term is “static analysis” Target-independent optimizations best done on IL • Uses that knowledge in an attempt to improve the code – Literally hundreds of transformations have been proposed code – Large amount of overlap between them Target-dependent optimizations best done on Nothing “optimal” about optimization target code • Proofs of optimality assume restrictive & unrealistic conditions • Better goal is to “usually improve” 3 4 Traditional Three-pass Compiler The Optimizer (or Middle End) IR Source Front Middle IR Back Machine IROpt IROpt IROpt ... IROpt IR Code End End End code 1 2 3 n Errors Errors Modern -
CS 4120 Introduction to Compilers Andrew Myers Cornell University
CS 4120 Introduction to Compilers Andrew Myers Cornell University Lecture 19: Introduction to Optimization 2020 March 9 Optimization • Next topic: how to generate better code through optimization. • Tis course covers the most valuable and straightforward optimizations – much more to learn! –Other sources: • Appel, Dragon book • Muchnick has 10 chapters of optimization techniques • Cooper and Torczon 2 CS 4120 Introduction to Compilers How fast can you go? 10000 direct source code interpreters (parsing included!) 1000 tokenized program interpreters (BASIC, Tcl) AST interpreters (Perl 4) 100 bytecode interpreters (Java, Perl 5, OCaml, Python) call-threaded interpreters 10 pointer-threaded interpreters (FORTH) simple code generation (PA4, JIT) 1 register allocation naive assembly code local optimization global optimization FPGA expert assembly code GPU 0.1 3 CS 4120 Introduction to Compilers Goal of optimization • Compile clean, modular, high-level source code to expert assembly-code performance. • Can’t change meaning of program to behavior not allowed by source. • Different goals: –space optimization: reduce memory use –time optimization: reduce execution time –power optimization: reduce power usage 4 CS 4120 Introduction to Compilers Why do we need optimization? • Programmers may write suboptimal code for clarity. • Source language may make it hard to avoid redundant computation a[i][j] = a[i][j] + 1 • Architectural independence • Modern architectures assume optimization—hard to optimize by hand! 5 CS 4120 Introduction to Compilers Where to optimize? • Usual goal: improve time performance • But: many optimizations trade off space vs. time. • Example: loop unrolling replaces a loop body with N copies. –Increasing code space speeds up one loop but slows rest of program down a little. -
Compiler Transformations for High-Performance Computing
Compiler Transformations for High-Performance Computing DAVID F. BACON, SUSAN L. GRAHAM, AND OLIVER J. SHARP Computer Sc~ence Diviszon, Unwersbty of California, Berkeley, California 94720 In the last three decades a large number of compiler transformations for optimizing programs have been implemented. Most optimization for uniprocessors reduce the number of instructions executed by the program using transformations based on the analysis of scalar quantities and data-flow techniques. In contrast, optimization for high-performance superscalar, vector, and parallel processors maximize parallelism and memory locality with transformations that rely on tracking the properties of arrays using loop dependence analysis. This survey is a comprehensive overview of the important high-level program restructuring techniques for imperative languages such as C and Fortran. Transformations for both sequential and various types of parallel architectures are covered in depth. We describe the purpose of each traneformation, explain how to determine if it is legal, and give an example of its application. Programmers wishing to enhance the performance of their code can use this survey to improve their understanding of the optimization that compilers can perform, or as a reference for techniques to be applied manually. Students can obtain an overview of optimizing compiler technology. Compiler writers can use this survey as a reference for most of the important optimizations developed to date, and as a bibliographic reference for the details of each optimization. -
Compiler Construction
Compiler construction PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 10 Dec 2011 02:23:02 UTC Contents Articles Introduction 1 Compiler construction 1 Compiler 2 Interpreter 10 History of compiler writing 14 Lexical analysis 22 Lexical analysis 22 Regular expression 26 Regular expression examples 37 Finite-state machine 41 Preprocessor 51 Syntactic analysis 54 Parsing 54 Lookahead 58 Symbol table 61 Abstract syntax 63 Abstract syntax tree 64 Context-free grammar 65 Terminal and nonterminal symbols 77 Left recursion 79 Backus–Naur Form 83 Extended Backus–Naur Form 86 TBNF 91 Top-down parsing 91 Recursive descent parser 93 Tail recursive parser 98 Parsing expression grammar 100 LL parser 106 LR parser 114 Parsing table 123 Simple LR parser 125 Canonical LR parser 127 GLR parser 129 LALR parser 130 Recursive ascent parser 133 Parser combinator 140 Bottom-up parsing 143 Chomsky normal form 148 CYK algorithm 150 Simple precedence grammar 153 Simple precedence parser 154 Operator-precedence grammar 156 Operator-precedence parser 159 Shunting-yard algorithm 163 Chart parser 173 Earley parser 174 The lexer hack 178 Scannerless parsing 180 Semantic analysis 182 Attribute grammar 182 L-attributed grammar 184 LR-attributed grammar 185 S-attributed grammar 185 ECLR-attributed grammar 186 Intermediate language 186 Control flow graph 188 Basic block 190 Call graph 192 Data-flow analysis 195 Use-define chain 201 Live variable analysis 204 Reaching definition 206 Three address -
Codegeneratoren
Codegeneratoren Sebastian Geiger Twitter: @lanoxx, Email: [email protected] January 26, 2015 Abstract This is a very short summary of the topics that are tought in “Codegeneratoren“. Code generators are the backend part of a compiler that deal with the conversion of the machine independent intermediate representation into machine dependent instructions and the various optimizations on the resulting code. The first two Sections deal with prerequisits that are required for the later parts. Section 1 discusses computer architectures in order to give a better understanding of how processors work and why di↵erent optimizations are possible. Section 2 discusses some of the higher level optimizations that are done before the code generator does its work. Section 3 discusses instruction selection, which conserns the actual conversion of the intermediate representation into machine dependent instructions. The later chapters all deal with di↵erent kinds of optimizations that are possible on the resulting code. I have written this document to help my self studing for this subject. If you find this document useful or if you find errors or mistakes then please send me an email. Contents 1 Computer architectures 2 1.1 Microarchitectures ............................................... 2 1.2 InstructionSetArchitectures ......................................... 3 2 Optimizations 3 2.1 StaticSingleAssignment............................................ 3 2.2 Graphstructuresandmemorydependencies . 4 2.3 Machineindependentoptimizations. 4 3 Instruction selection 5 4 Register allocation 6 5 Coallescing 7 6 Optimal register allocation using integer linear programming 7 7 Interproceedural register allocation 8 7.1 Global Register allocation at link time (Annotation based) . 8 7.2 Register allocation across procedure and module bounaries (Web) . 8 8 Instruction Scheduling 9 8.1 Phaseorderingproblem:...........................................