Introduction to Optimization • Muchnick Has 10 Chapters of Optimization Techniques • Cooper and Torczon Also Cover Optimization

Total Page:16

File Type:pdf, Size:1020Kb

Introduction to Optimization • Muchnick Has 10 Chapters of Optimization Techniques • Cooper and Torczon Also Cover Optimization 10/28/2013 Optimization • Next topic: how to generate better code CS 4120 through optimization. Introduction to Compilers • This course covers the most valuable Ross Tate and straightforward optimizations – Cornell University much more to learn! • Other sources: Lecture 25: Introduction to Optimization • Muchnick has 10 chapters of optimization techniques • Cooper and Torczon also cover optimization CS 4120 Introduction to Compilers 2 How fast can you go? Goal of optimization 10000 direct source code interpreters • Help programmers • clean, modular, high-level source code 1000 tokenized program interpreters (BASIC, Tcl) • but compile to assembly-code performance AST interpreters (CS 3110 RCL, Perl 4) 100 • Optimizations are code transformations bytecode interpreters (Java, Perl 5, OCaml) • can’t change meaning of program to behavior call-threaded interpreters 10 pointer-threaded interpreters (FORTH) not allowed by source simple code generation (PA4, JIT) • Different kinds of optimization: • space optimization: reduce memory use 1 register allocation naive assembly code local optimization expert assembly code global optimization • time optimization: reduce execution time 0.1 • power optimization: reduce power usage CS 4120 Introduction to Compilers 3 CS 4120 Introduction to Compilers 4 Why do we need optimization? Where to optimize? • Programmers may write suboptimal code to • Usual goal: improve time performance make it clearer. • But: many optimizations trade off space versus time. • High-level language may make avoiding • E.g.: loop unrolling replaces loop body with N copies. redundant computation inconvenient or • Increasing code space slows program down a little, impossible speeds up one loop • Frequently executed code with long loops: space/time – a[i][j] = a[i][j] + 1 tradeoff is generally a win • Infrequently executed code: optimize code space at • Exploit patterns inexpressible at source level expense of time, saving instruction cache space • Clean up mess from translation stages • Complex optimizations may never pay off! • Architectural independence. • Ideal focus of optimization: program hot spots – Modern architectures make it hard to hand optimize CS 4120 Introduction to Compilers 5 CS 4120 Introduction to Compilers 6 1 10/28/2013 Safety Writing fast programs in practice • Possible opportunity for loop-invariant code motion: 1.Pick the right algorithms and data while (b) { z = y/x; // x, y not assigned in loop structures: design for locality and few … operations } • Transformation: invariant code out of loop: 2.Turn on optimization and profile to figure z = y/x; out program hot spots while (b) { Preserves meaning? … 3.Evaluate whether design works; if so… } Faster? • Three aspects of an optimization: 4.Tweak source code until optimizer does • the code transformation “the right thing” to machine code • safety of transformation • understanding optimizers helps! • performance improvement CS 4120 Introduction to Compilers 7 CS 4120 Introduction to Compilers 8 Structure of an optimization When to apply optimization • Optimization is a code transformation AST Inlining HIR Specialization • Applied at some stage of compiler Constant folding IR – HIR, MIR, LIR Constant propagation Value numbering • In general requires some analysis: Canonical Dead code elimination • safety analysis to determine where MIR IR Loop-invariant code motion Common sub-expression elimination transformation does not change meaning Strength reduction (e.g. live variable analysis) Abstract Constant folding & propagation Assembly Branch prediction/optimization • cost analysis to determine where it ought to Register allocation speed up code (e.g., which variable to spill) Loop unrolling LIR Assembly Cache optimization Peephole optimizations CS 4120 Introduction to Compilers 9 CS 4120 Introduction to Compilers 10 Register allocation Constant folding • Goal: convert abstract assembly (infinite # of registers) • Idea: if operands are known at compile time, into real assembly (6 registers) evaluate at compile time when possible. mov t1, t2 mov rax, rbx int x = (2 + 3)*4*y; ⇒ int x = 5*4*y; add t1, -4(rbp) add rax, -4(rbp) ⇒ int x = 20*y; mov t3, -8(rbp) mov rbx, -8(rbp) • Can perform at every stage of compilation mov t4, t3 cmp t1, t4 cmp rax, rbx • Constant expressions are created by translation and by optimization a[2] ⇒ MEM(MEM(a) + 2*4) • Need to reuse registers aggressively (e.g., rbx) ⇒ MEM(MEM(a) + 8) • Coalesce registers (t3, t4) to eliminate mov’s • May be impossible without spilling some temporaries to stack CS 4120 Introduction to Compilers 11 CS 4120 Introduction to Compilers 12 2 10/28/2013 Constant folding conditionals Algebraic simplification • More general form of constant folding: take advantage of if (true) S ⇒ S simplification rules a * 1 ⇒ a b | false ⇒ b if (false) S ⇒ ; a + 0 ⇒ a b & true ⇒ b identities if (true) S else S’ ⇒ S a * 0 ⇒ 0 b & false ⇒ false zeroes (a + 1) + 2 ⇒ a + (1 + 2) ⇒ a+3 reassociation if (false) S else S’ ⇒ S’ a * 4 ⇒ a shl 2 a * 7 ⇒ (a shl 3) − a strength reduction a / 32767 ⇒ a shr 15 + a shr 30 while (false) S ⇒ ; • Must be careful with floating point and with overflow - algebraic while (true) ; ⇒ ; ???? identities may give wrong or less precise answers if (2 > 3) S ⇒ if (false) S ⇒ ; • E.g., (a+b)+c ≠ a+(b+c) in floating point if a,b small CS 4120 Introduction to Compilers 13 CS 4120 Introduction to Compilers 14 Unreachable-code elimination Inlining • Replace a function call with the body of the function: • Basic blocks not contained by any f(a: int):int = { b:int=1; n:int = 0; trace leading from starting basic while (n<a) (b = 2*b); return b; } g(x: int):int = { return 1+ f(x); } block are unreachable and can be ⇒ g(x:int):int = { fx:int; a:int = x; eliminated { b:int=1; n:int = 0; while (n<a) ( b = 2*b); fx=b; goto finish; } • Performed at canonical-IR or finish: return 1 + fx; } assembly-code levels • Best done on HIR • Can inline methods, but more difficult – there can be only one f. • Reductions in code size improve • May need to rename variables to avoid name capture—consider if cache, TLB performance. f refers to a global variable x CS 4120 Introduction to Compilers 15 CS 4120 Introduction to Compilers 16 Constant propagation Dead-code elimination • If side effect of a statement can never be • If value of variable is known to be a observed, can eliminate the statement constant, replace use of variable with x = y*y; // dead! constant … // x unused • Value of variable must be propagated x = z*z; x = z*z; forward from point of assignment • Dead variable: if never read after definition int x = 5; int i; int y = x*2; while (m<n) ( m++; i = i+1) while (m<n) (m++) int z = a[y]; // = MEM(MEM(a) + y*4) • Other optimizations create dead • Interleave with constant folding! statements/variables CS 4120 Introduction to Compilers 17 CS 4120 Introduction to Compilers 18 3 10/28/2013 Copy propagation Redundancy Elimination • Given assignment x = y, replace • Common-Subexpression Elimination subsequent uses of x with y (CSE) combines redundant computations • May make x a dead variable, result in a(i) = a(i) + 1 dead code ⇒ [[a]+i*4] = [[a]+i*4] + 1 • Need to determine where copies of y ⇒ t1 = [a] + i*4; [t1] = [t1]+1 propagate to • Need to determine that expression always has same value in both places x = y x = y b[j]=a[i]+1; c[k]=a[i] ⇏ t1=a[i]; b[j]=t1+1; c[k]=t1 x = x * f(x - 1) x = y * f(y - 1) CS 4120 Introduction to Compilers 19 CS 4120 Introduction to Compilers 20 Loops Loop-invariant code motion • Program hot spots are usually loops • Another form of redundancy elimination (exceptions: OS kernels, compilers) • If result of a statement or expression does • Most execution time in most programs is not change during loop, and it has no spent in loops: 90/10 is typical. externally-visible side effect (!), can hoist its • Loop optimizations are important, effective, computation before loop and numerous • Often useful for array element addressing computations – invariant code not visible at source level • Requires analysis to identify loop-invariant expressions CS 4120 Introduction to Compilers 21 CS 4120 Introduction to Compilers 22 Loop-invariant code motion Strength reduction • Replace expensive operations (*,/) by cheap ones for (i = 0; i < a.length; i++) { (+,−) via dependent induction variable S // a not assigned in S } hoisted loop-invariant expression for (int i = 0; i < n; i++) { a[i*3] = 1; t1 = a.length; } int j = 0; for (i = 0; i < t1; i++) { for (int i = 0; i < n; i++){ S a[ j ] = 1; j = j+3; } } CS 4120 Introduction to Compilers 23 CS 4120 Introduction to Compilers 24 4 10/28/2013 Loop unrolling Summary • Many useful optimizations that can transform • Branches are expensive code to make it faster/smaller/... – unroll loop to avoid them: • Whole is greater than sum of parts: optimizations for (i = 0; i<n; i++) { S } should be applied together, sometimes more than once, at different levels for (i = 0; i < n−3; i+=4) {S; S; S; S;} • Transformation is relatively easy • The hard problem: when are optimizations are for ( ; i < n; i++) S; safe and when are they effective? – Data-flow analysis • Gets rid of ¾ of conditional branches! – Control-flow analysis • Space-time tradeoff: not a good idea for – Pointer analysis large S or small n. CS 4120 Introduction to Compilers 25 CS 4120 Introduction to Compilers 26 5 .
Recommended publications
  • CS153: Compilers Lecture 19: Optimization
    CS153: Compilers Lecture 19: Optimization Stephen Chong https://www.seas.harvard.edu/courses/cs153 Contains content from lecture notes by Steve Zdancewic and Greg Morrisett Announcements •HW5: Oat v.2 out •Due in 2 weeks •HW6 will be released next week •Implementing optimizations! (and more) Stephen Chong, Harvard University 2 Today •Optimizations •Safety •Constant folding •Algebraic simplification • Strength reduction •Constant propagation •Copy propagation •Dead code elimination •Inlining and specialization • Recursive function inlining •Tail call elimination •Common subexpression elimination Stephen Chong, Harvard University 3 Optimizations •The code generated by our OAT compiler so far is pretty inefficient. •Lots of redundant moves. •Lots of unnecessary arithmetic instructions. •Consider this OAT program: int foo(int w) { var x = 3 + 5; var y = x * w; var z = y - 0; return z * 4; } Stephen Chong, Harvard University 4 Unoptimized vs. Optimized Output .globl _foo _foo: •Hand optimized code: pushl %ebp movl %esp, %ebp _foo: subl $64, %esp shlq $5, %rdi __fresh2: movq %rdi, %rax leal -64(%ebp), %eax ret movl %eax, -48(%ebp) movl 8(%ebp), %eax •Function foo may be movl %eax, %ecx movl -48(%ebp), %eax inlined by the compiler, movl %ecx, (%eax) movl $3, %eax so it can be implemented movl %eax, -44(%ebp) movl $5, %eax by just one instruction! movl %eax, %ecx addl %ecx, -44(%ebp) leal -60(%ebp), %eax movl %eax, -40(%ebp) movl -44(%ebp), %eax Stephen Chong,movl Harvard %eax,University %ecx 5 Why do we need optimizations? •To help programmers… •They write modular, clean, high-level programs •Compiler generates efficient, high-performance assembly •Programmers don’t write optimal code •High-level languages make avoiding redundant computation inconvenient or impossible •e.g.
    [Show full text]
  • Control Flow Analysis in Scheme
    Control Flow Analysis in Scheme Olin Shivers Carnegie Mellon University [email protected] Abstract Fortran). Representative texts describing these techniques are [Dragon], and in more detail, [Hecht]. Flow analysis is perhaps Traditional ¯ow analysis techniques, such as the ones typically the chief tool in the optimising compiler writer's bag of tricks; employed by optimising Fortran compilers, do not work for an incomplete list of the problems that can be addressed with Scheme-like languages. This paper presents a ¯ow analysis ¯ow analysis includes global constant subexpression elimina- technique Ð control ¯ow analysis Ð which is applicable to tion, loop invariant detection, redundant assignment detection, Scheme-like languages. As a demonstration application, the dead code elimination, constant propagation, range analysis, information gathered by control ¯ow analysis is used to per- code hoisting, induction variable elimination, copy propaga- form a traditional ¯ow analysis problem, induction variable tion, live variable analysis, loop unrolling, and loop jamming. elimination. Extensions and limitations are discussed. However, these traditional ¯ow analysis techniques have The techniques presented in this paper are backed up by never successfully been applied to the Lisp family of computer working code. They are applicable not only to Scheme, but languages. This is a curious omission. The Lisp community also to related languages, such as Common Lisp and ML. has had suf®cient time to consider the problem. Flow analysis dates back at least to 1960, ([Dragon], pp. 516), and Lisp is 1 The Task one of the oldest computer programming languages currently in use, rivalled only by Fortran and COBOL. Flow analysis is a traditional optimising compiler technique Indeed, the Lisp community has long been concerned with for determining useful information about a program at compile the execution speed of their programs.
    [Show full text]
  • Language and Compiler Support for Dynamic Code Generation by Massimiliano A
    Language and Compiler Support for Dynamic Code Generation by Massimiliano A. Poletto S.B., Massachusetts Institute of Technology (1995) M.Eng., Massachusetts Institute of Technology (1995) Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 1999 © Massachusetts Institute of Technology 1999. All rights reserved. A u th or ............................................................................ Department of Electrical Engineering and Computer Science June 23, 1999 Certified by...............,. ...*V .,., . .* N . .. .*. *.* . -. *.... M. Frans Kaashoek Associate Pro essor of Electrical Engineering and Computer Science Thesis Supervisor A ccepted by ................ ..... ............ ............................. Arthur C. Smith Chairman, Departmental CommitteA on Graduate Students me 2 Language and Compiler Support for Dynamic Code Generation by Massimiliano A. Poletto Submitted to the Department of Electrical Engineering and Computer Science on June 23, 1999, in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract Dynamic code generation, also called run-time code generation or dynamic compilation, is the cre- ation of executable code for an application while that application is running. Dynamic compilation can significantly improve the performance of software by giving the compiler access to run-time infor- mation that is not available to a traditional static compiler. A well-designed programming interface to dynamic compilation can also simplify the creation of important classes of computer programs. Until recently, however, no system combined efficient dynamic generation of high-performance code with a powerful and portable language interface. This thesis describes a system that meets these requirements, and discusses several applications of dynamic compilation.
    [Show full text]
  • Strength Reduction of Induction Variables and Pointer Analysis – Induction Variable Elimination
    Loop optimizations • Optimize loops – Loop invariant code motion [last time] Loop Optimizations – Strength reduction of induction variables and Pointer Analysis – Induction variable elimination CS 412/413 Spring 2008 Introduction to Compilers 1 CS 412/413 Spring 2008 Introduction to Compilers 2 Strength Reduction Induction Variables • Basic idea: replace expensive operations (multiplications) with • An induction variable is a variable in a loop, cheaper ones (additions) in definitions of induction variables whose value is a function of the loop iteration s = 3*i+1; number v = f(i) while (i<10) { while (i<10) { j = 3*i+1; //<i,3,1> j = s; • In compilers, this a linear function: a[j] = a[j] –2; a[j] = a[j] –2; i = i+2; i = i+2; f(i) = c*i + d } s= s+6; } •Observation:linear combinations of linear • Benefit: cheaper to compute s = s+6 than j = 3*i functions are linear functions – s = s+6 requires an addition – Consequence: linear combinations of induction – j = 3*i requires a multiplication variables are induction variables CS 412/413 Spring 2008 Introduction to Compilers 3 CS 412/413 Spring 2008 Introduction to Compilers 4 1 Families of Induction Variables Representation • Basic induction variable: a variable whose only definition in the • Representation of induction variables in family i by triples: loop body is of the form – Denote basic induction variable i by <i, 1, 0> i = i + c – Denote induction variable k=i*a+b by triple <i, a, b> where c is a loop-invariant value • Derived induction variables: Each basic induction variable i defines
    [Show full text]
  • Lecture Notes on Peephole Optimizations and Common Subexpression Elimination
    Lecture Notes on Peephole Optimizations and Common Subexpression Elimination 15-411: Compiler Design Frank Pfenning and Jan Hoffmann Lecture 18 October 31, 2017 1 Introduction In this lecture, we discuss common subexpression elimination and a class of optimiza- tions that is called peephole optimizations. The idea of common subexpression elimination is to avoid to perform the same operation twice by replacing (binary) operations with variables. To ensure that these substitutions are sound we intro- duce dominance, which ensures that substituted variables are always defined. Peephole optimizations are optimizations that are performed locally on a small number of instructions. The name is inspired from the picture that we look at the code through a peephole and make optimization that only involve the small amount code we can see and that are indented of the rest of the program. There is a large number of possible peephole optimizations. The LLVM com- piler implements for example more than 1000 peephole optimizations [LMNR15]. In this lecture, we discuss three important and representative peephole optimiza- tions: constant folding, strength reduction, and null sequences. 2 Constant Folding Optimizations have two components: (1) a condition under which they can be ap- plied and the (2) code transformation itself. The optimization of constant folding is a straightforward example of this. The code transformation itself replaces a binary operation with a single constant, and applies whenever c1 c2 is defined. l : x c1 c2 −! l : x c (where c =
    [Show full text]
  • Efficient Symbolic Analysis for Optimizing Compilers*
    Efficient Symbolic Analysis for Optimizing Compilers? Robert A. van Engelen Dept. of Computer Science, Florida State University, Tallahassee, FL 32306-4530 [email protected] Abstract. Because most of the execution time of a program is typically spend in loops, loop optimization is the main target of optimizing and re- structuring compilers. An accurate determination of induction variables and dependencies in loops is of paramount importance to many loop opti- mization and parallelization techniques, such as generalized loop strength reduction, loop parallelization by induction variable substitution, and loop-invariant expression elimination. In this paper we present a new method for induction variable recognition. Existing methods are either ad-hoc and not powerful enough to recognize some types of induction variables, or existing methods are powerful but not safe. The most pow- erful method known is the symbolic differencing method as demonstrated by the Parafrase-2 compiler on parallelizing the Perfect Benchmarks(R). However, symbolic differencing is inherently unsafe and a compiler that uses this method may produce incorrectly transformed programs without issuing a warning. In contrast, our method is safe, simpler to implement in a compiler, better adaptable for controlling loop transformations, and recognizes a larger class of induction variables. 1 Introduction It is well known that the optimization and parallelization of scientific applica- tions by restructuring compilers requires extensive analysis of induction vari- ables and dependencies
    [Show full text]
  • CSE 401/M501 – Compilers
    CSE 401/M501 – Compilers Dataflow Analysis Hal Perkins Autumn 2018 UW CSE 401/M501 Autumn 2018 O-1 Agenda • Dataflow analysis: a framework and algorithm for many common compiler analyses • Initial example: dataflow analysis for common subexpression elimination • Other analysis problems that work in the same framework • Some of these are the same optimizations we’ve seen, but more formally and with details UW CSE 401/M501 Autumn 2018 O-2 Common Subexpression Elimination • Goal: use dataflow analysis to A find common subexpressions m = a + b n = a + b • Idea: calculate available expressions at beginning of B C p = c + d q = a + b each basic block r = c + d r = c + d • Avoid re-evaluation of an D E available expression – use a e = b + 18 e = a + 17 copy operation s = a + b t = c + d u = e + f u = e + f – Simple inside a single block; more complex dataflow analysis used F across bocks v = a + b w = c + d x = e + f G y = a + b z = c + d UW CSE 401/M501 Autumn 2018 O-3 “Available” and Other Terms • An expression e is defined at point p in the CFG if its value a+b is computed at p defined t1 = a + b – Sometimes called definition site ... • An expression e is killed at point p if one of its operands a+b is defined at p available t10 = a + b – Sometimes called kill site … • An expression e is available at point p if every path a+b leading to p contains a prior killed b = 7 definition of e and e is not … killed between that definition and p UW CSE 401/M501 Autumn 2018 O-4 Available Expression Sets • To compute available expressions, for each block
    [Show full text]
  • Code Optimizations Recap Peephole Optimization
    7/23/2016 Program Analysis Recap https://www.cse.iitb.ac.in/~karkare/cs618/ • Optimizations Code Optimizations – To improve efficiency of generated executable (time, space, resources …) Amey Karkare – Maintain semantic equivalence Dept of Computer Science and Engg • Two levels IIT Kanpur – Machine Independent Visiting IIT Bombay [email protected] – Machine Dependent [email protected] 2 Peephole Optimization Peephole optimization examples… • target code often contains redundant Redundant loads and stores instructions and suboptimal constructs • Consider the code sequence • examine a short sequence of target instruction (peephole) and replace by a Move R , a shorter or faster sequence 0 Move a, R0 • peephole is a small moving window on the target systems • Instruction 2 can always be removed if it does not have a label. 3 4 1 7/23/2016 Peephole optimization examples… Unreachable code example … Unreachable code constant propagation • Consider following code sequence if 0 <> 1 goto L2 int debug = 0 print debugging information if (debug) { L2: print debugging info } Evaluate boolean expression. Since if condition is always true the code becomes this may be translated as goto L2 if debug == 1 goto L1 goto L2 print debugging information L1: print debugging info L2: L2: The print statement is now unreachable. Therefore, the code Eliminate jumps becomes if debug != 1 goto L2 print debugging information L2: L2: 5 6 Peephole optimization examples… Peephole optimization examples… • flow of control: replace jump over jumps • Strength reduction
    [Show full text]
  • Compiler-Based Code-Improvement Techniques
    Compiler-Based Code-Improvement Techniques KEITH D. COOPER, KATHRYN S. MCKINLEY, and LINDA TORCZON Since the earliest days of compilation, code quality has been recognized as an important problem [18]. A rich literature has developed around the issue of improving code quality. This paper surveys one part of that literature: code transformations intended to improve the running time of programs on uniprocessor machines. This paper emphasizes transformations intended to improve code quality rather than analysis methods. We describe analytical techniques and specific data-flow problems to the extent that they are necessary to understand the transformations. Other papers provide excellent summaries of the various sub-fields of program analysis. The paper is structured around a simple taxonomy that classifies transformations based on how they change the code. The taxonomy is populated with example transformations drawn from the literature. Each transformation is described at a depth that facilitates broad understanding; detailed references are provided for deeper study of individual transformations. The taxonomy provides the reader with a framework for thinking about code-improving transformations. It also serves as an organizing principle for the paper. Copyright 1998, all rights reserved. You may copy this article for your personal use in Comp 512. Further reproduction or distribution requires written permission from the authors. 1INTRODUCTION This paper presents an overview of compiler-based methods for improving the run-time behavior of programs — often mislabeled code optimization. These techniques have a long history in the literature. For example, Backus makes it quite clear that code quality was a major concern to the implementors of the first Fortran compilers [18].
    [Show full text]
  • Redundant Computation Elimination Optimizations
    Redundant Computation Elimination Optimizations CS2210 Lecture 20 CS2210 Compiler Design 2004/5 Redundancy Elimination ■ Several categories: ■ Value Numbering ■ local & global ■ Common subexpression elimination (CSE) ■ local & global ■ Loop-invariant code motion ■ Partial redundancy elimination ■ Most complex ■ Subsumes CSE & loop-invariant code motion CS2210 Compiler Design 2004/5 Value Numbering ■ Goal: identify expressions that have same value ■ Approach: hash expression to a hash code ■ Then have to compare only expressions that hash to same value CS2210 Compiler Design 2004/5 1 Example a := x | y b := x | y t1:= !z if t1 goto L1 x := !z c := x & y t2 := x & y if t2 trap 30 CS2210 Compiler Design 2004/5 Global Value Numbering ■ Generalization of value numbering for procedures ■ Requires code to be in SSA form ■ Crucial notion: congruence ■ Two variables are congruent if their defining computations have identical operators and congruent operands ■ Only precise in SSA form, otherwise the definition may not dominate the use CS2210 Compiler Design 2004/5 Value Graphs ■ Labeled, directed graph ■ Nodes are labeled with ■ Operators ■ Function symbols ■ Constants ■ Edges point from operator (or function) to its operands ■ Number labels indicate operand position CS2210 Compiler Design 2004/5 2 Example entry receive n1(val) i := 1 1 B1 j1 := 1 i3 := φ2(i1,i2) B2 j3 := φ2(j1,j2) i3 mod 2 = 0 i := i +1 i := i +3 B3 4 3 5 3 B4 j4:= j3+1 j5:= j3+3 i2 := φ5(i4,i5) j3 := φ5(j4,j5) B5 j2 > n1 exit CS2210 Compiler Design 2004/5 Congruence Definition ■ Maximal relation on the value graph ■ Two nodes are congruent iff ■ They are the same node, or ■ Their labels are constants and the constants are the same, or ■ They have the same operators and their operands are congruent ■ Variable equivalence := x and y are equivalent at program point p iff they are congruent and their defining assignments dominate p CS2210 Compiler Design 2004/5 Global Value Numbering Algo.
    [Show full text]
  • Hybrid Analysis of Memory References And
    HYBRID ANALYSIS OF MEMORY REFERENCES AND ITS APPLICATION TO AUTOMATIC PARALLELIZATION A Dissertation by SILVIUS VASILE RUS Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY December 2006 Major Subject: Computer Science HYBRID ANALYSIS OF MEMORY REFERENCES AND ITS APPLICATION TO AUTOMATIC PARALLELIZATION A Dissertation by SILVIUS VASILE RUS Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Approved by: Chair of Committee, Lawrence Rauchwerger Committee Members, Nancy Amato Narasimha Reddy Vivek Sarin Head of Department, Valerie Taylor December 2006 Major Subject: Computer Science iii ABSTRACT Hybrid Analysis of Memory References and Its Application to Automatic Parallelization. (December 2006) Silvius Vasile Rus, B.S., Babes-Bolyai University Chair of Advisory Committee: Dr. Lawrence Rauchwerger Executing sequential code in parallel on a multithreaded machine has been an elusive goal of the academic and industrial research communities for many years. It has recently become more important due to the widespread introduction of multi- cores in PCs. Automatic multithreading has not been achieved because classic, static compiler analysis was not powerful enough and program behavior was found to be, in many cases, input dependent. Speculative thread level parallelization was a welcome avenue for advancing parallelization coverage but its performance was not always op- timal due to the sometimes unnecessary overhead of checking every dynamic memory reference. In this dissertation we introduce a novel analysis technique, Hybrid Analysis, which unifies static and dynamic memory reference techniques into a seamless com- piler framework which extracts almost maximum available parallelism from scientific codes and incurs close to the minimum necessary run time overhead.
    [Show full text]
  • Optimization Compiler Passes Optimizations the Role of The
    Analysis Synthesis of input program of output program Compiler (front -end) (back -end) character Passes stream Intermediate Lexical Analysis Code Generation token intermediate Optimization stream form Syntactic Analysis Optimization abstract intermediate Before and after generating machine syntax tree form code, devote one or more passes over the program to “improve” code quality Semantic Analysis Code Generation annotated target AST language 2 The Role of the Optimizer Optimizations • The compiler can implement a procedure in many ways • The optimizer tries to find an implementation that is “better” – Speed, code size, data space, … Identify inefficiencies in intermediate or target code Replace with equivalent but better sequences To accomplish this, it • Analyzes the code to derive knowledge about run-time • equivalent = "has the same externally visible behavior – Data-flow analysis, pointer disambiguation, … behavior" – General term is “static analysis” Target-independent optimizations best done on IL • Uses that knowledge in an attempt to improve the code – Literally hundreds of transformations have been proposed code – Large amount of overlap between them Target-dependent optimizations best done on Nothing “optimal” about optimization target code • Proofs of optimality assume restrictive & unrealistic conditions • Better goal is to “usually improve” 3 4 Traditional Three-pass Compiler The Optimizer (or Middle End) IR Source Front Middle IR Back Machine IROpt IROpt IROpt ... IROpt IR Code End End End code 1 2 3 n Errors Errors Modern
    [Show full text]