15-745: Optimizing Compilers for Modern Architectures Course

15-745: Optimizing Compilers for Modern Architectures Course

Course Logistics • Want to get off the waitlist? – See me in my office (GHC 7221) after class to discuss – This course is not intended to be your first compiler course • Let Dominic know if can’t get on Piazza or Blackboard for this course 15-745: Optimizing Compilers for Modern Architectures • Need to get the book Lecture 1: Introduction What would you get out of this course? Structure of a Compiler Optimization Example • Let’s run through the course webpage at http://www.cs.cmu.edu/~15745/ Carnegie Mellon Carnegie Mellon Phillip B. Gibbons 15-745: Introduction 1 15-745: Introduction 2 Phillip B. Gibbons What Do Compilers Do? How Can the Compiler Improve Performance? Execution time = Operation count * Machine cycles per operation 1. Translate one language into another – e.g., convert C++ into x86 object code • Minimize the number of operations – difficult for “natural” languages, but feasible for computer languages – arithmetic operations, memory accesses • Replace expensive operations with simpler ones – e.g., replace 4-cycle multiplication with 1-cycle shift Processor 2. Improve (i.e. “optimize”) the code • Minimize cache misses – e.g., make the code run 3 times faster – both data and instruction accesses cache • or more energy efficient, more robust, etc. • Perform work in parallel memory – driving force behind modern processor design – instruction scheduling within a thread – parallel execution across multiple threads More accurately, machine cycles per operation must account for instruction overlap Carnegie Mellon Carnegie Mellon 15-745: Introduction 3 15-745: Introduction 4 1 What Would You Get Out of This Course? II. Structure of a Compiler • Basic knowledge of existing compiler optimizations Source Code Intermediate Form Object Code • Hands-on experience in constructing optimizations within a fully functional C x86Alpha research compiler C++ Front Optimizer Back SPARCARM • Basic principles and theory for the development of new optimizations End End Java SPARCx86 Verilog MIPSIA-64 • Optimizations are performed on an “intermediate form” – similar to a generic RISC instruction set • Allows easy portability to multiple source languages, target machines Carnegie Mellon Carnegie Mellon 15-745: Introduction 5 15-745: Introduction 6 Ingredients in a Compiler Optimization Ingredients in a Compiler Optimization • Formulate optimization problem • Formulate optimization problem – Identify opportunities of optimization – Identify opportunities of optimization • applicable across many programs • applicable across many programs • affect key parts of the program (loops/recursions) • affect key parts of the program (loops/recursions) • amenable to “efficient enough” algorithm • amenable to “efficient enough” algorithm • Representation • Representation – Must abstract essential details relevant to optimization – Must abstract essential details relevant to optimization Mathematical • Analysis Programs Model – Detect when it is desirable and safe to apply transformation graphs • Code Transformation static statements abstraction matrices dynamic execution integer programs relations • Experimental Evaluation (and repeat process) generated code solutions Carnegie Mellon Carnegie Mellon 15-745: Introduction 7 15-745: Introduction 8 2 Representation: Instructions III. Optimization Example • Three-address code • Bubblesort program that sorts an array A that is allocated in static storage: A := B op C – an element of A requires four bytes of a byte-addressed machine • LHS: name of variable e.g. x, A[t] (address of A + contents of t) – elements of A are numbered 1 through n (n is a variable) • RHS: value – A[j] is in location &A+4*(j-1) • Typical instructions FOR i := n-1 DOWNTO 1 DO A := B op C FOR j := 1 TO i DO A := unaryop B A := B IF A[j]> A[j+1] THEN BEGIN temp := A[j]; GOTO s A[j] := A[j+1]; IF A relop B GOTO s A[j+1] := temp CALL f END RETURN Carnegie Mellon Carnegie Mellon 15-745: Introduction 9 15-745: Introduction 10 Translated Code Representation: a Basic Block i := n-1 t8 :=j-1 • Basic block = a sequence of 3-address statements S5: if i<1 goto s1 t9 := 4*t8 j := 1 temp := A[t9] ;A[j] – only the first statement can be reached from outside the block s4: if j>i goto s2 t10 := j+1 (no branches into middle of block) t1 := j-1 t11:= t10-1 – all the statements are executed consecutively if the first one is t2 := 4*t1 t12 := 4*t11 (no branches out or halts except perhaps at end of block) t3 := A[t2] ;A[j] t13 := A[t12] ;A[j+1] t4 := j+1 t14 := j-1 t5 := t4-1 t15 := 4*t14 • We require basic blocks to be maximal t6 := 4*t5 A[t15] := t13 ;A[j]:=A[j+1] – they cannot be made larger without violating the conditions t7 := A[t6] ;A[j+1] t16 := j+1 if t3<=t7 goto s3 t17 := t16-1 t18 := 4*t17 • Optimizations within a basic block are local optimizations FOR i := n-1 DOWNTO 1 DO A[t18]:=temp ;A[j+1]:=temp s3: j := j+1 FOR j := 1 TO i DO goto S4 IF A[j]> A[j+1] THEN BEGIN S2: i := i-1 temp := A[j]; goto s5 A[j] := A[j+1]; s1: A[j+1] := temp END Carnegie Mellon Carnegie Mellon 15-745: Introduction 11 15-745: Introduction 12 3 Flow Graphs Find the Basic Blocks • Nodes: basic blocks i := n-1 t8 :=j-1 S5: if i<1 goto s1 t9 := 4*t8 j := 1 temp := A[t9] ;A[j] • Edges: Bi -> Bj, iff Bj can follow Bi immediately in some execution s4: if j>i goto s2 t10 := j+1 – Either first instruction of B is target of a goto at end of B t1 := j-1 t11:= t10-1 j i t2 := 4*t1 t12 := 4*t11 – Or, Bj physically follows Bi, which does not end in an unconditional goto. t3 := A[t2] ;A[j] t13 := A[t12] ;A[j+1] t4 := j+1 t14 := j-1 • The block led by first statement of the program is the start, or entry node. t5 := t4-1 t15 := 4*t14 t6 := 4*t5 A[t15] := t13 ;A[j]:=A[j+1] t7 := A[t6] ;A[j+1] t16 := j+1 if t3<=t7 goto s3 t17 := t16-1 t18 := 4*t17 A[t18]:=temp ;A[j+1]:=temp s3: j := j+1 goto S4 S2: i := i-1 goto s5 s1: Carnegie Mellon Carnegie Mellon 15-745: Introduction 13 15-745: Introduction 14 Basic Blocks from Example Sources of Optimizations in • Algorithm optimization B1 i := n-1 • Algebraic optimization B2 if i<1 goto out out A := B+0 => A := B i := i-1 B3 j := 1 B5 goto B2 • Local optimizations – within a basic block -- across instructions if j>i goto B5 • Global optimizations B4 t8 :=j-1 B7 ... – within a flow graph -- across basic blocks A[t18]=temp • Interprocedural analysis t1 := j-1 – within a program -- across procedures (flow graphs) B6 ... if t3<=t7 goto B8 j := j+1 B8 goto B4 Carnegie Mellon Carnegie Mellon 15-745: Introduction 15 15-745: Introduction 16 4 Local Optimizations Example • Analysis & transformation performed within a basic block i := n-1 t8 :=j-1 S5: if i<1 goto s1 t9 := 4*t8 • No control flow information is considered j := 1 temp := A[t9] ;A[j] • Examples of local optimizations: s4: if j>i goto s2 t10 := j+1 • local common subexpression elimination t1 := j-1 t11:= t10-1 analysis: same expression evaluated more than once in b. t2 := 4*t1 t12 := 4*t11 transformation: replace with single calculation t3 := A[t2] ;A[j] t13 := A[t12] ;A[j+1] t4 := j+1 t14 := j-1 t5 := t4-1 t15 := 4*t14 t6 := 4*t5 A[t15] := t13 ;A[j]:=A[j+1] t7 := A[t6] ;A[j+1] t16 := j+1 • local constant folding or elimination if t3<=t7 goto s3 t17 := t16-1 analysis: expression can be evaluated at compile time t18 := 4*t17 transformation: replace by constant, compile-time value A[t18]:=temp ;A[j+1]:=temp s3: j := j+1 goto S4 S2: i := i-1 • dead code elimination goto s5 s1: Carnegie Mellon Carnegie Mellon 15-745: Introduction 17 15-745: Introduction 18 Example (Intraprocedural) Global Optimizations B1: i := n-1 B7: t8 :=j-1 • Global versions of local optimizations B2: if i<1 goto out t9 := 4*t8 – global common subexpression elimination B3: j := 1 temp := A[t9] ;temp:=A[j] – global constant propagation B4: if j>i goto B5 t12 := 4*j B6: t1 := j-1 t13 := A[t12] ;A[j+1] – dead code elimination t2 := 4*t1 A[t9]:= t13 ;A[j]:=A[j+1] • Loop optimizations t3 := A[t2] ;A[j] A[t12]:=temp ;A[j+1]:=temp – reduce code to be executed in each iteration t6 := 4*j B8: j := j+1 – code motion t7 := A[t6] ;A[j+1] goto B4 – if t3<=t7 goto B8 B5: i := i-1 induction variable elimination goto B2 • Other control structures out: – Code hoisting: eliminates copies of identical code on parallel paths in a flow graph to reduce code size. Carnegie Mellon Carnegie Mellon 15-745: Introduction 19 15-745: Introduction 20 5 Example Example (After Global CSE) B1: i := n-1 B7: t8 :=j-1 B1: i := n-1 B7: A[t2] := t7 B2: if i<1 goto out t9 := 4*t8 B2: if i<1 goto out A[t6] := t3 B3: j := 1 temp := A[t9] ;temp:=A[j] B3: j := 1 B8: j := j+1 B4: if j>i goto B5 t12 := 4*j B4: if j>i goto B5 goto B4 B6: t1 := j-1 t13 := A[t12] ;A[j+1] B6: t1 := j-1 B5: i := i-1 t2 := 4*t1 A[t9]:= t13 ;A[j]:=A[j+1] t2 := 4*t1 goto B2 t3 := A[t2] ;A[j] A[t12]:=temp ;A[j+1]:=temp t3 := A[t2] ;A[j] out: t6 := 4*j B8: j := j+1 t6 := 4*j t7 := A[t6] ;A[j+1] goto B4 t7 := A[t6] ;A[j+1] if t3<=t7 goto B8 B5: i := i-1 if t3<=t7 goto B8 goto B2 out: Carnegie Mellon Carnegie Mellon 15-745: Introduction 21 15-745: Introduction 22 Induction Variable Elimination Example • Intuitively B1: i := n-1 B7: A[t2] := t7 B2: if i<1 goto out A[t6] := t3 – Loop indices are induction variables (counting iterations) B3: j := 1 B8: j := j+1 B4: if j>i goto B5 goto B4 – Linear functions of the loop indices are also induction variables B6: t1 := j-1 B5: i := i-1 (for accessing arrays) t2 := 4*t1 goto B2 t3 := A[t2] ;A[j] out: • Analysis: detection of induction variable t6 := 4*j t7 := A[t6] ;A[j+1] • Optimizations if t3<=t7 goto B8 – strength reduction: • replace multiplication by

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    7 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us