Naive Translation to LLVM Step 2: Translating to SSA Form

Naive Translation to LLVM Step 2: Translating to SSA Form

Compiler construction 2014 An example of optimization in LLVM int f () { int i, j, k; i = 8; Lecture 8 j = 1; Comments More on code optimization k = 1; while (i != j) { Human reader sees, with some if (i==8) effort, that the C/Javalette function SSA form f k = 0; returns 8. Constant propagation else We follow how LLVM:s Common subexpression elimination i++; optimizations will discover this Loop optimizations i = i+k; fact. j++; } return i; } Step 1: Naive translation to LLVM Step 2: Translating to SSA form (opt -mem2reg) define i32 @f() { if.then: entry: store i32 0, i32* %k define i32 @f() { if.then: %i = alloca i32 br label %if.end entry: br label %if.end %j = alloca i32 if.else: br label %while.cond %k = alloca i32 %tmp4 = load i32* %i if.else: store i32 8, i32* %i %inc = add i32 %tmp4, 1 while.cond: %inc = add i32 %i.1, 1 store i32 1, i32* %j store i32 %inc, i32* %i %k.1 = phi i32 [ 1, %entry ], br label %if.end store i32 1, i32* %k br label %if.end [ %k.0, %if.end ] br label %while.cond if.end: %j.0 = phi i32 [ 1, %entry ], if.end: while.cond: %tmp5 = load i32* %i [ %inc8, %if.end ] %k.0 = phi i32 [ 0, %if.then ], %tmp = load i32* %i %tmp6 = load i32* %k %i.1 = phi i32 [ 8, %entry ], [ %k.1, %if.else ] %tmp1 = load i32* %j %add = add i32 %tmp5, %tmp6 [ %add, %if.end ] %i.0 = phi i32 [ %i.1, %if.then ], %cmp = icmp ne i32 %tmp, %tmp1 store i32 %add, i32* %i %cmp = icmp ne i32 %i.1, %j.0 [ %inc, %if.else ] br i1 %cmp, label %while.body, %tmp7 = load i32* %j br i1 %cmp, label %while.body, %add = add i32 %i.0, %k.0 label %while.end %inc8 = add i32 %tmp7, 1 label %while.end %inc8 = add i32 %j.0, 1 while.body: store i32 %inc8, i32* %j br label %while.cond %tmp2 = load i32* %i br label %while.cond while.body: %cmp3 = icmp eq i32 %tmp2, 8 while.end: %cmp3 = icmp eq i32 %i.1, 8 while.end: br i1 %cmp3, label %if.then, %tmp9 = load i32* %i br i1 %cmp3, label %if.then, ret i32 %i.1 label %if.else ret i32 %tmp9 label %if.else } } Step 3: Sparse Conditional Constant Propagation Step 4: CFG Simplification (opt -simplifycfg) (opt -sccp) define i32 @f() { entry: br label %while.cond define i32 @f() { if.then: entry: br label %if.end while.cond: br label %while.cond %j.0 = phi i32 [ 1, %entry ], if.else: [ %inc8, %if.end ] Comments while.cond: br label %if.end %k.1 = phi i32 [ 1, %entry ], %j.0 = phi i32 [ 1, %entry ], [ 0, %if.end ] If the function terminates, the [ %inc8, %if.end ] if.end: %cmp = icmp ne i32 8, %j.0 return value is 8. %k.1 = phi i32 [ 1, %entry ], %inc8 = add i32 %j.0, 1 br i1 %cmp, label %if.end, [ 0, %if.end ] br label %while.cond label %while.end opt has not yet detected that the %cmp = icmp ne i32 8, %j.0 loop is certain to terminate. br i1 %cmp, label %while.body, while.end: if.end: label %while.end ret i32 8 %inc8 = add i32 %j.0, 1 } br label %while.cond while.body: br i1 true, label %if.then, while.end: label %if.else ret i32 8 } Step 5: Dead Loop Deletion (opt -loop-deletion) Static Single Assignment form Def-use chains define i32 @f() { One more -simplifycfg step Dataflow analysis often needs to connect a definition with its uses entry: yields finally and, conversely, find all definitions reaching a use. br label %while.end This can be simplified if each variable has only one definition. define i32 @f() { while.end: entry: A new form of IR ret i32 8 ret i32 8 Three-address code can be converted to SSA form by renaming } } variables so that each variable has just one definition. For realistic code, dozens of passes are performed, some of them A non-example Converted to SSA repeatedly. Many heuristics are used to determine order. s := 0 s1 := 0 Use opt -std-compile-opts for a default selection. x := 1 x1 := 1 s := s + x s2 := s1 + x1 x := x + 1 x2 := x1 + 1 Conversion to SSA An artificial device: φ-functions Naive version First pass: Add “definitions” of the form A harder example Conversion started x := φ(x,..., x) in the beginning of each block with several s := 0 predecessors and for each variable. x := 1 s1 := 0 Second pass: Do renumbering. L1: if x > n goto L2 x1 := 1 s := s + x L1: if x > n goto L2 After 1st pass After 2nd pass x := x + 1 s2 := s? + x? goto L1 x2 := x? + 1 s := 0 s1 := 0 L2: goto L1 x := 1 x1 := 1 L2: L1: s := φ(s,s) L1: s3 := φ(s1,s2) x := φ(x,x) x3 := φ(x1,x2) Note the def/use difficulty: What should replace the three if x > n goto L2 if x3 > n goto L2 In s + x, which def of s does the ’?’ ? s := s + x s2 := s3 + x3 use refer to? x := x + 1 x2 := x3 + 1 goto L1 goto L1 L2: L2: What are φ-functions? Step 2 of example revisited: To SSA form A device during optimization !" # $ Think of φ-functions as function calls during optimization. ! "! #! " Later, some of them will be eliminated (e.g. by dead code #! "!! " !" # $ $ ! "!$! " %&$ % % ! %&$ elimination). %& %& %& %& %! ! %& !" # $ !" # $ & Others will after optimization be transformed to real code. ' x3 := (x1,x2) Idea: φ will be transformed to an instruction !" # $ %&$ !" # $ % %' !# %& %&( %& %&)$ %! ! ( x3 := x1 at the end of left predecessor and x3 := x2 at end of %& # " # %&) right predecessor. ' & # # " %&$ ( * $$ ! %& ( ! # $ # $ Advantages Many analyses become much simpler when code is in SSA form. # $ %&+$ %&,$ ! "! ! (" $$$$ ! %&+ %&, Main reason: we see immediately for each use of a variable where $$ $ ! "!$ ! (" %&-$ ( !$ $$ ! %&- # ( ! it was defined. !" # $ Computing SSA form; algorithm Simple constant propagation A dataflow analysis based on SSA form Uses values from a lattice L with elements We already did this : Not a constant, as far as the analysis can tell. Yes, but the conversion inserts unnecessary φ-functions and is too > c1, c2, c3,...: The value is constant, as indicated. inefficient – the gains in analysis with SSA form may be lost in : Yet unknown, may be constant. ⊥ conversion. Each variable v is assigned an initial value val(v) L: ∈ Variables with definitions v := c get val(v) = c, Better algorithms input variables/parameters v get val(v) = , > There are algorithms for finding the right number of φ-functions and the rest get val(v) = . needed. ⊥ These are based on the notion of dominance; if you intend to use The lattice L SSA form, you need to learn about that – or use LLVM, which has The lattice order tools to do it for you. c for all c. c1 c2 c3 c4 . ⊥ ≤ ≤ > ci and cj not related. Propagation phase, 1 Propagation phase, 2 Iteration Iteration, continued Initially, place all names n with val(n) = on a worklist. 6 > Update val for the defined variables, putting variables that get a Iterate by picking a name from the worklist, examining its uses and new value back on the worklist. computing val of the RHS’s, using rules as Terminate when worklist is empty. 0 x = 0 (for any x) · Termination x = · ⊥ ⊥ Values of variables on the worklist can only increase (in lattice x = (x = 0) · > > 6 order) during iteration. Each value can only have its value increased twice. plus ordinary multiplication for constant operands. For φ-functions, we take the join of the arguments, where A disappointment ∨ x = x for all x, x = for all x, and In our running example, this algorithm will terminate with all ⊥ ∨ > ∨ > variables having value . , if c = c > c c = > i 6 j i ∨ j c , otherwise. We need to take reachability into account. i Sparse Conditional Constant Propagation Correctness of SCCP Sketch of algorithm Uses also a worklist of reachable blocks. ! "! #! " A combination of two dataflow analyses $ ! "!! " Initially, only the entry block is % % #! Sparse conditional constant propagation can be seen as the %! ! reachable. & combination of simple constant propagation and reachability In evaluation of φ functions, analysis/dead code analysis. ! ! ' only flows from # ⊥ & unreachable blocks. Both of these can be expressed as dataflow problems and a framework can be devised where the correctness of such New blocks added to worklist ' combination can be proved. when elaborating terminating instructions. Result for running example as # ' ! shown to the right Final steps Common subexpression elimination Control flow graph simplification Problem Fairly simple pass; SCCP does not change graph structure of CFG We want to avoid re-computing an expression; instead we want to even when “obvious” simplifications can be done. use the previously computed value. Dead Loop Elimination Notes j Identifies an induction variable (namely ), which Code example The second occurrence of a - d increases with 1 for each loop iteration, a := b + c should not be computed; terminates the loop when reaching a known value, b := a - d instead we should use d := b. is initialised to a smaller value. c := b + c Both occurrences of b + c must d := a - d When such a variable is found, loop termination is guaranteed and be computed, since b is redefined the loop can be removed. in-between. Value numbering, 1 Value numbering, 2 Algorithm A classic technique For each instruction x := y # z: Works on three-address code within a basic block. Look up VN ny for y in D1. Each expression is assigned a value number (VN), so that If not present, generate new unique VN ny and expressions that have the same VN must have the same value.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    11 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us