Naive Translation to LLVM Step 2: Translating to SSA Form

Compiler construction 2014 An example of optimization in LLVM int f () { int i, j, k; i = 8; Lecture 8 j = 1; Comments More on code optimization k = 1; while (i != j) { Human reader sees, with some if (i==8) effort, that the C/Javalette function SSA form f k = 0; returns 8. Constant propagation else We follow how LLVM:s Common subexpression elimination i++; optimizations will discover this Loop optimizations i = i+k; fact. j++; } return i; } Step 1: Naive translation to LLVM Step 2: Translating to SSA form (opt -mem2reg) define i32 @f() { if.then: entry: store i32 0, i32* %k define i32 @f() { if.then: %i = alloca i32 br label %if.end entry: br label %if.end %j = alloca i32 if.else: br label %while.cond %k = alloca i32 %tmp4 = load i32* %i if.else: store i32 8, i32* %i %inc = add i32 %tmp4, 1 while.cond: %inc = add i32 %i.1, 1 store i32 1, i32* %j store i32 %inc, i32* %i %k.1 = phi i32 [ 1, %entry ], br label %if.end store i32 1, i32* %k br label %if.end [ %k.0, %if.end ] br label %while.cond if.end: %j.0 = phi i32 [ 1, %entry ], if.end: while.cond: %tmp5 = load i32* %i [ %inc8, %if.end ] %k.0 = phi i32 [ 0, %if.then ], %tmp = load i32* %i %tmp6 = load i32* %k %i.1 = phi i32 [ 8, %entry ], [ %k.1, %if.else ] %tmp1 = load i32* %j %add = add i32 %tmp5, %tmp6 [ %add, %if.end ] %i.0 = phi i32 [ %i.1, %if.then ], %cmp = icmp ne i32 %tmp, %tmp1 store i32 %add, i32* %i %cmp = icmp ne i32 %i.1, %j.0 [ %inc, %if.else ] br i1 %cmp, label %while.body, %tmp7 = load i32* %j br i1 %cmp, label %while.body, %add = add i32 %i.0, %k.0 label %while.end %inc8 = add i32 %tmp7, 1 label %while.end %inc8 = add i32 %j.0, 1 while.body: store i32 %inc8, i32* %j br label %while.cond %tmp2 = load i32* %i br label %while.cond while.body: %cmp3 = icmp eq i32 %tmp2, 8 while.end: %cmp3 = icmp eq i32 %i.1, 8 while.end: br i1 %cmp3, label %if.then, %tmp9 = load i32* %i br i1 %cmp3, label %if.then, ret i32 %i.1 label %if.else ret i32 %tmp9 label %if.else } } Step 3: Sparse Conditional Constant Propagation Step 4: CFG Simplification (opt -simplifycfg) (opt -sccp) define i32 @f() { entry: br label %while.cond define i32 @f() { if.then: entry: br label %if.end while.cond: br label %while.cond %j.0 = phi i32 [ 1, %entry ], if.else: [ %inc8, %if.end ] Comments while.cond: br label %if.end %k.1 = phi i32 [ 1, %entry ], %j.0 = phi i32 [ 1, %entry ], [ 0, %if.end ] If the function terminates, the [ %inc8, %if.end ] if.end: %cmp = icmp ne i32 8, %j.0 return value is 8. %k.1 = phi i32 [ 1, %entry ], %inc8 = add i32 %j.0, 1 br i1 %cmp, label %if.end, [ 0, %if.end ] br label %while.cond label %while.end opt has not yet detected that the %cmp = icmp ne i32 8, %j.0 loop is certain to terminate. br i1 %cmp, label %while.body, while.end: if.end: label %while.end ret i32 8 %inc8 = add i32 %j.0, 1 } br label %while.cond while.body: br i1 true, label %if.then, while.end: label %if.else ret i32 8 } Step 5: Dead Loop Deletion (opt -loop-deletion) Static Single Assignment form Def-use chains define i32 @f() { One more -simplifycfg step Dataflow analysis often needs to connect a definition with its uses entry: yields finally and, conversely, find all definitions reaching a use. br label %while.end This can be simplified if each variable has only one definition. define i32 @f() { while.end: entry: A new form of IR ret i32 8 ret i32 8 Three-address code can be converted to SSA form by renaming } } variables so that each variable has just one definition. For realistic code, dozens of passes are performed, some of them A non-example Converted to SSA repeatedly. Many heuristics are used to determine order. s := 0 s1 := 0 Use opt -std-compile-opts for a default selection. x := 1 x1 := 1 s := s + x s2 := s1 + x1 x := x + 1 x2 := x1 + 1 Conversion to SSA An artificial device: φ-functions Naive version First pass: Add “definitions” of the form A harder example Conversion started x := φ(x,..., x) in the beginning of each block with several s := 0 predecessors and for each variable. x := 1 s1 := 0 Second pass: Do renumbering. L1: if x > n goto L2 x1 := 1 s := s + x L1: if x > n goto L2 After 1st pass After 2nd pass x := x + 1 s2 := s? + x? goto L1 x2 := x? + 1 s := 0 s1 := 0 L2: goto L1 x := 1 x1 := 1 L2: L1: s := φ(s,s) L1: s3 := φ(s1,s2) x := φ(x,x) x3 := φ(x1,x2) Note the def/use difficulty: What should replace the three if x > n goto L2 if x3 > n goto L2 In s + x, which def of s does the ’?’ ? s := s + x s2 := s3 + x3 use refer to? x := x + 1 x2 := x3 + 1 goto L1 goto L1 L2: L2: What are φ-functions? Step 2 of example revisited: To SSA form A device during optimization !" # $ Think of φ-functions as function calls during optimization. ! "! #! " Later, some of them will be eliminated (e.g. by dead code #! "!! " !" # $ $ ! "!$! " %&$ % % ! %&$ elimination). %& %& %& %& %! ! %& !" # $ !" # $ & Others will after optimization be transformed to real code. ' x3 := (x1,x2) Idea: φ will be transformed to an instruction !" # $ %&$ !" # $ % %' !# %& %&( %& %&)$ %! ! ( x3 := x1 at the end of left predecessor and x3 := x2 at end of %& # " # %&) right predecessor. ' & # # " %&$ ( * $$ ! %& ( ! # $ # $ Advantages Many analyses become much simpler when code is in SSA form. # $ %&+$ %&,$ ! "! ! (" $$$$ ! %&+ %&, Main reason: we see immediately for each use of a variable where $$ $ ! "!$ ! (" %&-$ ( !$ $$ ! %&- # ( ! it was defined. !" # $ Computing SSA form; algorithm Simple constant propagation A dataflow analysis based on SSA form Uses values from a lattice L with elements We already did this : Not a constant, as far as the analysis can tell. Yes, but the conversion inserts unnecessary φ-functions and is too > c1, c2, c3,...: The value is constant, as indicated. inefficient – the gains in analysis with SSA form may be lost in : Yet unknown, may be constant. ⊥ conversion. Each variable v is assigned an initial value val(v) L: ∈ Variables with definitions v := c get val(v) = c, Better algorithms input variables/parameters v get val(v) = , > There are algorithms for finding the right number of φ-functions and the rest get val(v) = . needed. ⊥ These are based on the notion of dominance; if you intend to use The lattice L SSA form, you need to learn about that – or use LLVM, which has The lattice order tools to do it for you. c for all c. c1 c2 c3 c4 . ⊥ ≤ ≤ > ci and cj not related. Propagation phase, 1 Propagation phase, 2 Iteration Iteration, continued Initially, place all names n with val(n) = on a worklist. 6 > Update val for the defined variables, putting variables that get a Iterate by picking a name from the worklist, examining its uses and new value back on the worklist. computing val of the RHS’s, using rules as Terminate when worklist is empty. 0 x = 0 (for any x) · Termination x = · ⊥ ⊥ Values of variables on the worklist can only increase (in lattice x = (x = 0) · > > 6 order) during iteration. Each value can only have its value increased twice. plus ordinary multiplication for constant operands. For φ-functions, we take the join of the arguments, where A disappointment ∨ x = x for all x, x = for all x, and In our running example, this algorithm will terminate with all ⊥ ∨ > ∨ > variables having value . , if c = c > c c = > i 6 j i ∨ j c , otherwise. We need to take reachability into account. i Sparse Conditional Constant Propagation Correctness of SCCP Sketch of algorithm Uses also a worklist of reachable blocks. ! "! #! " A combination of two dataflow analyses $ ! "!! " Initially, only the entry block is % % #! Sparse conditional constant propagation can be seen as the %! ! reachable. & combination of simple constant propagation and reachability In evaluation of φ functions, analysis/dead code analysis. ! ! ' only flows from # ⊥ & unreachable blocks. Both of these can be expressed as dataflow problems and a framework can be devised where the correctness of such New blocks added to worklist ' combination can be proved. when elaborating terminating instructions. Result for running example as # ' ! shown to the right Final steps Common subexpression elimination Control flow graph simplification Problem Fairly simple pass; SCCP does not change graph structure of CFG We want to avoid re-computing an expression; instead we want to even when “obvious” simplifications can be done. use the previously computed value. Dead Loop Elimination Notes j Identifies an induction variable (namely ), which Code example The second occurrence of a - d increases with 1 for each loop iteration, a := b + c should not be computed; terminates the loop when reaching a known value, b := a - d instead we should use d := b. is initialised to a smaller value. c := b + c Both occurrences of b + c must d := a - d When such a variable is found, loop termination is guaranteed and be computed, since b is redefined the loop can be removed. in-between. Value numbering, 1 Value numbering, 2 Algorithm A classic technique For each instruction x := y # z: Works on three-address code within a basic block. Look up VN ny for y in D1. Each expression is assigned a value number (VN), so that If not present, generate new unique VN ny and expressions that have the same VN must have the same value.

Load more