Control-Flow and Low-Level Optimizations

Outline • Unreachable-Code Elimination Control-Flow and Low-Level • Straightening • If and Loop Simplifications Optimizations • Loop Inversion and Unswitching • Branch Optimizations • Tail Merging (Cross Jumping) • Conditional Moves • Dead-Code Elimination • Branch Prediction • Peephole Optimization • Machine Idioms & Instruction Combining Unreachable-Code Elimination Unreachable-Code Elimination Unreachable code is code that cannot be executed , entry entry regardless of the input data c = a + b c = a + b – codthtide that is never execut tblfable for any i nput tdt data d = c d = c f = a + c e > c e > c – code that has become unreachable due to a previous g = e compiler tftransforma tion a = e +c f = c – g f = c – g b = c+1c + 1 b = c+1c + 1 Unreachable code elimination removes this code h = e + 1 d = 4 * a d = 4 * a e < a e = d – 7 e = d – 7 – reduces the code space f = e+2e + 2 f = e+2e + 2 – improves instruction-cache utilization – enables other control -flow transformations exit exit Straightening Straightening Example Straightening is applicable to pairs of basic blocks such Straightening in the presence of fall-throughs is tricky... that the first has no successors other than the second and the second has no predecessors other than the first L1: … L1: … a = b + c a = b + c goto L2 b = c * 2 a = a + 1 … … L6: … if c < 0 goto L3 a = b + c a=b+ca = b + c gotL4to L4 gotL5to L5 b = c * 2 b = c * 2 a = a + 1 L2:b: b = c * 2 L6: … a = a + 1 c < 0 a = a + 1 goto L4 c < 0 if c < 0 goto L3 L5: … L5: … If Simplifications If Simplification Example If simplifications apply to conditional constructs a > d a > d Y N Y N one or both of whose branches are empty: b = a – if either the then or the else part of an if-construct is c = 4 * b b = a (a >= d) or bool c = 4 * b empty, the corresponding branch can be eliminated Y N – one branch of an if with a constant-valued condition d = b d = c d = b can also be eliminated – we can also simplify ifs whose condition, C, occurs e = a + b e = a + b in the scope of a condition that implies C (and none of the condition’s operands has changed value) … … Loop Simplifications Loop Simplification Example • A loop whose body is empty can be eliminated s = 0 if the iteration-control code has no side-effects i = 0 s = 0 i = i+1i + 1 (Side-effects might be simple enough that they can be i = 0 s = s + i i = 4 replaced with non-looping code at compile time) L1: if i >= 4 goto L2 i = i + 1 ii+1i = i + 1 s = 10 s = s + i L2: … s = s + i i = i + 1 • If number of iterations is small enough , loops goto L1 ss+is = s + i can be unrolled into branchless code and the L2: … i = i + 1 s = s + i lbdbloop body can be execute d at compil ilie time L2: … Loop Inversion Loop Inversion Example 1 Loop inversion transforms a while loop into a for (i = 0; i < 100; i++) { Loop bounds repeat loop (i.e. moves the loop-closing test a[i] = i + 1; are known } from before the loop to after it). – Has the advantaggye that only one branch instruction needs to be executed to close the loop. – Requires that we determine that the loop is entered i=0;i = 0; i0i = 0; at least once! while (i < 100) { do { a[i] = i+1;i + 1; a[i]=i+1;a[i] = i + 1; i++; i++; } } while (i < 100) Loop Inversion Example 2 Unswitching Unswitching is a control -flow transformation that moves loop-invariant conditional branches out if (k > = n) goto L of loops for (i = k; i < n; i++) { i = k; if (k == 2) { a[i] = i + 1; do { for (i = 1; i < 100; i++) { for (i = 1; i < 100; i++) } a[i] = i + 1; if (k == 2) a[i] = a[i] + 1; i++; a[i] = a[i] + 1; }l} else { Loop bounds } while (i < n) else for (i = 1; i < 100; i++) are unknown L: a[i] = a[i] – 1; a[i] = a[i] – 1; } } Unswitching Example Branch Optimizations Branches to branches are remarkably common! if (k == 2) { – An unconditional branch to an unconditional branch can be for (i = 1; i < 100; i++) { replaced by a branch to the latter ’s target for (i = 1; i < 100; i++) { if (a[i] > 0) – A conditional branch to an unconditional branch can be if (k == 2&&2 && a[i] > 0) a[i] = a [i] + 1; replaced by the corresponding conditional branch to the latter branch’s target a[i] = a[i] + 1; } } }else{} else { – An unconditional branch to a conditional branch can be replaced by a copy of the conditional branch i = 100; } – A conditional branch to a conditional branch can be replaced by a conditional branch with the former’s test and the latter’s target as long as the latter condition is true whenever the fiformer one is Branch Optimization Examples Eliminating Useless Control -Flow if a == 0 go to L1 if a == 0 go to L2 The Problem: … … – After optimization, the CFG might contain empty blocks L1: if a >= 0 goto L2 L1: if a >= 0 goto L2 … … – “Empty” blocks still end with either a branch or jump L2: … L2: … – Produces jump to jump, which wastes time and space goto L1 The Algorithm: (Clean) L1: … L1: … – Use four distinct transformations if a == 0 goto L1 – AlhApply them in a caref flllully selected dd order goto L2 if a != 0 goto L2 – Iterate until done L1: … L1: … Eliminating Useless Control -Flow Eliminating Useless Control -Flow Transformation 1 Both sides of branch target B2 Transformation 2 Merging an empty block – Neither block must be empty – Empty B1 ends with a jump B1 B1 – Rep lace it with a j ump t o B1 empty – Coalesce B1 and B2 B1 – Move B1’s incoming edges – Simple rewrite of the last – Eliminates extraneous jjpump operation in B1 B2 B2 B2 B2 – Faster, smaller code BhtjBranch, not a jump How does this happen? How does this happen? Eliminating redundant branches – By rewriting other branches Eliminating empty blocks – By eliminating operations in B1 How do we recognize it? How do we recognize it? – Check each branch – Test for empty block Eliminating Useless Control -Flow Eliminating Useless Control -Flow Transformation 3 Coalescing blocks Transformation 4 Jump to a branch – Neither block must be empty – B1 ends with a jump, B2 is B1 – B1 ends with a jump to B2 B1 B1 emppyty – Eliminates pointless jump B1 – B2 has one predecessor – Combine the two blocks – Copy branch into end of B1 B2 empty empty – Might make B2 unreachable B2 – Eliminates a jump B2 B2 How does this happen? How does this happen? Eliminating non-empty blocks – By eliminating operations in B1 – By simplifying edges out of B1 Hoisting branches How do we recognize it? from empty blocks How do we recognize it? – Check target of jump – Jump to empty block Eliminating Useless Control -Flow Tail Merging (Cross Jumping) Puttinggg the transformations together Tail merging applies to basic blocks whose last few instructions – Process the blocks in postorder are identical and that continue to the same location. • Clean up Bi’s successors before Bi It replaces the matching instructions of one block with a branch to • Simplifies implementation and understanding the corresponding point in the other. – At each node, apply transformations in a fixed order … … r12+31 = r2 + r3 r12+31 = r2 + r3 • Eliminate redundant branch r4 = r3 shl 2 goto L2 • Eliminate empty block r2 = r2 + 1 … • Merge bloc k w ith successors r2 = r4 – r2 r5 = r4 – 6 • Hoist branch from empty successor goto L1 L2: r4 = r3 shl 2 – May need to iterate … r2 = r2 + 1 r5 = r4 – 6 r2 = r4 – r2 •Postorder ⇒ unprocessed successors along back edges r4 = r3 shl 2 L1: … • Can bound iterations, but derivinggg a tight bound is hard r2 = r2 + 1 • Must recompute postorder between iterations r2 = r4 – r2 L1: … Conditional Moves Conditional Moves Help Loop Unrolling Conditional moves are instructions that copy a source to for(i=1;i<=n;i++){for (i = 1; i <= n; i++) { for(i=1;i<=n;i++){for (i = 1; i <= n; i++) { a target if and only if a specified condition is satisfied x = a[i]; x = a[i]; if (x>0) u = z * x; w = z * x; u = b[i]; – availa ble in several mod ern arc hitect ures (SPARC-V9, PentiumPro) else u = b[i]; u = (x>0) w; – are used to replace simple branching code sequences with s = s + u; non-branching code s = s + u; } } if a > b go to L1 • BiBy using con ditilditional move i ittinstructions, we can unroll max = b t1 = a > b loops containing internal control-flow and end up with goto L2 max = b “straight-line” code L1: max = a max = (t1) a – helps because instruction scheduling is then more effective L2: … – works if the two instruction blocks of the if are small in size Dead-Code Elimination Dead-Code Elimination Example entry entry A variable is dead if it is not used on any path from the location in k is only used the code where it is defined to the exit point of the routine. i = 1 j = 2 to define new i = 1 An instruction is dead if it computes values that are not used on j=2j = 2 k = 3 values for itself! any executable path leading from the instruction.

Load more