CS 515 Programming Language and I

Lecture 10: Data Flow Analysis II (The lectures are based on the slides copyrighted by Keith Cooper and Linda Torczon from Rice University.) Zheng (Eddy) Zhang Rutgers University Fall 2017, 11/14/2017 2 Review: Middle End (Optimizer)

Front IR Middle IR Back Machine End End End code

Errors

• Front end works from the syntax of the source code • Rest of the compiler works from intermediate representation (IR) - Analyzes IR to learn about the code - Transforms IR to improve final code - Examples 1. Parallelization 2. Redundancy elimination 3. Data flow analysis (DFA) framework 4. Single static assignment (SSA) 3 Review: Compiler Middle End (Optimizer)

Front IR Middle IR Back Machine End End End code

Errors

• Front end works from the syntax of the source code • Rest of the compiler works from intermediate representation (IR) - Analyzes IR to learn about the code - Transforms IR to improve final code - Examples 1. Parallelization 2. Redundancy elimination 3. Data flow analysis (DFA) framework 4. Single static assignment (SSA) 4 Review: Redundancy Elimination

An expression x+y is redundant if and only if, along every path from the procedure’s entry, it has been evaluated, and its constituent subexpressions (x & y) have not been re-defined.

If the compiler can prove that an expression is redundant • It can preserve the results of earlier evaluations • It can replace the current evaluation with a reference

Two pieces to the problem • Proving that x+y is redundant • Rewriting the code to eliminate the redundant evaluation

One single-pass, local technique to accomplish both is called (VN) 5 Review

Redundancy Elimination • Local Value Numbering (LVN) • Superlocal Value Numbering (SVN) • Dominator Value Numbering (DVN) • Global Common Subexpression Elimination (GCSE) 6 Review: Local Value Numbering

A m ← a + b n ← a + b Local Value Numbering (LVN) C Find redundant operation within a block q ← a + b B r ← c + d p ← c + d D r ← c + d E e ← b + 18 e ← a + 17 LVN finds 2 redundant ops s ← a + b t ← c + d u ← e + f u ← e + f LVN misses 8 redundant ops!

v ← a + b F w ← c + d x ← e + f G y ← a + b z ← c + d 7 Review: Local Value Numbering

The Algorithm

For each operation o in the block 1. Get value numbers for the operands from a hash lookup 2. Hash to get a value number for o 3. If o already had a value number, replace o with a reference 4. If o1 & o2 are constant, evaluate it & use a “load immediate”

Focus on each operand’s value number, not its name. 8 Review: Superlocal Value Numbering

A m ← a + b n ← a + b Local Value Numbering (LVN) C Find redundant operation within a block q ← a + b B Superlocal Value Numbering (SVN) r ← c + d p ← c + d Find redundant operation within a EBB D r ← c + d E e ← b + 18 e ← a + 17 s ← a + b t ← c + d u ← e + f u ← e + f

v ← a + b F w ← c + d x ← e + f G y ← a + b z ← c + d 9 Review: Superlocal Value Numbering

A m ← a + b n ← a + b Local Value Numbering (LVN) C Find redundant operation within a block q ← a + b B Superlocal Value Numbering (SVN) r ← c + d p ← c + d Find redundant operation within a EBB D r ← c + d E An Extended Basic Block (EBB) e ← b + 18 e ← a + 17 Set of blocks B1, B2, …, Bn s ← a + b t ← c + d • B1 is either the entry node to the u ← e + f u ← e + f procedure, or it has > 1 pred. •All other B have 1 pred. & that pred. v ← a + b i F w ← c + d is in the EBB x ← e + f Three EBBs in this CFG G { A, B, C, D, E } y ← a + b 1. z ← c + d 2. { F } 3. { G } 10 Review: Superlocal Value Numbering

The SVN Algorithm 1. Identify EBBs 2. In depth-first order over an EBB, starting with the head of the EBB, p0 (I) Apply LVN to pi (II) Invoke SVN on each of pi’s EBB successors • When going from pi to its EBB successor pj, extend the symbol table with a new scope for pj, apply LVN to pj , & process pj’s EBB successors • When going from pj to its EBB predecessor pi , discard the scope for pj

Use a scoped table & the right name space 11 Review: Superlocal Value Numbering

A m ← a + b n ← a + b Local Value Numbering (LVN) C Find redundant operation within a block q ← a + b B Superlocal Value Numbering (SVN) r ← c + d p ← c + d Find redundant operation within a EBB D r ← c + d E An Extended Basic Block (EBB) e ← b + 18 e ← a + 17 Set of blocks B1, B2, …, Bn s ← a + b t ← c + d • B1 is either the entry node to the u ← e + f u ← e + f procedure, or it has > 1 pred. •All other B have 1 pred. & that pred. v ← a + b i F w ← c + d is in the EBB x ← e + f Three EBBs in this CFG G { A, B, C, D, E } y ← a + b 1. z ← c + d 2. { F } 3. { G } 12 Review: Superlocal Value Numbering

A m ← a + b n ← a + b Local Value Numbering (LVN) C Find redundancy within a block q ← a + b B Superlocal Value Numbering (SVN) r ← c + d p ← c + d Find redundancy within a EBB D r ← c + d E e ← b + 18 e ← a + 17 s ← a + b t ← c + d u ← e + f u ← e + f SVN finds 5 redundant ops.

v ← a + b SVN misses 5 redundant ops! F w ← c + d x ← e + f G y ← a + b z ← c + d 13 Review: Superlocal Value Numbering

A m ← a + b n ← a + b Local Value Numbering (LVN) C Find redundancy within a block q ← a + b B Superlocal Value Numbering (SVN) r ← c + d p ← c + d Find redundancy within a EBB D r ← c + d E Dominator Value Numbering (DVN) e ← b + 18 e ← a + 17 Find redundancy in IDOM-tree s ← a + b t ← c + d u ← e + f u ← e + f

v ← a + b F w ← c + d x ← e + f G y ← a + b z ← c + d 14 Redundancy Elimination

An expression x+y is redundant if and only if, along every path from the procedure’s entry, it has been evaluated, and its constituent subexpressions (x & y) have not been re-defined.

We need to understand the properties of those paths A m ← a + b • Specifically, what blocks occur on any path from entry to p n ← a + b C

• If we can find those blocks, we can use their hash tables q ← a + b r ← c + d B For example: D E • A is on every path to any other node; and C is on every path p ← c + d e ← b + 18 e ← a + 17 r ← c + d s ← a + b t ← c + d to D, E, and F. u ← e + f u ← e + f • Only A is on every path to G. v ← a + b F w ← c + d x ← e + f G y ← a + b z ← c + d 15 Review: Dominance in a Control Flow Graph

Definition In a flow graph, x dominates y if and only if every path from the entry node of the control-flow graph to y includes x • By definition, x dominates x itself A m ← a + b • We associate a DOM set with each node n ← a + b C DOM(x) contains the set of nodes that dominate x • q ← a + b r ← c + d • |DOM(x )| ≥ 1 B D E Block DOM IDOM p ← c + d e ← b + 18 e ← a + 17 A A — r ← c + d s ← a + b t ← c + d B A,B A u ← e + f u ← e + f C A,C A D A,C,D C v ← a + b F w ← c + d E A,C,E C x ← e + f F A,C,F C G y ← a + b G A,G A z ← c + d 16 Review: Dominance in a Control Flow Graph Immediate dominator • For any node x, there must be a y in DOM(x ) closest to x • We call this y the immediate dominator of x - x cannot be its own immediate dominator, unless x is n0 A m ← a + b • As a matter of notation, we write this as IDOM(x ) n ← a + b C

q ← a + b r ← c + d B D E Block DOM IDOM p ← c + d e ← b + 18 e ← a + 17 A A — r ← c + d s ← a + b t ← c + d B A,B A u ← e + f u ← e + f C A,C A D A,C,D C v ← a + b F w ← c + d E A,C,E C x ← e + f F A,C,F C G y ← a + b G A,G A z ← c + d 17 Computing Dominators

To use dominance information, we need to compute DOM sets

• A node n dominates node m iff n is on every path from n0 to m - Every node dominates itself, by definition - n’s immediate dominator is the node in DOM(n ) that is closest to n in the graph

IDOM(n) ≠ n, unless n is n0 , by convention. 18 Computing Dominators

Computing DOM • We formulate the computation of DOM sets as a data-flow analysis (DFA) problem • Simultaneous equations over sets associated with the nodes in the CFG

Equations relate the set value at a node n to those of its predecessors and successors.

Initially, DOM(n) = N, ∀ n ≠ n0 .

1. DOM(n0 ) = { n0 } 2. DOM(n) = { n } ∪ (∩p preds(n) DOM(p)) 19 Computing Dominators

Initially, DOM(n) = N, ∀ n ≠ n0 .

1. DOM(n0 ) = { n0 }

2. DOM(n) = { n } ∪ (∩p preds(n) DOM(p))

Example:

DOM(B7) = { B7 } ∪ { DOM( B6 ) ∩ DOM( B8 )} 20 Computing Dominators A “fixed-point” data flow analysis (DFA) solution:

• Build a control-flow graph to identify the blocks & their DOM(n0 ) ← { n0 } predecessors for x ← n1 to nn • Initialize DOM(n ) appropriately, for all n in the CFG DOM(x) ← { all nodes in graph, N } Set DOM(n) to N, for all n ≠ n0 change ← true Set DOM(n0) to { n0 } while (change) • Repeatedly apply the following equation until the sets change ← false stop changing for x ← n0 to nn TEMP ← { x } ∪ (∩ypred (x) DOM(y )) if DOM(x) ≠ TEMP then DOM(n) = { n } ∪ (∩p preds(n) DOM(p)) change ← true DOM(x) ← TEMP 21 Computing Dominators Initially, DOM(n) = N, ∀ n ≠ n0 . DOM(n0 ) = { n0 } Data Flow Equation:

DOM(n) = { n } ∪ (∩p preds(n) DOM(p))

DOM(n) B0 B1 B2 B3 B4 B5 B6 B7 B8

- Ӕ0Ӗ N N N N N N N N 1 {0} {0,1} {0,1,2} {0,1,2,3} {0,1,2,3,4} {0,1,5} {0,1,5,6} {0,1,5,7} {0,1,5,8}

2 {0} {0,1} {0,1,2} {0,1,3} {0,1,3,4} {0,1,5} {0,1,5,6} {0,1,5,7} {0,1,5,8} 3 {0} {0,1} {0,1,2} {0,1,3} {0,1,3,4} {0,1,5} {0,1,5,6} {0,1,5,7} {0,1,5,8} 22 Review: Superlocal Value Numbering

A m ← a + b n ← a + b Local Value Numbering (LVN) C Find redundancy within a block q ← a + b B Superlocal Value Numbering (SVN) r ← c + d p ← c + d Find redundancy within a EBB D r ← c + d E Dominator Value Numbering (DVN) e ← b + 18 e ← a + 17 Find redundancy in IDOM-tree s ← a + b t ← c + d u ← e + f u ← e + f 1. Find blocks DOM(p) occur on any path from entry to p. v ← a + b 2. Use hash table of DOM(p) to F w ← c + d remove redundancy x ← e + f G y ← a + b z ← c + d 23 Review: Static Single Assignment (SSA) Form

Two principles • Each name is defined by exactly one operation • Each operand refers to exactly one definition

To reconcile these principles with real code • Insert ϕ-functions at merge points to reconcile name space • Add subscripts to variable names for uniqueness

x0 ¬ ... x1 ¬ ... x ¬ ... x ¬ ...

x ¬f(x ,x ) ... ¬ x + ... 2 0 1 ¬ x2 + ... 24 Static Single Assignment (SSA)

m0 ← a + b A m ← a + b A n0 a + b n ← a + b C ← C

B q ← a + b q0 ← a + b r ← c + d p ← c + d B r1 ← c + d D E r ← c + d p0 ← c + d D E e0 ← b + 18 e ← b + 18 e ← a + 17 r0 ← c + d e1 ← a + 17 s0 ← a + b s ← a + b t ← c + d t0 ← c + d u0 ← e + f u ← e + f u ← e + f u1 ← e + f

F F e3 ← ϕ(e0,e1) G G v ← a + b r2 (r0,r1) y ← a + b ← ϕ u2 ← ϕ(u0,u1) w ← c + d y0 a + b z ← c + d ← v0 ← a + b x ← e + f z0 ← c + d w0 ← c + d x0 ← e + f Original Form SSA Form 25 Use Dominator for Superlocal Value Numbering

m0 ← a + b SVN did not help with blocks F or G A n0 ← a + b • Multiple predecessors C • Must decide what facts hold in F and in G q0 ← a + b • Can look at F’s or G’s DOM sets B r1 ← c + d

• Or use table from IDOM(x) to start x D p0 ← c + d E -Use C for F and A for G e0 ← b + 18 r0 ← c + d e1 ← a + 17 s0 a + b -Imposes a DOM-tree application order ← t0 ← c + d u0 e + f ← u1 ← e + f

Dominator Tree F A G e3 ← ϕ(e0,e1) r2 ← ϕ(r0,r1) u2 ← ϕ(u0,u1) B C G y0 ← a + b v0 ← a + b z0 ← c + d w0 ← c + d D E F x0 ← e + f 26 Use Dominator for Superlocal Value Numbering

m0 ← a + b SVN did not help with blocks F or G A n0 ← a + b • Multiple predecessors C • Must decide what facts hold in F and in G q0 ← a + b • Can look at F’s or G’s DOM sets B r1 ← c + d

• Or use table from IDOM(x) to start x D p0 ← c + d E -Use C for F and A for G e0 ← b + 18 r0 ← c + d e1 ← a + 17 s0 a + b -Imposes a DOM-tree application order ← t0 ← c + d u0 e + f ← u1 ← e + f

Dominator Tree F A G e3 ← ϕ(e0,e1) r2 ← ϕ(r0,r1) u2 ← ϕ(u0,u1) B C G y0 ← a + b v0 ← a + b z0 ← c + d w0 ← c + d D E F x0 ← e + f 27 Use Dominator for Superlocal Value Numbering

m0 ← a + b SVN did not help with blocks F or G A n0 ← a + b • Multiple predecessors C • Must decide what facts hold in F and in G q0 ← a + b • Can look at F’s or G’s DOM sets B r1 ← c + d

• Or use table from IDOM(x) to start x D p0 ← c + d E -Use C for F and A for G e0 ← b + 18 r0 ← c + d e1 ← a + 17 s0 a + b -Imposes a DOM-tree application order ← t0 ← c + d u0 e + f ← u1 ← e + f

Dominator Tree F A G e3 ← ϕ(e0,e1) r2 ← ϕ(r0,r1) u2 ← ϕ(u0,u1) B C G y0 ← a + b v0 ← a + b z0 ← c + d w0 c + d D E F ← x0 ← e + f 28 Value Numbering

A m ← a + b n ← a + b Local Value Numbering (LVN) C Find redundancy within a block q ← a + b B Superlocal Value Numbering (SVN) r ← c + d p ← c + d Find redundancy within a EBB D E r ← c + d Dominator Value Numbering (DVN) e ← b + 18 e ← a + 17 Find redundancy in IDOM-tree s ← a + b t ← c + d u ← e + f u ← e + f DVN finds 8 redundant ops.

v ← a + b DVN misses 2 redundant ops! F w ← c + d x ← e + f Another shortcoming of DVN G y ← a + b No loop-carried redundancies z ← c + d 29 Value Numbering

A m ← a + b n ← a + b Local Value Numbering (LVN) C Find redundancy within a block q ← a + b B Superlocal Value Numbering (SVN) r ← c + d p ← c + d Find redundancy within a EBB D r ← c + d E Dominator Value Numbering (DVN) e ← b + 18 e ← a + 17 Find redundancy in IDOM-tree s ← a + b t ← c + d u ← e + f u ← e + f Global Common Subexpression Elimination (GCSE) v ← a + b Find redundancy in all basic F w ← c + d x ← e + f blocks G y ← a + b z ← c + d 30 Global Redundancy Elimination Availability definition:

An expression is available on entry to block b if and only if, along every path from n0 to b, it has been evaluated and none of its constituent subexpressions has been re-defined. 31 Data Flow Analysis for Expression Availability

Annotate each block b with a set AVAIL(b) x+y ∈ AVAIL(b) iff x+y is available at the entry to block b

AVAIL(b) = ∩ x ∈ predecessors(b) (DEEXPR(x) ∪ (AVAIL(x) - EXPRKILL(x)))

DEEXPR(x): the set of expressions defined in block x and not killed before the end of block x (downward exposed expressions) EXPRKILL(x): the set of expressions killed in block x u = f + e (killed expressions) l = b + u

Initial Conditions: a = b + c EXPRKILL: {f+e} AVAIL(n0 ) = ∅ f = a DEExpr: {b+c} AVAIL(x) = { all expressions } AVAIL: {b+c, b+u} ...... 32 Computing Availability

To compute AVAIL sets:

1. Build a control-flow graph (CFG) 2. For each block b in the CFG, compute DEEXPR and EXPRKILL 3. Solve the equations for AVAIL I. Use a data-flow solver (e.g., round-robin iterative solver) II. The AVAIL equations are well-behaved (unique fixed point, fast convergence)

AVAIL(b) = ∩ x ∈ predecessors(b) (DEEXPR(x) ∪ (AVAIL(x) - EXPRKILL(x))) 33 Example AVAIL(A) = Ø A m ← a + b AVAIL(B) = {a+b} ∪ (Ø ∩ all) n ← a + b = {a+b} C AVAIL(C) = {a+b} AVAIL(D) = {a+b,c+d} ∪ ({a+b,c+d} q ← a + b B ∩ all) r ← c + d p ← c + d = {a+b,c+d} D r ← c + d E AVAIL(E) = {a+b,c+d} AVAIL(F) = [{b+18,a+b,e+f} ∪ e ← b + 18 e ← a + 17 ({a+b,c+d} ∩ {all - e+f})] s ← a + b t ← c + d ∩ [{a+17,c+d,e+f} ∪ u ← e + f u ← e + f ({a+b,c+d} ∩ {all - e+f})] = {a+b,c+d,e+f} v ← a + b AVAIL(G) = [ {c+d} ∪ ({a+b} ∩ all)] F w ← c + d ∩ [{a+b,c+d,e+f} ∪ x ← e + f ({a+b,c+d,e+f} ∩ all)] G y ← a + b = {a+b,c+d} z ← c + d A B C D E F G DEEXPR a+b c+d a+b,c+d b+18,a+b,e+f a+17,c+d,e+f a+b,c+d,e+f a+b,c+d EXPRKILL { } { } { } e+f e+f { } { } 34 Value Numbering

A m ← a + b n ← a + b Local Value Numbering (LVN) C Find redundancy within a block q ← a + b B Superlocal Value Numbering (SVN) r ← c + d p ← c + d Find redundancy within a EBB D r ← c + d E Dominator Value Numbering (DVN) e ← b + 18 e ← a + 17 Find redundancy in IDOM-tree s ← a + b t ← c + d u ← e + f u ← e + f Global Common Subexpression Elimination (GCSE) v ← a + b Find redundancy in all basic blocks F w ← c + d AVAIL(F) = {a+b,c+d,e+f} AVAIL(G) = {a+b,c+d} x ← e + f G y ← a + b z ← c + d 35 Value Numbering

A m ← a + b n ← a + b Local Value Numbering (LVN) C Find redundancy within a block q ← a + b B Superlocal Value Numbering (SVN) r ← c + d p ← c + d Find redundancy within a EBB D r ← c + d E Dominator Value Numbering (DVN) e ← b + 18 e ← a + 17 Find redundancy in IDOM-tree s ← a + b t ← c + d u ← e + f u ← e + f Global Common Subexpression Elimination (GCSE) v ← a + b Find redundancy in all basic blocks F w ← c + d x ← e + f GCSE finds 10 redundant ops. G y ← a + b GCSE misses 0 redundant ops! z ← c + d 36 Summary Redundancy Elimination A m ← a + b • Local Value Numbering (LVN) n ← a + b LVN - Finds redundancy, constants, & algebraic identities in a block C q ← a + b SVN • Superlocal Value Numbering (SVN) B r ← c + d - Extends local value numbering to EBBs p ← c + d r ← c + d LVN D E SVN - Use SSA-like name space to simplify bookkeeping e ← b + 18 e ← a + 17 • Dominator Value Numbering (DVN) s ← a + b SVN t ← c + d u ← e + f u ← e + f - Extends scope to “almost” global (no back edges) - Use dominance information to handle join points in CFG v ← a + b DVN F w ← c + d DVN • Global Common Subexpression Elimination (GCSE) x ← e + f GCSE - Calculate at every basic block G y ← a + b DVN - Use data-flow analysis to calculate available expressions z ← c + d GCSE 37 Comparison of Redundancy Elimination Techniques

Basis of Name Scope Operates On Loops Identity LVN Local blocks value no

SVN Superlocal EBBs value no

DVN Regional dom. Tree value no

GCSE Global CFG lexical yes 38 Review: Compiler Middle End (Optimizer)

Front IR Middle IR Back Machine End End End code

Errors

• Front end works from the syntax of the source code • Rest of the compiler works from intermediate representation (IR) - Analyzes IR to learn about the code - Transforms IR to improve final code - Examples 1. Parallelization 2. Redundancy elimination 3. Data flow analysis (DFA) framework 4. Single static assignment (SSA) 39 Review: Compiler Middle End (Optimizer)

Front IR Middle IR Back Machine End End End code

Errors

• Front end works from the syntax of the source code • Rest of the compiler works from intermediate representation (IR) - Analyzes IR to learn about the code - Transforms IR to improve final code - Examples 1. Parallelization 2. Redundancy elimination 3. Data flow analysis (DFA) framework 4. Single static assignment (SSA) Live Variables

A variable v is live at point p if and only if there is a path from p to a use of v along which v is not redefined.

Uses: e = b + c 1. Global 2. Improve SSA construction: a = b + c e = a reduce # of φ-functions 3. Detect references to uninitialized variables & defined but not used variables a = e + c 4. Drive transformations: useless-store elimination

40 Equations for Live Variables

Data Flow Equation (nf is the exit node of the CFG)

LIVEOUT(nf) = ϕ

LIVEOUT(b) = ∪ x∈succ(b) (UEEXPR(x) ∪ (LIVEOUT(x) ∩ VARKILL(x) ))

• LIVEOUT(n) contains the name of every variable that is live on exit from n • UEVAR(n) contains the upward-exposed variables in n, i.e. those that are used in n before any redefinition in n • VARKILL(n) contains all the variables that are defined in n

41 Three Steps in Live Variable Analysis

1. Build a CFG 2. Gather the initial information for each block 3. Use an iterative fixed-point algorithm to propagate information around the CFG

LIVEOUT(nf) = ϕ

LIVEOUT(b) = ∪ x∈succ(b) (UEEXPR(x) ∪ (LIVEOUT(x) ∩ VARKILL(x) ))

42 43 Live Variable Analysis Terms • Postorder: visits as many of a nodes’ children as possible before visiting the node • Reverse Postorder (RPO): visits as many of a nodes’ predecessors as possible before visiting the node • Forward problem: RPO on CFG. • Backward problem: Postorder on CFG or RPO on reverse CFG.

4 1

2 3 3 2

1 4

Postorder Reverse Postorder

44 Comparison with AVAIL Analysis

What’s in common? AVAIL(b) = ∩x∈pred(b) (DEEXPR(x) ∪ (AVAIL(x) ∩ EXPRKILL(x) )) • Three steps

• Fixed-point algorithm LIVEOUT(b) = What is different? ∪ x∈succ(b) (UEEXPR(x) ∪ (LIVEOUT(x) ∩ VARKILL(x) ))

• AVAIL: domain is a set of expressions LIVEOUT: domain is a set of variables • AVAIL: forward problem LIVEOUT: backward problem • AVAIL: intersection of all paths (all path problem0 LIVEOUT: union of all paths (any path problem)

45 46 Reaching Definitions A definition d of some variable v reaches operation i iff i uses the defined value of v before v is redefined

• REACHES(n): the set of definitions that reach the start of node n • DEDEF(n): the set of downward-exposed definitions in n i.e., their defined variables are not redefined before leaving n • DEFKILL(n): all definitions killed by a definition in n.

• Initial Condition:

REACHES(no) = Ø • Data Flow Equation:

REACHES(n) = ∪ m ∈ proc(n) DEDEF(m) ∪ (REACHES(m) ∩ DEFKILL(m)) 47 Example: Def-Use Chains

a ← 5 b ← 3 c ← b + 2 d ← a - 2 d is dead It has no use

e ← a + b e ← 13 e ← e + c

f ← 2 + e Write f 48

More data flow analysis in next class 49

Global Register Coloring 50 Global Register Allocation

The Big Picture • At each point in the code • Determine which values will reside in registers • Select a register for each such value • The goal is an allocation that “minimizes” running time

Most modern, global allocators use a graph-coloring paradigm • Build a “conflict graph” or “interference graph” • Find a k-coloring for the graph, where k is the number of available registers, or change the code to a nearby problem that it can k-color 51 Building the Interference Graph What is an “interference” ? (or conflict) • Two values interfere if there exists an operation where both are simultaneously live • If x and y interfere, they cannot occupy the same register • Interference graph construction relies on LIVE information

The interference graph, GI = (NI , EI ) • Nodes in GI represent values, or live ranges • Edges in GI represent individual interferences For x, y NI , Ei iff x and y interfere

A k-coloring of GI can be mapped into an allocation to k registers 52 Live Range

Live ranges are simpler in a single block

A value is live from its definition to its last use. • A live range is just the interval in the block from first definition to last use • In a single block, live ranges form an interval graph

A simple Example: # Operation a ← … a’s live range is [0,4] 0 b’s live range is [1,3] 1 b ← … e’s live range is [4,5] 2 c ← … a … Live ranges may, of course, overlap 3 d ← … b … a & b are simultaneously live, so they cannot occupy the same PR 4 e ← … a … a & e can occupy the same PR, as could b & e 5 f ← … e … 53 Live Range In the multi-block case, live ranges are more complex . • Consider x, y & z in the code to the right x ¬ … ‣ x has 2 different live ranges z ¬ …

B1 y ¬ …

x ¬ y B3 … ¬ y y ¬ x y ¬ z x ¬ …

B4 … ¬ … z ¬ …

B 5 … ¬ x + y + z 54 Live Range In the multi-block case, live ranges are more complex . Consider x, y & z in the code to the right • x ¬ … ‣ x has 2 different live ranges z ¬ … ‣ y has 2 different live ranges B1 y ¬ …

x ¬ y B3 … ¬ y y ¬ x y ¬ z x ¬ …

B4 … ¬ … z ¬ …

B 5 … ¬ x + y + z 55 Live Range In the multi-block case, live ranges are more complex . • Consider x, y & z in the code to the right ‣ x has 2 different live ranges ‣ y has 2 different live ranges ‣ z has 1 live range 56 Finding Live Ranges We can use SSA to find live ranges in a simple way • Build static single assignment form (SSA form) • Consider each SSA name a set • At each phi-function, union together the sets of the phi-function arguments • Each remaining set is a live range • Rename into live ranges

SSA construction details will be covered in next lecture. 57 Live Range

B0 Example in (Pruned) SSA Form x0 ¬ … z0 ¬ … • Each name is defined in exactly one operation

• Each use refers to one name B1 z1 ¬ Φ(z0,z2) ¬ • Live ranges are y0 …

1. (x0,x2,x3) and (x1) B3 x1 ¬ y0 … ¬ y0 ¬ ¬ 2. (y0) and (y1,y2,y3) y1 x1 y2 z1 x2 ¬ … 3. (z0,z1,z2)

As predicted several slides ago B4 x3 ¬ Φ(x2,x0) y3 ¬Φ(y1,y2) … ¬ …

z2 ¬ …

B5 … ¬ x3 + y3 + z2 58 Rename

B0 ¬ x0 … B0 x0 ¬ … ¬ z0 … z0 ¬ …

B1 z ¬ Φ(z ,z ) 1 0 2 B1 y0 ¬ … y0 ¬ …

B3 x1 ¬ y0 … ¬ y0 B2 x1 ¬ B3 … ¬ y1 ¬ x1 y2 ¬ z1 y0 y0 x2 ¬ … y1 ¬ x1 y1 ¬ z0 x0 ¬ …

B4 x3 ¬ Φ(x2,x0) B4 y3 ¬Φ(y1,y2) … ¬ … … ¬ … z0 ¬ … z2 ¬ …

B5 B5 … ¬ x3 + y3 + z2 … ¬ x0 + y1 + z0 59 Chaitin’s Allocation Algorithm Assume we have k physical registers: k-coloring problem

Observation: Any vertex n that fewer than k neighbors in the interference graph can always be colored. Algorithm: 1. Pick any vertex n such that deg(n) < k and push it into the stack S. Remove n and all its incident edges from the interference graph. If there does not exist such a vertex with degree < k, prune nodes from the graph until we can find a vertex with degree < k. The pruning criteria can be customized. 2. Repeat step 1 until no vertex is left in the graph. 3. Successively pop vertices off the stack S and color one node at one time. 60 Example 61 Example 62 Example 63 Example 64 Example 65 Example 66 Example 67 Example 68 Example 69 Example 70 Chaitin-Briggs Algorithm An Improvement over Chaitin’s algorithm Observation: A node that has more than k-1 neighbors is not necessarily un-colorable.

Brigg’s idea: • Keep the pruned nodes as coloring candidates and still push them into stack • When you pop it off, a color might be available for it. If so, color it.

Maximum degree is a loose upper bound on colorability 71 Example 72 Example 73 Example 74 Example 75 Example 76 Example 77 Example

Copyright 2013. Keith D. Cooper & Linda Torczon. Rice University 78 Example

Copyright 2013. Keith D. Cooper & Linda Torczon. Rice University 79 Example

Copyright 2013. Keith D. Cooper & Linda Torczon. Rice University 80 Example

Copyright 2013. Keith D. Cooper & Linda Torczon. Rice University 81 Example

Copyright 2013. Keith D. Cooper & Linda Torczon. Rice University 82 Reading

• Engineering A Compiler - Chapter 5.3.4 - Chapter 5.5 - Chapter 8.6.1 - Chapter 9