<<

Program Control Flow  Control flow  Sequence of operations  Representations

Control Flow Analysis  Control flow graph

 Control dependence

 Call graph

COMP 621 – Program Analysis and  Transformations  Analyzing program to discover its control These slides have been adapted from structure http://cs.gmu.edu/~white/CS640/Slides/CS640202.ppt  Today’s topic: CFGbased analysis by Professor Liz White.

2 Control Flow Analysis

Control Flow Graph Basic Blocks  CFG models flow of control in the program (procedure)  Definition  G = (N, E) as a directed graph  A is a maximal sequence of  Node n ∈ N: basic blocks consecutive statements with a single entry  A basic block is a maximal sequence of stmts with a single entry point, single exit point, and no internal branches point, a single exit point, and no internal  For simplicity, we assume a unique entry node n 0 and a unique exit branches node n in later discussions  Edge e=(n i, n j) ∈ E: possible transfer of control from block n i to  Basic unit in control flow analysis block n j  Local level of code optimizations if (x==y) if (x==y)  Redundancy elimination then { } else { }  Registerallocation .

3 4 Control Flow Analysis Control Flow Analysis

Basic Block Example Basic Block Example (1) i := m – 1 (1) i := m – 1 (2) j := n • How many basic blocks (2) j := n • How many basic blocks (3) t1 := 4 * n in this code fragment? (3) t1 := 4 * n in this code fragment? (4) v := a[t1] (4) v := a[t1] (5) i := i + 1 • What are they? • What are they? (5) i := i + 1 (6) t2 := 4 * i (7) t3 := a[t2] (6) t2 := 4 * I (8) if t3 < v goto (5) (7) t3 := a[t2] (9) j := j – 1 (8) if t3 < v goto (5) (10) t4 := 4 * j (9) j := j – 1 (11) t5 := a[t4] (10) t4 := 4 * j (12) if t5 > v goto (9) (11) t5 := a[t4] (13) if i >= j goto (23) (12) if t5 > v goto (9) (14) t6 := 4*i (13) if i >= j goto (23) (15) x := a[t6] (14) t6 := 4*I … (15) x := a[t6] 5 6 Control Flow Analysis Control Flow Analysis

1 Identify Basic Blocks Example Input : A sequence of intermediate code (1) i := m – 1 (16) t7 := 4 * i statements (2) j := n (17) t8 := 4 * j (3) t1 := 4 * n (18) t9 := a[t8] 1. Determine the leaders , the first statements of (4) v := a[t1] (19) a[t7] := t9 basic blocks (5) i := i + 1 (20) t10 := 4 * j • The first statement in the sequence (entry point) is a (6) t2 := 4 * i (21) a[t10] := x leader (7) t3 := a[t2] (22) goto (5) • Any statement that is the target of a branch (8) if t3 < v goto (5) (23) t11 := 4 * i (conditional or unconditional) is a leader (9) j := j 1 (24) x := a[t11] (10) t4 := 4 * j (25) t12 := 4 * i • Any statement immediately following a branch (conditional or unconditional) or a return is a leader (11) t5 := a[t4] (26) t13 := 4 * n (12) If t5 > v goto (9) (27) t14 := a[t13] 2. For each leader, its basic block is the leader (13) if i >= j goto (23) (28) a[t12] := t14 and all statements up to, but not including, the (14) t6 := 4*i (29) t15 := 4 * n next leader or the end of the program (15) x := a[t6] (30) a[t15] := x

7 8 Control Flow Analysis Control Flow Analysis

Example: Leaders Example: Basic Blocks (1) i := m – 1 (16) t7 := 4 * i (1) i := m – 1 (16) t7 := 4 * i (2) j := n (17) t8 := 4 * j (2) j := n (17) t8 := 4 * j (3) t1 := 4 * n (18) t9 := a[t8] (3) t1 := 4 * n (18) t9 := a[t8] (4) v := a[t1] (19) a[t7] := t9 (4) v := a[t1] (19) a[t7] := t9 (5) i := i + 1 (20) t10 := 4 * j (5) i := i + 1 (20) t10 := 4 * j (6) t2 := 4 * i (21) a[t10] := x (6) t2 := 4 * i (21) a[t10] := x (7) t3 := a[t2] (22) goto (5) (7) t3 := a[t2] (22) goto (5) (8) if t3 < v goto (5) (23) t11 := 4 * i (8) if t3 < v goto (5) (23) t11 := 4 * i (9) j := j - 1 (24) x := a[t11] (9) j := j - 1 (24) x := a[t11] (10) t4 := 4 * j (25) t12 := 4 * i (10) t4 := 4 * j (25) t12 := 4 * i (11) t5 := a[t4] (26) t13 := 4 * n (11) t5 := a[t4] (26) t13 := 4 * n (12) If t5 > v goto (9) (27) t14 := a[t13] (12) If t5 > v goto (9) (27) t14 := a[t13] (13) if i >= j goto (23) (28) a[t12] := t14 (13) if i >= j goto (23) (28) a[t12] := t14 (14) t6 := 4*i (29) t15 := 4 * n (14) t6 := 4*i (29) t15 := 4 * n (15) x := a[t6] (30) a[t15] := x (15) x := a[t6] (30) a[t15] := x

9 10 Control Flow Analysis Control Flow Analysis

Generating CFGs Example: CFG

 Partition intermediate code into basic blocks (1) i := m – 1 (16) t7 := 4 * i (2) j := n (17) t8 := 4 * j  Add edges corresponding to control flows (3) t1 := 4 * n (18) t9 := a[t8] between blocks (4) v := a[t1] (19) a[t7] := t9  Unconditional goto (5) i := i + 1 (20) t10 := 4 * j (6) t2 := 4 * i (21) a[t10] := x  Conditional branch – multiple edges (7) t3 := a[t2] (22) goto (5)  Sequential flow – control passes to the next block (if (8) if t3 < v goto (5) (23) t11 := 4 * i no branch at the end) (9) j := j - 1 (24) x := a[t11] (10) t4 := 4 * j (25) t12 := 4 * i  If no unique entry node n or exit node n , add 0 f (11) t5 := a[t4] (26) t13 := 4 * n dummy nodes and insert necessary edges (12) If t5 > v goto (9) (27) t14 := a[t13]

 Ideally no edges entering n0; no edges exiting nf (13) if i >= j goto (23) (28) a[t12] := t14 (14) t6 := 4*i (29) t15 := 4 * n  Simplify many analysis and transformation (15) x := a[t6] (30) a[t15] := x

11 12 Control Flow Analysis Control Flow Analysis

2 CFG and HL code Complications in CFG Construction I = 1 1  Function calls J = 1 K = 1  Instruction may prefer function calls as L = 1 2 basic block boundaries repeat 3  Special functions as setjmp() and longjmp() if (P) then begin 7 J = I  4 5 if (Q) then L = 2  Ambiguous jump else L = 3  Jump r1 //target stored in register r1 K = K + 1 6 end  Static analysis may generate edges that never occur else K = K + 2 8 9 at runtime

print (I,J,K,L)  Record potential targets if possible repeat 10  Jumps target outside the current procedure if (R) then L = L + 4 11 12 until (S)  PASCAL, Algol: still restricted to lexically enclosing I = I + 6 procedure until () 13 14 Control Flow Analysis Control Flow Analysis

Nodes in CFG Depth First Traversal

 CFG is a rooted, directed graph  Given a CFG = A → ∈  Entry node as the root  If there is an edge ni nj E  Depthfirst traversal (depthfirst searching)  ni is a predecessor of nj B  Idea: start at the root and explore as far/deep as  nj is a successor of ni possible along each branch before backtracking  For any node n ∈ N  Can build a spanning tree for the graph  Pred(n) : the set of predecessors of n  Spanning tree of a directed graph G contains all  Succ(n) : the set of successors of n nodes of G such that  A branch node is a node that has more than one successor  There is a path from the root to any node reachable in the original graph and  A join node is a node that has more than one predecessor  There are no cycles

15 16 Control Flow Analysis Control Flow Analysis

DFS Spanning Tree DFST Example

Nodes are numbered procedure span(v) /* v is a node in the A 1 in the order visited graph */ during the search B == depth first pre-order InTree(v) = true C numbering . For each w that is a successor of v do

if (!InTree(w)) then D Add edge v  w to spanning tree span(w) E F end span G J

 Initial: span(n 0) I H

17 18 Control Flow Analysis Control Flow Analysis

3 DFST Example CFG Edges Classification

Nodes are numbered Edge x  y in a CFG is an A 1 in the order visited 2 during the search  Advancing edge – if x is an ancestor of y B == depth first pre-order in the tree C 3 Numbering .  Tree edge – if part of the spanning tree  Forward edge – if not part of the spanning tree D 4 and x is an ancestor of y in the tree 5 9 E F  Retreating edge – if not part of the spanning tree and y is an ancestor of x in 10 G 6 J the tree  I H 7 Cross edge – if not part of the spanning 8 tree and neither is an ancestor of the other

19 20 Control Flow Analysis Control Flow Analysis

DFST Example Back Edges and Reducibility

Tree Edge  An edge x  y in a CFG is a back edge if A Forward Edge Retreating Edge every path from the entry node of the flow B Cross Edge graph to x goes through y C  y dominates x : more details later

D  Every back edge is a retreating edge  Vice versa? E F  A flow graph is reducible if all its retreating G J edges in any DFST are also back edges  I H Flow graphs that occur in practice are almost always reducible

21 22 Control Flow Analysis Control Flow Analysis

NonReducible Graphs Nodes Ordering wrt DFST  Testing reducibility: Take any DFST for the  Enhanced depthfirst spanning tree algorithm: time =0; flow graph, remove the back edges, and procedure span(v) /* v is a node in the graph */ check that the result is acyclic InTree(v) = true; d[v] = ++time; For each w that is a successor of v do A if (!InTree(w)) then Add edge v  w to spanning tree span(w) B C f[v]=++time; end span

In any DFST, one  Associate two numbers to each node v in the graph of these edges will  d[v]: discovery time of v in the spanning be a retreating edge  f[v]: finish time of v in the spanning

23 24 Control Flow Analysis Control Flow Analysis

4 Nodes Ordering wrt DFST Ordering Example  Preordering  Ordering of vertices based on discovery time D 5  Postordering 7 6 E F  Ordering of vertices based on finish time  Reverse postordering G 8  The reverse of a postordering, i.e. ordering of vertices in the opposite order of their finish time  Preordering: DEGF  Not the same as preordering  Postordering: GEFD  Commonly used in forward data flow analysis  Reverse postordering: DFEG  Backward data flow analysis: RPO on the reverse CFG 25 26 Control Flow Analysis Control Flow Analysis

Big Picture Regions in CFG

Why care about ordering / back edges?  Extended basic block (EBB)  CFGs are commonly used to propagate  EBB is a maximal set of nodes in a CFG that contains information between nodes (basic blocks) no join nodes other than the entry node

 Data flow analysis  A single entry and possibly multiple exits

 Some optimizations like value numbering and instruction  The existence of back edges / cycles in flow graphs indicates that we may need to traverse scheduling are more effective if applied in EBBs the graph more than once  Natural loop  Iterative algorithms: when to stop? How quickly can  Loop is a collection of nodes in a CFG such that we stop?  All nodes in the collection are strongly connected, and  The collection of nodes has a unique entry : the only way to  Proper ordering of nodes during iterative visit the loop from outside algorithm assures number of passes limited by  A loop that contains no other loops is an inner loop the number of “nested” back edges  Main target of program optimizations

27 28 Control Flow Analysis Control Flow Analysis

EBB Example Dominance

Maxsize EBBs:  Node d of a CFG dominates node n if every path A {A,B}, {C,J}, from the entry node of the graph to n passes {D,E,F}, {G,H,I} B through d (d dom n ) C Loops?  Dom( n): the set of dominators of node n Not that obvious  Every node dominates itself: n ∈ Dom( n) Can use dominator  Node d strictly dominates n if d ∈ Dom( n) and d ≠ n D based loop detection  Dominancebased loop recognition: entry of a loop E F dominates all nodes in the loop

 Each node n has a unique immediate dominator G J m which is the last dominator of n on any path from the entry to n (m idom n ), m ≠ n I H  The immediate dominator m of n is the strict dominator of n that is closest to n 29 30 Control Flow Analysis Control Flow Analysis

5 Dominator Example Dominator Trees

Block Dom IDom 1 1 1 1 {1} — 2 2 3 2 2 {1,2} 1 3 3 4 3 {1,3} 1 4 4 5 6 7 4 {1,3,4} 3 5 6 5 6 5 {1,3,4,5} 4 7 8 7 10 6 {1,3,4,6} 4 8 9 8 7 {1,3,4,7} 4 9 10 9 10 8 {1,3,4,7,8} 7

9 {1,3,4,7,8,9} 8  In a dominator tree, a node’s parent is its immediate 10 {1,3,4,7,8,10} 8 dominator

31 32 Control Flow Analysis Control Flow Analysis

Other sets of interest Example 2 Block SDom Dom 1 1 Block Dom IDom Domn 1 1 2 1 {} {1,2,3,4,5,6,7,8,9,10} 2 2 3 3 2 {1} {2} 7 3 4 3 {1} {3,4,5,6,7,8,9,10} 4 4 5 5 4 {1,3} {4,5,6,7,8,9,10} 5 6 6 6 7 5 {1,3,4} {5} 7 6 {1,3,4} {6} 8 9 8 8 7 {1,3,4} {7,8,9,10} 9 9 10 10 8 {1,3,4,7} {8,9,10} 10 11 12 9 {1,3,4,7,8} {9} 11 10 {1,3,4,7,8} {10} 12 33 34 Control Flow Analysis Control Flow Analysis

Example 2 Algorithm: Computing DOM 1 Block Dom IDom  An iterative fixedpoint calculation 1 1 N is the set of nodes in the CFG 2 2 1,2 1 DOM( n0) = { n0} ( n0 is the entry) 3 1,2,3 2 For all nodes x ≠ n 3 7 0 4 1,2,3,4 3 DOM( x) = N 4 5 5 1,2,3,5 3 6 1,2,3,6 3 Until no more changes to dominator sets 6 7 1,2,7 2 for all nodes x ≠ n0 DOM( x) = { x } + ( ∩ DOM(P) ) for all predecessors P of x 8 9 8 1,2,8 2 9 1,2,8,9 8 10 10 1,2,8,9,10 9 11 12 11 1,2,8,9,11 9  At termination, node d in DOM( n) iff d dominates n 12 1,2,8,9,11,12 11

35 36 Control Flow Analysis Control Flow Analysis

6 Dominator Example Dominator Example initial iteration1 Dom 0 0 {0} {0} 0 Block initial iteration1 iteration2 1 N {1} + (Dom(0) ∩∩∩ Dom(9)) = {0,1} 0 {0} {0} {0} 1 1 2 N {2} + Dom(1) = {0,1,2} 1 N {0,1} {0,1} 2 2 3 N {3} + (Dom(1) ∩∩∩ Dom(2) ∩∩∩ Dom(8) ∩∩∩ 2 N {0,1,2} {0,1,2} 3 Dom(4)) = {0,1,3} 3 3 N {0,1,3} {0,1,3} 4 4 N {4} + (Dom(3) ∩∩∩ Dom(7)) = {0,1,3,4} 4 4 N {0,1,3,4} {0,1,3,4} 5 6 5 N {5} + Dom(4) = {0,1,3,4,5} 5 6 5 N {0,1,3,4,5} {0,1,3,4,5} 7 6 N {6} + Dom(4) = {0,1,3,4,6} 7 6 N {0,1,3,4,6} {0,1,3,4,6} 7 N {7} + (Dom(5) ∩∩∩ Dom(6) ∩∩∩ Dom(10)) = 8 {0,1,3,4,7} 8 7 N {0,1,3,4,7} {0,1,3,4,7} 9 10 8 N {8} + Dom(7) = {0,1,3,4,7,8} 9 10 8 N {0,1,3,4,7,8} {0,1,3,4,7,8} 9 N {9} + Dom(8) = {0,1,3,4,7,8,9} 9 N {0,1,3,4,7,8,9} {0,1,3,4,7,8,9} 10 N {10} + Dom(8) = {0,1,3,4,7,8,10} 10 N {0,1,3,4,7,8,10} {0,1,3,4,7,8,10} 37 38 Control Flow Analysis Control Flow Analysis

Computing IDOM from DOM IDominator Example IDom 0 Block initial (SDOM) 1. For each node n, initially set IDOM( n) = 0 {} {} DOM( n){n} (SDOM strict dominators) 1 1 {0} {0} 2 2. For each node p in IDOM(n), see if p has 2 {0,1} {1} //0 1’s dominator 3 dominators other than itself also included in 3 {0,1} {1} //0 1’s dominator IDOM(n): if so, remove them from IDOM(n) 4 4 {0,1,3} {3} // 0,1 3’s dominators 5 6 5 {0,1,3,4} {4} // 0,1,3 4’s dominators 7 6 {0,1,3,4} {4} // 0,1,3 4’s dominators 8 7 {0,1,3,4} {4} // 0,1,3 4’s dominators  The immediate dominator m of n is the strict 8 {0,1,3,4,7} {7} // 0,1,3,4 7’s dominators 9 10 dominator of n that is closest to n 9 {0,1,3,4,7,8} {8} // 0,1,3,4,7 8’s dominators 10 {0,1,3,4,7,8} {8} // 0,1,3,4,7 8’s dominators

39 40 Control Flow Analysis Control Flow Analysis

PostDominance Postdominator Example

 Related concept Block Pdom IPdom  Node d of a CFG postdominates node n if 1 1 {3,4,7,8,10,exit} 3 2 every path from n to the exit node passes 2 {2,3,4,7,8,10,exit} 3 3 through d (d pdom n) 3 {3,4,7,8,10,exit} 4 4  Pdom(n): the set of postdominators of node n 4 {4,7,8,10,exit} 7 5 6 ∈ 5 {5,7,8,10,exit} 7  Every node postdominates itself: n 7 Pdom(n) 6 {6,7,8,10,exit} 7 8 7 {7,8,10,exit} 8  Each node n has a unique immediate post 9 10 8 {8,10,exit} 10 9 {1,3,4,7,8,10,exit} 1 dominator m exit 10 {10,exit} exit

41 42 Control Flow Analysis Control Flow Analysis

7 CFG Natural Loops 1 exit  Natural loops that are suitable for improvement have two essential properties: 2 12  A loop must have a single entry point called header 3 7 11  There must be at least one way to iterate the loop, i.e., at least one path back to the header 4 5 10 9  Identifying natural loops 6 8  Searching for back edges (n→d) in CFG whose 8 9 2 6 7 heads dominate their tails  For an edge a→b, b is the head and a is the tail

10 1 3 4 5  A back edge flows from a node n to one of n’s dominators d 11 12  The natural loop for that edge is { d}+the set of nodes that can reach n without going through d exit  d is the header of the loop 43 44 Control Flow Analysis Control Flow Analysis

Back Edge Example Identifying Natural Loops  Given a back edge n→d, the natural loop of the Block Dom IDom edge includes 1 1 1 —  Node d 2 2 1,2 1  Any node that can reach n without going through d 3 3 1,3 1 4  Loop construction 4 1,3,4 3 5 6  Set loop ={ d} 5 1,3,4,5 4 7  Add n into loop if n ≠d 6 1,3,4,6 4 8  Consider each node m≠d that we know is in loop , 7 1,3,4,7 4 make sure that m’s predecessors are also inserted in 9 10 8 1,3,4,7,8 7 loop 9 1,3,4,7,8,9 8 Back edges? 10 1,3,4,7,8,10 8

45 46 Control Flow Analysis Control Flow Analysis

Natural Loops Example Inner Loops

 A useful property of natural loops: unless two Back edge Natural loop loops have the same header, they are either 1 10 →7 {7,10,8} disjoint or one is entirely contained (nested 2 7→4 {4,7,5,6 within) the other 3 10,8} B2

4 4→3 B0 B1 {3,4,7,5,6,10,8} 5 6 8→3 B3 7 9→1 {1,9,8,7,5,6, 10,4,3,2}  An inner loop is a loop that contains no other 8 loops 9 10  Why neither {3,4} nor  Good optimization candidate {4,5,6,7} is a natural loop?  The inner loop of the previous example: {7,8,10}

47 48 Control Flow Analysis Control Flow Analysis

8 Dominance Frontiers Dominance Frontier for Node 7

 For a node n in CFG, DF( n) denotes the 1 1 dominance frontier set of n 2 3 2  DF( n) contains all nodes x s.t. n dominates an immediate predecessor of x but does not strictly 3 4 dominate x 4 5 6 7  For this to happen, there is some path from node n to 5 6 8 x, n   y  x where ( n DOM y) but !( n SDOM x) 7 9 10  Informally, DF( n) contains the first nodes reachable 8 from n that n does not strictly dominate, on each CFG 9 10 Paths of interest: path leaving n 7  4 7  8  3  Used in SSA calculation and redundancy 7  8  9  1 elimination 7  8  10  7 DF(7)={1,3,4,7}

49 50 Control Flow Analysis Control Flow Analysis

Dominance Frontier for Node 4 Computing Dominance Frontiers

1  Easiest way: 1 2 3 DF(x) = SUCC(DOM 1(x)) – SDOM 1(x) where 2 3 4 SUCC(x) = set of successors of x in the CFG  But not the most efficient 4 5 6 7

5 6 8  Observation 7 9 10  Nodes in a DF must be join nodes 8  The predecessor of any join node j must have 9 10 Paths of interest: DF(4)={1,3,4} j in its DF unless it dominates j  The dominators of j’s predecessors must have j in their DF sets unless they also dominate j

51 52 Control Flow Analysis Control Flow Analysis

Computing Dominance Frontiers Computing Dominance Frontier 1 for all nodes n, initialize DF(n) =Ø 0 2 for all nodes n 3 if n has multiple predecessors, then 1 4 for each predecessor p of n 2 5 6 7 runner = p 3 while (runner ≠IDom(n)) 4 8 DF(runner) = DF(runner) ∪ {n} 5 6 9 10 runner = IDom(runner) 7 Join node 1: runner = 0 = IDom(1)  First identify join nodes j in CFG 8 runner = 9 : DF(9) += {1} runner = 8 : DF(8) += {1}  Starting with j’s predecessors, walk up the dominator tree 9 10 runner = 7 : DF(7) += {1} until we reach the immediate dominator of j runner = 4 : DF(4) += {1}  Node j should be included in the DF set of all the nodes we pass runner = 3 : DF(3) += {1} by except for j’s immediate dominator runner = 1 : DF(1) += {1} runner = 0 = IDom(1) 53 54 Control Flow Analysis Control Flow Analysis

9 Computing Dominance Frontier Computing Dominance Frontier 1 1 0 0 2 3 2 3

1 4 1 4 2 2 3 5 6 7 3 5 6 7

4 8 4 8 5 6 9 10 5 6 9 10 7 Join node 3: 7 Join node 4: runner = 1 = IDom(3) runner = 3 = IDom(4) 8 runner = 2: DF(2) += {3} 8 runner = 7: DF(7) += {4} runner = 4: DF(4) += {3} runner = 4: DF(4) += {4} 9 10 runner = 3: DF(3) += {3} 9 10 Join node 7: runner = 8 : DF(8) += {3} runner = 5: DF(5) += {7} runner = 7 : DF(7) += {3} runner = 6: DF(6) += {7} runner = 10: DF(10) += {7} 55 runner = 8: DF(8) += {7} 56 Control Flow Analysis Control Flow Analysis runner = 7: DF(7) += {7}

Dominance Frontier Example Example 2 1 Bloc DF 0 Block DF k 1 {1} 2 1 1 2 {3} 2 2 3 7 3 {1,3} 3 3 4 {1,3,4} 4 5 4 4 5 5 {7} 5 6 6 6 6 {7} 7 8 9 7 7 {1,3,4,7} 8 8 8 {1,3,7} 10 9 9 10 9 {1} 11 12 10 10 {7} 11

57 12 58 Control Flow Analysis Control Flow Analysis

Dominatorbased Analysis Summary  Idea  CFG construction  Use dominators to discover loops for  Basic blocks identification optimization  CFG traversal  Depthfirst spanning tree  Advantages  Vertex ordering  Sufficient for use by iterative dataflow analysis and optimizations  CFG analysis  Important regions: EBB and loop  Least timeintensive to implement  Dominators  Favored by most current optimizing compilers  Dominance frontiers  Alternative approach  Additional references  Intervalbased analysis/structural analysis  Advanced compiler design and implementation, by S. Muchinick, Morgan Kaufmann

59 60 Control Flow Analysis Control Flow Analysis

10