10/14/2019

Organization

• Dominator relation of CFGs Dominators, – postdominator relation control-dependence • Dominator and SSA form • Computing dominator relation and tree – Dataflow algorithm – Lengauer and Tarjan algorithm • Control-dependence relation • SSA form

12

Control-flow graphs Dominators

START • CFG is a START • Unique node START from which • In a CFG G, node a is all nodes in CFG are reachable a said to dominate node a • Unique node END reachable from b b all nodes b if every path from • Dummy edge to simplify c c discussion START  END START to b contains • Path in CFG: sequence of nodes, de de possibly empty, such that a. successive nodes in sequence are connected in CFG by edge f • Dominance relation: f – If x is first node in sequence and y is last node, we will write the path g relation on nodes g as x * y – If path is non-empty (has at least – We will write a dom b one edge) we will write x + y END if a dominates b END

34

1 10/14/2019

Example Computing dominance relation

• Dataflow problem: START START A B C D E F G END A START x xxxxxxx x Domain: powerset of nodes in CFG A x x x x xxx B B x x x xxx N C C Dom(N) = {N} U ∩ Dom(M) D x M ε pred(N) DE E x F xx F G x Find greatest solution. END x G Work through example on previous slide to check this. Question: what do you get if you compute least solution? END

56

Properties of dominance Example of proof

• Dominance is • Let us prove that dominance is transitive. – reflexive: a dom a – Given: a dom b and b dom c – anti-symmetric: a dom b and b dom a  a = b – Consider any path P: START + c – transitive: a dom b and b dom c  a dom c – Since b dom c, P must contain b. – tree-structured: – Consider prefix of P = Q: START + b • a dom c and b dom c  a dom b or b dom a – Q must contain a because a dom b. • intuitively, this means dominators of a node are – Therefore P contains a. themselves ordered by dominance

78

2 10/14/2019

Dominator tree example Computing dominator tree

START • Inefficient way: START a – Solve dataflow equations to compute full dominance relation b a END c – Build tree top-down c b • Root is START de • For every other node f f d e – Remove START from its dominator set g g – If node is then dominated only by itself, add node as child of START in dominator tree

END • Keep repeating this process in the obvious way

Check: verify that from dominator tree, you can generate full relation

910

Building dominator tree directly Immediate dominators

• Algorithm of Lengauer and Tarjan • Parent of node b in tree, if it exists, is – Based on depth-first search of graph called the immediate dominator of b –O(E*(E)) where E is number of edges in – written as idom(b) CFG – idom not defined for START – Essentially linear time • Intuitively, all dominators of b other than b • Linear time algorithm due to Buchsbaum itself dominate idom(b) et al – In our example, idom(c) = a – Much more complex and probably not efficient to implement except for very large graphs

11 12

3 10/14/2019

Useful lemma Postdominators

• Lemma: Given CFG G • Given a CFG G, node b is said to postdominate and edge ab, idom(b) START dominates a node a if every path from a to END contains b. – we write b pdom a to say that b postdominates a • Proof: Otherwise, there is q END a path P: START + a • Postdominance is dominance in reverse CFG that does not contain c idom(b). Concatenating b obtained by reversing direction of all edges and edge ab to path P, we interchanging roles of START and END. a get a path from START to d e • Caveat: a dom b does not necessarily imply b b that does not contain g idom(b) which is a pdom a. contradiction. ab is edge in CFG – See example: a dom b but b does not pdom a idom(b) = q which dominates f

13 14

Obvious properties Control dependence

• Postdominance is a tree-structured relation • Intuitive idea: • Postdominator relation can be built using a – node w is control-dependent on a node u if backward dataflow analysis. node u determines whether w is executed • Postdominator tree can be built using Lengauer •Example: and Tarjan algorithm on reverse CFG START • Immediate postdominator: ipdom e START ….. • Lemma: if a  b is an edge in CFG G, then if e then S1 else S2 S1 S2 ipdom(a) postdominates b. …. END m END

We would say S1 and S2 are control-dependent on e

15 16

4 10/14/2019

Examples (contd.) Example (contd.)

• S1 and S3 are control- dependent on f START START • Are they control-dependent on e START e? e ….. • Decision at e does not fully while e do S1; determine if S1 (or S3 is S1 f …. S2 executed) since there is a later END S1 S3 test that determines this • So we will NOT say that S1 n END and S3 are control-dependent END on e – Intuition: control-dependence We would say node S1 is control-dependent on e. m is about “last” decision point It is also intuitive to say node e is control-dependent on itself: • However, f is control- - execution of node e determines whether or not e is executed again. dependent on e, and S1 and S3 are transitively (iteratively) control-dependent on e

17 18

Formal definition of control Example (contd.) dependence • Can a node be control- • Formalizing these intuitions is quite tricky dependent on more than one node? • Starting around 1980, lots of proposed – yes, see example n definitions – nested repeat-until loops t1 • Commonly accepted definition due to • n is control-dependent on t1 and t2 (why?) t2 Ferrane, Ottenstein, Warren (1987) • In general, control- • Uses idea of postdominance dependence relation can be quadratic in size of • We will use a slightly modified definition program due to Bilardi and Pingali which is easier to think about and work with

19 20

5 10/14/2019

Control dependence definition Control dependence definition

• First cut: given a CFG G, a node w is control- • Small caveat: what if w = u in previous definition? u dependent on an edge (uv) if – See picture: is u control- dependent on edge uv? – w postdominates v v – Intuition says yes, but – ……. w does not postdominate u definition on previous slides says “u should not • Intuitively, postdominate u” and our – first condition: if control flows from u to v it is definition of postdominance is guaranteed that w will be executed reflexive • Fix: given a CFG G, a node w – second condition: but from u we can reach END is control-dependent on an without encountering w edge (uv) if – so there is a decision being made at u that – w postdominates v determines whether w is executed – if w is not u, w does not postdominate u

21 22

Strict postdominance Example

• A node w is said to strictly postdominate a node START u if a b c d e f g –w != u a STARTa xx xx – w postdominates u b fb xx x cd x • That is, strict postdominance is the irreflexive c ce x ab x version of the postdominance relation de • Control dependence: given a CFG G, a node w f is control-dependent on an edge (uv) if – w postdominates v g – w does not strictly postdominate u END

23 24

6 10/14/2019

Computing control-dependence Computing control-dependence relation relation

• Control dependence: END given a CFG G, a node w • Compute the postdominator tree is control-dependent on • Overlay each edge uv on pdom tree and determine an edge (uv) if g START nodes in interval [v,ipdom(u)) – w postdominates v • Time and space complexity is O(EV). – w does not strictly f postdominate u • Faster solution: in practice, we do not want the full relation, we only make queries c – cd(e): what are the nodes control-dependent on an edge e? • Nodes control dependent d e on edge (uv) are nodes – conds(w): what are the edges that w is control-dependent on? on path up the a b – cdequiv(w): what nodes have the same control-dependences as postdominator tree from v node w? to ipdom(u), excluding a b c d e f g ipdom(u) • It is possible to implement a simple data structure that STARTa xx xx – We will write this as takes O(E) time and space to build, and that answers [v,ipdom(u)) fb xx x these queries in time proportional to output of query • half-open interval in tree cd x (optimal) (Pingali and Bilardi 1997). ce x ab x

25 26

SSA form SSA example

• Static single assignment form START START A – Intermediate representation of program in which A Z0:= … every use of a variable is reached by exactly one Z:= … definition Z1 := (Z4,Z0) B p1 B – Most programs do not satisfy this condition p1 • (eg) see program on next slide: use of Z in node F is reached C C …… D Z3:= (Z1,Z3) by definitions in nodes A and C Z:= …. D Z2:= …. – Requires inserting dummy assignments called - G p3 G p3 E functions at merge points in the CFG to “merge” E multiple definitions p2 Z4:= (Z2,Z3) – Simple algorithm: insert -functions for all variables at p2 all merge points in the CFG and rename each real F print(Z) and dummy assignment of a variable uniquely F print(Z4) • (eg) see transformed example on next slide END END

27 28

7 10/14/2019

Minimal-SSA form Example Minimal SSA form

• In previous example, dummy assignment Z3 is not really needed since there is no actual assignment to Z in nodes D and G of the original program. • Minimal SSA form – SSA form of program that does not contain such “unnecessary” dummy assignments – See example on next slide • Question: how do we construct minimal SSA form directly?

29 30

Minimal SSA form Minimal-SSA form Example

31 32

8 10/14/2019

Computing Merge(v) Dominance frontier

START • If u ε Merge(w), w does not strictly dominate u • Dominance frontier of node w – Proof: there is a path from – Node u is in dominance frontier of node w if w START to v that does not contain w • dominates a CFG predecessor v of u, but w • Conversely • does not strictly dominate u – if w dominates u, u ε Merge(w) • Idea: • Dominance frontier = control dependence – compute nodes on the in reverse graph A B C D E F G dominance frontier of w u A • w does not strictly dominate u x but dominates some CFG B predecessor of u C x Running example: – iterate D x E x F G x

33 34

Iterated dominance frontier Why is SSA form useful? • Irreflexive closure of dominance frontier relation • Related notion: iterated control • For many dataflow problems, SSA form enables dependence in reverse graph sparse dataflow analysis that • Where to place -functions for a variable Z – yields the same precision as bit-vector CFG-based – Let Assignments = {START} U dataflow analysis {nodes with assignments to Z in original CFG} – but is asymptotically faster since it permits the – Find set I = iterated dominance exploitation of sparsity frontier of nodes in Assignments –Place -functions in nodes of set I • SSA has two distinct features •For example – factored def-use chains – Assignments = {START,A,C} – DF(Assignments) = {E} – renaming – DF(DF(Assignments)) = {B} – you do not have to perform renaming to get – DF(DF(DF(Assignments))) = {B} advantage of SSA for many dataflow problems – So I = {E,B} – This is where we place - functions, which is correct

35 36

9 10/14/2019

Computing SSA form Dependences

• Cytron et al algorithm • We have seen control-dependences. – compute DF relation (see slides on computing • What other kind of dependences are there control-dependence relation) – find irreflexive transitive closure of DF relation for set in programs? of assignments for each variable – Data dependences: dependences that arise • Computing full DF relation from reads and writes to memory locations – Cytron et al algorithm takes O(|V| +|DF|) time • Think of these as constraints on reordering – |DF| can be quadratic in size of CFG of statements • Faster algorithms – O(|V|+|E|) time per variable: see Bilardi and Pingali

37 38

Data dependences Anti-dependences

• Flow-dependence (read-after-write): S1S2 • Anti-dependence (write-after-read): S1S2 – Execution of S2 may follow execution of S1 in program – Execution of S2 may follow execution of S1 in order program order – S1 may write to a memory location that may be read by – S1 may read from a memory location that may be S2 (over)written by S2 –Example: –Example:

….. while e do x := … …x… x := 3 flow-dependence flow-dependence ..x…. x: = … anti-dependence …x.. …… x:= … ……. This is called a -carried dependence

39 40

10 10/14/2019

Output-dependence Summary of dependences

• Output-dependence (write-after-write): • Dependence S1S2 – Data-dependence: relation between nodes – Execution of S2 may follow execution of S1 in • Flow- or read-after-write (RAW) program order • Anti- or write-after-read (WAR) – S1 and S2 may both write to same memory • Output- or write-after-write (WAW) location – Control-dependence: relation between nodes and edges

41 42

11