Topic-I-C

Dataflow Analysis

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 1 Global Dataflow Analysis

Motivation We need to know variable def and use information between basic blocks for: • • dead-code elimination • redundant computation elimination • code motion • elimination • build data dependence graph (DDG) • etc.

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 2 Topics of DataFlow Analysis

• Reaching definition • • ud-chains and du-chains • Available expressions • Others ..

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 3 Definition and Use

1. Definition & Use 푆: 푣1 = … 푣2

푆 is a “definition” of 푣1 푆 is a “use” of 푣2

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 4 Compute Def and Use Information of a Program P?

 Case 1: P is a basic block ?  Case 2: P contains more than one basic blocks ?

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 5 Points and Paths

B1 points in a basic block:

d1: i := m-1 - between statements d2: j := n - before the first statement d : a := u1 3 - after the last statement B2

d4: i := i+1 B3

d5: j := j+1 In the example, how many points basic block B1, B2, B3, and B5 have? B4

B5 B6 B1 has four, B2, B3, and B5 d : a := u2 6 have two points each.

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 6 Points and Paths

B1

d1: i := m-1 A path is a sequence of points d : j := n 2 p1, p2, …, pn such that either: d3: a := u1 (i) if pi immediately precedes S, B2 than pi+1 immediately follows S.

d4: i := i+1 (ii) or pi is the end of a basic block and pi+1 is the beginning of a successor block B3 d : j := j+1 5 In the example, is there a path from B4 the beginning of block B5 to the beginning of block B6? B5 B6 Yes, it travels through the end point d : a := u2 6 of B5 and then through all the points in B2, B3, and B4.

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 7 Reach and Kill

Kill

a definition d1 of a variable v d : x := … 1 is killed between p1 and p2 if in every path from p1 to p2 there is another definition of v.

Reach d : x := … a definition d reaches a point p 2 if  a path d  p, and d is not killed along the path

In the example, do d1 and d2 reach the both d1, d2 reach point points and ? but only d1 reaches point

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 8 Reach Example

S0: PROGRAM S1: READ(L) 1 B1 S2: N = 0

S3: K = 0 S4: M = 1

S5: K = K + M B2 2 S6: C = K > L S7: IF C THEN GOTO S11

S8: N = N + 1 B3 S9: M = M + 2 S10: GOTO S5 3 4

B4 S11: WRITE(N) S12: END

The set of defs reaching the use of N in S8: {S2, S8} def S2 reach S11 along statement path: (S2, S3, S4, S5, S6, S7, S11) S8 reach S11 along statement path: (S8, S9, S10, S5, S6, S7, S11)

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 9 Problem Formulation: Example 1

Can d1 reach point p1? d1 x := exp1 d1 x := exp1 s if p > 0 s1 if p > 0 1 s2 x := x + 1 p1 s3 a = b + c s2 x := x + 1 s4 e = x + 1 p1

s3 a = b + c It depends on what point

p1 represents!!! s4 e = x + 1

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 10 Problem Formulation: Example 2

d1 x := exp1 Can d1 and d4 reach point p3?

s2 if y > 0 d1 x := exp1

s2 while y > 0 do s3 a := b + 2 s3 a := b + 2 p3 p3 d4 x := exp2 d4 x = exp2 s5 c := a + 1 end while s5 c = a + 1

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 11 Data-Flow Analysis of Structured Programs

Structured programs have an useful property: there is a single point of entrance and a single exit point for each statement.

We will consider program statements that can be described by the following syntax:

Statement  id := Expression | Statement ; Statement | if Expression then Statement else Statement | do Statement while Expression Expression  id + id | id

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 12 Structured Programs

S ::= id := E | S ; S This restricted syntax results in the | if E then S else S forms depicted below for flowgraphs | do S while E E ::= id + id | id

S1 S1 If E goto S1

S2 S1 S2 If E goto S1

S1 S2 if E then S1 else S2 do S1 while E

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 13 Data-Flow Values

1. Each program point associates with a data-flow value 2. A data-flow value represents the possible program states that can be observed for that program point. 3. The data-flow value depends on the goal of the analysis.

Given a statement 푆, 𝑖푛(푆) and 표푢푡(푆) denote the data-flow values before and after 푆, respectively.

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 14 Data-Flow Values of Basic Block

Assume basic block 푩 consists of statement 풔ퟏ, 풔ퟐ, … , 풔풏 (풔ퟏ is the first statement of 푩 and 풔풏 is the last statement of 푩), the data-flow values immediately before and after 푩 is denoted as: 풊풏 푩 = 풊풏 풔ퟏ 풐풖풕 푩 = 풐풖풕(풔풏)

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 15 Instances of Data-Flow Problems

● Reaching Definitions ● Live-Variable Analysis ● DU Chains and UD Chains ● Available Expressions

To solve these problems we must take into consideration the data-flow and the control flow in the program. A common method to solve such problems is to create a set of data- flow equations.

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 16 Iterative Method for Dataflow Analysis

Establish a set of dataflow relations for each basic block Establish a set dataflow equations between basic blocks Establish an initial solution Iteratively solve the dataflow equations, until a fixed point is reached.

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 17 Generate set: gen(S) In general, 푑 ∈ 푔푒푛(푆) if 푑 reaches the end of 푆 independent of whether it reaches the beginning of 푆.

We restrict 푔푒푛(푆) contains only the definition in 푆.

If 푆 is a basic block, gen(푆) contains all the definitions inside the basic block that are “visible” immediately after the block.

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 18 Kill Set: kill(S)

풅 ∈ 푘𝑖푙푙 푺 ⇒ 푑 never reaches the end of 푺. This is equivalent to say: reaches end of  풅 푺 풅 ∉ 푘𝑖푙푙(푆) d1 a := d2 a := ...

dk a : dd a := S dd: a := b + c

푘𝑖푙푙(푠) = 퐷푎 − { 푑푑 }

Of course the statements d1,d2, …, dk all get killed except dd itself. A basic block’s kill set is simply the union of all the definitions killed by its individual statements! 2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 19 Reaching Definitions

Problem Statement:

Given a program and a program point determine the set of definitions reaching this point in a program.

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 20 Iterative Algorithm for Reaching Definitions dataflow equations The set of definitions reaching the entry of basic block B: 𝑖푛(푩) = 표푢푡(푷) 푷 ∈ 푝푟푒푑푒푐푒푠푠표푟(푩)

The set of definitions reaching the exit of basic block B: 표푢푡 푩 = 푔푒푛 푩 ∪ { 𝑖푛 푩 − kill 퐁 }

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 21 (AhoSethiUllman, pp. 606) Iterative Algorithm for Reaching Definitions Algorithm

1) 풐풖풕(푬푵푻푹풀) =  ; 2) for (each basic block 푩 other than 푬푵푻푹풀)  풐풖풕(푩) = ; Need a flag to test if a out 3) while ( changes to any out occur) is changed! The initial value of the flag is true. 4) for ( each 푩 other than 푬푵푻푹풀) {

풊풏 푩 = 푷 ∈풑풓풆풅풆풄풆풔풔풐풓풔 풐풇 푩 풐풖풕(푷) ; 풐풖풕 푩 = 품풆풏 푩 ∪ 풊풏 푩 − 풌풊풍풍 풃 ; } Dataflow Equations –

a simple case

푔푒푛 푆 = 푑 Da is the set of all definitions in the S d : a := b + c 푘𝑖푙푙 (푆) = 퐷푎 − {푑} program for variable a!

S1 푔푒푛(푆) = 푔푒푛 (푆2) ∪ (푔푒푛(푆1) − 푘𝑖푙푙(푆2)) S S 2 푘𝑖푙푙 푆 = 푘𝑖푙푙(푆2) ∪ (푘𝑖푙푙(푆1) − 푔푒푛(푆2))

푔푒푛(푆) = 푔푒푛(푆1) ∪ 푔푒푛(푆2)

S S1 S2

푘𝑖푙푙(푆) = 푘𝑖푙푙(푆1) ∩ 푘𝑖푙푙(푆2) flow equations for reaching definitions reaching for equations flow - 푔푒푛(푆) = 푔푒푛(푆1) S S1

푘𝑖푙푙(푆) = 푘𝑖푙푙(푆1) Date

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 23

Dataflow Equations

S d : a := b + c 표푢푡(푆) = 푔푒푛(푆) ∪ (𝑖푛(푆) − 푘𝑖푙푙(푆))

S 𝑖푛(푆) = 𝑖푛(푆1) S 1 𝑖푛(푆2) = 표푢푡(푆1) S2 표푢푡(푆) = 표푢푡(푆2)

S S1 S2 𝑖푛 푆 = 𝑖푛 푆1 = 𝑖푛 푆2

표푢푡(푆) = 표푢푡(푆1) ∪ 표푢푡(푆2)

flow equations for reaching definitions reaching for equations flow - 𝑖푛(푆) = 𝑖푛(푆 ) ∪ 표푢푡(푆 )

S S1 1 1 Date 표푢푡(푆) = 표푢푡(푆1)

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 24 Dataflow Analysis: An Example

Using RD (reaching def) as an example:

d1 : i = 0 Question: in What is the set of reaching loop L . definitions at the exit of the loop L? .

d2 : i = i + 1

𝑖푛 퐿 = 푑1 ∪ 표푢푡(퐿) out 푔푒푛 (퐿) = {푑2} 푘𝑖푙푙 (퐿) = {푑1} 표푢푡(퐿) = 푔푒푛(퐿) ∪ {𝑖푛(퐿) − 푘𝑖푙푙(퐿)}

𝑖푛(퐿) depends on 표푢푡(퐿), and 표푢푡(퐿) depends on 𝑖푛(퐿)!!

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 25 Initialization Solution? out[L] =  First iteration d1 : i = 0 표푢푡 퐿 = 푔푒푛 퐿 ∪ ( 𝑖푛 (퐿) − 푘𝑖푙푙 퐿 ) in = {푑2} ∪ ({푑1} − {푑1}) = {푑2} loop L . . Second iteration d2 : i = i + 1 표푢푡(퐿) = 푔푒푛(퐿) ∪ ( 𝑖푛(퐿) − 푘𝑖푙푙 퐿 ) but now: out 𝑖푛 퐿 = 푑1 ∪ 표푢푡(퐿) = {푑1} ∪ {푑2} = {푑1, 푑2} therefore: 𝑖푛 퐿 = 푑1 ∪ 표푢푡(퐿) 표푢푡 퐿 = 푑2 ∪ ({푑1, 푑2} − {푑1}) 푔푒푛(퐿) = {푑2} = {푑2} ∪ {푑2} = {푑 } 푘𝑖푙푙(퐿) = {푑1} 2 표푢푡 퐿 = 푔푒푛 퐿 ∪ {𝑖푛(퐿) − 푘𝑖푙푙(퐿)} So, we reached the fixed point!

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 26 Reaching Definitions: Another Example

ENTRY

B1 d1: i := m-1 Step 1: Compute gen and kill for each d2: j := n basic block d3: a := u1

푔푒푛(퐵1) = {푑1, 푑2, 푑3} B2 d : i := i+1 4 푘𝑖푙푙(퐵1) = {푑4, 푑5, 푑6, 푑7} d5: j :=j - 1 푔푒푛(퐵 ) = {푑4, 푑5} B3 2 푘𝑖푙푙(퐵2) = {푑1, 푑2, 푑7} d6: a := u2 푔푒푛(퐵3) = {푑6} B4 d : i := u3 7 푘𝑖푙푙(퐵3) = {푑3}

EXIT 푔푒푛(퐵4) = {푑7} 푘𝑖푙푙(퐵4) = {푑1, 푑4}

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 27 Reaching Definitions: Another Example (Con’t)

ENTRY Step 2: For every basic block, make:

B1 d1: i := m-1 out[B] =  d2: j := n d : a := u1 3 Initialization:

B2 d4: i := i+1 표푢푡(퐵1) =  d5: j :=j - 1 표푢푡(퐵 ) =  B3 2

d6: a := u2 표푢푡(퐵3) = 

B4 d7: i := u3 표푢푡(퐵4) =  EXIT

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 28 Reaching Definitions: Another Example (Con’t)

ENTRY To simplify the representation, the in[B]

B1 d1: i := m-1 and out[B] sets are represented by d2: j := n bit strings. Assuming the representation d3: a := u1 d1d2d3 d4d5d6d7 we obtain: Initialization: B2 d4: i := i+1 d5: j :=j - 1 표푢푡(퐵 ) =  Initial 1 Block B3 in[B] out[B] B 000 0000 d6: a := u2 표푢푡(퐵 ) =  1 2 B 000 0000 2 B4 d : i := u3 B3 000 0000 7 표푢푡(퐵 ) =  B 000 0000 3 4

EXIT 표푢푡(퐵4) =  Notation: d1d2d3 d4d5d6d7

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 29 푔푒푛(퐵2) = {푑1, 푑2, 푑3} 푘𝑖푙푙(퐵1) = {푑4, 푑5, 푑6, 푑7}

푔푒푛(퐵2) = {푑Reaching4, 푑5} Definitions: 푘𝑖푙푙(퐵2) = {푑1, 푑2, 푑7} 푔푒푛(퐵3) = {푑6} 푘𝑖푙푙(퐵Another) = {푑 } Example (Con’t) 3 3 푔푒푛(퐵4) = {푑7} 푘𝑖푙푙(퐵4) = {푑1, 푑4} ENTRY while a fixed point is not found:

B1 d1: i := m-1 풊풏(푩) = 푷 ∈ 풑풓풆풅(푩) 풐풖풕(푷) d : j := n 2 풐풖풕 푩 = 품풆풏 푩 ∪ (풊풏(푩) − 풌풊풍풍(푩)) d3: a := u1

Initial Block B2 d4: i := i+1 in[B] out[B] d5: j :=j - 1 B1 000 0000 B2 000 0000 B3 B3 000 0000 B 000 0000 d : a := u2 4 6 First Iteration Block in[B] out[B] B4 d7: i := u3 B 1 000 0000 111 0000 out(B) = B 2 000 0000 000 1100 gen(B) EXIT B 3 000 00 0 0 000 0010 B 4 000 00 0 0 00 0 00 0 1

Notation: d1d2d3 d4d5d6d7

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 30 푔푒푛(퐵2) = {푑1, 푑2, 푑3} 푘𝑖푙푙(퐵1) = {푑4, 푑5, 푑6, 푑7}

푔푒푛(퐵2) = {푑Reaching4, 푑5} Definitions: 푘𝑖푙푙(퐵2) = {푑1, 푑2, 푑7} 푔푒푛(퐵3) = {푑6} 푘𝑖푙푙(퐵Another) = {푑 } Example (Con’t) 3 3 푔푒푛(퐵4) = {푑7} 푘𝑖푙푙(퐵4) = {푑1, 푑4} ENTRY while a fixed point is not found:

B1 d1: i := m-1 풊풏(푩) = 푷 ∈ 풑풓풆풅(푩) 풐풖풕(푷) d : j := n 2 풐풖풕 푩 = 품풆풏 푩 ∪ (풊풏(푩) − 풌풊풍풍(푩)) d3: a := u1

First Iteration Block B2 d4: i := i+1 in[B] out[B] d5: j :=j - 1 B 1 000 0000 111 0000 B 2 000 0000 000 1100 B3 B 3 00 0 00 0 0 000 0010 B 4 00 0 00 0 0 00 0 00 0 1 d6: a := u2

Second Iteration Block B4 d7: i := u3 in[B] out[B] B 1 000 0000 111 0000 B 2 111 0010 001 1110 EXIT B 3 000 1100 000 11 10 B 4 000 1100 000 0101

Notation: d1d2d3 d4d5d6d7

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 31 gen[B1] = {d1, d2, d3} kill[B1] = {d4, d5, d6, d7} gen[B2] = {d , d } 4 Reaching5 Definitions: kill [B2] = {d1, d2, d7} gen[B3] = {d6} kill [B3] = Another{d } Example (Con’t) 3 gen[B4] = {d7} kill [B4] = {d1, d4} ENTRY while a fixed point is not found: in[B] =  out[P] where P is a B1 d1: i := m-1 predecessor of B d2: j := n out[B] = gen[B]  (in[B]-kill[B]) d3: a := u1 Second Iteration Block in[B] out[B] B2 d4: i := i+1 B 1 000 0000 111 0000 d5: j :=j - 1 B 2 111 0010 001 1110 B 000 1100 000 11 1 0 B3 3 B 4 000 1100 001 0111

d : a := u2 6 Third Iteration Block in[B] out[B] B4 d : i := u3 7 B 1 000 0000 111 0000 B 2 111 1110 001 1110 EXIT B 3 001 1110 000 11 1 0 B 4 001 1110 001 0111

Notation: d d d d d d d 1 2 3 4 5 6 7 we reached the fixed point! 2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 32 gen[B1] = {d1, d2, d3} kill[B1] = {d4, d5, d6, d7} gen[B2] = {d , d } 4 Reaching5 Definitions: kill [B2] = {d1, d2, d7} gen[B3] = {d6} kill [B3] = Another{d } Example (Con’t) 3 gen[B4] = {d7} kill [B4] = {d1, d4} ENTRY while a fixed point is not found: in[B] =  out[P] where P is a B1 d1: i := m-1 predecessor of B d2: j := n out[B] = gen[B]  (in[B]-kill[B]) d3: a := u1 Third Iteration Block in[B] out[B] B2 d4: i := i+1 B 1 000 0000 111 0000 d5: j :=j - 1 B 2 111 0010 001 1110 B 000 1100 000 11 1 0 B3 3 B 4 000 1100 001 0111

d : a := u2 6 Forth Iteration Block in[B] out[B] B4 d : i := u3 7 B 1 000 0000 111 0000 B 2 111 1110 001 1110 EXIT B 3 001 1110 000 11 1 0 B 4 001 1110 001 0111

Notation: d d d d d d d 1 2 3 4 5 6 7 we reached the fixed point! 2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 33 Other Applications of Data flow Analysis

Live Variable Analysis DU and UD Chains Available Expressions Constant Propagation Constant Folding Others ..

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 34 Live Variable Analysis: Another Example of Flow Analysis

• A variable 푉 is live at the exit of a basic block 푛, if there is a def-free path from 푛 to an outward exposed use of 푉 in a node 푛’.

“Live variable analysis problem” – determine the set of variables which are live at the exit from each program point. Live variable analysis is a “backwards dataflow” analysis, that is the analysis is done in a backwards order .

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 35 Live Variable Analysis: Another Example of Flow Analysis

The set of live variables at L1: b := 3; line L2 is {푏, 푐}, but the set of live variables at line L1 is L2: c := 5; only {푏} since variable "c" is updated in line 2. The L3: a := b + c; value of variable "a" is never used, so the variable goto L1; is never live.

Copy from Wikipedia, the free encyclopedia 2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 36 Live Variable Analysis: Def and use set

• 푑푒푓(퐵): the set of variables defined in basic block 퐵 prior to any use of that variable in 퐵 • 푢푠푒(퐵): the set of variables whose values may be used in 퐵 prior to any definition of the variable.

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 37 Live Variable Analysis

dataflow equations The set of variables live at the entry of basic block B:

𝑖푛 퐵 = 푢푠푒 퐵 ∪ { 표푢푡 퐵 − 푑푒푓 퐵 }

The set of variables live at the exit of basic block B:

표푢푡 퐵 = 𝑖푛(푆) 푆 ∈ 푠푢푐푐푒푠푠표푟푠(퐵)

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 38 Iterative Algorithm for Live Variable Analysis Algorithm 1) out(EXIT) = ∅ ; 2) for (each basic block B other than EXIT) in(B) = ∅; Need a flag to test if 3) while ( changes to any “in” occur) a in is changed! The initial value of the flag 4) for ( each B other than EXIT) is true. { 표푢푡 퐵 = 𝑖푛 푆 푆 ∈ 푠푢푐푐푒푠푠표푟푠 퐵

𝑖푛 퐵 = 푢푠푒 퐵 ∪ {표푢푡 퐵 − 푑푒푓 퐵 } 2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 39 (AhoSethiUllman, pp. 607)

Live Variable Analysis: a Quiz

Calculate the live variable sets 𝑖푛(퐵) and 표푢푡(퐵) for the program:

B1 d1: i := m-1 d2: j := n d3: a := u1

B2 d4: i := i+1 d5: j :=j - 1

B3

d6: a := u2

B4 d7: i := u3

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 40 D-U and U-D Chains

Many dataflow analyses need to find the use-sites of each defined variable or the definition-sites of each variable used in an expression. Def-Use (D-U), and Use-Def (U-D) chains are efficient data structures that keep this information. Notice that when a code is represented in Static Single-Assignment (SSA) form (as in most modern compilers) there is no need to maintain D-U and U-D chains.

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 41 UD Chain

An UD chain is a list of all definitions that can reach a given use of a variable.

......

S1’: v= ... Sm’: v = ...

Sn: ... = … v … . . .

A UD chain: 푈퐷(푆푛, 푣) = (푆1’, … , 푆푚’).

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 42 DU Chain

A DU chain is a list of all uses that can be reached by a given definition of a variable. DU Chain is a counterpart of a UD Chain.

. . .

Sn’: v = …

S1: … = … v … Sk: … = … v … ......

′ A DU chain: 퐷푈(푆푛, 푣) = (푆1, … , 푆푘). 2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 43 Use of DU/UD Chains in Optimization/Parallelization

• Dependence analysis • Live variable analysis • • Analysis for various transformations

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 44 Available Expressions

An expression 푥 + 푦 is available at a point 푝 if:

1. Every path from the 푠푡푎푟푡 node to 푝 evaluates 푥 + 푦.

2. After the last evaluation prior to reaching 푝, there are no subsequent assignments to 푥 or 푦.

We say that a basic block kills expression 푥 + 푦 if it may assign 푥 or 푦, and does not subsequently recompute 푥 + 푦.

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 45 Available Expression: Example

S2: TEMP = A * B S1: TEMP = A * B B2 S2: Y Y= =A TEMP * B + +C C B1 S1: XX == TEMP A * B ++ CC

B3 S3: C = 1

B4 S4:S4: ZZ == TEMP A * B + CC - DD ** EE

Yes. ItIs isexpression generated A * inB availableall paths at leading the beginning to B4 ofand basic it is block not killed B4? after its generation in any path. Thus the redundant expression can be eliminated.

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 46 Available Expression: Example

S3: Y = A * B S1: X = A * B B2 S4: W=Y + C B1 S2: Z = X + C

B3 S5: C = 1

S6: T = A * B B4 S7: V= D * T

Yes. It is generated in all paths leading to B4 and it is not killed after its generation in any path. Thus the redundant expression can be eliminated.

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 47 Available Expression: Example

S1: temp = A * B S3: temp = A * B B2 B1 S2: Z = temp + C S4: W=temp + C

B3 S5: C = 1

S6: T = temp B4 S7: V= D * T

Yes. It is generated in all paths leading to B4 and it is not killed after its generation in any path. Thus the redundant expression can be eliminated.

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 48 Available Expressions: gen and kill set

Assume 푈 is the “universal” set of all expressions appearing on the right of one or more statements in a program.

푒푔푒푛(퐵) : the set of expressions generated by 퐵

푒푘푖푙푙(퐵): the set of expressions in 푈 killed in 퐵.

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 49 Calculate the Generate Set of Available Expressions

No generated expression

p S:  x= y+z q S’: add y+z to S; delete expressions involving x from S

S  a = b + c b + c b= a – d ab -+ d c , a - d c= b + c a – d,d b + c d= a - d a - d

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 50 Iterative Algorithm for Available Expressions

dataflow equations The set of expressions available at the entry of basic block 퐵: 풊풏 푩 = 풐풖풕(푷) 푷∈풑풓풆풅풆풄풆풔풔풐풓풔(푩)

The set of expressions available at the exit of basic block 퐵:

풐풖풕 푩 = 풆품풆풏 푩 ∪ { 풊풏 푩 − 풆풌풊풍풍 푩 }

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 51 (AhoSethiUllman, pp. 606) Iterative Algorithm for Reaching Definitions Algorithm 1. 풐풖풕 푬푵푻푹풀 = ∅; 2. 푭풐풓 풆풂풄풉 풃풂풔풊풄 풃풍풐풄풌 푩 풐풕풉풆풓 풕풉풂풏 푬푵푻푹풀

풐풖풕 푩 = 푼 Need a flag to test if 3. 푾풉풊풍풆 풄풉풂풏품풆풔 풕풐 풂풏풚 풐풖풕 풐풄풄풖풓 a out is changed! The initial value of the flag 4. 풇풐풓 풆풂풄풉 푩 풐풕풉풆풓 풕풉풂풏 푬푵푻푹풀 is true. { 풊풏 푩 = 풐풖풕(푷) 푷 ∈ 풑풓풆풅풆풄풆풔풔풐풓풔(푩)

풐풖풕 푩 = 풆품풆풏 푩 ∪ { 풊풏 푩 − 풆풌풊풍풍 푩 } } 2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 52 (AhoSethiUllman, pp. 607)

Use of Available Expressions

• Detecting global common subexpressions

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 53 More Useful Data- Flow Frameworks

Constant propagation is the process of substituting the values of known constants in expressions at compile time. Constant folding is a compiler optimization technique where constant sub-expressions are evaluated at compiler time.

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 54 Constant Folding Example

i = 32*48-1530 i = 6 Constant folding can be implemented : • In a compiler’s front end on the IR tree (before it is translted into three-address codes • In the back end, as an adjunct to constant propagation

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 55 Constant Propagation Example

int x = 14; int y = 7 - x / 2; return y * (28 / x + 2);

Constant propagation int x = 14; int y = 7 - 14 / 2; return y * (28 / 14 + 2); Constant folding int x = 14; int y = 0; return 0;

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 56 Summary

• Basic Blocks

• Control Flow Graph (CFG)

• Dominator and Dominator Tree

• Natural Loops

• Program point and path

• Dataflow equations and the iterative method

• Reaching definition

• Live variable analysis

• Available expressions

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 57 Remarks of Mathematical Foundations on Solving Dataflow Equations

• As long as the dataflow value domain is “nice” (e.g. semi-lattice) • And each function specified by the dataflow equation is “nice” -- then iterative application of the dataflow equations at each node will eventually terminate with a stable solution (a fix point). • For mathematical foundation -- read

• Ken Kenedy: “A Survey of Dataflow Analysis Techniques”, In Programm Flow Analysis: Theory and Applications, Ed. Muchnik and Jones, Printice Hall, 1981. • Muchnik’s book: Section 8.2, pp 223

For a good discussion: also read 9.3 (pp 618-632) in new Dragon Book

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 58 Algorithm Convergence

Intuitively we can observe that the algorithm converges to a fix point because the 풐풖풕(푩) set never decreases in size.

It can be shown that an upper bound on the number of iterations required to reach a fix point is the number of nodes in the flow graph.

Intuitively, if a definition reaches a point, it can only reach the point through a cycle free path, and no cycle free path can be longer than the number of nodes in the graph.

Empirical evidence suggests that for real programs the number of iterations required to reach a fix point is less then five.

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 59 More remarks

 If a data-flow framework meets “good” conditions then it has a unique fixed-point solution The iterative algorithm finds the (best) answer The solution does not depend on order of computation Algorithm can choose an order that converges quickly

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 60 Ordering the Nodes to Maximize Propagation

4 1

2 3 3 2

1 4

Postorder Reverse Postorder Visit children first Visit parents first

 Reverse postorder visits predecessors before visiting a node  Use reverse preorder for backward problems Reverse postorder on reverse CFG is reverse preorder

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 61 Iterative solution to general data-flow frameworks

 INPUT: A data-flow framework with the following components: 1. A data-flow graph, with specially labeled ENTRY and EXIT nodes, 2. A direction of the data-flow D, 3. A set of values V, 4. A meet operator Λ, 5. A set of functions F, where fB in F is the transfer function for block B, and

6. A constant value vENTRY or vEXIT in V, representing the boundary condition for forward and backward frameworks, respectively.  OUTPUT: Values in V for IN[B] and OUT[B] for each block B in the data-flow graph. (From p925 in Dragon Book Edition 2) 2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 62 Iterative algorithm for a forward data-flow problem

1. 푶푼푻(푬푵푻푹풀) = 푽푬푵푻푹풀; 2. 풇풐풓 (푒푎푐ℎ 푏푎푠𝑖푐 푏푙표푐푘 푩 표푡ℎ푒푟 푡ℎ푎푛 푬푵푻푹풀) 푶푼푻(푩) = 푻; 3. 풘풉풊풍풆 푐ℎ푎푛푔푒푠 푡표 푎푛푦 푶푼푻 표푐푐푢푟 4. 풇풐풓 (푒푎푐ℎ 푏푎푠𝑖푐 푏푙표푐푘 푩 표푡ℎ푒푟 푡ℎ푎푛 푬푵푻푹풀) { 푰푵 푩 = 푶푼푻 푷 ; 푷 ∈ 풑풓풆풅풆풄풆풔풔풐풓풔 푩 푶푼푻 푩 = 풇푩 푰푵 푩 }

(From p926 in Dragon Book Edition 2)

2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 63 Iterative algorithm for a backward data-flow problem

1. 푰푵 푬푿푰푻 = 푽푬푿푰푻; 2. 풇풐풓 (푒푎푐ℎ 푏푎푠𝑖푐 푏푙표푐푘 푩 표푡ℎ푒푟 푡ℎ푎푛 푬푿푰푻) 푰푵(푩) = 푻; 3. 풘풉풊풍풆 푐ℎ푎푛푔푒푠 푡표 푎푛푦 푰푵 표푐푐푢푟 4. 풇풐풓 (푒푎푐ℎ 푏푎푠𝑖푐 푏푙표푐푘 푩 표푡ℎ푒푟 푡ℎ푎푛 푬푿푰푻) {

푶푼푻(푩) = 푺 ∈풔풖풄풄풆풔풔풐풓풔(푩) 푰푵(푺)

푰푵(푩) = 풇푩( 푶푼푻(푩) ); } (From p926 in Dragon Book Edition 2) 2012/3/2 \course\cpeg421-08s\Topic4-a.ppt 64