Program Optimizations using Data-Flow Analysis

Last time Remove statements that define only one variable and the variable being – Lattice theoretic framework for data-flow analysis defined is not in the live out set.

Algorithm Today – Dead-code elimination do { – Copy propagation 1) generate a FlowGraph from the list of instructions – Constant propagation 2) perform liveness – Common sub-expression elimination (CSE) 3) for each node in FlowGraph if the defs set contains only one temporary if the temporary being defined is not in the live out set remove the node from the FlowGraph 4) generate a list of instructions from the modified FlowGraph

} while (changes);

CS553 Lecture Program Optimizations using Data-Flow Analysis 2 CS553 Lecture Program Optimizations using Data-Flow Analysis 3

Dead code elimination with the MiniJava Copy Propagation

1) Method in FlowGraph for removing a node that attaches predecessors with Propagate the results of a move statement. successors.

Algorithm 2) Modify Targets class so that we can query getFallThrough() and get a label for constructing a new list of Assem.Instr. do { 1) generate a FlowGraph from the list of instructions 3) Method to trace through a FlowGraph to generate an Assem.Instr list. 2) perform reaching definitions 3) for each node in FlowGraph 4) DataFlowSet interface (will be provided) for each use

5) DataFlowProblem interface (will be provided) if the use is reached by a def in a move statement change the use to the rhs in the move statement 6) DataFlowSolver class where the constructor takes a DataFlowProblem as input and solves it with IDFA. 4) generate a list of instructions from the modified FlowGraph

7) Reimplement Liveness using the data-flow framework (an implementation for } while (changes); LiveDataFlowSet will be provided)

CS553 Lecture Program Optimizations using Data-Flow Analysis 4 CS553 Lecture Program Optimizations using Data-Flow Analysis 5

1 Copy propagation with the MiniJava compiler Constant Propagation using Reaching Definitions

Assume the DataFlowSolver has already been written. Propagate the results of a CONSTANT move statement.

– The ReachDefs class should implement the DataFlowProblem interface. Algorithm – Implement a ReachDefsDataFlowSet class with reaching def statements paired do { with the temporary they define. 1) generate a FlowGraph from the list of instructions 2) perform reaching definitions – Implement a CopyProp class. 3) for each node in FlowGraph for each use if the use is reached by a def in a constant move statement change the use to the rhs in the move statement 4) generate a list of instructions from the modified FlowGraph

} while (changes);

CS553 Lecture Program Optimizations using Data-Flow Analysis 6 CS553 Lecture Program Optimizations using Data-Flow Analysis 7

Constant Propagation with the MiniJava compiler Constant Propagation

Assume the DataFlowSolver has already been written. Goal – Discover constant variables and expressions and propagate them forward – Subclass Assem.Instr with a CONSTMOVE instruction type. through the program

– The ReachDefs class should implement the DataFlowProblem interface.

Uses – Implement a ReachDefsDataFlowSet class with reaching def statements paired – Evaluate expressions at instead of run time with the temporary they define. – Eliminate dead code (e.g., code) – Improve efficacy of other optimizations (e.g., and – Implement a ConstPropReachDef class. )

CS553 Lecture Program Optimizations using Data-Flow Analysis 8 CS553 Lecture Program Optimizations using Data-Flow Analysis 9

2 Roadmap Kinds of Constants

Simple constants Kildall [1973] – Constant for all paths through a program 1 3 Simple Constants More Conditional Constants

Kildall [1973] constants Wegbreit [1975] Conditional constants Wegbreit [1975] – Constant for actual paths through a program (when only one direction of a conditional is taken) faster faster := 1 2 4 Sparse Conditional 1 Sparse Simple Constants More ... Constants Reif and Lewis [1977] constants if c=1 Wegman & Zadeck [1991] true false

2 j := 3 3 j := 5

4 j?

CS553 Lecture Program Optimizations using Data-Flow Analysis 10 CS553 Lecture Program Optimizations using Data-Flow Analysis 11

Data-Flow Analysis for Simple Constant Propagation Data-Flow Analysis for Simple Constant Propagation (cont)

Simple constant propagation: analysis is “reaching constants” Reaching constants for simple constant propagation v×c Using tuples of lattices – D: 2 – D: {All constants} ∪ {⊤,⊥}

– ⊓: – ⊓: ∩ c ⊓ ⊤ = c ¨

– F: c ⊓ ⊥ = ⊥

– Kill(x←. . .) = {(x,c) ∀c} c ⊓ d = ⊥ if c ≠ d . . . -3 -2 -1 0 1 2 3 . . .

– Gen(x←c) = {(x,c)} c ⊓ d = c if c = d

– Gen(x←y⊕z) = if (y,c )∈In & (z,c )∈In, {(x,c ⊕ c )} y z y z – F: ⊥ – . . . – Fx←c(In) = c

– Fx←y⊕z(In) = if cy=Iny & cz=Inz, then cy ⊕ cz, else ⊤or ⊥ – . . .

CS553 Lecture Program Optimizations using Data-Flow Analysis 12 CS553 Lecture Program Optimizations using Data-Flow Analysis 13

3 Initialization for Reaching Constants Simple Constants with the MiniJava compiler

Assume the DataFlowSolver has already been written. Pessimistic – Each variable is initially set to ⊥ in data-flow analysis – Subclass Assem.Instr with a CONSTMOVE instruction type. – Forces merges at loop headers to go to ⊥ conservatively – Subclass Assem.Instr with a BINOPMOVE instruction type that can be queried for the operator type, each operand, and the temporary being defined. Optimistic – Each variable is initially set to ⊤in data-flow analysis – The SimpleConstants class should implement the DataFlowProblem interface. – Each variable is set to ⊥ at the entry node in the flow graph. – In the transfer function, BINOPMOVE instructions may be replaced with – What assumption is being made when optimistic reaching constants is CONSTMOVE instructions. performed? – Can we replace operands to a BINOPMOVE instruction with a constant?

– The SimpleConstants class should take a reference to a list of Assem.Instrs as input and as a side-effect modify that list.

CS553 Lecture Program Optimizations using Data-Flow Analysis 14 CS553 Lecture Program Optimizations using Data-Flow Analysis 15

Common Subexpression Elimination CSE Example

Before CSE After CSE Idea c := a + b t1 := a + b – Find common subexpressions whose range spans the same basic blocks d := m & n c := t1 and eliminate unnecessary re-evaluations e := b + d t2 := m & n – Leverage available expressions f := a + b d := t2 g := -b t3 := b + d h := b + a e := t3 a := j + a f := t1 Recall available expressions k := m & n g := -b – An expression (e.g., x+y) is available at node n if every path from the j := b + d h := t1 entry node to n evaluates x+y, and there are no definitions of x or y after a := -b a := j + a the last evaluation along that path if m & n goto L2 k := t2 j := t3 Summary a := -b if t2 goto L2 Strategy 11 instructions – If an expression is available at a point where it is evaluated, it need not be 12 variables 9 binary operators Summary recomputed 14 instructions 15 variables 4 binary operators

CS553 Lecture Program Optimizations using Data-Flow Analysis 16 CS553 Lecture Program Optimizations using Data-Flow Analysis 17

4 CSE Approach 1 CSE Approach 2

Notation Idea – Avail(b) is the set of expressions available at block b – Reduce number of copies by assigning a unique name to each unique – Gen(b) is the set of expressions generated and not killed at block b expression

Summary If we use e and e ∈ Avail(b) – Allocate a new name n – ∀e Name[e] = unassigned – Search backward from b (in CFG) to find statements (one for each path) – if we use e and e ∈ Avail(b) that most recently generate e – if Name[e]=unassigned, allocate new name n and Name[e] = n – Insert copy to n after generators else n = Name[e] – Replace e with n Example – Replace e with n – In a subsequent traversal of block b, if e ∈ Gen(b) and Name[e] ≠ Problems a := b + c unassigned, then insert a copy to Name[e] after the generator of e – Backward search for each use is expensive t1 := a Problem Example – Generates unique name for each use t2 := a – May still insert unnecessary copies – |Uses| > |Avail| e := tb1 + c a := b + c – Requires two passes over the code – Each generator may have many copies f := tb2 + c t1 := a

CS553 Lecture Program Optimizations using Data-Flow Analysis 18 CS553 Lecture Program Optimizations using Data-Flow Analysis 19

CSE Approach 3 Extraneous Copies

Idea Extraneous copies degrade performance – Don’t worry about temporaries – Create one temporary for each unique expression Let other transformations deal with them – Let subsequent pass eliminate unnecessary temporaries – Dead code elimination

At an evaluation of e – Coalescing – Hash e to a name, n, in a table Coalesce assignments to t1 and t2 into a single statement – Insert an assignment of e to n t1 := b + c

At a use of e in b, if e ∈ Avail(b) t2 := t1 – Lookup e’s name in the hash table (call this name n) – Replace e with n – Greatly simplifies CSE

Problems – Inserts more copies than approach 2 (but extra copies are dead) – Still requires two passes (2nd pass is very general)

CS553 Lecture Program Optimizations using Data-Flow Analysis 20 CS553 Lecture Program Optimizations using Data-Flow Analysis 21

5 CSE Approach 4 CSE with the MiniJava compiler

Idea Assume the DataFlowSolver has already been written. – Create one temporary for each unique expression – Calculate reaching expressions at the same time as available expressions – Subclass Assem.Instr with a CONSTMOVE instruction type.

At an evaluation of e – Subclass Assem.Instr with a BINOPMOVE instruction type that can be queried – Hash e to a name, n, in a table for the operator type, each operand, and the temporary being defined.

At a use of e in b, if e ∈ Avail(b) – The AvailExpr class should implement the DataFlowProblem interface. – Lookup e’s name in the hash table (call this name n) – Replace e with n – The CSE class should take a reference to a list of Assem.Instrs as input and as a – Insert an assignment of e to n after each reaching expression side-effect modify that list.

Problems? – Only requires modification of available expressions analysis to include reaching definitions

CS553 Lecture Program Optimizations using Data-Flow Analysis 22 CS553 Lecture Program Optimizations using Data-Flow Analysis 23

Next Time

Reading – Ch 17.4 and Ch 18 intro

Lecture – Speeding up data-flow analysis

CS553 Lecture Program Optimizations using Data-Flow Analysis 24

6