Agenda

„ Initial example: data-flow analysis for CSE P 501 – common subexpression elimination „ Other analysis problems that work in the same framework Data-flow Analysis Hal Perkins Autumn 2005 „ Credits: Largely based on Keith Cooper’s slides from Rice University

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-1 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-2

The Story So Far… Dominator

A m = a + b „ Most sophisticated 0 0 0 „ Redundant expression elimination n0 = a0 + b0 algorithm so far „ Local Value Numbering „ Still misses some B C p = c + d q = a + b opportunities 0 0 0 0 0 0 „ Superlocal Value Numbering r0 = c0 + d0 r1 = c0 + d0 „ Can’t handle loops „ Extends VN to EBBs D E e0 = b0 + 18 e1 = a0 + 17 „ SSA-like namespace s0 = a0 + b0 t0 = c0 + d0 u0 = e0 + f0 u1 = e1 + f0 „ Dominator VN Technique (DVNT) F e = Φ(e ,e ) „ All of these propagate along forward edges 2 0 1 u2 = Φ(u0,u1) v = a + b „ None are global G 0 0 0 w = c + d r2 = Φ(r0,r1) 0 0 0 x = e + f „ In particular, can’t handle back edges (loops) y0 = a0 + b0 0 2 0 z0 = c0 + d0

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-3 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-4

Available Expressions “Available” and Other Terms

„ Goal: use data-flow analysis to find „ An expression e is defined at point p in the CFG if its value is computed at p common subexpressions whose range „ Sometimes called definition site spans basic blocks „ An expression e is killed at point p if one of its operands is defined at p „ Idea: calculate available expressions at „ Sometimes called kill site beginning of each basic block „ An expression e is available at point p if every path leading to p contains a prior „ Avoid re-evaluation of an available definition of e and e is not killed between expression – use a copy operation that definition and p

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-5 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-6

CSE P 501 Au05 R-1 Computing Available Sets Expressions

„ For each block b, define „ AVAIL(b) is the set

„ AVAIL(b) – the set of expressions available AVAIL(b) = ∩x∈preds(b) (DEF(x) ∪ on entry to b (AVAIL(x) ∩ NKILL(x)) )

„ NKILL(b) – the set of expressions not killed „ preds(b) is the set of b’s predecessors in in b the control flow graph „ DEF(b) – the set of expressions defined in „ This gives a system of simultaneous b and not subsequently killed in b equations – a data-flow problem

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-7 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-8

GCSE with Available Name Space Issues Expressions

„ In previous value-numbering „ For each block b, compute DEF(b) and algorithms, we used a SSA-like NKILL(b) renaming to keep track of versions „ For each block b, compute AVAIL(b)

„ In global data-flow problems, we use „ For each block b, value number the the original namespace block starting with AVAIL(b) „ Replace expressions in AVAIL(b) with „ The KILL information captures when a value is no longer available references to the previously computed values

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-9 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-10

Global CSE Replacement Analysis

„ After analysis and before „ Main problem – inserts extraneous copies at transformation, assign a global name to all definitions and uses of every e that each expression e by hashing on e appears in any AVAIL(b) „ But the extra copies are dead and easy to remove „ During transformation step „ Useful copies often coalesce away when registers „ At each evaluation of e, insert copy and temporaries are assigned name(e ) = e „ Common strategy

„ At each reference to e, replace e with „ Insert copies that might be useful name(e ) „ Let sort it out later

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-11 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-12

CSE P 501 Au05 R-2 Computing Available Expressions Computing DEF and NKILL (1)

„ Big Picture „ For each block b with operations o1, o2, …, ok KILLED = ∅ „ Build control-flow graph DEF(b) = ∅ „ Calculate initial local data – DEF(b) and NKILL(b) for i = k to 1 assume oi is “x = y + z” „ This only needs to be done once if (y ∉ KILLED and z ∉ KILLED) „ Iteratively calculate AVAIL(b) by repeatedly add “y + z” to DEF(b) evaluating equations until nothing changes add x to KILLED „ Another fixed-point algorithm …

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-13 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-14

Computing Available Computing DEF and NKILL (2) Expressions

„ After computing DEF and KILLED for a „ Once DEF(b) and NKILL(b) are block b, computed for all blocks b NKILL(b) = { all expressions } Worklist = { all blocks bi } for each expression e while (Worklist ≠∅) remove a block b from Worklist for each variable v ∈ e recompute AVAIL(b) if v ∈ KILLED then if AVAIL(b) changed NKILL(b) = NKILL(b) - e Worklist = Worklist ∪ successors(b)

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-15 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-16

Comparing Algorithms Comparing Algorithms (2)

A m = a + b „ LVN – Local Value n = a + b „ LVN => SVN => DVN form a strict hierarchy Numbering „ SVN – Superlocal Value B C – later algorithms find a superset of previous p = c + d q = a + b Numbering r = c + d r = c + d information „ DVN – Dominator-based D E Value Numbering e = b + 18 e = a + 17 „ Global RE finds a somewhat different set „ GRE – Global Redundancy s = a + b t = c + d „ Discovers e+f in F (computed in both D and E) Elimination u = e + f u = e + f F „ Misses identical values if they have different v = a + b w = c + d names (e.g., a+b and c+d when a=c and b=d) x = e + f „ Value Numbering catches this G y = a + b z = c + d

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-17 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-18

CSE P 501 Au05 R-3 Data-flow Analysis (1) Data-flow Analysis (2)

„ A collection of techniques for compile- „ Usually formulated as a set of time reasoning about run-time values simultaneous equations (data-flow problem) „ Almost always involves building a graph „ Sets attached to nodes and edges „ Trivial for basic blocks „ Need a lattice (or semilattice) to describe „ Control-flow graph or derivative for global values

problems „ In particular, has an appropriate operator to combine values and an appropriate “bottom” or „ Call graph or derivative for whole-program minimal value problems

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-19 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-20

Data-flow Analysis (3) Data-flow Analysis (4)

„ Desired solution is usually a meet over „ Limitations all paths (MOP) solution „ Precision – “up to symbolic execution” „ Assumes all paths taken

„ “What is true on every path from entry” „ Sometimes cannot afford to compute full solution

„ “What can happen on any path from entry” „ Arrays – classic analysis treats each array as a single fact „ Usually relates to safety of optimization „ Pointers – difficult, expensive to analyze

„ Imprecision rapidly adds up „ Summary: for scalar values we can quickly solve simple problems

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-21 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-22

Scope of Analysis Some Problems (1)

„ Larger context (EBBs, regions, global, „ Merge points often cause loss of interprocedural) sometimes helps information „ More opportunities for optimizations „ Sometimes worthwhile to clone the code at „ But not always the merge points to yield two straight-line „ Introduces uncertainties about flow of control sequences „ Usually only allows weaker analysis

„ Sometimes has unwanted side effects

„ Can create additional pressure on registers, for example

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-23 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-24

CSE P 501 Au05 R-4 Some Problems (2) Other Data-Flow Problems

„ Procedure/function/method calls are problematic „ The basic data-flow analysis framework „ Have to assume anything could happen, which kills local assumptions can be applied to many other problems „ Calling sequence and register conventions are often more general than needed beyond redundant expressions „ One technique – inline substitution „ Allows caller and called code to be analyzed together; more „ Different kinds of analysis enable precise information different optimizations „ Can eliminate overhead of function call, parameter passing, register save/restore „ But… Creates dependency in compiled code on specific version of procedure definition – need to avoid trouble (inconsistencies) if (when?) the definition changes.

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-25 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-26

Characterizing Data-flow Efficiency of Data-flow Analysis Analysis

„ All of these involve sets of facts about each „ The algorithms eventually terminate, basic block b „ IN(b) – facts true on entry to b but the expected time needed can be „ OUT(b) – facts true on exit from b reduced by picking a good order to visit „ GEN(b) – facts created and not killed in b nodes in the CFG „ KILL(b) – facts killed in b „ Forward problems – reverse postorder „ These are related by the equation OUT(b) = GEN(b) ∪ (IN(b) – KILL(b) „ Backward problems - postorder „ Solve this iteratively for all blocks „ Sometimes information propagates forward; sometimes backward

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-27 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-28

Example: Equations for Live Variables

„ A variable v is live at point p iff there is any „ Sets path from p to a use of v along which v is not „ USED(b) – variables used in b before being redefined defined in b „ Uses „ NOTDEF(b) – variables not defined in b „ – only live variables need a register (or temporary) „ LIVE(b) – variables live on exit from b „ Eliminating useless stores „ Equation „ Detecting uses of uninitialized variables LIVE(b) = ∪s∈succ(b) USED(s) ∪ „ Improve SSA construction – only need Φ-function (LIVE(s) ∩ NOTDEF(s)) for variables that are live in a block

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-29 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-30

CSE P 501 Au05 R-5 Example: Available Expressions Example: Reaching Definitions

„ This is the analysis we did earlier to „ A definition d of some variable v eliminate redundant expression reaches operation i iff i reads the evaluation (i.e., compute AVAIL(b)) value of v and there is a path from d to i that does not define v

„ Uses

„ Find all of the possible definition points for a variable in an expression

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-31 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-32

Equations for Reaching Example: Very Busy Definitions Expressions

„ Sets „ An expression e is considered very busy

„ DEFOUT(b) – set of definitions in b that reach the at some point p if e is evaluated and end of b (i.e., not subsequently redefined in b) used along every path that leaves p, „ SURVIVED(b) – set of all definitions not obscured and evaluating e at p would produce by a definition in b the same result as evaluating it at the „ REACHES(b) – set of definitions that reach b original locations „ Equation „ Uses REACHES(b) = ∪p∈preds(b) DEFOUT(p) ∪ „ Code hoisting – move e to p (reduces code (REACHES(p) ∩ SURVIVED(p)) size; no effect on execution time)

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-33 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-34

Equations for Very Busy Expressions Summary

„ Sets „ Dataflow analysis gives a framework for „ USED(b) – expressions used in b before they are killed finding global information „ KILLED(b) – expressions redefined in b before „ Key to enabling most optimizing they are used transformations „ VERYBUSY(b) – expressions very busy on exit from b „ Equation

VERYBUSY(b) = ∩s∈succ(b) USED(s) ∪ (VERYBUSY(s) - KILLED(s))

12/6/2005 © 2002-05 Hal Perkins & UW CSE R-35 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-36

CSE P 501 Au05 R-6