Dominator Value Numbering Available Expressions

Total Page:16

File Type:pdf, Size:1020Kb

Dominator Value Numbering Available Expressions Agenda Initial example: data-flow analysis for CSE P 501 – Compilers common subexpression elimination Other analysis problems that work in the same framework Data-flow Analysis Hal Perkins Autumn 2005 Credits: Largely based on Keith Cooper’s slides from Rice University 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-1 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-2 The Story So Far… Dominator Value Numbering A m = a + b Most sophisticated 0 0 0 Redundant expression elimination n0 = a0 + b0 algorithm so far Local Value Numbering Still misses some B C p = c + d q = a + b opportunities 0 0 0 0 0 0 Superlocal Value Numbering r0 = c0 + d0 r1 = c0 + d0 Can’t handle loops Extends VN to EBBs D E e0 = b0 + 18 e1 = a0 + 17 SSA-like namespace s0 = a0 + b0 t0 = c0 + d0 u0 = e0 + f0 u1 = e1 + f0 Dominator VN Technique (DVNT) F e = Φ(e ,e ) All of these propagate along forward edges 2 0 1 u2 = Φ(u0,u1) v = a + b None are global G 0 0 0 w = c + d r2 = Φ(r0,r1) 0 0 0 x = e + f In particular, can’t handle back edges (loops) y0 = a0 + b0 0 2 0 z0 = c0 + d0 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-3 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-4 Available Expressions “Available” and Other Terms Goal: use data-flow analysis to find An expression e is defined at point p in the CFG if its value is computed at p common subexpressions whose range Sometimes called definition site spans basic blocks An expression e is killed at point p if one of its operands is defined at p Idea: calculate available expressions at Sometimes called kill site beginning of each basic block An expression e is available at point p if every path leading to p contains a prior Avoid re-evaluation of an available definition of e and e is not killed between expression – use a copy operation that definition and p 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-5 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-6 CSE P 501 Au05 R-1 Computing Available Available Expression Sets Expressions For each block b, define AVAIL(b) is the set AVAIL(b) – the set of expressions available AVAIL(b) = ∩x∈preds(b) (DEF(x) ∪ on entry to b (AVAIL(x) ∩ NKILL(x)) ) NKILL(b) – the set of expressions not killed preds(b) is the set of b’s predecessors in in b the control flow graph DEF(b) – the set of expressions defined in This gives a system of simultaneous b and not subsequently killed in b equations – a data-flow problem 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-7 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-8 GCSE with Available Name Space Issues Expressions In previous value-numbering For each block b, compute DEF(b) and algorithms, we used a SSA-like NKILL(b) renaming to keep track of versions For each block b, compute AVAIL(b) In global data-flow problems, we use For each block b, value number the the original namespace block starting with AVAIL(b) Replace expressions in AVAIL(b) with The KILL information captures when a value is no longer available references to the previously computed values 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-9 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-10 Global CSE Replacement Analysis After analysis and before Main problem – inserts extraneous copies at transformation, assign a global name to all definitions and uses of every e that each expression e by hashing on e appears in any AVAIL(b) But the extra copies are dead and easy to remove During transformation step Useful copies often coalesce away when registers At each evaluation of e, insert copy and temporaries are assigned name(e ) = e Common strategy At each reference to e, replace e with Insert copies that might be useful name(e ) Let dead code elimination sort it out later 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-11 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-12 CSE P 501 Au05 R-2 Computing Available Expressions Computing DEF and NKILL (1) Big Picture For each block b with operations o1, o2, …, ok KILLED = ∅ Build control-flow graph DEF(b) = ∅ Calculate initial local data – DEF(b) and NKILL(b) for i = k to 1 assume oi is “x = y + z” This only needs to be done once if (y ∉ KILLED and z ∉ KILLED) Iteratively calculate AVAIL(b) by repeatedly add “y + z” to DEF(b) evaluating equations until nothing changes add x to KILLED Another fixed-point algorithm … 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-13 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-14 Computing Available Computing DEF and NKILL (2) Expressions After computing DEF and KILLED for a Once DEF(b) and NKILL(b) are block b, computed for all blocks b NKILL(b) = { all expressions } Worklist = { all blocks bi } for each expression e while (Worklist ≠∅) remove a block b from Worklist for each variable v ∈ e recompute AVAIL(b) if v ∈ KILLED then if AVAIL(b) changed NKILL(b) = NKILL(b) - e Worklist = Worklist ∪ successors(b) 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-15 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-16 Comparing Algorithms Comparing Algorithms (2) A m = a + b LVN – Local Value n = a + b LVN => SVN => DVN form a strict hierarchy Numbering SVN – Superlocal Value B C – later algorithms find a superset of previous p = c + d q = a + b Numbering r = c + d r = c + d information DVN – Dominator-based D E Value Numbering e = b + 18 e = a + 17 Global RE finds a somewhat different set GRE – Global Redundancy s = a + b t = c + d Discovers e+f in F (computed in both D and E) Elimination u = e + f u = e + f F Misses identical values if they have different v = a + b w = c + d names (e.g., a+b and c+d when a=c and b=d) x = e + f Value Numbering catches this G y = a + b z = c + d 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-17 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-18 CSE P 501 Au05 R-3 Data-flow Analysis (1) Data-flow Analysis (2) A collection of techniques for compile- Usually formulated as a set of time reasoning about run-time values simultaneous equations (data-flow problem) Almost always involves building a graph Sets attached to nodes and edges Trivial for basic blocks Need a lattice (or semilattice) to describe Control-flow graph or derivative for global values problems In particular, has an appropriate operator to combine values and an appropriate “bottom” or Call graph or derivative for whole-program minimal value problems 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-19 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-20 Data-flow Analysis (3) Data-flow Analysis (4) Desired solution is usually a meet over Limitations all paths (MOP) solution Precision – “up to symbolic execution” Assumes all paths taken “What is true on every path from entry” Sometimes cannot afford to compute full solution “What can happen on any path from entry” Arrays – classic analysis treats each array as a single fact Usually relates to safety of optimization Pointers – difficult, expensive to analyze Imprecision rapidly adds up Summary: for scalar values we can quickly solve simple problems 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-21 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-22 Scope of Analysis Some Problems (1) Larger context (EBBs, regions, global, Merge points often cause loss of interprocedural) sometimes helps information More opportunities for optimizations Sometimes worthwhile to clone the code at But not always the merge points to yield two straight-line Introduces uncertainties about flow of control sequences Usually only allows weaker analysis Sometimes has unwanted side effects Can create additional pressure on registers, for example 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-23 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-24 CSE P 501 Au05 R-4 Some Problems (2) Other Data-Flow Problems Procedure/function/method calls are problematic The basic data-flow analysis framework Have to assume anything could happen, which kills local assumptions can be applied to many other problems Calling sequence and register conventions are often more general than needed beyond redundant expressions One technique – inline substitution Allows caller and called code to be analyzed together; more Different kinds of analysis enable precise information different optimizations Can eliminate overhead of function call, parameter passing, register save/restore But… Creates dependency in compiled code on specific version of procedure definition – need to avoid trouble (inconsistencies) if (when?) the definition changes. 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-25 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-26 Characterizing Data-flow Efficiency of Data-flow Analysis Analysis All of these involve sets of facts about each The algorithms eventually terminate, basic block b IN(b) – facts true on entry to b but the expected time needed can be OUT(b) – facts true on exit from b reduced by picking a good order to visit GEN(b) – facts created and not killed in b nodes in the CFG KILL(b) – facts killed in b Forward problems – reverse postorder These are related by the equation OUT(b) = GEN(b) ∪ (IN(b) – KILL(b) Backward problems - postorder Solve this iteratively for all blocks Sometimes information propagates forward; sometimes backward 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-27 12/6/2005 © 2002-05 Hal Perkins & UW CSE R-28 Example:Live Variable Analysis Equations for Live Variables A variable v is live at point p iff there is any Sets path from p to a use of v along which v is not USED(b) – variables used in b before being redefined defined in b Uses NOTDEF(b) – variables not defined in b Register allocation – only live variables need a register (or temporary) LIVE(b) – variables live on exit from b Eliminating
Recommended publications
  • Redundancy Elimination Common Subexpression Elimination
    Redundancy Elimination Aim: Eliminate redundant operations in dynamic execution Why occur? Loop-invariant code: Ex: constant assignment in loop Same expression computed Ex: addressing Value numbering is an example Requires dataflow analysis Other optimizations: Constant subexpression elimination Loop-invariant code motion Partial redundancy elimination Common Subexpression Elimination Replace recomputation of expression by use of temp which holds value Ex. (s1) y := a + b Ex. (s1) temp := a + b (s1') y := temp (s2) z := a + b (s2) z := temp Illegal? How different from value numbering? Ex. (s1) read(i) (s2) j := i + 1 (s3) k := i i + 1, k+1 (s4) l := k + 1 no cse, same value number Why need temp? Local and Global ¡ Local CSE (BB) Ex. (s1) c := a + b (s1) t1 := a + b (s2) d := m&n (s1') c := t1 (s3) e := a + b (s2) d := m&n (s4) m := 5 (s5) if( m&n) ... (s3) e := t1 (s4) m := 5 (s5) if( m&n) ... 5 instr, 4 ops, 7 vars 6 instr, 3 ops, 8 vars Always better? Method: keep track of expressions computed in block whose operands have not changed value CSE Hash Table (+, a, b) (&,m, n) Global CSE example i := j i := j a := 4*i t := 4*i i := i + 1 i := i + 1 b := 4*i t := 4*i b := t c := 4*i c := t Assumes b is used later ¡ Global CSE An expression e is available at entry to B if on every path p from Entry to B, there is an evaluation of e at B' on p whose values are not redefined between B' and B.
    [Show full text]
  • Integrating Program Optimizations and Transformations with the Scheduling of Instruction Level Parallelism*
    Integrating Program Optimizations and Transformations with the Scheduling of Instruction Level Parallelism* David A. Berson 1 Pohua Chang 1 Rajiv Gupta 2 Mary Lou Sofia2 1 Intel Corporation, Santa Clara, CA 95052 2 University of Pittsburgh, Pittsburgh, PA 15260 Abstract. Code optimizations and restructuring transformations are typically applied before scheduling to improve the quality of generated code. However, in some cases, the optimizations and transformations do not lead to a better schedule or may even adversely affect the schedule. In particular, optimizations for redundancy elimination and restructuring transformations for increasing parallelism axe often accompanied with an increase in register pressure. Therefore their application in situations where register pressure is already too high may result in the generation of additional spill code. In this paper we present an integrated approach to scheduling that enables the selective application of optimizations and restructuring transformations by the scheduler when it determines their application to be beneficial. The integration is necessary because infor- mation that is used to determine the effects of optimizations and trans- formations on the schedule is only available during instruction schedul- ing. Our integrated scheduling approach is applicable to various types of global scheduling techniques; in this paper we present an integrated algorithm for scheduling superblocks. 1 Introduction Compilers for multiple-issue architectures, such as superscalax and very long instruction word (VLIW) architectures, axe typically divided into phases, with code optimizations, scheduling and register allocation being the latter phases. The importance of integrating these latter phases is growing with the recognition that the quality of code produced for parallel systems can be greatly improved through the sharing of information.
    [Show full text]
  • Efficient Run Time Optimization with Static Single Assignment
    i Jason W. Kim and Terrance E. Boult EECS Dept. Lehigh University Room 304 Packard Lab. 19 Memorial Dr. W. Bethlehem, PA. 18015 USA ¢ jwk2 ¡ tboult @eecs.lehigh.edu Abstract We introduce a novel optimization engine for META4, a new object oriented language currently under development. It uses Static Single Assignment (henceforth SSA) form coupled with certain reasonable, albeit very uncommon language features not usually found in existing systems. This reduces the code footprint and increased the optimizer’s “reuse” factor. This engine performs the following optimizations; Dead Code Elimination (DCE), Common Subexpression Elimination (CSE) and Constant Propagation (CP) at both runtime and compile time with linear complexity time requirement. CP is essentially free, whether the values are really source-code constants or specific values generated at runtime. CP runs along side with the other optimization passes, thus allowing the efficient runtime specialization of the code during any point of the program’s lifetime. 1. Introduction A recurring theme in this work is that powerful expensive analysis and optimization facilities are not necessary for generating good code. Rather, by using information ignored by previous work, we have built a facility that produces good code with simple linear time algorithms. This report will focus on the optimization parts of the system. More detailed reports on META4?; ? as well as the compiler are under development. Section 0.2 will introduce the scope of the optimization algorithms presented in this work. Section 1. will dicuss some of the important definitions and concepts related to the META4 programming language and the optimizer used by the algorithms presented herein.
    [Show full text]
  • CS153: Compilers Lecture 19: Optimization
    CS153: Compilers Lecture 19: Optimization Stephen Chong https://www.seas.harvard.edu/courses/cs153 Contains content from lecture notes by Steve Zdancewic and Greg Morrisett Announcements •HW5: Oat v.2 out •Due in 2 weeks •HW6 will be released next week •Implementing optimizations! (and more) Stephen Chong, Harvard University 2 Today •Optimizations •Safety •Constant folding •Algebraic simplification • Strength reduction •Constant propagation •Copy propagation •Dead code elimination •Inlining and specialization • Recursive function inlining •Tail call elimination •Common subexpression elimination Stephen Chong, Harvard University 3 Optimizations •The code generated by our OAT compiler so far is pretty inefficient. •Lots of redundant moves. •Lots of unnecessary arithmetic instructions. •Consider this OAT program: int foo(int w) { var x = 3 + 5; var y = x * w; var z = y - 0; return z * 4; } Stephen Chong, Harvard University 4 Unoptimized vs. Optimized Output .globl _foo _foo: •Hand optimized code: pushl %ebp movl %esp, %ebp _foo: subl $64, %esp shlq $5, %rdi __fresh2: movq %rdi, %rax leal -64(%ebp), %eax ret movl %eax, -48(%ebp) movl 8(%ebp), %eax •Function foo may be movl %eax, %ecx movl -48(%ebp), %eax inlined by the compiler, movl %ecx, (%eax) movl $3, %eax so it can be implemented movl %eax, -44(%ebp) movl $5, %eax by just one instruction! movl %eax, %ecx addl %ecx, -44(%ebp) leal -60(%ebp), %eax movl %eax, -40(%ebp) movl -44(%ebp), %eax Stephen Chong,movl Harvard %eax,University %ecx 5 Why do we need optimizations? •To help programmers… •They write modular, clean, high-level programs •Compiler generates efficient, high-performance assembly •Programmers don’t write optimal code •High-level languages make avoiding redundant computation inconvenient or impossible •e.g.
    [Show full text]
  • Copy Propagation Optimizations for VLIW DSP Processors with Distributed Register Files ?
    Copy Propagation Optimizations for VLIW DSP Processors with Distributed Register Files ? Chung-Ju Wu Sheng-Yuan Chen Jenq-Kuen Lee Department of Computer Science National Tsing-Hua University Hsinchu 300, Taiwan Email: {jasonwu, sychen, jklee}@pllab.cs.nthu.edu.tw Abstract. High-performance and low-power VLIW DSP processors are increasingly deployed on embedded devices to process video and mul- timedia applications. For reducing power and cost in designs of VLIW DSP processors, distributed register files and multi-bank register archi- tectures are being adopted to eliminate the amount of read/write ports in register files. This presents new challenges for devising compiler op- timization schemes for such architectures. In our research work, we ad- dress the compiler optimization issues for PAC architecture, which is a 5-way issue DSP processor with distributed register files. We show how to support an important class of compiler optimization problems, known as copy propagations, for such architecture. We illustrate that a naive deployment of copy propagations in embedded VLIW DSP processors with distributed register files might result in performance anomaly. In our proposed scheme, we derive a communication cost model by clus- ter distance, register port pressures, and the movement type of register sets. This cost model is used to guide the data flow analysis for sup- porting copy propagations over PAC architecture. Experimental results show that our schemes are effective to prevent performance anomaly with copy propagations over embedded VLIW DSP processors with dis- tributed files. 1 Introduction Digital signal processors (DSPs) have been found widely used in an increasing number of computationally intensive applications in the fields such as mobile systems.
    [Show full text]
  • Precise Null Pointer Analysis Through Global Value Numbering
    Precise Null Pointer Analysis Through Global Value Numbering Ankush Das1 and Akash Lal2 1 Carnegie Mellon University, Pittsburgh, PA, USA 2 Microsoft Research, Bangalore, India Abstract. Precise analysis of pointer information plays an important role in many static analysis tools. The precision, however, must be bal- anced against the scalability of the analysis. This paper focusses on improving the precision of standard context and flow insensitive alias analysis algorithms at a low scalability cost. In particular, we present a semantics-preserving program transformation that drastically improves the precision of existing analyses when deciding if a pointer can alias Null. Our program transformation is based on Global Value Number- ing, a scheme inspired from compiler optimization literature. It allows even a flow-insensitive analysis to make use of branch conditions such as checking if a pointer is Null and gain precision. We perform experiments on real-world code and show that the transformation improves precision (in terms of the number of dereferences proved safe) from 86.56% to 98.05%, while incurring a small overhead in the running time. Keywords: Alias Analysis, Global Value Numbering, Static Single As- signment, Null Pointer Analysis 1 Introduction Detecting and eliminating null-pointer exceptions is an important step towards developing reliable systems. Static analysis tools that look for null-pointer ex- ceptions typically employ techniques based on alias analysis to detect possible aliasing between pointers. Two pointer-valued variables are said to alias if they hold the same memory location during runtime. Statically, aliasing can be de- cided in two ways: (a) may-alias [1], where two pointers are said to may-alias if they can point to the same memory location under some possible execution, and (b) must-alias [27], where two pointers are said to must-alias if they always point to the same memory location under all possible executions.
    [Show full text]
  • Control Flow Analysis in Scheme
    Control Flow Analysis in Scheme Olin Shivers Carnegie Mellon University [email protected] Abstract Fortran). Representative texts describing these techniques are [Dragon], and in more detail, [Hecht]. Flow analysis is perhaps Traditional ¯ow analysis techniques, such as the ones typically the chief tool in the optimising compiler writer's bag of tricks; employed by optimising Fortran compilers, do not work for an incomplete list of the problems that can be addressed with Scheme-like languages. This paper presents a ¯ow analysis ¯ow analysis includes global constant subexpression elimina- technique Ð control ¯ow analysis Ð which is applicable to tion, loop invariant detection, redundant assignment detection, Scheme-like languages. As a demonstration application, the dead code elimination, constant propagation, range analysis, information gathered by control ¯ow analysis is used to per- code hoisting, induction variable elimination, copy propaga- form a traditional ¯ow analysis problem, induction variable tion, live variable analysis, loop unrolling, and loop jamming. elimination. Extensions and limitations are discussed. However, these traditional ¯ow analysis techniques have The techniques presented in this paper are backed up by never successfully been applied to the Lisp family of computer working code. They are applicable not only to Scheme, but languages. This is a curious omission. The Lisp community also to related languages, such as Common Lisp and ML. has had suf®cient time to consider the problem. Flow analysis dates back at least to 1960, ([Dragon], pp. 516), and Lisp is 1 The Task one of the oldest computer programming languages currently in use, rivalled only by Fortran and COBOL. Flow analysis is a traditional optimising compiler technique Indeed, the Lisp community has long been concerned with for determining useful information about a program at compile the execution speed of their programs.
    [Show full text]
  • Value Numbering
    SOFTWARE—PRACTICE AND EXPERIENCE, VOL. 0(0), 1–18 (MONTH 1900) Value Numbering PRESTON BRIGGS Tera Computer Company, 2815 Eastlake Avenue East, Seattle, WA 98102 AND KEITH D. COOPER L. TAYLOR SIMPSON Rice University, 6100 Main Street, Mail Stop 41, Houston, TX 77005 SUMMARY Value numbering is a compiler-based program analysis method that allows redundant computations to be removed. This paper compares hash-based approaches derived from the classic local algorithm1 with partitioning approaches based on the work of Alpern, Wegman, and Zadeck2. Historically, the hash-based algorithm has been applied to single basic blocks or extended basic blocks. We have improved the technique to operate over the routine’s dominator tree. The partitioning approach partitions the values in the routine into congruence classes and removes computations when one congruent value dominates another. We have extended this technique to remove computations that define a value in the set of available expressions (AVA IL )3. Also, we are able to apply a version of Morel and Renvoise’s partial redundancy elimination4 to remove even more redundancies. The paper presents a series of hash-based algorithms and a series of refinements to the partitioning technique. Within each series, it can be proved that each method discovers at least as many redundancies as its predecessors. Unfortunately, no such relationship exists between the hash-based and global techniques. On some programs, the hash-based techniques eliminate more redundancies than the partitioning techniques, while on others, partitioning wins. We experimentally compare the improvements made by these techniques when applied to real programs. These results will be useful for commercial compiler writers who wish to assess the potential impact of each technique before implementation.
    [Show full text]
  • Language and Compiler Support for Dynamic Code Generation by Massimiliano A
    Language and Compiler Support for Dynamic Code Generation by Massimiliano A. Poletto S.B., Massachusetts Institute of Technology (1995) M.Eng., Massachusetts Institute of Technology (1995) Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 1999 © Massachusetts Institute of Technology 1999. All rights reserved. A u th or ............................................................................ Department of Electrical Engineering and Computer Science June 23, 1999 Certified by...............,. ...*V .,., . .* N . .. .*. *.* . -. *.... M. Frans Kaashoek Associate Pro essor of Electrical Engineering and Computer Science Thesis Supervisor A ccepted by ................ ..... ............ ............................. Arthur C. Smith Chairman, Departmental CommitteA on Graduate Students me 2 Language and Compiler Support for Dynamic Code Generation by Massimiliano A. Poletto Submitted to the Department of Electrical Engineering and Computer Science on June 23, 1999, in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract Dynamic code generation, also called run-time code generation or dynamic compilation, is the cre- ation of executable code for an application while that application is running. Dynamic compilation can significantly improve the performance of software by giving the compiler access to run-time infor- mation that is not available to a traditional static compiler. A well-designed programming interface to dynamic compilation can also simplify the creation of important classes of computer programs. Until recently, however, no system combined efficient dynamic generation of high-performance code with a powerful and portable language interface. This thesis describes a system that meets these requirements, and discusses several applications of dynamic compilation.
    [Show full text]
  • Strength Reduction of Induction Variables and Pointer Analysis – Induction Variable Elimination
    Loop optimizations • Optimize loops – Loop invariant code motion [last time] Loop Optimizations – Strength reduction of induction variables and Pointer Analysis – Induction variable elimination CS 412/413 Spring 2008 Introduction to Compilers 1 CS 412/413 Spring 2008 Introduction to Compilers 2 Strength Reduction Induction Variables • Basic idea: replace expensive operations (multiplications) with • An induction variable is a variable in a loop, cheaper ones (additions) in definitions of induction variables whose value is a function of the loop iteration s = 3*i+1; number v = f(i) while (i<10) { while (i<10) { j = 3*i+1; //<i,3,1> j = s; • In compilers, this a linear function: a[j] = a[j] –2; a[j] = a[j] –2; i = i+2; i = i+2; f(i) = c*i + d } s= s+6; } •Observation:linear combinations of linear • Benefit: cheaper to compute s = s+6 than j = 3*i functions are linear functions – s = s+6 requires an addition – Consequence: linear combinations of induction – j = 3*i requires a multiplication variables are induction variables CS 412/413 Spring 2008 Introduction to Compilers 3 CS 412/413 Spring 2008 Introduction to Compilers 4 1 Families of Induction Variables Representation • Basic induction variable: a variable whose only definition in the • Representation of induction variables in family i by triples: loop body is of the form – Denote basic induction variable i by <i, 1, 0> i = i + c – Denote induction variable k=i*a+b by triple <i, a, b> where c is a loop-invariant value • Derived induction variables: Each basic induction variable i defines
    [Show full text]
  • Garbage Collection
    Topic 10: Dataflow Analysis COS 320 Compiling Techniques Princeton University Spring 2016 Lennart Beringer 1 Analysis and Transformation analysis spans multiple procedures single-procedure-analysis: intra-procedural Dataflow Analysis Motivation Dataflow Analysis Motivation r2 r3 r4 Assuming only r5 is live-out at instruction 4... Dataflow Analysis Iterative Dataflow Analysis Framework Definitions Iterative Dataflow Analysis Framework Definitions for Liveness Analysis Definitions for Liveness Analysis Remember generic equations: -- r1 r1 r2 r2, r3 r3 r2 r1 r1 -- r3 -- -- Smart ordering: visit nodes in reverse order of execution. -- r1 r1 r2 r2, r3 r3 r2 r1 r3, r1 ? r1 -- r3 r3, r1 r3 -- -- r3 Smart ordering: visit nodes in reverse order of execution. -- r1 r3, r1 r3 r1 r2 r3, r2 r3, r1 r2, r3 r3 r3, r2 r3, r2 r2 r1 r3, r1 r3, r2 ? r1 -- r3 r3, r1 r3 -- -- r3 Smart ordering: visit nodes in reverse order of execution. -- r1 r3, r1 r3 r1 r2 r3, r2 r3, r1 : r2, r3 r3 r3, r2 r3, r2 r2 r1 r3, r1 r3, r2 : r1 -- r3 r3, r1 r3, r1 r3 -- -- r3 -- r3 Smart ordering: visit nodes in reverse order of execution. Live Variable Application 1: Register Allocation Interference Graph r1 r3 ? r2 Interference Graph r1 r3 r2 Live Variable Application 2: Dead Code Elimination of the form of the form Live Variable Application 2: Dead Code Elimination of the form This may lead to further optimization opportunities, as uses of variables in s of the form disappear. repeat all / some analysis / optimization passes! Reaching Definition Analysis Reaching Definition Analysis (details on next slide) Reaching definitions: definition-ID’s 1.
    [Show full text]
  • Global Value Numbering Using Random Interpretation
    Global Value Numbering using Random Interpretation Sumit Gulwani George C. Necula [email protected] [email protected] Department of Electrical Engineering and Computer Science University of California, Berkeley Berkeley, CA 94720-1776 Abstract General Terms We present a polynomial time randomized algorithm for global Algorithms, Theory, Verification value numbering. Our algorithm is complete when conditionals are treated as non-deterministic and all operators are treated as uninter- Keywords preted functions. We are not aware of any complete polynomial- time deterministic algorithm for the same problem. The algorithm Global Value Numbering, Herbrand Equivalences, Random Inter- does not require symbolic manipulations and hence is simpler to pretation, Randomized Algorithm, Uninterpreted Functions implement than the deterministic symbolic algorithms. The price for these benefits is that there is a probability that the algorithm can report a false equality. We prove that this probability can be made 1 Introduction arbitrarily small by controlling various parameters of the algorithm. Detecting equivalence of expressions in a program is a prerequi- Our algorithm is based on the idea of random interpretation, which site for many important optimizations like constant and copy prop- relies on executing a program on a number of random inputs and agation [18], common sub-expression elimination, invariant code discovering relationships from the computed values. The computa- motion [3, 13], induction variable elimination, branch elimination, tions are done by giving random linear interpretations to the opera- branch fusion, and loop jamming [10]. It is also important for dis- tors in the program. Both branches of a conditional are executed. At covering equivalent computations in different programs, for exam- join points, the program states are combined using a random affine ple, plagiarism detection and translation validation [12, 11], where combination.
    [Show full text]