Partial Redundancy Elimination for Access Path Expressions

Total Page:16

File Type:pdf, Size:1020Kb

Partial Redundancy Elimination for Access Path Expressions Purdue University Purdue e-Pubs Department of Computer Science Technical Reports Department of Computer Science 1998 Partial Redundancy Elimination for Access Path Expressions Antony L. Hosking Purdue University, [email protected] National Nystrom David Whitlock Quintin Cutts Amer Diwan Report Number: 98-044 Hosking, Antony L.; Nystrom, National; Whitlock, David; Cutts, Quintin; and Diwan, Amer, "Partial Redundancy Elimination for Access Path Expressions" (1998). Department of Computer Science Technical Reports. Paper 1431. https://docs.lib.purdue.edu/cstech/1431 This document has been made available through Purdue e-Pubs, a service of the Purdue University Libraries. Please contact [email protected] for additional information. PARTIAL REDUNDANCY ELIMINATION FOR ACCESS PATH EXPRESSIONS Antony Hosking Nathaniel Nystrom David Whitlock Quitin Culls AmerDiwan CSD TR #98-044 December 1998 lmemurionul Workshop 011 Aliasing ill Ohjecl-Oriemed Systems, Lisbon, Ponugal, June 1999 Partial Redundancy Elimination for Access Path Expressions Antony L. Hosking Nathaniel Nystrom David Whitlock Department ofComputer Sciences Purdue UniversiLY West Lafayette, IN 47907- [398, USA Quintin Cutts Amer Diwan Department ofComputing Science Department ofComputer Science University ofGlasgow Stanford University Glasgow Gl2 8QQ, Scotland Stanford, CA 94305-9030, USA Abstract which are often the source of large performance pcnal­ ties incurred in the memory subsystem. Pointer traversals pose significant overhead to the ex­ In this paper we evaluate an approach to elimina­ ecution of object-oriented programs, since every ac­ tion of common access expressions that combines par­ cess Lo an object's state requires a pointer dereference. tial redundancy elimination (PRE) [Morel and Rcnvoise Eliminating redundant pointer traversals reduces both instructions executed as well as redundant memory ac­ 1979] with type-based alias analysis (TBAA) lDhvan cesses to relieve pressure on the memory subsystem. We et al. 1998]. To explore the potential of this approach describe an approach to elimination of redundant access we have built an optimization framework for Java elass expressions that combines partial redundancy elimina­ files incorporating TBAA-based PRE for access expres­ tion (PRE) with type-based alias analysis (TBAA). To sions, and measured its impact on the performance of explore the potential of this approach we have imple­ several benchmark programs. An interesting aspect of mented an optimization framework for Java class files the optimizaLion framework is that it operates entirely as incorporating TBAA-ba~ed PRE overpointer acces~' ex­ a bytecode-to-bytccode translator, sourcing and target­ pressions. The framework is implemented as a classfile­ ing Java class files. Our experiments compare the ex­ Lo-classfile transformer; opLimized classes can then be ecution of optimized and unoptimized classes for sev­ run in any standard Java execution environment. Our eral SPARC-based execution environments: the inter­ experiments demonstrate improvements in the execu­ preted JOK reference virtual machinc, the Solaris 2.6 tion ofoptimized code for several Java benchmarks run­ ning in diverse execution environments: the standard virtual machine with ''just-in-time'' (ill) compilation, interpreted JOK virLual machine, a virtual machine us­ and native binaries compiled off-line ("way-ahead-of­ ing "just-in-time" compilation, and native binaries com­ time") to C and thence to native code using the Solaris piled off-line ("way-ahead-of-time"). We isolate the im­ C compiler. pact of access path PRE using TBAA, and demonstrate We have measured boLh the static and dynamic im­ that Java's requirement ofprecise exceptions can notice­ pact of bytecodc-Ievel PRE optimizat"lon for a seL of ably impact code-motion optimizations like PRE. Java benchmark applications, including static code size, bytecode execution counts, native-instruction execution counts, and elapsed time. The results demonstrate gen­ 1 Introduction eral improvement on all measures tor all execution en­ vironmenL~, although some benchmarks have degraded Pointer traversals pose significant overhead to the exe­ performance in certain environments. cution of object-oriented programs, since every access The remainder of thc paper is organized tls follows. to an object's state requires a pointer dereference. Ob­ Section 2 introduces the approach to elimination of jects can refer to other objects, fomling graph structures, redundant access path expressions ba~ed on partial­ and they can be modified, with such modifications vis­ redundancy elimination and type-based alias analysis. ible in future accesses. Just as common subexpressions Section 3 describes our implementation of the analysis often appear in numerical code, common access expres­ and optimization framework that supports transforma­ sions are likewise often encountered in object-oriented tion of Java cla~s files using TBAA-based PRE. Sec­ code. Where two such expressions redundantly com­ tion 4 describes ourexper"lmental methodology for eval­ pute the same value it is desirable to avoid repeated uation ofTBAA-based PRE for several Java benchmark computation of that value by caching the result of the applications. The experimental results are reponed and first computation in a temporary variable, and reusing interpreted in Section 5. We conclude with a discussion it from the temporary at the later occurrence of the ex­ of related work and speculate on directions for future pression. Eliminating redundant computations in this work. way certainly eliminates redundant CPU overhead. Per­ haps just as important for modern machinc architec­ tures, eliminating redundant access expressions also has the effect of eliminating redundant memory references, o 0 o a Table 1: Access expressions b b b b a+b H-a+b t<-a+b I Notation I Name I Variable accesscd I p.f Field access Field f of class in" ~/ ~/ stance [0 which p refers ~ ~ pi;] Array access Component with sub- o+b scripL ; of array Lo (a) Before PRE (b) After PRE which prefers Figure 1: PRE for arithmctie expressions 2 PRE for Access Expressions ble with array lype T[) if type S is a.~signable to type T. Interface types also yield rules on assignability: an Our analysis and optimization framework revolves interface type S is assignable to an interface type T only around PRE over object access expressions. We adopt if T is the same interface as S or a superinterfaee of S; slandanl terminology and notations uscd in the specifi­ a class type S is assignable to an interface type T if S cation of the Java programming language to frame the implements T. Finally, array types, interface types and analysis and optimization problem. class types are all assignablc to class type object. For our purposes we say that a type S is a subt)'pe of 2.1 Thrminology and notation a type T ifS is assignable to T. 1 We write SlIbtype~'(T) to denote all subtypes of type T, including T. Thus, The following definitions paraphrase the Java specifica­ an access path p can legally access variables of type Lion [Gosling et al. 1996J. An object in Java is eithcr a SlIbrypes(Type(p)). class i/lstance or an array. Reference values in Java are poillter~' to these objects, as well as the null reference. Both objects and arrays are created by expressions thaI 2.2 Partial redundancy elimination allocate and initializc storage for them. The operators Our approach to optimization of access expressions ·IS on references to objecLs arc field access, method invoca­ based on application of partiaf redullda/lC)' elimil/arioll tion, casts, type comparison (instanceof), equality (PRE) [Morel and Renvoise 1979]. To our knowledge operators and the conditional operator. There may be this is the first time PRE has been applied to access many references to the same object. Objects have muta­ paths. PRE is a powerful global optimization technique ble state, stored in the variablc fields of class instances Lhat subsumes the more standard common suhexpres­ or the variable elements of arrays. Two variables may sion elimination (CSE). PRE eliminates computations refer to the same object: the state of the object can be that are only partially redundant; that is, redundant only modified through the referencc stored in one variable on some, but not all, paths to somelaterrc-compulation. and then the altered state observed through the other. By inserting evaluations on those paths where the com­ Access expressiolls refer to the variables that comprise putation docs not occur, the later re-evaluation can be an objccl's state. Afield access expression refers to a eliminated and replaced instead with a use of the pre­ field of some elass instancc, while an array access ex­ computed value. This is illustrated in Figure 1. In Fig­ pressioll refers to a component ofan array. Table I sum­ ure l(a), both a and b are available along each palh marizes the Lwo kinds ofaccess expressions in Java. We to the merge point, where expression a + b is evalu" adopt the term access parh lLarus and Hilfinger 1988; ated. However, Lhis evaluation is partially redundant Diwan et al. 1998J to mean a non-empty sequence of since a+h is available on one path to the merge but not accesses. For example, the Java expression a.b[i].c is both. By hoisting the second evaluation of a + b into an access path. Without loss of generality, our nOLation the path where it was not originally available, as in Fig­ assumes that distinct fields within an object have differ­ ure I(b), a + b need only bc evaluated once along any ent names (Le., fields that override inherited fields of the path through the program, ralher Ihan twice as before. same name from superclasses are trivially renamed). Consider the Java access expression a.b[i].c, which A variable is a storage location and has an associated translates to Java bytecode of the form: type, someLimes called its compife-rime type. Given an access path p, then the compile-time lype of p, written aload a load local variable a T)'pe(p) , is simply the compile-time type of the variable getfield b load field b of a it accesses.
Recommended publications
  • Efficient Run Time Optimization with Static Single Assignment
    i Jason W. Kim and Terrance E. Boult EECS Dept. Lehigh University Room 304 Packard Lab. 19 Memorial Dr. W. Bethlehem, PA. 18015 USA ¢ jwk2 ¡ tboult @eecs.lehigh.edu Abstract We introduce a novel optimization engine for META4, a new object oriented language currently under development. It uses Static Single Assignment (henceforth SSA) form coupled with certain reasonable, albeit very uncommon language features not usually found in existing systems. This reduces the code footprint and increased the optimizer’s “reuse” factor. This engine performs the following optimizations; Dead Code Elimination (DCE), Common Subexpression Elimination (CSE) and Constant Propagation (CP) at both runtime and compile time with linear complexity time requirement. CP is essentially free, whether the values are really source-code constants or specific values generated at runtime. CP runs along side with the other optimization passes, thus allowing the efficient runtime specialization of the code during any point of the program’s lifetime. 1. Introduction A recurring theme in this work is that powerful expensive analysis and optimization facilities are not necessary for generating good code. Rather, by using information ignored by previous work, we have built a facility that produces good code with simple linear time algorithms. This report will focus on the optimization parts of the system. More detailed reports on META4?; ? as well as the compiler are under development. Section 0.2 will introduce the scope of the optimization algorithms presented in this work. Section 1. will dicuss some of the important definitions and concepts related to the META4 programming language and the optimizer used by the algorithms presented herein.
    [Show full text]
  • CS153: Compilers Lecture 19: Optimization
    CS153: Compilers Lecture 19: Optimization Stephen Chong https://www.seas.harvard.edu/courses/cs153 Contains content from lecture notes by Steve Zdancewic and Greg Morrisett Announcements •HW5: Oat v.2 out •Due in 2 weeks •HW6 will be released next week •Implementing optimizations! (and more) Stephen Chong, Harvard University 2 Today •Optimizations •Safety •Constant folding •Algebraic simplification • Strength reduction •Constant propagation •Copy propagation •Dead code elimination •Inlining and specialization • Recursive function inlining •Tail call elimination •Common subexpression elimination Stephen Chong, Harvard University 3 Optimizations •The code generated by our OAT compiler so far is pretty inefficient. •Lots of redundant moves. •Lots of unnecessary arithmetic instructions. •Consider this OAT program: int foo(int w) { var x = 3 + 5; var y = x * w; var z = y - 0; return z * 4; } Stephen Chong, Harvard University 4 Unoptimized vs. Optimized Output .globl _foo _foo: •Hand optimized code: pushl %ebp movl %esp, %ebp _foo: subl $64, %esp shlq $5, %rdi __fresh2: movq %rdi, %rax leal -64(%ebp), %eax ret movl %eax, -48(%ebp) movl 8(%ebp), %eax •Function foo may be movl %eax, %ecx movl -48(%ebp), %eax inlined by the compiler, movl %ecx, (%eax) movl $3, %eax so it can be implemented movl %eax, -44(%ebp) movl $5, %eax by just one instruction! movl %eax, %ecx addl %ecx, -44(%ebp) leal -60(%ebp), %eax movl %eax, -40(%ebp) movl -44(%ebp), %eax Stephen Chong,movl Harvard %eax,University %ecx 5 Why do we need optimizations? •To help programmers… •They write modular, clean, high-level programs •Compiler generates efficient, high-performance assembly •Programmers don’t write optimal code •High-level languages make avoiding redundant computation inconvenient or impossible •e.g.
    [Show full text]
  • Precise Null Pointer Analysis Through Global Value Numbering
    Precise Null Pointer Analysis Through Global Value Numbering Ankush Das1 and Akash Lal2 1 Carnegie Mellon University, Pittsburgh, PA, USA 2 Microsoft Research, Bangalore, India Abstract. Precise analysis of pointer information plays an important role in many static analysis tools. The precision, however, must be bal- anced against the scalability of the analysis. This paper focusses on improving the precision of standard context and flow insensitive alias analysis algorithms at a low scalability cost. In particular, we present a semantics-preserving program transformation that drastically improves the precision of existing analyses when deciding if a pointer can alias Null. Our program transformation is based on Global Value Number- ing, a scheme inspired from compiler optimization literature. It allows even a flow-insensitive analysis to make use of branch conditions such as checking if a pointer is Null and gain precision. We perform experiments on real-world code and show that the transformation improves precision (in terms of the number of dereferences proved safe) from 86.56% to 98.05%, while incurring a small overhead in the running time. Keywords: Alias Analysis, Global Value Numbering, Static Single As- signment, Null Pointer Analysis 1 Introduction Detecting and eliminating null-pointer exceptions is an important step towards developing reliable systems. Static analysis tools that look for null-pointer ex- ceptions typically employ techniques based on alias analysis to detect possible aliasing between pointers. Two pointer-valued variables are said to alias if they hold the same memory location during runtime. Statically, aliasing can be de- cided in two ways: (a) may-alias [1], where two pointers are said to may-alias if they can point to the same memory location under some possible execution, and (b) must-alias [27], where two pointers are said to must-alias if they always point to the same memory location under all possible executions.
    [Show full text]
  • Value Numbering
    SOFTWARE—PRACTICE AND EXPERIENCE, VOL. 0(0), 1–18 (MONTH 1900) Value Numbering PRESTON BRIGGS Tera Computer Company, 2815 Eastlake Avenue East, Seattle, WA 98102 AND KEITH D. COOPER L. TAYLOR SIMPSON Rice University, 6100 Main Street, Mail Stop 41, Houston, TX 77005 SUMMARY Value numbering is a compiler-based program analysis method that allows redundant computations to be removed. This paper compares hash-based approaches derived from the classic local algorithm1 with partitioning approaches based on the work of Alpern, Wegman, and Zadeck2. Historically, the hash-based algorithm has been applied to single basic blocks or extended basic blocks. We have improved the technique to operate over the routine’s dominator tree. The partitioning approach partitions the values in the routine into congruence classes and removes computations when one congruent value dominates another. We have extended this technique to remove computations that define a value in the set of available expressions (AVA IL )3. Also, we are able to apply a version of Morel and Renvoise’s partial redundancy elimination4 to remove even more redundancies. The paper presents a series of hash-based algorithms and a series of refinements to the partitioning technique. Within each series, it can be proved that each method discovers at least as many redundancies as its predecessors. Unfortunately, no such relationship exists between the hash-based and global techniques. On some programs, the hash-based techniques eliminate more redundancies than the partitioning techniques, while on others, partitioning wins. We experimentally compare the improvements made by these techniques when applied to real programs. These results will be useful for commercial compiler writers who wish to assess the potential impact of each technique before implementation.
    [Show full text]
  • Global Value Numbering Using Random Interpretation
    Global Value Numbering using Random Interpretation Sumit Gulwani George C. Necula [email protected] [email protected] Department of Electrical Engineering and Computer Science University of California, Berkeley Berkeley, CA 94720-1776 Abstract General Terms We present a polynomial time randomized algorithm for global Algorithms, Theory, Verification value numbering. Our algorithm is complete when conditionals are treated as non-deterministic and all operators are treated as uninter- Keywords preted functions. We are not aware of any complete polynomial- time deterministic algorithm for the same problem. The algorithm Global Value Numbering, Herbrand Equivalences, Random Inter- does not require symbolic manipulations and hence is simpler to pretation, Randomized Algorithm, Uninterpreted Functions implement than the deterministic symbolic algorithms. The price for these benefits is that there is a probability that the algorithm can report a false equality. We prove that this probability can be made 1 Introduction arbitrarily small by controlling various parameters of the algorithm. Detecting equivalence of expressions in a program is a prerequi- Our algorithm is based on the idea of random interpretation, which site for many important optimizations like constant and copy prop- relies on executing a program on a number of random inputs and agation [18], common sub-expression elimination, invariant code discovering relationships from the computed values. The computa- motion [3, 13], induction variable elimination, branch elimination, tions are done by giving random linear interpretations to the opera- branch fusion, and loop jamming [10]. It is also important for dis- tors in the program. Both branches of a conditional are executed. At covering equivalent computations in different programs, for exam- join points, the program states are combined using a random affine ple, plagiarism detection and translation validation [12, 11], where combination.
    [Show full text]
  • Code Optimization
    Code Optimization Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon’s slides that they prepared for COMP 412 at Rice. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. Traditional Three-Phase Compiler Source Front IR IR Back Machine Optimizer Code End End code Errors Optimization (or Code Improvement) • Analyzes IR and rewrites (or transforms) IR • Primary goal is to reduce running time of the compiled code — May also improve space, power consumption, … • Must preserve “meaning” of the code — Measured by values of named variables — A course (or two) unto itself 1 1 The Optimizer IR Opt IR Opt IR Opt IR... Opt IR 1 2 3 n Errors Modern optimizers are structured as a series of passes Typical Transformations • Discover & propagate some constant value • Move a computation to a less frequently executed place • Specialize some computation based on context • Discover a redundant computation & remove it • Remove useless or unreachable code • Encode an idiom in some particularly efficient form 2 The Role of the Optimizer • The compiler can implement a procedure in many ways • The optimizer tries to find an implementation that is “better” — Speed, code size, data space, … To accomplish this, it • Analyzes the code to derive knowledge
    [Show full text]
  • Topic 1D Slides
    Topic I (d): Static Single Assignment Form (SSA) 621-10F/Topic-1d-SSA 1 Reading List Slides: Topic Ix Other readings as assigned in class 621-10F/Topic-1d-SSA 2 ABET Outcome Ability to apply knowledge of SSA technique in compiler optimization An ability to formulate and solve the basic SSA construction problem based on the techniques introduced in class. Ability to analyze the basic algorithms using SSA form to express and formulate dataflow analysis problem A Knowledge on contemporary issues on this topic. 621-10F/Topic-1d-SSA 3 Roadmap Motivation Introduction: SSA form Construction Method Application of SSA to Dataflow Analysis Problems PRE (Partial Redundancy Elimination) and SSAPRE Summary 621-10F/Topic-1d-SSA 4 Prelude SSA: A program is said to be in SSA form iff Each variable is statically defined exactly only once, and each use of a variable is dominated by that variable’s definition. So, is straight line code in SSA form ? 621-10F/Topic-1d-SSA 5 Example 푿ퟏ In general, how to transform an arbitrary program into SSA form? Does the definition of X 푿ퟐ 2 dominate its use in the example? 푿ퟑ 흓 (푿ퟏ, 푿ퟐ) 푿 ퟒ 621-10F/Topic-1d-SSA 6 SSA: Motivation Provide a uniform basis of an IR to solve a wide range of classical dataflow problems Encode both dataflow and control flow information A SSA form can be constructed and maintained efficiently Many SSA dataflow analysis algorithms are more efficient (have lower complexity) than their CFG counterparts. 621-10F/Topic-1d-SSA 7 Algorithm Complexity Assume a 1 GHz machine, and an algorithm that takes f(n) steps (1 step = 1 nanosecond).
    [Show full text]
  • A General Compiler Framework for Speculative Optimizations Using Data Speculative Code Motion
    A General Compiler Framework for Speculative Optimizations Using Data Speculative Code Motion Xiaoru Dai, Antonia Zhai, Wei-Chung Hsu, Pen-Chung Yew Department of Computer Science and Engineering University of Minnesota Minneapolis, MN 55455 {dai, zhai, hsu, yew}@cs.umn.edu Abstract obtaining precise data dependence analysis is both difficult and expensive for languages such as C in which Data speculative optimization refers to code dynamic and pointer-based data structures are frequently transformations that allow load and store instructions to used. When the data dependence analysis is unable to be moved across potentially dependent memory show that there is definitely no data dependence between operations. Existing research work on data speculative two memory references, the compiler must assume that optimizations has mainly focused on individual code there is a data dependence between them. It is quite often transformation. The required speculative analysis that that such an assumption is overly conservative. The identifies data speculative optimization opportunities and examples in Figure 1 illustrate how such conservative data the required recovery code generation that guarantees the dependences may affect compiler optimizations. correctness of their execution are handled separately for each optimization. This paper proposes a new compiler S1: = *q while ( p ){ while( p ){ framework to facilitate the design and implementation of S2: *p= b 1 S1: if (p->f==0) S1: if (p->f == 0) general data speculative optimizations such as dead store
    [Show full text]
  • Redundant Computation Elimination Optimizations
    Redundant Computation Elimination Optimizations CS2210 Lecture 20 CS2210 Compiler Design 2004/5 Redundancy Elimination ■ Several categories: ■ Value Numbering ■ local & global ■ Common subexpression elimination (CSE) ■ local & global ■ Loop-invariant code motion ■ Partial redundancy elimination ■ Most complex ■ Subsumes CSE & loop-invariant code motion CS2210 Compiler Design 2004/5 Value Numbering ■ Goal: identify expressions that have same value ■ Approach: hash expression to a hash code ■ Then have to compare only expressions that hash to same value CS2210 Compiler Design 2004/5 1 Example a := x | y b := x | y t1:= !z if t1 goto L1 x := !z c := x & y t2 := x & y if t2 trap 30 CS2210 Compiler Design 2004/5 Global Value Numbering ■ Generalization of value numbering for procedures ■ Requires code to be in SSA form ■ Crucial notion: congruence ■ Two variables are congruent if their defining computations have identical operators and congruent operands ■ Only precise in SSA form, otherwise the definition may not dominate the use CS2210 Compiler Design 2004/5 Value Graphs ■ Labeled, directed graph ■ Nodes are labeled with ■ Operators ■ Function symbols ■ Constants ■ Edges point from operator (or function) to its operands ■ Number labels indicate operand position CS2210 Compiler Design 2004/5 2 Example entry receive n1(val) i := 1 1 B1 j1 := 1 i3 := φ2(i1,i2) B2 j3 := φ2(j1,j2) i3 mod 2 = 0 i := i +1 i := i +3 B3 4 3 5 3 B4 j4:= j3+1 j5:= j3+3 i2 := φ5(i4,i5) j3 := φ5(j4,j5) B5 j2 > n1 exit CS2210 Compiler Design 2004/5 Congruence Definition ■ Maximal relation on the value graph ■ Two nodes are congruent iff ■ They are the same node, or ■ Their labels are constants and the constants are the same, or ■ They have the same operators and their operands are congruent ■ Variable equivalence := x and y are equivalent at program point p iff they are congruent and their defining assignments dominate p CS2210 Compiler Design 2004/5 Global Value Numbering Algo.
    [Show full text]
  • A Simple Graph-Based Intermediate Representation
    A Simple Graph-Based Intermediate Representation Cliff Click Michael Paleczny [email protected] [email protected] understand, and easy to extend. Our goal is a repre- Abstract sentation that is simple and light weight while allowing We present a graph-based intermediate representation easy expression of fast optimizations. (IR) with simple semantics and a low-memory-cost C++ This paper discusses the intermediate representa- implementation. The IR uses a directed graph with la- beled vertices and ordered inputs but unordered outputs. tion (IR) used in the research compiler implemented as Vertices are labeled with opcodes, edges are unlabeled. part of the author’s dissertation [8]. The parser that We represent the CFG and basic blocks with the same builds this IR performs significant parse-time optimi- vertex and edge structures. Each opcode is defined by a zations, including building a form of Static Single As- C++ class that encapsulates opcode-specific data and be- signment (SSA) at parse-time. Classic optimizations havior. We use inheritance to abstract common opcode such as Conditional Constant Propagation [23] and behavior, allowing new opcodes to be easily defined from Global Value Numbering [20] as well as a novel old ones. The resulting IR is simple, fast and easy to use. global code motion algorithm [9] work well on the IR. These topics are beyond the scope of this paper but are 1. Introduction covered in Click’s thesis. Intermediate representations do not exist in a vac- The intermediate representation is a graph-based, uum. They are the stepping stone from what the pro- object-oriented structure, similar in spirit to an opera- grammer wrote to what the machine understands.
    [Show full text]
  • Global Value Numbering
    Reuse Optimization Local Value Numbering Announcement Idea − HW2 is due Monday! I will not accept late HW2 turnins − Each variable, expression, and constant is assigned a unique number − When we encounter a variable, expression or constant, see if it’s already Idea been assigned a number − Eliminate redundant operations in the dynamic execution of instructions − If so, use the value for that number How do redundancies arise? − If not, assign a new number − Loop invariant code (e.g., index calculation for arrays) − Same number ⇒ same value − Sequence of similar operations (e.g., method lookup) b → #1 #3 − Same value be generated in multiple places in the code Example a := b + c c → #2 b + c is #1 + # 2 → #3 Types of reuse optimization d := b a → #3 − Value numbering b := a − Common subexpression elimination d → #1 e := d + c a is − Partial redundancy elimination d + c #1 + # 2 → #3 e → #3 CS553 Lecture Value Numbering 2 CS553 Lecture Value Numbering 3 Local Value Numbering (cont) Global Value Numbering Temporaries may be necessary How do we handle control flow? b → #1 a := b + c c → #2 w = 5 w = 8 a := b b + c is #1 + # 2 → #3 x = 5 x = 8 d := a + c a → #3 #1 a + c is #1 + # 2 → #3 w → #1 w → #2 d → #3 y = w+1 x → #1 x → #2 z = x+1 . b → #1 t := b + c c → #2 a := b b + c is #1 + # 2 → #3 d := b + c t t → #3 a → #1 a + c is #1 + # 2 → #3 d → #3 CS553 Lecture Value Numbering 4 CS553 Lecture Value Numbering 5 1 Global Value Numbering (cont) Role of SSA Form Idea [Alpern, Wegman, and Zadeck 1988] SSA form is helpful − Partition program variables into congruence classes − Allows us to avoid data-flow analysis − All variables in a particular congruence class have the same value − Variables correspond to values − SSA form is helpful a = b a1 = b .
    [Show full text]
  • GVN Notes for Slides
    This talk is divided into 7 parts. Part 1 begins with a quick look at the SSA optimization framework and global value numbering. Part 2 describes a well known brute force algorithm for GVN. Part 3 modifies this to arrive at an efficient sparse algorithm. Part 4 unifies the sparse algorithm with a wide range of additional analyses. Parts 5-7 put it all together, present measurements and suggest conclusions. PLDI’02 17 June 2002 2-1/34 The framework of an SSA based global optimizer can be rep- resented as a pipeline of processing stages. The intermediate representation of a routine flows in at the top and is first trans- lated to static single assignment form. The SSA form IR is now processed by a number of passes, one of which is global value numbering. The results of GVN are used to transform the IR. Finally the IR is translated out of SSA form, and optimized IR flows out of the pipeline. PLDI’02 17 June 2002 3-1/34 These are the basic notions of value numbering. A value is a constant or an SSA variable. The values of a routine can be partitioned into congruence classes. Congruent values are guar- anteed to be identical for any possible execution of the routine. Every congruence class has a representative value called a leader. PLDI’02 17 June 2002 4-1/34 GVN is an analysis phase - it does not transform IR. Its input is the SSA form IR of a routine. It produces 4 outputs: the congruence classes of the routine, the values in every congruence class, the leader of every congruence class and the congruence class of every value.
    [Show full text]