Equality Saturation: a New Approach to Optimization ∗
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Exploring Languages with Interpreters and Functional Programming Chapter 8
Exploring Languages with Interpreters and Functional Programming Chapter 8 H. Conrad Cunningham 24 September 2018 Contents 8 Evaluation Model 2 8.1 Chapter Introduction . .2 8.2 Referential Transparency Revisited . .2 8.3 Substitution Model . .3 8.4 Time and Space Complexity . .6 8.5 Termination . .7 8.6 What Next? . .7 8.7 Exercises . .8 8.8 Acknowledgements . .8 8.9 References . .9 8.10 Terms and Concepts . .9 Copyright (C) 2016, 2017, 2018, H. Conrad Cunningham Professor of Computer and Information Science University of Mississippi 211 Weir Hall P.O. Box 1848 University, MS 38677 (662) 915-5358 Browser Advisory: The HTML version of this textbook requires a browser that supports the display of MathML. A good choice as of September 2018 is a recent version of Firefox from Mozilla. 1 8 Evaluation Model 8.1 Chapter Introduction This chapter introduces an evaluation model applicable to Haskell programs. As in the previous chapters, this chapter focuses on use of first-order functions and primitive data types. A goal of this chapter and the next one is enable students to analyze Haskell functions to determine under what conditions they terminate normally and how efficient they are. This chapter presents the evaluation model and the next chapter informally analyzes simple functions in terms of time and space efficiency and termination. How can we evaluate (i.e. execute) an expression that “calls” a function like the fact1 function from Chapter 4? We do this by rewriting expressions using a substitution model, as we see in this chapter. This process depends upon a property of functional languages called referential transparency. -
Expression Rematerialization for VLIW DSP Processors with Distributed Register Files ?
Expression Rematerialization for VLIW DSP Processors with Distributed Register Files ? Chung-Ju Wu, Chia-Han Lu, and Jenq-Kuen Lee Department of Computer Science, National Tsing-Hua University, Hsinchu 30013, Taiwan {jasonwu,chlu}@pllab.cs.nthu.edu.tw,[email protected] Abstract. Spill code is the overhead of memory load/store behavior if the available registers are not sufficient to map live ranges during the process of register allocation. Previously, works have been proposed to reduce spill code for the unified register file. For reducing power and cost in design of VLIW DSP processors, distributed register files and multi- bank register architectures are being adopted to eliminate the amount of read/write ports between functional units and registers. This presents new challenges for devising compiler optimization schemes for such ar- chitectures. This paper aims at addressing the issues of reducing spill code via rematerialization for a VLIW DSP processor with distributed register files. Rematerialization is a strategy for register allocator to de- termine if it is cheaper to recompute the value than to use memory load/store. In the paper, we propose a solution to exploit the character- istics of distributed register files where there is the chance to balance or split live ranges. By heuristically estimating register pressure for each register file, we are going to treat them as optional spilled locations rather than spilling to memory. The choice of spilled location might pre- serve an expression result and keep the value alive in different register file. It increases the possibility to do expression rematerialization which is effectively able to reduce spill code. -
Week 3: Scope and Evaluation Order Assignment 1 Has Been Posted!
Week 3: Scope and Evaluation Order CSC324 Principles of Programming Languages David Liu, Department of Computer Science Assignment 1 has been posted! Closures and static scope static (adjective) determined only by the program source code dynamic (adjective) determined only when the program is run E.g., referential transparency is a static property How are closures implemented? Are they static or dynamic? Or both? Function bodies can be processed statically (e.g., Haskell compiler generates code once per lambda). The closure environment (and therefore the closure itself) can only be generated dynamically. The closure depends only on where the function is evaluated, not where that function is called. (define (make-adder n) (lambda (x) (+ x n))) (define adder (make-adder 1)) (adder 100) So we can determine where each free identifier obtains its values statically, based on where its enclosing function is defined. scope (of an identifier) The parts of program code that may refer to that identifier. static (aka lexical) scope The scope of every identifier is determined by the structure of the source code (e.g., by nesting of lambdas and lets). Every identifier obtains its value from the closest enclosing expression that binds it. (define (make-adder n) (lambda (x) (+ x n))) (define adder (make-adder 1)) ; (0x..., {n: 1}) (let* ([n 100]) (adder 2)) Implementing static scope in an interpreter A simple interpreter (define/match (interpret env expr) [(_ (? number?)) expr] [(_ (? symbol?)) (hash-ref env expr)] [(_ (list '+ l r)) (+ (interpret env l) (interpret env r))]) A simple interpreter (define/match (interpret env expr) [(_ (? number?)) expr] [(_ (? symbol?)) (hash-ref env expr)] [(_ (list '+ l r)) (+ (interpret env l) (interpret env r))]) The environment is passed recursively when interpreting each subexpression. -
Equality Saturation: a New Approach to Optimization
Logical Methods in Computer Science Vol. 7 (1:10) 2011, pp. 1–37 Submitted Oct. 12, 2009 www.lmcs-online.org Published Mar. 28, 2011 EQUALITY SATURATION: A NEW APPROACH TO OPTIMIZATION ROSS TATE, MICHAEL STEPP, ZACHARY TATLOCK, AND SORIN LERNER Department of Computer Science and Engineering, University of California, San Diego e-mail address: {rtate,mstepp,ztatlock,lerner}@cs.ucsd.edu Abstract. Optimizations in a traditional compiler are applied sequentially, with each optimization destructively modifying the program to produce a transformed program that is then passed to the next optimization. We present a new approach for structuring the optimization phase of a compiler. In our approach, optimizations take the form of equality analyses that add equality information to a common intermediate representation. The op- timizer works by repeatedly applying these analyses to infer equivalences between program fragments, thus saturating the intermediate representation with equalities. Once saturated, the intermediate representation encodes multiple optimized versions of the input program. At this point, a profitability heuristic picks the final optimized program from the various programs represented in the saturated representation. Our proposed way of structuring optimizers has a variety of benefits over previous approaches: our approach obviates the need to worry about optimization ordering, enables the use of a global optimization heuris- tic that selects among fully optimized programs, and can be used to perform translation validation, even on compilers other than our own. We present our approach, formalize it, and describe our choice of intermediate representation. We also present experimental results showing that our approach is practical in terms of time and space overhead, is effective at discovering intricate optimization opportunities, and is effective at performing translation validation for a realistic optimizer. -
Scalable Conditional Induction Variables (CIV) Analysis
Scalable Conditional Induction Variables (CIV) Analysis ifact Cosmin E. Oancea Lawrence Rauchwerger rt * * Comple A t te n * A te s W i E * s e n l C l o D C O Department of Computer Science Department of Computer Science and Engineering o * * c u e G m s E u e C e n R t v e o d t * y * s E University of Copenhagen Texas A & M University a a l d u e a [email protected] [email protected] t Abstract k = k0 Ind. k = k0 DO i = 1, N DO i =1,N Var. DO i = 1, N IF(cond(b(i)))THEN Subscripts using induction variables that cannot be ex- k = k+2 ) a(k0+2*i)=.. civ = civ+1 )? pressed as a formula in terms of the enclosing-loop indices a(k)=.. Sub. ENDDO a(civ) = ... appear in the low-level implementation of common pro- ENDDO k=k0+MAX(2N,0) ENDIF ENDDO gramming abstractions such as filter, or stack operations and (a) (b) (c) pose significant challenges to automatic parallelization. Be- Figure 1. Loops with affine and CIV array accesses. cause the complexity of such induction variables is often due to their conditional evaluation across the iteration space of its closed-form equivalent k0+2*i, which enables its in- loops we name them Conditional Induction Variables (CIV). dependent evaluation by all iterations. More importantly, This paper presents a flow-sensitive technique that sum- the resulted code, shown in Figure 1(b), allows the com- marizes both such CIV-based and affine subscripts to pro- piler to verify that the set of points written by any dis- gram level, using the same representation. -
SI 413, Unit 3: Advanced Scheme
SI 413, Unit 3: Advanced Scheme Daniel S. Roche ([email protected]) Fall 2018 1 Pure Functional Programming Readings for this section: PLP, Sections 10.7 and 10.8 Remember there are two important characteristics of a “pure” functional programming language: • Referential Transparency. This fancy term just means that, for any expression in our program, the result of evaluating it will always be the same. In fact, any referentially transparent expression could be replaced with its value (that is, the result of evaluating it) without changing the program whatsoever. Notice that imperative programming is about as far away from this as possible. For example, consider the C++ for loop: for ( int i = 0; i < 10;++i) { /∗ some s t u f f ∗/ } What can we say about the “stuff” in the body of the loop? Well, it had better not be referentially transparent. If it is, then there’s no point in running over it 10 times! • Functions are First Class. Another way of saying this is that functions are data, just like any number or list. Functions are values, in fact! The specific privileges that a function earns by virtue of being first class include: 1) Functions can be given names. This is not a big deal; we can name functions in pretty much every programming language. In Scheme this just means we can do (define (f x) (∗ x 3 ) ) 2) Functions can be arguments to other functions. This is what you started to get into at the end of Lab 2. For starters, there’s the basic predicate procedure?: (procedure? +) ; #t (procedure? 10) ; #f (procedure? procedure?) ; #t 1 And then there are “higher-order functions” like map and apply: (apply max (list 5 3 10 4)) ; 10 (map sqrt (list 16 9 64)) ; '(4 3 8) What makes the functions “higher-order” is that one of their arguments is itself another function. -
Induction Variable Analysis with Delayed Abstractions1
Induction Variable Analysis with Delayed Abstractions1 SEBASTIAN POP, and GEORGES-ANDRE´ SILBER CRI, Mines Paris, France and ALBERT COHEN ALCHEMY group, INRIA Futurs, Orsay, France We present the design of an induction variable analyzer suitable for the analysis of typed, low-level, three address representations in SSA form. At the heart of our analyzer stands a new algorithm that recognizes scalar evolutions. We define a representation called trees of recurrences that is able to capture different levels of abstractions: from the finer level that is a subset of the SSA representation restricted to arithmetic operations on scalar variables, to the coarser levels such as the evolution envelopes that abstract sets of possible evolutions in loops. Unlike previous work, our algorithm tracks induction variables without prior classification of a few evolution patterns: different levels of abstraction can be obtained on demand. The low complexity of the algorithm fits the constraints of a production compiler as illustrated by the evaluation of our implementation on standard benchmark programs. Categories and Subject Descriptors: D.3.4 [Programming Languages]: Processors—compilers, interpreters, optimization, retargetable compilers; F.3.2 [Logics and Meanings of Programs]: Semantics of Programming Languages—partial evaluation, program analysis General Terms: Compilers Additional Key Words and Phrases: Scalar evolutions, static analysis, static single assignment representation, assessing compilers heuristics regressions. 1Extension of Conference Paper: -
A Formally-Verified Alias Analysis
A Formally-Verified Alias Analysis Valentin Robert1;2 and Xavier Leroy1 1 INRIA Paris-Rocquencourt 2 University of California, San Diego [email protected], [email protected] Abstract. This paper reports on the formalization and proof of sound- ness, using the Coq proof assistant, of an alias analysis: a static analysis that approximates the flow of pointer values. The alias analysis con- sidered is of the points-to kind and is intraprocedural, flow-sensitive, field-sensitive, and untyped. Its soundness proof follows the general style of abstract interpretation. The analysis is designed to fit in the Comp- Cert C verified compiler, supporting future aggressive optimizations over memory accesses. 1 Introduction Alias analysis. Most imperative programming languages feature pointers, or object references, as first-class values. With pointers and object references comes the possibility of aliasing: two syntactically-distinct program variables, or two semantically-distinct object fields can contain identical pointers referencing the same shared piece of data. The possibility of aliasing increases the expressiveness of the language, en- abling programmers to implement mutable data structures with sharing; how- ever, it also complicates tremendously formal reasoning about programs, as well as optimizing compilation. In this paper, we focus on optimizing compilation in the presence of pointers and aliasing. Consider, for example, the following C program fragment: ... *p = 1; *q = 2; x = *p + 3; ... Performance would be increased if the compiler propagates the constant 1 stored in p to its use in *p + 3, obtaining ... *p = 1; *q = 2; x = 4; ... This optimization, however, is unsound if p and q can alias. -
Practical and Accurate Low-Level Pointer Analysis
Practical and Accurate Low-Level Pointer Analysis Bolei Guo Matthew J. Bridges Spyridon Triantafyllis Guilherme Ottoni Easwaran Raman David I. August Department of Computer Science, Princeton University {bguo, mbridges, strianta, ottoni, eraman, august}@princeton.edu Abstract High−Level IR Low−Level IR Pointer SuperBlock .... Inlining Opti Scheduling .... Analysis Formation Pointer analysis is traditionally performed once, early Source Machine Code Code in the compilation process, upon an intermediate repre- Lowering sentation (IR) with source-code semantics. However, per- forming pointer analysis only once at this level imposes a Figure 1. Traditional compiler organization phase-ordering constraint, causing alias information to be- char A[10],B[10],C[10]; . come stale after subsequent code transformations. More- foo() { int i; over, high-level pointer analysis cannot be used at link time char *p; or run time, where the source code is unavailable. for (i=0;i<10;i++) { if (...) 1: p = A 2: p = B 1: p = A 2: p = B This paper advocates performing pointer analysis on a 1: p = A; 3': C[i] = p[i] 3: C[i] = p[i] else low-level intermediate representation. We present the first 2: p = B; 4': A[i] = ... 4: A[i] = ... 3: C[i] = p[i]; context-sensitive and partially flow-sensitive points-to anal- 4: A[i] = ...; 3: C[i] = p[i] } 4: A[i] = ... ysis designed to operate at the assembly level. As we will } demonstrate, low-level pointer analysis can be as accurate (a) Source code (b) Source code CFG (c) Transformed code CFG as high-level analysis. Additionally, our low-level pointer analysis also enables a quantitative comparison of prop- agating high-level pointer analysis results through subse- Figure 2. -
Strength Reduction of Induction Variables and Pointer Analysis – Induction Variable Elimination
Loop optimizations • Optimize loops – Loop invariant code motion [last time] Loop Optimizations – Strength reduction of induction variables and Pointer Analysis – Induction variable elimination CS 412/413 Spring 2008 Introduction to Compilers 1 CS 412/413 Spring 2008 Introduction to Compilers 2 Strength Reduction Induction Variables • Basic idea: replace expensive operations (multiplications) with • An induction variable is a variable in a loop, cheaper ones (additions) in definitions of induction variables whose value is a function of the loop iteration s = 3*i+1; number v = f(i) while (i<10) { while (i<10) { j = 3*i+1; //<i,3,1> j = s; • In compilers, this a linear function: a[j] = a[j] –2; a[j] = a[j] –2; i = i+2; i = i+2; f(i) = c*i + d } s= s+6; } •Observation:linear combinations of linear • Benefit: cheaper to compute s = s+6 than j = 3*i functions are linear functions – s = s+6 requires an addition – Consequence: linear combinations of induction – j = 3*i requires a multiplication variables are induction variables CS 412/413 Spring 2008 Introduction to Compilers 3 CS 412/413 Spring 2008 Introduction to Compilers 4 1 Families of Induction Variables Representation • Basic induction variable: a variable whose only definition in the • Representation of induction variables in family i by triples: loop body is of the form – Denote basic induction variable i by <i, 1, 0> i = i + c – Denote induction variable k=i*a+b by triple <i, a, b> where c is a loop-invariant value • Derived induction variables: Each basic induction variable i defines -
Dead Code Elimination Based Pointer Analysis for Multithreaded Programs
Journal of the Egyptian Mathematical Society (2012) 20, 28–37 Egyptian Mathematical Society Journal of the Egyptian Mathematical Society www.etms-eg.org www.elsevier.com/locate/joems ORIGINAL ARTICLE Dead code elimination based pointer analysis for multithreaded programs Mohamed A. El-Zawawy Department of Mathematics, Faculty of Science, Cairo University, Giza 12316, Egypt Available online 2 February 2012 Abstract This paper presents a new approach for optimizing multitheaded programs with pointer constructs. The approach has applications in the area of certified code (proof-carrying code) where a justification or a proof for the correctness of each optimization is required. The optimization meant here is that of dead code elimination. Towards optimizing multithreaded programs the paper presents a new operational semantics for parallel constructs like join-fork constructs, parallel loops, and conditionally spawned threads. The paper also presents a novel type system for flow-sensitive pointer analysis of multithreaded pro- grams. This type system is extended to obtain a new type system for live-variables analysis of mul- tithreaded programs. The live-variables type system is extended to build the third novel type system, proposed in this paper, which carries the optimization of dead code elimination. The justification mentioned above takes the form of type derivation in our approach. ª 2011 Egyptian Mathematical Society. Production and hosting by Elsevier B.V. All rights reserved. 1. Introduction (a) concealing suspension caused by some commands, (b) mak- ing it easier to build huge software systems, (c) improving exe- One of the mainstream programming approaches today is mul- cution of programs specially those that are executed on tithreading. -
Generalizing Loop-Invariant Code Motion in a Real-World Compiler
Imperial College London Department of Computing Generalizing loop-invariant code motion in a real-world compiler Author: Supervisor: Paul Colea Fabio Luporini Co-supervisor: Prof. Paul H. J. Kelly MEng Computing Individual Project June 2015 Abstract Motivated by the perpetual goal of automatically generating efficient code from high-level programming abstractions, compiler optimization has developed into an area of intense research. Apart from general-purpose transformations which are applicable to all or most programs, many highly domain-specific optimizations have also been developed. In this project, we extend such a domain-specific compiler optimization, initially described and implemented in the context of finite element analysis, to one that is suitable for arbitrary applications. Our optimization is a generalization of loop-invariant code motion, a technique which moves invariant statements out of program loops. The novelty of the transformation is due to its ability to avoid more redundant recomputation than normal code motion, at the cost of additional storage space. This project provides a theoretical description of the above technique which is fit for general programs, together with an implementation in LLVM, one of the most successful open-source compiler frameworks. We introduce a simple heuristic-driven profitability model which manages to successfully safeguard against potential performance regressions, at the cost of missing some speedup opportunities. We evaluate the functional correctness of our implementation using the comprehensive LLVM test suite, passing all of its 497 whole program tests. The results of our performance evaluation using the same set of tests reveal that generalized code motion is applicable to many programs, but that consistent performance gains depend on an accurate cost model.