Inlining: a Tool for Eliminating Mutual Recursion
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
A Deep Dive Into the Interprocedural Optimization Infrastructure
Stes Bais [email protected] Kut el [email protected] Shi Oku [email protected] A Deep Dive into the Luf Cen Interprocedural [email protected] Hid Ue Optimization Infrastructure [email protected] Johs Dor [email protected] Outline ● What is IPO? Why is it? ● Introduction of IPO passes in LLVM ● Inlining ● Attributor What is IPO? What is IPO? ● Pass Kind in LLVM ○ Immutable pass Intraprocedural ○ Loop pass ○ Function pass ○ Call graph SCC pass ○ Module pass Interprocedural IPO considers more than one function at a time Call Graph ● Node : functions ● Edge : from caller to callee A void A() { B(); C(); } void B() { C(); } void C() { ... B C } Call Graph SCC ● SCC stands for “Strongly Connected Component” A D G H I B C E F Call Graph SCC ● SCC stands for “Strongly Connected Component” A D G H I B C E F Passes In LLVM IPO passes in LLVM ● Where ○ Almost all IPO passes are under llvm/lib/Transforms/IPO Categorization of IPO passes ● Inliner ○ AlwaysInliner, Inliner, InlineAdvisor, ... ● Propagation between caller and callee ○ Attributor, IP-SCCP, InferFunctionAttrs, ArgumentPromotion, DeadArgumentElimination, ... ● Linkage and Globals ○ GlobalDCE, GlobalOpt, GlobalSplit, ConstantMerge, ... ● Others ○ MergeFunction, OpenMPOpt, HotColdSplitting, Devirtualization... 13 Why is IPO? ● Inliner ○ Specialize the function with call site arguments ○ Expose local optimization opportunities ○ Save jumps, register stores/loads (calling convention) ○ Improve instruction locality ● Propagation between caller and callee ○ Other passes would benefit from the propagated information ● Linkage -
Unfold/Fold Transformations and Loop Optimization of Logic Programs
Unfold/Fold Transformations and Loop Optimization of Logic Programs Saumya K. Debray Department of Computer Science The University of Arizona Tucson, AZ 85721 Abstract: Programs typically spend much of their execution time in loops. This makes the generation of ef®cient code for loops essential for good performance. Loop optimization of logic programming languages is complicated by the fact that such languages lack the iterative constructs of traditional languages, and instead use recursion to express loops. In this paper, we examine the application of unfold/fold transformations to three kinds of loop optimization for logic programming languages: recur- sion removal, loop fusion and code motion out of loops. We describe simple unfold/fold transformation sequences for these optimizations that can be automated relatively easily. In the process, we show that the properties of uni®cation and logical variables can sometimes be used to generalize, from traditional languages, the conditions under which these optimizations may be carried out. Our experience suggests that such source-level transformations may be used as an effective tool for the optimization of logic pro- grams. 1. Introduction The focus of this paper is on the static optimization of logic programs. Speci®cally, we investigate loop optimization of logic programs. Since programs typically spend most of their time in loops, the generation of ef®cient code for loops is essential for good performance. In the context of logic programming languages, the situation is complicated by the fact that iterative constructs, such as for or while, are unavailable. Loops are usually expressed using recursive procedures, and loop optimizations have be considered within the general framework of inter- procedural optimization. -
A Right-To-Left Type System for Value Recursion
1 A right-to-left type system for value recursion 61 2 62 3 63 4 Alban Reynaud Gabriel Scherer Jeremy Yallop 64 5 ENS Lyon, France INRIA, France University of Cambridge, UK 65 6 66 1 Introduction Seeking to address these problems, we designed and imple- 7 67 mented a new check for recursive definition safety based ona 8 In OCaml recursive functions are defined using the let rec opera- 68 novel static analysis, formulated as a simple type system (which we 9 tor, as in the following definition of factorial: 69 have proved sound with respect to an existing operational seman- 10 let rec fac x = if x = 0 then 1 70 tics [Nordlander et al. 2008]), and implemented as part of OCaml’s 11 else x * (fac (x - 1)) 71 type-checking phase. Our check was merged into the OCaml distri- 12 Beside functions, let rec can define recursive values, such as 72 bution in August 2018. 13 an infinite list ones where every element is 1: 73 Moving the check from the middle end to the type checker re- 14 74 let rec ones = 1 :: ones stores the desirable property that compilation of well-typed programs 15 75 Note that this “infinite” list is actually cyclic, and consists of asingle does not go wrong. This property is convenient for tools that reuse 16 76 cons-cell referencing itself. OCaml’s type-checker without performing compilation, such as 17 77 However, not all recursive definitions can be computed. The MetaOCaml [Kiselyov 2014] (which type-checks quoted code) and 18 78 following definition is justly rejected by the compiler: Merlin [Bour et al. -
Handout – Dataflow Optimizations Assignment
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout – Dataflow Optimizations Assignment Tuesday, Mar 19 DUE: Thursday, Apr 4, 9:00 pm For this part of the project, you will add dataflow optimizations to your compiler. At the very least, you must implement global common subexpression elimination. The other optimizations listed below are optional. You may also wait until the next project to implement them if you are going to; there is no requirement to implement other dataflow optimizations in this project. We list them here as suggestions since past winners of the compiler derby typically implement each of these optimizations in some form. You are free to implement any other optimizations you wish. Note that you will be implementing register allocation for the next project, so you don’t need to concern yourself with it now. Global CSE (Common Subexpression Elimination): Identification and elimination of redun- dant expressions using the algorithm described in lecture (based on available-expression anal- ysis). See §8.3 and §13.1 of the Whale book, §10.6 and §10.7 in the Dragon book, and §17.2 in the Tiger book. Global Constant Propagation and Folding: Compile-time interpretation of expressions whose operands are compile time constants. See the algorithm described in §12.1 of the Whale book. Global Copy Propagation: Given a “copy” assignment like x = y , replace uses of x by y when legal (the use must be reached by only this def, and there must be no modification of y on any path from the def to the use). -
Comparative Studies of Programming Languages; Course Lecture Notes
Comparative Studies of Programming Languages, COMP6411 Lecture Notes, Revision 1.9 Joey Paquet Serguei A. Mokhov (Eds.) August 5, 2010 arXiv:1007.2123v6 [cs.PL] 4 Aug 2010 2 Preface Lecture notes for the Comparative Studies of Programming Languages course, COMP6411, taught at the Department of Computer Science and Software Engineering, Faculty of Engineering and Computer Science, Concordia University, Montreal, QC, Canada. These notes include a compiled book of primarily related articles from the Wikipedia, the Free Encyclopedia [24], as well as Comparative Programming Languages book [7] and other resources, including our own. The original notes were compiled by Dr. Paquet [14] 3 4 Contents 1 Brief History and Genealogy of Programming Languages 7 1.1 Introduction . 7 1.1.1 Subreferences . 7 1.2 History . 7 1.2.1 Pre-computer era . 7 1.2.2 Subreferences . 8 1.2.3 Early computer era . 8 1.2.4 Subreferences . 8 1.2.5 Modern/Structured programming languages . 9 1.3 References . 19 2 Programming Paradigms 21 2.1 Introduction . 21 2.2 History . 21 2.2.1 Low-level: binary, assembly . 21 2.2.2 Procedural programming . 22 2.2.3 Object-oriented programming . 23 2.2.4 Declarative programming . 27 3 Program Evaluation 33 3.1 Program analysis and translation phases . 33 3.1.1 Front end . 33 3.1.2 Back end . 34 3.2 Compilation vs. interpretation . 34 3.2.1 Compilation . 34 3.2.2 Interpretation . 36 3.2.3 Subreferences . 37 3.3 Type System . 38 3.3.1 Type checking . 38 3.4 Memory management . -
Introduction Inline Expansion
CSc 553 Principles of Compilation Introduction 29 : Optimization IV Department of Computer Science University of Arizona [email protected] Copyright c 2011 Christian Collberg Inline Expansion I The most important and popular inter-procedural optimization is inline expansion, that is replacing the call of a procedure Inline Expansion with the procedure’s body. Why would you want to perform inlining? There are several reasons: 1 There are a number of things that happen when a procedure call is made: 1 evaluate the arguments of the call, 2 push the arguments onto the stack or move them to argument transfer registers, 3 save registers that contain live values and that might be trashed by the called routine, 4 make the jump to the called routine, Inline Expansion II Inline Expansion III 1 continued.. 3 5 make the jump to the called routine, ... This is the result of programming with abstract data types. 6 set up an activation record, Hence, there is often very little opportunity for optimization. 7 execute the body of the called routine, However, when inlining is performed on a sequence of 8 return back to the callee, possibly returning a result, procedure calls, the code from the bodies of several procedures 9 deallocate the activation record. is combined, opening up a larger scope for optimization. 2 Many of these actions don’t have to be performed if we inline There are problems, of course. Obviously, in most cases the the callee in the caller, and hence much of the overhead size of the procedure call code will be less than the size of the associated with procedure calls is optimized away. -
PGI Compilers
USER'S GUIDE FOR X86-64 CPUS Version 2019 TABLE OF CONTENTS Preface............................................................................................................ xii Audience Description......................................................................................... xii Compatibility and Conformance to Standards............................................................xii Organization................................................................................................... xiii Hardware and Software Constraints.......................................................................xiv Conventions.................................................................................................... xiv Terms............................................................................................................ xv Related Publications.........................................................................................xvii Chapter 1. Getting Started.....................................................................................1 1.1. Overview................................................................................................... 1 1.2. Creating an Example..................................................................................... 2 1.3. Invoking the Command-level PGI Compilers......................................................... 2 1.3.1. Command-line Syntax...............................................................................2 1.3.2. Command-line Options............................................................................ -
Functional Programming (ML)
Functional Programming CSE 215, Foundations of Computer Science Stony Brook University http://www.cs.stonybrook.edu/~cse215 Functional Programming Function evaluation is the basic concept for a programming paradigm that has been implemented in functional programming languages. The language ML (“Meta Language”) was originally introduced in the 1970’s as part of a theorem proving system, and was intended for describing and implementing proof strategies. Standard ML of New Jersey (SML) is an implementation of ML. The basic mode of computation in SML is the use of the definition and application of functions. 2 (c) Paul Fodor (CS Stony Brook) Install Standard ML Download from: http://www.smlnj.org Start Standard ML: Type ''sml'' from the shell (run command line in Windows) Exit Standard ML: Ctrl-Z under Windows Ctrl-D under Unix/Mac 3 (c) Paul Fodor (CS Stony Brook) Standard ML The basic cycle of SML activity has three parts: read input from the user, evaluate it, print the computed value (or an error message). 4 (c) Paul Fodor (CS Stony Brook) First SML example SML prompt: - Simple example: - 3; val it = 3 : int The first line contains the SML prompt, followed by an expression typed in by the user and ended by a semicolon. The second line is SML’s response, indicating the value of the input expression and its type. 5 (c) Paul Fodor (CS Stony Brook) Interacting with SML SML has a number of built-in operators and data types. it provides the standard arithmetic operators - 3+2; val it = 5 : int The Boolean values true and false are available, as are logical operators such as not (negation), andalso (conjunction), and orelse (disjunction). -
Abstract Recursion and Intrinsic Complexity
ABSTRACT RECURSION AND INTRINSIC COMPLEXITY Yiannis N. Moschovakis Department of Mathematics University of California, Los Angeles [email protected] October 2018 iv Abstract recursion and intrinsic complexity was first published by Cambridge University Press as Volume 48 in the Lecture Notes in Logic, c Association for Symbolic Logic, 2019. The Cambridge University Press catalog entry for the work can be found at https://www.cambridge.org/us/academic/subjects/mathematics /logic-categories-and-sets /abstract-recursion-and-intrinsic-complexity. The published version can be purchased through Cambridge University Press and other standard distribution channels. This copy is made available for personal use only and must not be sold or redistributed. This final prepublication draft of ARIC was compiled on November 30, 2018, 22:50 CONTENTS Introduction ................................................... .... 1 Chapter 1. Preliminaries .......................................... 7 1A.Standardnotations................................ ............. 7 Partial functions, 9. Monotone and continuous functionals, 10. Trees, 12. Problems, 14. 1B. Continuous, call-by-value recursion . ..................... 15 The where -notation for mutual recursion, 17. Recursion rules, 17. Problems, 19. 1C.Somebasicalgorithms............................. .................... 21 The merge-sort algorithm, 21. The Euclidean algorithm, 23. The binary (Stein) algorithm, 24. Horner’s rule, 25. Problems, 25. 1D.Partialstructures............................... ...................... -
Generating Mutually Recursive Definitions
Generating Mutually Recursive Definitions Jeremy Yallop Oleg Kiselyov University of Cambridge, UK Tohoku University, Japan [email protected] [email protected] Abstract let rec s = function A :: r ! s r j B :: r ! t r j [] ! true and t = function A :: r ! s r j B :: r ! u r j [] ! false Many functional programs — state machines (Krishnamurthi and u = function A :: r ! t r j B :: r ! u r j [] ! false 2006), top-down and bottom-up parsers (Hutton and Meijer 1996; Hinze and Paterson 2003), evaluators (Abelson et al. 1984), GUI where each function s, t, and u realizes a recognizer, taking a list of initialization graphs (Syme 2006), &c. — are conveniently ex- A and B symbols and returning a boolean. However, the program pressed as groups of mutually recursive bindings. One therefore that builds such a group from a description of an arbitrary state expects program generators, such as those written in MetaOCaml, machine cannot be expressed in MetaOCaml. to be able to build programs with mutual recursion. The limited support for generating mutual recursion is a con- Unfortunately, currently MetaOCaml can only build recursive sequence of expression-based quotation: brackets enclose expres- groups whose size is hard-coded in the generating program. The sions, and splices insert expressions into expressions — but a group general case requires something other than quotation, and seem- of bindings is not an expression. There is a second difficulty: gen- ingly weakens static guarantees on the resulting code. We describe erating recursive definitions with ‘backward’ and ‘forward’ refer- the challenges and propose a new language construct for assuredly ences seemingly requires unrestricted, Lisp-like gensym, which de- generating binding groups of arbitrary size – illustrating with a col- feats MetaOCaml’s static guarantees. -
Dataflow Optimizations
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2010 Handout – Dataflow Optimizations Assignment Tuesday, Mar 16 DUE: Thursday, Apr 1 (11:59pm) For this part of the project, you will add dataflow optimizations to your compiler. At the very least, you must implement global common subexpression elimination. The other optimizations listed below are optional. We list them here as suggestions since past winners of the compiler derby typically implement each of these optimizations in some form. You are free to implement any other optimizations you wish. Note that you will be implementing register allocation for the next project, so you don’t need to concern yourself with it now. Global CSE (Common Subexpression Elimination): Identification and elimination of redun dant expressions using the algorithm described in lecture (based on available-expression anal ysis). See §8.3 and §13.1 of the Whale book, §10.6 and §10.7 in the Dragon book, and §17.2 in the Tiger book. Global Constant Propagation and Folding: Compile-time interpretation of expressions whose operands are compile time constants. See the algorithm described in §12.1 of the Whale book. Global Copy Propagation: Given a “copy” assignment like x = y , replace uses of x by y when legal (the use must be reached by only this def, and there must be no modification of y on any path from the def to the use). See §12.5 of the Whale book and §10.7 of the Dragon book. Loop-invariant Code Motion (code hoisting): Moving invariant code from within a loop to a block prior to that loop. -
Eliminating Scope and Selection Restrictions in Compiler Optimizations
ELIMINATING SCOPE AND SELECTION RESTRICTIONS IN COMPILER OPTIMIZATION SPYRIDON TRIANTAFYLLIS ADISSERTATION PRESENTED TO THE FACULTY OF PRINCETON UNIVERSITY IN CANDIDACY FOR THE DEGREE OF DOCTOR OF PHILOSOPHY RECOMMENDED FOR ACCEPTANCE BY THE DEPARTMENT OF COMPUTER SCIENCE SEPTEMBER 2006 c Copyright by Spyridon Triantafyllis, 2006. All Rights Reserved Abstract To meet the challenges presented by the performance requirements of modern architectures, compilers have been augmented with a rich set of aggressive optimizing transformations. However, the overall compilation model within which these transformations operate has remained fundamentally unchanged. This model imposes restrictions on these transforma- tions’ application, limiting their effectiveness. First, procedure-based compilation limits code transformations within a single proce- dure’s boundaries, which may not present an ideal optimization scope. Although aggres- sive inlining and interprocedural optimization can alleviate this problem, code growth and compile time considerations limit their applicability. Second, by applying a uniform optimization process on all codes, compilers cannot meet the particular optimization needs of each code segment. Although the optimization process is tailored by heuristics that attempt to a priori judge the effect of each transfor- mation on final code quality, the unpredictability of modern optimization routines and the complexity of the target architectures severely limit the accuracy of such predictions. This thesis focuses on removing these restrictions through two novel compilation frame- work modifications, Procedure Boundary Elimination (PBE) and Optimization-Space Ex- ploration (OSE). PBE forms compilation units independent of the original procedures. This is achieved by unifying the entire application into a whole-program control-flow graph, allowing the compiler to repartition this graph into free-form regions, making analysis and optimization routines able to operate on these generalized compilation units.