CS502: Compiler Design Code Optimization Manas Thakur

Total Page:16

File Type:pdf, Size:1020Kb

CS502: Compiler Design Code Optimization Manas Thakur CS502: Compiler Design Code Optimization Manas Thakur Fall 2020 Fast. Faster. Fastest? Character stream Machine-Independent Lexical Analyzer Lexical Analyzer Code Optimizer d Token stream Intermediate representation n d e k n SyntaxSyntax Analyzer Analyzer Code Generator Code Generator c e t a Syntax tree Target machine code n B o r Semantic Analyzer Machine-Dependent Semantic Analyzer Code Optimizer F Syntax tree Target machine code Intermediate Intermediate Symbol Code Generator Code Generator Table Intermediate representation Manas Thakur CS502: Compiler Design 2 Role of Code Optimizer ● Make the program better – time, memory, energy, ... ● No guarantees in this land! – Will a particular optimization for sure improve something? – Will performing an optimization affect something else? – In what order should I perform the optimizations? – What “scope” to perform certain optimization at? – Is the optimizer fast enough? ● Can an optimized program be optimized further? Manas Thakur CS502: Compiler Design 3 Full employment theorem for compiler writers ● Statement: There is no fully optimizing compiler. ● Assume it exists: – such that it transforms a program P to the smallest program Opt(P) that has the same behaviour as P. – Halting problem comes to the rescue: ● Smallest program that never halts: L1: goto L1 – Thus, a fully optimizing compiler could solve the halting problem by checking if a given program is L1: goto L1! – But HP is an undecidable problem. – Hence, a fully optimizing compiler can’t exist! ● Therefore we talk just about an optimizing compiler. – and keep working without worrying about future prospects! Manas Thakur CS502: Compiler Design 4 How to perform optimizations? ● Analysis – Go over the program – Identify some properties ● Potentially useful properties ● Transformation – Use the information computed by the analysis to transform the program ● without affecting the semantics ● An example that we have (not literally) seen: – Compute liveness information – Delete assignments to variables that are dead Manas Thakur CS502: Compiler Design 5 Classifying optimizations ● Based on scope: – Local to basic blocks – Intraprocedural – Interprocedural ● Based on positioning: – High-level (transform source code or high-level IR) – Low-level (transform mid/low-level IR) ● Based on (in)dependence w.r.t. target machine: – Machine independent (general enough) – Machine dependent (specific to the architecture) Manas Thakur CS502: Compiler Design 6 May versus Must information ● Consider the program: if (c) { ● Which variables may be assigned? a = ... b = ... – a, b, c } else { a = ... ● Which variables must be assigned? c = ... } – a ● May analysis: – the computed information may hold in at least one execution of the program. ● Must analysis: – the computed information must hold every time the program is executed. Manas Thakur CS502: Compiler Design 7 Many many optimizations ● Constant folding, constant propagation, tail-call elimination, redundancy elimination, dead code elimination, loop-invariant code motion, loop splitting, loop fusion, strength reduction, array scalarization, inlining, synchronization elision, cloning, data prefetching, parallelization . etc . ● How do they interact? – Optimist: we get the sum of all improvements. – Realist: many are in direct opposition. ● Let us study some of them! Manas Thakur CS502: Compiler Design 8 Constant propagation ● Idea: – If the value of a variable is known to be a constant at compile-time, replace the use of the variable with the constant. n = 10; n = 10; c = 2; c = 2; for (i=0; i<n; ++i) for (i=0; i<10; ++i) s = s + i * c; s = s + i * 2; – Usually a very helpful optimization – e.g., Can we now unroll the loop? ● Why is it good? ● Why could it be bad? – When can we eliminate n and c themselves? ● Now you know how well different optimizations might interact! Manas Thakur CS502: Compiler Design 9 Constant folding ● Idea: – If operands are known at compile-time, evaluate expression at compile-time. r = 3.141 * 10; r = 31.41; – What if the code was? PI = 3.141; Constant propagation r = PI * 10; – And what now? Constant folding PI = 3.141; r = PI * 10; d = 2 * r; Called partial evaluation Manas Thakur CS502: Compiler Design 10 Common sub-expression elimination ● Idea: – If program computes the same value multiple times, reuse the value. t = b + c; a = b + c; a = t; c = b + c; c = t; d = b + c; d = b + c; – Subexpressions can be reused until operands are redefined. Manas Thakur CS502: Compiler Design 11 Copy propagation ● Idea: – After an assignment x = y, replace the uses of x with y. x = y; x = y; if (x > 1) if (y > 1) s = x + f(x); s = y + f(y); – Can only apply up to another assignment to x, or ... another assignment to y! – What if there was an assignment y = z earlier? ● Apply transitively to all assignments. Manas Thakur CS502: Compiler Design 12 Dead-code elimination ● Idea: – If the result of a computation is never used, remove the computation. x = y + 1; y = 1; y = 1; x = 2 * z; x = 2 * z; – Remove code that assigns to dead variables. ● Liveness analysis done before would help! – This may, in turn, create more dead code. ● Dead-code elimination usually works transitively. Manas Thakur CS502: Compiler Design 13 Unreachable-code elimination ● Idea: – Eliminate code that can never be executed #define DEBUG 0 if (DEBUG) print(“Current value = ", v); – High-level: look for if (false) or while (false) ● perhaps after constant folding! – Low-level: more difficult ● Code is just labels and gotos ● Traverse the CFG, marking reachable blocks Manas Thakur CS502: Compiler Design 14 Next class ● Next class: – How to perform the optimizations that we have seen using a dataflow analysis? ● Starting with: – The back-end fullform of CFG! ● Approximately only 10 more classes left. – Hope this course is being successful in making (y)our hectic days a bit more exciting :-) Manas Thakur CS502: Compiler Design 15 CS502: Compiler Design Code Optimization (Cont.) Manas Thakur Fall 2020 Recall A2 int a; if (*) { a = 10; else { //something that doesn’t touch ‘a’ } x = a; ● Is ‘a’ initialized in this program? – Reality during run-time: Depends – What to tell at compile-time? ● Is this a ‘must’ question or a ‘may’ question? ● Correct answer: No – How do we obtain such answers? ● Need to model the control-flow Manas Thakur CS502: Compiler Design 17 Control-Flow Graph (CFG) ● Nodes represent instructions; edges represent flow of control a = 0 a = 0 b = a + 1 L1: b = a + 1 c = c + b a = b * 2 c = c + b if a < N goto L1 return c a = b * 2 a < N return c Manas Thakur CS502: Compiler Design 18 1 Some CFG terminology a = 0 2 b = a + 1 ● pred[n] gives predecessors of n – pred[1]? pred[4]? pred[2]? 3 c = c + b 4 ● succ[n] gives successors of n a = b * 2 – succ[2]? succ[5]? 5 a < N ● def(n) gives variables defined by n 6 return c – def(3) = {c} ● use(n) gives variables used by n – use(3) = {b, c} Manas Thakur CS502: Compiler Design 19 Live ranges revisited ● A variable is live, if its current value may be used in future. – 1 Insight: a = 0 ● work from future to past ● backward over the CFG 2 b = a + 1 3 ● Live ranges: c = c + b – a: {1->2, 4->5->2} 4 a = b * 2 – b: {2->3, 3->4} – c: All edges except 1->2 5 a < N 6 return c Manas Thakur CS502: Compiler Design 20 Liveness ● A variable v is live on an edge if there is a 1 a = 0 directed path from that edge to a use of v that does not go through any def of v. 2 b = a + 1 ● A variable is live-in at a node if it is live on any 3 c = c + b of the in-edges of that node. 4 a = b * 2 ● A variable is live-out at a node if it is live on any of the out-edges of that node. 5 a < N 6 ● Verify: return c – a: {1->2, 4->5->2} – b: {2->4} Manas Thakur CS502: Compiler Design 21 Computation of liveness ● Say live-in of n is in[n], and live-out of n is out[n]. ● We can compute in[n] and out[n] for any n as follows: in[n] = use[n] ∪ (out[n] – def[n]) out[n] = ∀s∈succ[n] ∪ in[s] Called flow functions. Called dataflow equations. Manas Thakur CS502: Compiler Design 22 Liveness as an iterative dataflow analysis IDFA for each n in[n] = {}; out[n] = {} Initialize repeat for each n Save previous values in’[n] = in[n]; out’[n] = out[n] in[n] = use[n] ∪ (out[n] – def[n]) Compute out[n] = ∀s ∈ succ[n] ∪ in[s] new values ● until in’[n] == in[n] and out’[n] == out[n] ∀n Repeat till fixed-point Manas Thakur CS502: Compiler Design 23 Fixed point Liveness analysis example 1 4 a = 0 a = b * 2 in[n] = use[n] ∪ (out[n] – def[n]) 2 5 b = a + 1 a < N out[n] = ∀s∈succ[n] ∪ in[s] 3 6 c = c + b return c Manas Thakur CS502: Compiler Design 24 In backward order 1 a = 0 2 b = a + 1 3 c = c + b 4 a = b * 2 5 a < N 6 return c ● Fixed point only in 3 iterations! ● Thus, the order of processing statements is important for efficiency. in[n] = use[n] ∪ (out[n] – def[n]) out[n] = ∀s∈succ[n] ∪ in[s] Manas Thakur CS502: Compiler Design 25 Complexity of our liveness computation algorithm ● For input program of size N repeat for each n – ≤N nodes in CFG in’[n] = in[n]; out’[n] = out[n] N variables in[n] = use[n] ∪ (out[n] – def[n]) ⇒ N elements per in/out out[n] = ∀s ∈ succ[n] ∪ in[s] ⇒ until in’[n] == in[n] and out’[n] == out[n] ∀n O(N) time per set union ⇒ – for loop performs constant number of set operations per node O(N2) time for for loop ⇒ – Each iteration of for loop can only add to each set (monotonicity) – Sizes of all in and out sets sum to 2N2 thus bounding the number of iterations of the repeat loop worst-case complexity of O(N4) ⇒ – Much less in practice (usually O(N) or O(N2)) if ordered properly.
Recommended publications
  • CSE 582 – Compilers
    CSE P 501 – Compilers Optimizing Transformations Hal Perkins Autumn 2011 11/8/2011 © 2002-11 Hal Perkins & UW CSE S-1 Agenda A sampler of typical optimizing transformations Mostly a teaser for later, particularly once we’ve looked at analyzing loops 11/8/2011 © 2002-11 Hal Perkins & UW CSE S-2 Role of Transformations Data-flow analysis discovers opportunities for code improvement Compiler must rewrite the code (IR) to realize these improvements A transformation may reveal additional opportunities for further analysis & transformation May also block opportunities by obscuring information 11/8/2011 © 2002-11 Hal Perkins & UW CSE S-3 Organizing Transformations in a Compiler Typically middle end consists of many individual transformations that filter the IR and produce rewritten IR No formal theory for order to apply them Some rules of thumb and best practices Some transformations can be profitably applied repeatedly, particularly if others transformations expose more opportunities 11/8/2011 © 2002-11 Hal Perkins & UW CSE S-4 A Taxonomy Machine Independent Transformations Realized profitability may actually depend on machine architecture, but are typically implemented without considering this Machine Dependent Transformations Most of the machine dependent code is in instruction selection & scheduling and register allocation Some machine dependent code belongs in the optimizer 11/8/2011 © 2002-11 Hal Perkins & UW CSE S-5 Machine Independent Transformations Dead code elimination Code motion Specialization Strength reduction
    [Show full text]
  • Redundancy Elimination Common Subexpression Elimination
    Redundancy Elimination Aim: Eliminate redundant operations in dynamic execution Why occur? Loop-invariant code: Ex: constant assignment in loop Same expression computed Ex: addressing Value numbering is an example Requires dataflow analysis Other optimizations: Constant subexpression elimination Loop-invariant code motion Partial redundancy elimination Common Subexpression Elimination Replace recomputation of expression by use of temp which holds value Ex. (s1) y := a + b Ex. (s1) temp := a + b (s1') y := temp (s2) z := a + b (s2) z := temp Illegal? How different from value numbering? Ex. (s1) read(i) (s2) j := i + 1 (s3) k := i i + 1, k+1 (s4) l := k + 1 no cse, same value number Why need temp? Local and Global ¡ Local CSE (BB) Ex. (s1) c := a + b (s1) t1 := a + b (s2) d := m&n (s1') c := t1 (s3) e := a + b (s2) d := m&n (s4) m := 5 (s5) if( m&n) ... (s3) e := t1 (s4) m := 5 (s5) if( m&n) ... 5 instr, 4 ops, 7 vars 6 instr, 3 ops, 8 vars Always better? Method: keep track of expressions computed in block whose operands have not changed value CSE Hash Table (+, a, b) (&,m, n) Global CSE example i := j i := j a := 4*i t := 4*i i := i + 1 i := i + 1 b := 4*i t := 4*i b := t c := 4*i c := t Assumes b is used later ¡ Global CSE An expression e is available at entry to B if on every path p from Entry to B, there is an evaluation of e at B' on p whose values are not redefined between B' and B.
    [Show full text]
  • A Methodology for Assessing Javascript Software Protections
    A methodology for Assessing JavaScript Software Protections Pedro Fortuna A methodology for Assessing JavaScript Software Protections Pedro Fortuna About me Pedro Fortuna Co-Founder & CTO @ JSCRAMBLER OWASP Member SECURITY, JAVASCRIPT @pedrofortuna 2 A methodology for Assessing JavaScript Software Protections Pedro Fortuna Agenda 1 4 7 What is Code Protection? Testing Resilience 2 5 Code Obfuscation Metrics Conclusions 3 6 JS Software Protections Q & A Checklist 3 What is Code Protection Part 1 A methodology for Assessing JavaScript Software Protections Pedro Fortuna Intellectual Property Protection Legal or Technical Protection? Alice Bob Software Developer Reverse Engineer Sells her software over the Internet Wants algorithms and data structures Does not need to revert back to original source code 5 A methodology for Assessing JavaScript Software Protections Pedro Fortuna Intellectual Property IP Protection Protection Legal Technical Encryption ? Trusted Computing Server-Side Execution Obfuscation 6 A methodology for Assessing JavaScript Software Protections Pedro Fortuna Code Obfuscation Obfuscation “transforms a program into a form that is more difficult for an adversary to understand or change than the original code” [1] More Difficult “requires more human time, more money, or more computing power to analyze than the original program.” [1] in Collberg, C., and Nagra, J., “Surreptitious software: obfuscation, watermarking, and tamperproofing for software protection.”, Addison- Wesley Professional, 2010. 7 A methodology for Assessing
    [Show full text]
  • Polyhedral Compilation As a Design Pattern for Compiler Construction
    Polyhedral Compilation as a Design Pattern for Compiler Construction PLISS, May 19-24, 2019 [email protected] Polyhedra? Example: Tiles Polyhedra? Example: Tiles How many of you read “Design Pattern”? → Tiles Everywhere 1. Hardware Example: Google Cloud TPU Architectural Scalability With Tiling Tiles Everywhere 1. Hardware Google Edge TPU Edge computing zoo Tiles Everywhere 1. Hardware 2. Data Layout Example: XLA compiler, Tiled data layout Repeated/Hierarchical Tiling e.g., BF16 (bfloat16) on Cloud TPU (should be 8x128 then 2x1) Tiles Everywhere Tiling in Halide 1. Hardware 2. Data Layout Tiled schedule: strip-mine (a.k.a. split) 3. Control Flow permute (a.k.a. reorder) 4. Data Flow 5. Data Parallelism Vectorized schedule: strip-mine Example: Halide for image processing pipelines vectorize inner loop https://halide-lang.org Meta-programming API and domain-specific language (DSL) for loop transformations, numerical computing kernels Non-divisible bounds/extent: strip-mine shift left/up redundant computation (also forward substitute/inline operand) Tiles Everywhere TVM example: scan cell (RNN) m = tvm.var("m") n = tvm.var("n") 1. Hardware X = tvm.placeholder((m,n), name="X") s_state = tvm.placeholder((m,n)) 2. Data Layout s_init = tvm.compute((1,n), lambda _,i: X[0,i]) s_do = tvm.compute((m,n), lambda t,i: s_state[t-1,i] + X[t,i]) 3. Control Flow s_scan = tvm.scan(s_init, s_do, s_state, inputs=[X]) s = tvm.create_schedule(s_scan.op) 4. Data Flow // Schedule to run the scan cell on a CUDA device block_x = tvm.thread_axis("blockIdx.x") 5. Data Parallelism thread_x = tvm.thread_axis("threadIdx.x") xo,xi = s[s_init].split(s_init.op.axis[1], factor=num_thread) s[s_init].bind(xo, block_x) Example: Halide for image processing pipelines s[s_init].bind(xi, thread_x) xo,xi = s[s_do].split(s_do.op.axis[1], factor=num_thread) https://halide-lang.org s[s_do].bind(xo, block_x) s[s_do].bind(xi, thread_x) print(tvm.lower(s, [X, s_scan], simple_mode=True)) And also TVM for neural networks https://tvm.ai Tiling and Beyond 1.
    [Show full text]
  • The LLVM Instruction Set and Compilation Strategy
    The LLVM Instruction Set and Compilation Strategy Chris Lattner Vikram Adve University of Illinois at Urbana-Champaign lattner,vadve ¡ @cs.uiuc.edu Abstract This document introduces the LLVM compiler infrastructure and instruction set, a simple approach that enables sophisticated code transformations at link time, runtime, and in the field. It is a pragmatic approach to compilation, interfering with programmers and tools as little as possible, while still retaining extensive high-level information from source-level compilers for later stages of an application’s lifetime. We describe the LLVM instruction set, the design of the LLVM system, and some of its key components. 1 Introduction Modern programming languages and software practices aim to support more reliable, flexible, and powerful software applications, increase programmer productivity, and provide higher level semantic information to the compiler. Un- fortunately, traditional approaches to compilation either fail to extract sufficient performance from the program (by not using interprocedural analysis or profile information) or interfere with the build process substantially (by requiring build scripts to be modified for either profiling or interprocedural optimization). Furthermore, they do not support optimization either at runtime or after an application has been installed at an end-user’s site, when the most relevant information about actual usage patterns would be available. The LLVM Compilation Strategy is designed to enable effective multi-stage optimization (at compile-time, link-time, runtime, and offline) and more effective profile-driven optimization, and to do so without changes to the traditional build process or programmer intervention. LLVM (Low Level Virtual Machine) is a compilation strategy that uses a low-level virtual instruction set with rich type information as a common code representation for all phases of compilation.
    [Show full text]
  • Attacking Client-Side JIT Compilers.Key
    Attacking Client-Side JIT Compilers Samuel Groß (@5aelo) !1 A JavaScript Engine Parser JIT Compiler Interpreter Runtime Garbage Collector !2 A JavaScript Engine • Parser: entrypoint for script execution, usually emits custom Parser bytecode JIT Compiler • Bytecode then consumed by interpreter or JIT compiler • Executing code interacts with the Interpreter runtime which defines the Runtime representation of various data structures, provides builtin functions and objects, etc. Garbage • Garbage collector required to Collector deallocate memory !3 A JavaScript Engine • Parser: entrypoint for script execution, usually emits custom Parser bytecode JIT Compiler • Bytecode then consumed by interpreter or JIT compiler • Executing code interacts with the Interpreter runtime which defines the Runtime representation of various data structures, provides builtin functions and objects, etc. Garbage • Garbage collector required to Collector deallocate memory !4 A JavaScript Engine • Parser: entrypoint for script execution, usually emits custom Parser bytecode JIT Compiler • Bytecode then consumed by interpreter or JIT compiler • Executing code interacts with the Interpreter runtime which defines the Runtime representation of various data structures, provides builtin functions and objects, etc. Garbage • Garbage collector required to Collector deallocate memory !5 A JavaScript Engine • Parser: entrypoint for script execution, usually emits custom Parser bytecode JIT Compiler • Bytecode then consumed by interpreter or JIT compiler • Executing code interacts with the Interpreter runtime which defines the Runtime representation of various data structures, provides builtin functions and objects, etc. Garbage • Garbage collector required to Collector deallocate memory !6 Agenda 1. Background: Runtime Parser • Object representation and Builtins JIT Compiler 2. JIT Compiler Internals • Problem: missing type information • Solution: "speculative" JIT Interpreter 3.
    [Show full text]
  • ICS803 Elective – III Multicore Architecture Teacher Name: Ms
    ICS803 Elective – III Multicore Architecture Teacher Name: Ms. Raksha Pandey Course Structure L T P 3 1 0 4 Prerequisite: Course Content: Unit-I: Multi-core Architectures Introduction to multi-core architectures, issues involved into writing code for multi-core architectures, Virtual Memory, VM addressing, VA to PA translation, Page fault, TLB- Parallel computers, Instruction level parallelism (ILP) vs. thread level parallelism (TLP), Performance issues, OpenMP and other message passing libraries, threads, mutex etc. Unit-II: Multi-threaded Architectures Brief introduction to cache hierarchy - Caches: Addressing a Cache, Cache Hierarchy, States of Cache line, Inclusion policy, TLB access, Memory Op latency, MLP, Memory Wall, communication latency, Shared memory multiprocessors, General architectures and the problem of cache coherence, Synchronization primitives: Atomic primitives; locks: TTS, ticket, array; barriers: central and tree; performance implications in shared memory programs; Chip multiprocessors: Why CMP (Moore's law, wire delay); shared L2 vs. tiled CMP; core complexity; power/performance; Snoopy coherence: invalidate vs. update, MSI, MESI, MOESI, MOSI; performance trade-offs; pipelined snoopy bus design; Memory consistency models: SC, PC, TSO, PSO, WO/WC, RC; Chip multiprocessor case studies: Intel Montecito and dual-core, Pentium4, IBM Power4, Sun Niagara Unit-III: Compiler Optimization Issues Code optimizations: Copy Propagation, dead Code elimination , Loop Optimizations-Loop Unrolling, Induction variable Simplification, Loop Jamming, Loop Unswitching, Techniques to improve detection of parallelism: Scalar Processors, Special locality, Temporal locality, Vector machine, Strip mining, Shared memory model, SIMD architecture, Dopar loop, Dosingle loop. Unit-IV: Control Flow analysis Control flow analysis, Flow graph, Loops in Flow graphs, Loop Detection, Approaches to Control Flow Analysis, Reducible Flow Graphs, Node Splitting.
    [Show full text]
  • CS153: Compilers Lecture 19: Optimization
    CS153: Compilers Lecture 19: Optimization Stephen Chong https://www.seas.harvard.edu/courses/cs153 Contains content from lecture notes by Steve Zdancewic and Greg Morrisett Announcements •HW5: Oat v.2 out •Due in 2 weeks •HW6 will be released next week •Implementing optimizations! (and more) Stephen Chong, Harvard University 2 Today •Optimizations •Safety •Constant folding •Algebraic simplification • Strength reduction •Constant propagation •Copy propagation •Dead code elimination •Inlining and specialization • Recursive function inlining •Tail call elimination •Common subexpression elimination Stephen Chong, Harvard University 3 Optimizations •The code generated by our OAT compiler so far is pretty inefficient. •Lots of redundant moves. •Lots of unnecessary arithmetic instructions. •Consider this OAT program: int foo(int w) { var x = 3 + 5; var y = x * w; var z = y - 0; return z * 4; } Stephen Chong, Harvard University 4 Unoptimized vs. Optimized Output .globl _foo _foo: •Hand optimized code: pushl %ebp movl %esp, %ebp _foo: subl $64, %esp shlq $5, %rdi __fresh2: movq %rdi, %rax leal -64(%ebp), %eax ret movl %eax, -48(%ebp) movl 8(%ebp), %eax •Function foo may be movl %eax, %ecx movl -48(%ebp), %eax inlined by the compiler, movl %ecx, (%eax) movl $3, %eax so it can be implemented movl %eax, -44(%ebp) movl $5, %eax by just one instruction! movl %eax, %ecx addl %ecx, -44(%ebp) leal -60(%ebp), %eax movl %eax, -40(%ebp) movl -44(%ebp), %eax Stephen Chong,movl Harvard %eax,University %ecx 5 Why do we need optimizations? •To help programmers… •They write modular, clean, high-level programs •Compiler generates efficient, high-performance assembly •Programmers don’t write optimal code •High-level languages make avoiding redundant computation inconvenient or impossible •e.g.
    [Show full text]
  • Analysis of Program Optimization Possibilities and Further Development
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Theoretical Computer Science 90 (1991) 17-36 17 Elsevier Analysis of program optimization possibilities and further development I.V. Pottosin Institute of Informatics Systems, Siberian Division of the USSR Acad. Sci., 630090 Novosibirsk, USSR 1. Introduction Program optimization has appeared in the framework of program compilation and includes special techniques and methods used in compiler construction to obtain a rather efficient object code. These techniques and methods constituted in the past and constitute now an essential part of the so called optimizing compilers whose goal is to produce an object code in run time, saving such computer resources as CPU time and memory. For contemporary supercomputers, the requirement of the proper use of hardware peculiarities is added. In our opinion, the existence of a sufficiently large number of optimizing compilers with real possibilities of producing “good” object code has evidently proved to be practically significant for program optimization. The methods and approaches that have accumulated in program optimization research seem to be no lesser valuable, since they may be successfully used in the general techniques of program construc- tion, i.e. in program synthesis, program transformation and at different steps of program development. At the same time, there has been criticism on program optimization and even doubts about its necessity. On the one hand, this viewpoint is based on blind faith in the new possibilities bf computers which will be so powerful that any necessity to worry about program efficiency will disappear.
    [Show full text]
  • Cross-Platform Language Design
    Cross-Platform Language Design THIS IS A TEMPORARY TITLE PAGE It will be replaced for the final print by a version provided by the service academique. Thèse n. 1234 2011 présentée le 11 Juin 2018 à la Faculté Informatique et Communications Laboratoire de Méthodes de Programmation 1 programme doctoral en Informatique et Communications École Polytechnique Fédérale de Lausanne pour l’obtention du grade de Docteur ès Sciences par Sébastien Doeraene acceptée sur proposition du jury: Prof James Larus, président du jury Prof Martin Odersky, directeur de thèse Prof Edouard Bugnion, rapporteur Dr Andreas Rossberg, rapporteur Prof Peter Van Roy, rapporteur Lausanne, EPFL, 2018 It is better to repent a sin than regret the loss of a pleasure. — Oscar Wilde Acknowledgments Although there is only one name written in a large font on the front page, there are many people without which this thesis would never have happened, or would not have been quite the same. Five years is a long time, during which I had the privilege to work, discuss, sing, learn and have fun with many people. I am afraid to make a list, for I am sure I will forget some. Nevertheless, I will try my best. First, I would like to thank my advisor, Martin Odersky, for giving me the opportunity to fulfill a dream, that of being part of the design and development team of my favorite programming language. Many thanks for letting me explore the design of Scala.js in my own way, while at the same time always being there when I needed him.
    [Show full text]
  • Control Flow Analysis in Scheme
    Control Flow Analysis in Scheme Olin Shivers Carnegie Mellon University [email protected] Abstract Fortran). Representative texts describing these techniques are [Dragon], and in more detail, [Hecht]. Flow analysis is perhaps Traditional ¯ow analysis techniques, such as the ones typically the chief tool in the optimising compiler writer's bag of tricks; employed by optimising Fortran compilers, do not work for an incomplete list of the problems that can be addressed with Scheme-like languages. This paper presents a ¯ow analysis ¯ow analysis includes global constant subexpression elimina- technique Ð control ¯ow analysis Ð which is applicable to tion, loop invariant detection, redundant assignment detection, Scheme-like languages. As a demonstration application, the dead code elimination, constant propagation, range analysis, information gathered by control ¯ow analysis is used to per- code hoisting, induction variable elimination, copy propaga- form a traditional ¯ow analysis problem, induction variable tion, live variable analysis, loop unrolling, and loop jamming. elimination. Extensions and limitations are discussed. However, these traditional ¯ow analysis techniques have The techniques presented in this paper are backed up by never successfully been applied to the Lisp family of computer working code. They are applicable not only to Scheme, but languages. This is a curious omission. The Lisp community also to related languages, such as Common Lisp and ML. has had suf®cient time to consider the problem. Flow analysis dates back at least to 1960, ([Dragon], pp. 516), and Lisp is 1 The Task one of the oldest computer programming languages currently in use, rivalled only by Fortran and COBOL. Flow analysis is a traditional optimising compiler technique Indeed, the Lisp community has long been concerned with for determining useful information about a program at compile the execution speed of their programs.
    [Show full text]
  • Language and Compiler Support for Dynamic Code Generation by Massimiliano A
    Language and Compiler Support for Dynamic Code Generation by Massimiliano A. Poletto S.B., Massachusetts Institute of Technology (1995) M.Eng., Massachusetts Institute of Technology (1995) Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 1999 © Massachusetts Institute of Technology 1999. All rights reserved. A u th or ............................................................................ Department of Electrical Engineering and Computer Science June 23, 1999 Certified by...............,. ...*V .,., . .* N . .. .*. *.* . -. *.... M. Frans Kaashoek Associate Pro essor of Electrical Engineering and Computer Science Thesis Supervisor A ccepted by ................ ..... ............ ............................. Arthur C. Smith Chairman, Departmental CommitteA on Graduate Students me 2 Language and Compiler Support for Dynamic Code Generation by Massimiliano A. Poletto Submitted to the Department of Electrical Engineering and Computer Science on June 23, 1999, in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract Dynamic code generation, also called run-time code generation or dynamic compilation, is the cre- ation of executable code for an application while that application is running. Dynamic compilation can significantly improve the performance of software by giving the compiler access to run-time infor- mation that is not available to a traditional static compiler. A well-designed programming interface to dynamic compilation can also simplify the creation of important classes of computer programs. Until recently, however, no system combined efficient dynamic generation of high-performance code with a powerful and portable language interface. This thesis describes a system that meets these requirements, and discusses several applications of dynamic compilation.
    [Show full text]