10/28/2013
Optimization
• Next topic: how to generate better code CS 4120 through optimization. Introduction to Compilers • This course covers the most valuable Ross Tate and straightforward optimizations – Cornell University much more to learn! • Other sources: Lecture 25: Introduction to Optimization • Muchnick has 10 chapters of optimization techniques • Cooper and Torczon also cover optimization
CS 4120 Introduction to Compilers 2
How fast can you go? Goal of optimization 10000 direct source code interpreters • Help programmers • clean, modular, high-level source code 1000 tokenized program interpreters (BASIC, Tcl) • but compile to assembly-code performance AST interpreters (CS 3110 RCL, Perl 4) 100 • Optimizations are code transformations bytecode interpreters (Java, Perl 5, OCaml) • can’t change meaning of program to behavior call-threaded interpreters 10 pointer-threaded interpreters (FORTH) not allowed by source simple code generation (PA4, JIT) • Different kinds of optimization: • space optimization: reduce memory use 1 register allocation naive assembly code local optimization expert assembly code global optimization • time optimization: reduce execution time 0.1 • power optimization: reduce power usage
CS 4120 Introduction to Compilers 3 CS 4120 Introduction to Compilers 4
Why do we need optimization? Where to optimize?
• Programmers may write suboptimal code to • Usual goal: improve time performance make it clearer. • But: many optimizations trade off space versus time. • High-level language may make avoiding • E.g.: loop unrolling replaces loop body with N copies. redundant computation inconvenient or • Increasing code space slows program down a little, impossible speeds up one loop • Frequently executed code with long loops: space/time – a[i][j] = a[i][j] + 1 tradeoff is generally a win • Infrequently executed code: optimize code space at • Exploit patterns inexpressible at source level expense of time, saving instruction cache space • Clean up mess from translation stages • Complex optimizations may never pay off! • Architectural independence. • Ideal focus of optimization: program hot spots – Modern architectures make it hard to hand optimize
CS 4120 Introduction to Compilers 5 CS 4120 Introduction to Compilers 6
1 10/28/2013
Safety Writing fast programs in practice
• Possible opportunity for loop-invariant code motion: 1.Pick the right algorithms and data while (b) { z = y/x; // x, y not assigned in loop structures: design for locality and few … operations } • Transformation: invariant code out of loop: 2.Turn on optimization and profile to figure z = y/x; out program hot spots while (b) { Preserves meaning? … 3.Evaluate whether design works; if so… } Faster? • Three aspects of an optimization: 4.Tweak source code until optimizer does • the code transformation “the right thing” to machine code • safety of transformation • understanding optimizers helps! • performance improvement
CS 4120 Introduction to Compilers 7 CS 4120 Introduction to Compilers 8
Structure of an optimization When to apply optimization
• Optimization is a code transformation AST Inlining HIR Specialization • Applied at some stage of compiler Constant folding IR – HIR, MIR, LIR Constant propagation Value numbering • In general requires some analysis: Canonical Dead code elimination • safety analysis to determine where MIR IR Loop-invariant code motion Common sub-expression elimination transformation does not change meaning Strength reduction (e.g. live variable analysis) Abstract Constant folding & propagation Assembly Branch prediction/optimization • cost analysis to determine where it ought to Register allocation speed up code (e.g., which variable to spill) Loop unrolling LIR Assembly Cache optimization Peephole optimizations
CS 4120 Introduction to Compilers 9 CS 4120 Introduction to Compilers 10
Register allocation Constant folding
• Goal: convert abstract assembly (infinite # of registers) • Idea: if operands are known at compile time, into real assembly (6 registers) evaluate at compile time when possible. mov t1, t2 mov rax, rbx int x = (2 + 3)*4*y; ⇒ int x = 5*4*y; add t1, -4(rbp) add rax, -4(rbp) ⇒ int x = 20*y; mov t3, -8(rbp) mov rbx, -8(rbp) • Can perform at every stage of compilation mov t4, t3 cmp t1, t4 cmp rax, rbx • Constant expressions are created by translation and by optimization a[2] ⇒ MEM(MEM(a) + 2*4) • Need to reuse registers aggressively (e.g., rbx) ⇒ MEM(MEM(a) + 8) • Coalesce registers (t3, t4) to eliminate mov’s • May be impossible without spilling some temporaries to stack
CS 4120 Introduction to Compilers 11 CS 4120 Introduction to Compilers 12
2 10/28/2013
Constant folding conditionals Algebraic simplification
• More general form of constant folding: take advantage of if (true) S ⇒ S simplification rules a * 1 ⇒ a b | false ⇒ b if (false) S ⇒ ; a + 0 ⇒ a b & true ⇒ b identities if (true) S else S’ ⇒ S a * 0 ⇒ 0 b & false ⇒ false zeroes (a + 1) + 2 ⇒ a + (1 + 2) ⇒ a+3 reassociation if (false) S else S’ ⇒ S’ a * 4 ⇒ a shl 2 a * 7 ⇒ (a shl 3) − a strength reduction a / 32767 ⇒ a shr 15 + a shr 30 while (false) S ⇒ ; • Must be careful with floating point and with overflow - algebraic while (true) ; ⇒ ; ???? identities may give wrong or less precise answers if (2 > 3) S ⇒ if (false) S ⇒ ; • E.g., (a+b)+c ≠ a+(b+c) in floating point if a,b small
CS 4120 Introduction to Compilers 13 CS 4120 Introduction to Compilers 14
CS 4120 Introduction to Compilers 25 CS 4120 Introduction to Compilers 26
5