Compilers Optimization

Compilers Optimization

Compilers Optimization Yannis Smaragdakis, U. Athens (or ig ina l s lides by Sam Guyer @Tu fts ) What we already saw z Lowering FlFrom language-lllevel cons ttttructs to mac hine-lllevel cons tttructs z At this point we could generate machine code z Output of lowering is a correct translation z What’s left to do? z Map from lower-level IR to actual ISA z Maybe some register management (could be required) z Pass off to assembler z Whyyp have a separate assembler? z Handles “packing the bits” Assembly addi <target>, <source>, <value> MhiMachine 0010 00ss ssst tttt iiii iiii iiii iiii 2 But first… z The compiler “understands” the program z IR captures program semantics z Lowering: semantics-preserving transformation z Why not d o oth ers ? z Compppiler optimizations z Oh great, now my program will be optimal! z Sorry, it’s a misnomer z What is an “optimization”? 3 Optimizations z What are they? z Code transformations z Improve some metric z Metrics z Performance: time, instructions, cycles AthAre these me titrics equ ilt?ivalent? z Memory z Memory hierarchy (reduce cache misses) z Reduce memory usage z Code Size z Energy 4 Big picture Errors Semantic chars Scanner tokens Parser AST checker Optimizations Checked Instruction IR lowering AST tuples selection assembly Register assembly Machine Assembler ∞ registers allocation k registers code 5 Big picture Errors Optimizations Semantic chars Scanner tokens Parser AST checker Checked Instruction IR lowering AST tuples selection assembly Register assembly Machine Assembler ∞ registers allocation k registers code 6 Why optimize? z High-level constructs may make some A[i][j] = A[i][j-1]+11] + 1 optimizations difficult t = A + i*row + j or impossible: ssoj = A + i*row + j – 1 (*t) = (*s) + 1 z High-level code may be more desirable z Program at high level z Focus on design; clean, modular implementation z Let compiler worry about gory details z Premature optimization is the root of all evil! 7 Limitations z What are optimizers good at? z Consistent and thorough z Find all opportunities for an optimization z Uniformly apply the transformation z What are they not good at? z Asyypmptotic comp lexit y z Compilers can’t fix bad algorithms z Compilers can’t fix bad data structures z There’s no magic 8 Requirements z Safety z Preserve the semantics of the program z What does that mean? z Profitability z Will it help our metric? z Do we need a guarantee of improvement? z Risk z How will interact with other optimizations? z How will it affect other stages of compilation? 9 Example: loop unrolling for ((;i=0; i<100; i++ ) for ((;i=0; i<25; i++ ){) { *t++ = *s++; *t++ = *s++; *t++ = *s++; *t++ = *s++; *t++ = *s++; } z Safety: z Always safe; getting loop conditions right can be tricky. z Profitability z Depends on hardware – usually a win – why? z Risk? z Increases size of code in loop z May not fit in the instruction cache 10 Optimizations z Manyyy, many optimizations invented z Constant folding, constant propagation, tail-call elimination, redundancy elimination, dead code elimination, loop- invariant code motion, loop splitting, loop fusion, strength reduction, array scalarization, inlining, cloning, data prefetching, parallelization. .etc . z How do they interact? z Optimist: we get the sum of all improvements! z RlitRealist: many are iditin direct oppos ition 11 Rough categories z Traditional optimizations z Transform the program to reduce work z Don’t change the level of abstraction z Resource allocation z Map program to specific hardware properties z Register allocation z Instruction scheduling, parallelism z Data streaming, prefetching z EblitEnabling trans forma tions z Don’t necessarily improve code on their own z Inlining, loop unrolling 12 Constant propagation z Idea z If the value of a variable is known to be a constant at compile-time, replace the use of variable with constant n = 10; n = 10; c = 2; c = 2; for (i=0;i<n;i++) for (i=0;i<10;i++) s = s +i*+ i*c; s = s +i*+ i*2; z Safety z Prove the value is constant z Notice: z May interact favorably with other optimizations, like loop unrolling – now we know the trip count 13 Constant folding z Idea z If operand s are known a t compil e-time, eva lua te expression at compile-time r=3141*10;r = 3.141 * 10; r=3141;r = 31.41; z What do we need to be careful of? z IthIs the resu ltthlt the same as if execu tdtted at runti ti?me? z Overflow/underflow, rounding and numeric stability z Often repeated throughout compiler x = A[2]; t1 = 2*4; t2 = A + t1; x = *t2; 14 Partial evaluation z Constant propaggggation and folding together z Idea: z Evaluate as much of the ppgrogram at compile-time as possible z More sophisticated schemes: z Simulate data structures, arrays z SbliSymbolic execu tifthdtion of the code z Caveat: floating point z Preserv ing the error c harac ter is tics o f floa ting po in t va lues 15 Algebraic simplification z Idea: z Apply the usual algebraic rules to simplify expressions a * 1 a a/1 a a * 0 0 a + 0 a b || false b z Reppyppyppeatedly apply to complex expressions z Many, many possible rules z Associativity and commutativity come into play 16 Common sub-expression elimination z Idea: z If program computes the same expression multiple times, reuse the value. a = b + c; tb+t = b + c c = b + c; a = t; d = b + c; c = t; d=b+c;d = b + c; z Safety: z Subexpression can only be reused until operands are redefined z Often occurs in address computations z Array inde xing and str u ct/field accesses 17 Dead code elimination z Idea: z If the result of a computation is never used, then we can remove the computation x=y+1;x = y + 1; y = 1; y = 1; x = 2 * z; x = 2 * z; z Safety z Variable is dead if it is never used after defined z Remove code that assigns to dead variables z This may, in turn, create more dead code z Dead-code elimination usually works transitively 18 Copy propagation z Idea: z After an assignment x = y, replace any uses of x with y x = y; x = y; if (x>1) if (y>1) s = x+f(x); s = y+f(y); z SftSafety: z Only apply up to another assignment to x, or z …another assignment to y! z What if there was an assignment y = z earlier? z Apply transitively to all assignments 19 Unreachable code elimination z Idea: z Eliminate code that can never be executed #define DEBUG 0 . if (DEBUG) print(“Current value = “, v); z Different implementations z High-level: look for if (false) or while (false) z Low-level: more difficult z Code is just labels and gotos z TthhkihblblkTraverse the graph, marking reachable blocks 20 How do these things happen? z Who would write code with: z Dead code z Common subexpressions z CttConstant express ions z Copies of variables z TthTwo ways they occur z High-level constructs – we’ve already seen examples z Other optimizations z Copy propagation often leaves dead code z Enabling transformations: inlining, loop unrolling, etc. 21 Loop optimizations z Program hot-spots are usually in loops z Most programs: 90% of execution time is in loops z What are possible exceptions? OS k ernel s, compil ers an d in terpre ters z Loops are a good place to expend extra effort z Numerous loop optimizations z Often expensive – complex analysis z For languages like Fortran, very effective z What about C? 22 Loop-invariant code motion z Idea: z If a computation won’t change from one loop iteration to the next, move it outside the loop t1 = x *x; for (i=0;i<N;i++) for (i=0;i<N;i++) A[i] = A[i] + x*x; A[i] = A[i] + t1; z Safety: z Determine when expressions are invariant z Just check for variables with no assignments? z Useful for array address computations z Not visible at source level 23 Strength reduction z Idea: z Replace expensive operations (mutiplication, division) with cheaper ones (addition, subtraction, bit shift) z Traditionally applied to induction variables z Variables whose value depends linearly on loop count z Special analysis to find such variables v = 0; for (i=0;i<N;i++) for (i=0;i<N;i++) v = 4*i; A[v] = . A[v] = . v = v + 4; 24 Strength reduction z Can also be applied to simple arithmetic operations: x * 2 x + x x * 22c^c x<<c x/2^c x>>c z Typical example of premature optimization z Programmers use bit-shift instead of multiplication z “2”ihdtdtd“x<<2” is harder to understand z Most compilers will get it right automatically 25 Inlining z Idea: z Replace a function call with the body of the callee z Safety z What about recursion? z Risk z Code size z Most compilers use heuristics to decide when z Has been cast as a knapppsack problem 26 Inlining z What are the benefits of inlining? z Eliminate call/return overhead z Customize callee code in the context of the caller z Use actual arguments z Push into copy of callee code using constant prop z Apply other optimizations to reduce code z Hardware z Eliminate the two jumps z Keep the pipeline filled z Critical for OO languages z Methods are often small z Encapsu lation, modu larity force code apart 27 Inlining z In C: z At a call-site, decide whether or not to inline z (Often a heuristic about callee/caller size) z Look up the callee z Replace call with body of callee z What about Java? class A { void M() {…} } z What complicates this? class B extends A { void M() {…} } z Virtual methods z Even worse? void foo(A x) z Dynamic class loading { x.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    79 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us