6.035 Lecture 9, Introduction to Program Analysis and Optimization

Spring 20092009 Lecture 9:9: IntroductionIntroduction toto Program Analysis and Optimization Outline • Introduction • Basic Blocks • Common Subexpression Elimination • Copy P ropa gatition •Dead CCodeode Eliminatation • Algebraic Simplification • Summary Prog ram Analysis • Compile-time reasoning about run-time behavior of program – Can discover things that are always true: • “ x is al ways 1 in thth e s tatemen t y = x + z” • “the pointer p always points into array a” • “the statement return 5 can never execute” – Can infer things that are likely to be true: • “the reference r usually refers to an object of class C” • “thethe statement a=b+ca = b + c appears ttoo eexecutexecute more frequently than the statement x = y + z” – Distinction between data and control-flow properties Saman Amarasinghe 3 6.035 ©MIT Fall 1998 Transformations • Use analysis results to transform program • Overall goal: improve some aspect of program • Traditional goals: – RdReduce number of execute d i nstruct ions – Reduce overall code size • Other goalsgoals emerge as space becomesbecomes more complex – Reduce number of cycles • Use vector or DSP instructions • Improve instruction or data cache hit rate – Reduce power consumption – Reduce memory usage Saman Amarasinghe 4 6.035 ©MIT Fall 1998 Outline • Introduction • Basic Blocks • Common Subexpression Elimination • Copy P ropa gatition •Dead CCodeode Eliminatation • Algebraic Simplification • Summary Control Flow Graph • Nodes Rep resent Comp utation – Each Node is a Basic Block –Basic Block is a Seququ ence of Instructions with • No Branches Out Of Middle of Basic Block • No Branches Into Middle of Basic Block • Basic Blocks should be maximal – Execution of basic block starts with first instruction – Includes all instructions in basic block •Edges Represent Control Flow Saman Amarasinghe 6 6.035 ©MIT Fall 1998 Control Flow Grap h s = 0; a = 4; into add(n, k) { i = 0; s = 0; a = 4; i = 0; k == 0 if (k == 0) b = 1; else b = 2; b = 1; b=b = 2;2; while (i < n) { i < n s = s + a*b; s = s +*+ a*bb; i = i + 1; return s; i = i + 1; } return s; } Saman Amarasinghe 7 6.035 ©MIT Fall 1998 Basic Block Construction • Start with instruction control-flow ggpraph • Visit all edges in graph • Merge aadjacentdjacent nnodesodes if – Only one edge from first node – Only oneone edge into second node s = 0; s = 0; a = 4; a = 4; Saman Amarasinghe 8 6.035 ©MIT Fall 1998 s = 0; s = 0; a = 4; a = 4; i = 0; k==0 b = 2; b = 1; i < n s = s + a*b; return s; iii+1; = i+1; Saman Amarasinghe 9 6.035 ©MIT Fall 2006 s = 0; s = 0; a = 4; a = 4; i = 0; i = 0; k==0 b = 2; b = 1; i < n s = s + a*b; return s; iii+1; = i+1; Saman Amarasinghe 10 6.035 ©MIT Fall 2006 s = 0; s = 0; a = 4; a = 4; i = 0; k==0 i = 0; k==0 b = 2; b = 1; i < n s = s + a*b; return s; iii+1; = i+1; Saman Amarasinghe 11 6.035 ©MIT Fall 2006 s = 0; s=0; a = 4; a = 4; i = 0; k==0 i = 0; k==0 b = 2; b = 1; b = 2; i < n s = s + a*b; return s; iii+1; = i+1; Saman Amarasinghe 12 6.035 ©MIT Fall 2006 s = 0; s=0; a = 4; a = 4; i = 0; k==0 i = 0; k==0 b = 2; b = 1; b = 2; i < n i < n s = s + a*b; return s; iii+1; = i+1; Saman Amarasinghe 13 6.035 ©MIT Fall 2006 s = 0; s=0; a = 4; a = 4; i = 0; k==0 i = 0; k==0 b = 2; b = 1; b = 2; i < n i < n s = s + a*b; s = s + a*b; return s; iii+1; = i+1; Saman Amarasinghe 14 6.035 ©MIT Fall 2006 s = 0; s=0; a = 4; a = 4; i = 0; k==0 i = 0; k==0 b = 2; b = 1; b = 2; i < n i < n s = s + a*b; s = s + a*b; return s; i = i + 1; i = i + 1; Saman Amarasinghe 15 6.035 ©MIT Fall 2006 s = 0; s=0; a = 4; a = 4; i = 0; k==0 i = 0; k==0 b = 2; b = 1; b = 2; i < n i < n s = s + a*b; s = s + a*b; return s; i = i + 1; iii+1; = i+1; Saman Amarasinghe 16 6.035 ©MIT Fall 2006 s = 0; s=0; a = 4; a = 4; i = 0; k==0 i = 0; k==0 b = 2; b = 1; b = 2; i < n i < n s = s + a*b; s = s + a*b; return s; return s; i = i + 1; iii+1; = i+1; Saman Amarasinghe 17 6.035 ©MIT Fall 2006 s = 0; s=0; a = 4; a = 4; i = 0; k==0 i = 0; k==0 b = 2; b = 1; b = 2; b = 1; i < n i < n s = s + a*b; s = s + a*b; return s; return s; i = i + 1; iii+1; = i+1; Saman Amarasinghe 18 6.035 ©MIT Fall 2006 s = 0; s=0; a = 4; a = 4; i = 0; k==0 i = 0; k==0 b = 2; b = 1; b = 2; b = 1; i < n i < n s = s + a*b; s = s + a*b; return s; return s; i = i + 1; iii+1; = i+1; Saman Amarasinghe 19 6.035 ©MIT Fall 2006 s = 0; s=0; a = 4; a = 4; i = 0; k==0 i = 0; k==0 b = 2; b = 1; b = 2; b = 1; i < n i < n s = s + a*b; s = s + a*b; return s; return s; i = i + 1; iii+1; = i+1; Saman Amarasinghe 20 6.035 ©MIT Fall 2006 Program Points, Split and Join Points • One program point before and after each statement in program • Split point has multiple successors – conditional branch statements only split points • Merge point has multiple predecessors • Each basic block – Either starts with a merge point or its predecessor ends with a split point – Either ends with a split point or its successor stttarts wiithth a merge point Saman Amarasinghe 21 6.035 ©MIT Fall 1998 Basic Block Optimizations • Common SSubub - • Copy Propagation Expression Elimination – a=x+y; b=a; c=b+z; –a=(x+y) y)+z; b=x+y; – a=x+y; b=a; c=a+z; – t=x+y; a=t+z; b=t; • Constant Pr opagation • Dead Code Elimination –x=5; b=x+y; – a=x+y; b=a; b=a+z; –x=5; b=5+y; y; – a=x+y; b=a+z • Algebraic Identities • Strength Reduction – a=x*1; – tit=i*4*4; –a=x; – t=i<<2; Saman Amarasinghe 22 6.035 ©MIT Fall 1998 Basic Block Analysis App pproach • Assume normalized basic block - all statements are of the form – var = var op var (where op is a binary operator) – var = op var (where op is a unary operator) – var = var • Simulate a symbolic execution of basic block – Reason about values of variables (or other aspects of computation) – Derive property of interest Saman Amarasinghe 23 6.035 ©MIT Fall 1998 Two Kinds of Variables • Temp oraries Introduced By Comp iler – Transfer values only within basic block –Introduced as papa rt of instruction flattening – Introduced by optimizations/transformations – Typically assigned to only once • Program Variables – Declared in original program – May be assigned to multiple times – May t rans fer va lues between basic blbl ock s Saman Amarasinghe 24 6.035 ©MIT Fall 1998 Outline • Introduction • Basic Blocks • Common Subexpression Elimination • Copy P ropa gatition •Dead CCodeode Eliminatation • Algebraic Simplification • Summary Value Numbering • Reason about values of variables and expressions in the program – SimulateSimulate executionexecution ofof basicbasic blockblock – Assign virtual value to each variable and expression • Discovered pproperty:roperty: which vvariablesariables and eexpressionsxpressions have the same value • Standarduse: – Common subexpression elimination – Typically combined with transformation that • Saves computed values in temporaries • Replaces expressions with temporaries when value of expression previously computed Saman Amarasinghe 26 6.035 ©MIT Fall 1998 New Basic Original Basic Block Block a = x+y a = x+y t1 = a b = aaz+z b=b = a+za+z b = b+y t2 = b c = a+z b = b+y t3 = b Var to Val c = t2 x v1 y v2 Exp to Val Exp to Tmp a v3 v1+v2 v3 v1+v2 t1 z v4 v3+v4 v5 v3+v4 t2 b v5v6 v5+v2 t6 c v5 v5+v2 v6 Saman Amarasinghe 27 6.035 ©MIT Fall 1998 Value Numbering Summary • Forward symbolic execution of basic block • Each new value assigned to temporary – a=x+y; becomes a=x+y; t=a; – Temporary ppreservesreserves value fforor use llaterater in program eevenven if original variable rewritten • a=x+y; a=a+z; b=x+y becomes • a=x+y; t=a; a=a+z; b=t; • Maps – VtVar to Val – specifies symbolic value for eachvar iable – Exp to Val – specifies value of each evaluated expression – Exp toto TmpTmp – specifies tmp tthathat hholdsolds value ofof eacheach evaluated expression Saman Amarasinghe 28 6.035 ©MIT Fall 1998 Map Usage •Var to Val – Used to compute symbolic value of y and z when processing statement offf form x = y + z •Exp to Tmp – Used to determine which tmp to use if value(y) + value(z) previously computed when processing statement ooff fformorm x = y + z •Exp to Val – Used to update Var to Val when • processing statement of the form x = y + z, and • value(y) + value(z)value(z) previouslypreviously computedcomputed Saman Amarasinghe 29 6.035 ©MIT Fall 1998 Interesting Properties • Finds common subexpressions even if they use diff erent vari ables in expressions – y=a+b; x=b; z=a+x becomes – y= aa+b;+b;t t= y;y; x= b;b;z z= t – Why? Because computes with symbolic values • Finds ccommonommon subexpressions eveneven iiff vvariableariable that originally held the value was overwritten – yya=a+b;+b; y1y=1;; zaz =a+b+b becomes – y=a+b; t=y; y=1; z=t –Why ? Because saves values away in temporaries Saman Amarasinghe 30 6.035 ©MIT Fall 1998 One More Interesting Property • Flattening and CSE combine to capture partial and arbitrarily complex common subexpressions w=(a+b)+c; y=(a+x)+c; z=a+b; – After flattening: t1=a+b; wtw=t1+c;1+c; xbx=b;; t2 =a+x; yty=t2+c;2+c; z =aa+b;+b; x=b; – CSE algorithm notices that •t1+c andt 2+c compute same value • In the statement z = a+b, a+b has already been computed so generated code can reuse the result t1=a+b; w=t1+c; t3=w; x=b; t2=t1; y=t3; z=t1; Saman Amarasinghe 31 6.035 ©MIT Fall 1998 Problems I • Algorithm has a temporary for each new value – a=x+y; t1=a; •Introduces – lots of temporaries – lots of copy statements to temporaries • In many cases, temporaries and copy statements are unnecessary • S o we ellii mina te ththem withith copy propagation and dead code elimination Saman Amarasinghe 32 6.035 ©MIT Fall 1998 Problems II • Expressions have to be identical –

6.035 Lecture 9, Introduction to Program Analysis and Optimization

A Methodology for Assessing Javascript Software Protections

Attacking Client-Side JIT Compilers.Key

CS153: Compilers Lecture 19: Optimization

Quaxe, Infinity and Beyond

Dead Code Elimination for Web Applications Written in Dynamic Languages

Lecture Notes on Peephole Optimizations and Common Subexpression Elimination

Code Optimizations Recap Peephole Optimization

Validating Register Allocation and Spilling

Compiler-Based Code-Improvement Techniques

An Analysis of Executable Size Reduction by LLVM Passes

Just-In-Time Compilation

Formal Languages and Compilers Lecture XI—Principles of Code Optimization