Optimizations • Code transformations to improve program – Mainly: improve execution time – Also: reduce program size

Control Flow Graphs • Can be done at high level or low level – E.g., constant folding

• Optimizations must be safe – Execution of transformed code must yield same results as the original code for all possible executions

1 2

Optimization Safety Optimization Safety

• Safety of code transformations usually requires certain • Safety of code transformations usually requires certain information that may not be explicit in the code information which may not explicit in the code • Example: dead code elimination • Example: dead code elimination (1) x = y + 1; (1) x = y + 1; (2) y = 2 * z; (2) y = 2 * z; (3) x=y+z;x = y + z; (3) x=y+z;x = y + z; (4) z = 1; (4) z = 1; (5) z = x; (5) z = x; • What statements are dead and can be removed? • Need to know whether values assigned to x at (1) is never used later (i.e., x is dead at statement (1)) – Obvious for this simple example (with no ) – Not obvious for complex flow of control

3 4

1 Dead Variable Example Dead Variable Example • Add control flow to example: • Add control flow to example:

x = y + 1; x = y + 1; y = 2 * z; y = 2 * z; if (d) x = y+z; if (d) x = y+z; z = 1; z = 1; z = x; z = x;

• Statement x = y+1 is not dead code! • Is ‘x = y+1’ dead code? Is ‘z = 1’ dead code? •On some executions, value is used later

5 6

Dead Variable Example Dead Variable Example

• Add more control flow: • Add more control flow: while () { while (c) { x = y + 1; x = y + 1; y = 2 * z; y = 2 * z; if (d) x = y+z; if (d) x = y+z; z = 1; z = 1; } } z = x; z = x; • Statement ‘x = y+1’ not dead (as before) • Is ‘x = y+1’ dead code? Is ‘z = 1’ dead code? • Statement ‘z = 1’ not dead either! •On some executions, value from ‘z=1’ is used later

7 8

2 Low-level Code Low-level Code • Harder to eliminate dead code in low-level code: • Harder to eliminate dead code in low-level code: label L1 label L1 fjump c L2 fjump c L2 x = y + 1; x = y + 1; y = 2 * z; y = 2 * z; fjumpdL3fjump d L3 Are these fjumpdL3fjump d L3 x = y+z; statements x = y+z; label L3 dead? label L3 z = 1; z = 1; jump L1 jump L1 label L2 label L2 z = x; z = x;

9 10

Optimizations and Control Flow Control Flow Graphs • Application of optimizations requires information • Control Flow Graph (CFG) = graph representation – Dead code elimination: need to know if variables are of computation and control flow in the program dead when assigned values – framework for static analysis of program control-flow • Required information: • Nodes are blocks = straight-line, single- – Not explicit in the program entry code, no branching except at end of – Must compute it statically (at compile-time) sequence – Must characterize all dynamic (run-time) executions • Edges represent possible flow of control from the • Control flow makes it hard to extract information end of one block to the beginning of the other – Branches and loops in the program – There may be multiple incoming/outgoing edges for – Different executions = different branches taken, different number of loop iterations executed each block

11 12

3 CFG Example Basic Blocks Program •Basic block= sequence of consecutive Control Flow Graph x = z-2 ; statements such that: y = 2*z; B1 x = z-2; – Control enters only at beginning of sequence if (c) { y = 2*z; – Control leaves only at end of sequence x = x+1; if (c) incoming control y=y+1;y = y+1; T F } a = a+1; B2 x = x+1; x = x-1; B else { 3 b = c*a; y = y+1; y = y-1; x = x-1; switch(b) y = y-1; outgoing control } B4 z = x+y; z = x+y; • No branching in or out in the middle of basic blocks

13 14

Computation and Control Flow Multiple Program Executions

Control Flow Graph • CFG models all Control Flow Graph program executions B1 x = z-2; B1 x = z-2; •Basic Blocks = y = 2*z; • Possible execution = y = 2*z; Nodes in the graph = if (c) path in the graph if (c) computation in the T F T F program • Multiple paths = B B 2 x = x+1; x = x-1; B3 multiple possible 2 x = x+1; x = x-1; B3 • Edges in the graph = y = y+1; y = y-1; program executions y = y+1; y = y-1; control flow in the program B4 z = x+y; B4 z = x+y;

15 16

4 Execution 1 Execution 2

• CFG models all Control Flow Graph • CFG models all Control Flow Graph program executions program executions B1 x = z-2; B1 x = z-2; • Possible execution = y = 2*z; • Possible execution = y = 2*z; path in the graph if (c) path in the graph if (c) T F •Execution 1: •Execution 2: B x = x+1; B –c is true 2 – c is false x = x-1; 3 y = y+1; – Program executes – Program executes y = y-1;

basic blocks B1, basic blocks B1, B2, B4 B3, B4 B4 z = x+y; B4 z = x+y;

17 18

Infeasible Executions Edges Going Out

• CFG models all program Control Flow Graph • Multiple outgoing edges executions, and then • Basic block executed next may be one of the some B 1 x = z-2; successor basic blocks y = 2*z; • Possible execution = • Each outgoing edge = outgoing flow of control path in the graph if (c) F in some execution of the ppgrogram •Execution 2: x = x-1; B – c is false and true (?!) 3 Basic y = y-1; – Program executes Block basic blocks B1, B3, B4 – and the T successor B of B 4 z = x+y; outgoing edges 4 if (c) T 19 20

5 Edges Coming In Building the CFG

• Multiple incoming edges • Can construct CFG for either high-level IR • Control may come from any of the predecessor or the low-level IR of the program basic blocks • Each incoming edge = incoming flow of control • Build CFG for high-level IR in some execution of the ppgrogram – Construct CFG for each high-level IR node incoming edges • Build CFG for low-level IR Basic Block – Analyze jump and label statements

21 22

CFG for High-level IR CFG for Block Statement • CFG(S) = flow graph of high-level statement S • CFG(S) is single-entry, single-exit graph: •CFG( S1; S2; …; SN ) = – one entry node (basic block) – one exit node (basic block) CFG(S1) Entry CFG(S2) CFG(S) = … … Exit CFG(SN) • Recursively define CFG(S)

23 24

6 CFG for If-then-else Statement CFG for If-then Statement

•CFG ( if (E) S1 else S2 ) •CFG( if (E) S )

if (E) if (E) T F T CFG(S1) CFG(S2) CFG(S1) F

Empty basic block

25 26

CFG for While Statement Recursive CFG Construction • Nested statements: recursively construct CFG •CFG for: while (e) S while traversing IR nodes •Example:

while (c) { if (e) x=y+1;x = y + 1; T y = 2 * z; CFG(S) F if (d) x = y+z; z = 1; } z = x;

27 28

7 Recursive CFG Construction Recursive CFG Construction • Nested statements: recursively construct CFG • Nested statements: recursively construct CFG while traversing IR nodes while traversing IR nodes

while (c) { while (c) { if (c) T x=y+1;x = y + 1; CFG(while) x=y+1;x = y + 1; y = 2 * z; y = 2 * z; CFG(body) F if (d) x = y+z; CFG(z=x) if (d) x = y+z; z = 1; z = 1; z=x } } z = x; z = x;

29 30

Recursive CFG Construction Recursive CFG Construction • Nested statements: recursively construct CFG • Simple algorithm to build CFG while traversing IR nodes • Generated CFG if (c) – Each basic block has a single statement – There are empty basic blocks while (c) { x = y+1 x=y+1;x = y + 1; y = 2*z y = 2 * z; • Small basic blocks = inefficient F – Small blocks = many nodes in CFG if (d) x = y+z; CFG(if) z = 1; – uses CFG to perform optimization z = 1 } – Many nodes in CFG = compiler optimizations will be time- and space-consuming z = x; z = x

31 32

8 Efficient CFG Construction Example • Basic blocks in CFG: • Efficient CFG: – As few as possible if (c) – As large as possible while (c) { x = y+1 y =2*z x = y + 1; • There should be no pair of basic blocks (B1,B2) if (d) such that: y2*z;y = 2 * z; if (d) x = y+z; x = y+z – B2 is a successor of B1 z = 1; – B1 has one outgoing edge z = 1 } – B2 has one incoming edge z = x; z = x • There should be no empty basic blocks

33 34

CFG for Low-level IR CFG for Low-level IR

• Identify pre-basic blocks as sequences of: label L1 label L1 – Non-branching instructions fjump c L2 • Basic block start: – Non-label instructions fjump c L2 x = y + 1; – At label instructions x = y + 1; y = 2 * z; – After jump instructions y = 2 * z; • No branches (jump) instructions = control fjump d L3 doesn’t flow out of basic blocks fjump d L3 xy+z;x = y+z; • Basic blocks end: x = y+z; label L3 label L3 • No labels instructions = control doesn’t – At jump instructions z = 1; flow into blocks z = 1; – Before label instructions jump L1 jump L1 label L2 label L2 z = x; z = x;

35 36

9 CFG for Low-level IR CFG for Low-level IR label L1 label L1 fjump c L2 fjump c L2 • Conditional jump: if (c) x = y + 1; x = y + 1; 2 successors x = y+1 y = 2 * z; y =2*z y = 2 * z; fjump d L3 if (d) fjump d L3 • Unconditional jump: x = y+z; x = y+z; 1 successor x = y+z label L3 label L3 z = 1 z = 1; z = 1; jump L1 z = x jump L1 label L2 label L2 z = x; z = x; 37 38

10