CIS 341: COMPILERS Announcements

CIS 341: COMPILERS Announcements

Lecture 24 CIS 341: COMPILERS Announcements • HW6: Analysis & Optimizations – Alias analysis, constant propagation, dead code elimination, register allocation – Due: TOMORROW at Midnight • Final Exam: – Date: Friday, May 4th – Time: noon – 2:00pm – Location: Moore 216 – Coverage: cumulative, but emphasis on material since the midterm Zdancewic CIS 341: Compilers 2 Phi nodes Alloc “promotion” Register allocation ! REVISITING SSA Zdancewic CIS 341: Compilers 3 Dominator Trees • Domination is transitive: – if A dominates B and B dominates C then A dominates C • Domination is anti-symmetric: – if A dominates B and B dominates A then A = B • Every flow graph has a dominator tree – The Hasse diagram of the dominates relation 1 1 2 2 3 4 3 4 5 6 5 6 7 8 7 8 9 0 9 0 CIS 341: Compilers 4 Phi Functions • Solution: φ functions – Fictitious operator, used only for analysis • implemented by Mov at x86 level – Chooses among different versions of a variable based on the path by which control enters the phi node.! %uid = phi <ty> v1, <label1>, … , vn, <labeln> entry: int y = … %y1 = … int x = … %x1 = … int z = … %z1 = … %p = icmp … if (p) { br i1 %p, label %then, label %else x = y + 1; then: } else { %x2 = add i64 %y1, 1 br label %merge x = y * 2; else: } %x3 = mult i64 %y1, 2 z = x + 3; merge: %x4 = phi i64 %x2, %then, %x3, %else %z2 = %add i64 %x4, 3 Zdancewic CIS 341: Compilers 5 Alloca Promotion • Not all source variables can be allocated to registers – If the address of the variable is taken (as permitted in C, for example) – If the address of the variable “escapes” (by being passed to a function) • An alloca instruction is called promotable if neither of the two conditions above holds entry: %x = alloca i64 // %x cannot be promoted %y = call malloc(i64 8) %ptr = bitcast i8* %y to i64** store i65** %ptr, %x // store the pointer into the heap entry: %x = alloca i64 // %x cannot be promoted %y = call foo(i64* %x) // foo may store the pointer into the heap • Happily, most local variables declared in source programs are promotable – That means they can be register allocated Zdancewic CIS 341: Compilers 6 Phi Placement Alternative • Less efficient, but easier to understand: • Place phi nodes "maximally" (i.e. at every node with > 2 predecessors) • If all values flowing into phi node are the same, then eliminate it: %x = phi t %y, %pred1 t %y %pred2 … t %y %predK // code that uses %x ⇒ // code with %x replaced by %y • Interleave with other optimizations – copy propagation – constant propagation – etc. Zdancewic CIS 341: Compilers 7 Example SSA Optimizations l : %p = alloca i64 1 Find • How to place phi store 0, %p alloca %b = %y > 0 nodes without breaking SSA? br %b, %l2, %l3 max φs • Note: the “real” LAS/ implementation l2: combines many of these LAA steps into one pass. store 1, %p – Places phis directly at the dominance frontier br %l3 DSE • This example also illustrates other common DAE optimizations: l3: – Load after store/alloca %x = load %p – Dead store/alloca ret %x elim φs elimination Example SSA Optimizations l : %p = alloca i64 1 Find • How to place phi store 0, %p alloca %b = %y > 0 nodes without %x1 = load %p breaking SSA? br %b, %l2, %l3 max φs • Insert LAS/ l2: LAA – Loads at the store 1, %p end of each %x = load %p 2 block br %l3 DSE DAE l3: %x = load %p ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • How to place phi store 0, %p alloca %b = %y > 0 nodes without %x1 = load %p breaking SSA? br %b, %l2, %l3 max φs • Insert LAS/ l2: %x3 = φ[%x1,%l1] LAA – Loads at the store 1, %p end of each %x = load %p 2 block br %l3 DSE – Insert φ-nodes at each block DAE l3: %x4 = φ[%x1;%l1, %x2:%l2] %x = load %p ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • How to place phi store 0, %p alloca %b = %y > 0 nodes without %x1 = load %p breaking SSA? br %b, %l2, %l3 max φs • Insert l : %x = φ[%x ,%l ] LAS/ 2 3 1 1 – Loads at the store %x3, %p LAA store 1, %p end of each %x = load %p 2 block br %l3 DSE – Insert φ-nodes at each block DAE l3: %x4 = φ[%x1;%l1, %x2:%l2] – Insert stores store %x4, %p %x = load %p after φ-nodes ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • For loads after store 0, %p alloca %b = %y > 0 stores (LAS): %x = load %p 1 – Substitute all br %b, %l , %l 2 3 max φs uses of the load by the value LAS/ being stored l2: %x3 = φ[%x1,%l1] store %x3, %p LAA – Remove the load store 1, %p %x2 = load %p br %l3 DSE DAE l3: %x4 = φ[%x1;%l1, %x2:%l2] store %x4, %p %x = load %p ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • For loads after store 0, %p alloca %b = %y > 0 stores (LAS): %x = load %p 1 – Substitute all br %b, %l , %l 2 3 max φs uses of the load by the value LAS/ being stored l2: %x3 = φ[%x1,%l1] store %x3, %p LAA – Remove the load store 1, %p %x2 = load %p br %l3 DSE DAE l3: %x4 = φ[%x1;%l1, %x2:%l2] store %x4, %p %x = load %p ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • For loads after store 0, %p alloca %b = %y > 0 stores (LAS): %x = load %p 1 – Substitute all br %b, %l , %l 2 3 max φs uses of the load by the value LAS/ being stored l2: %x3 = φ[0,%l1] store %x3, %p LAA – Remove the load store 1, %p %x2 = load %p br %l3 DSE DAE l3: %x4 = φ[0;%l1, %x2:%l2] store %x4, %p %x = load %p ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • For loads after store 0, %p alloca %b = %y > 0 stores (LAS): – Substitute all br %b, %l , %l 2 3 max φs uses of the load by the value LAS/ being stored l2: %x3 = φ[0,%l1] store %x3, %p LAA – Remove the load store 1, %p %x2 = load %p br %l3 DSE DAE l3: %x4 = φ[0;%l1, %x2:%l2] store %x4, %p %x = load %p ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • For loads after store 0, %p alloca %b = %y > 0 stores (LAS): – Substitute all br %b, %l , %l 2 3 max φs uses of the load by the value LAS/ being stored l2: %x3 = φ[0,%l1] store %x3, %p LAA – Remove the load store 1, %p %x2 = load %p br %l3 DSE DAE l3: %x4 = φ[0;%l1, 1:%l2] store %x4, %p %x = load %p ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • For loads after store 0, %p alloca %b = %y > 0 stores (LAS): – Substitute all br %b, %l , %l 2 3 max φs uses of the load by the value LAS/ being stored l2: %x3 = φ[0,%l1] store %x3, %p LAA – Remove the load store 1, %p br %l3 DSE DAE l3: %x4 = φ[0;%l1, 1:%l2] store %x4, %p %x = load %p ret %x elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • For loads after store 0, %p alloca %b = %y > 0 stores (LAS): – Substitute all br %b, %l , %l 2 3 max φs uses of the load by the value LAS/ being stored l2: %x3 = φ[0,%l1] store %x3, %p LAA – Remove the load store 1, %p br %l3 DSE DAE l3: %x4 = φ[0;%l1, 1:%l2] store %x4, %p %x = load %p ret %x4 elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • Dead Store store 0, %p alloca %b = %y > 0 Elimination (DSE) – Eliminate all br %b, %l , %l 2 3 max φs stores with no subsequent LAS/ loads. l2: %x3 = φ[0,%l1] store %x3, %p LAA store 1, %p • Dead Alloca br %l DSE 3 Elimination (DAE) – Eliminate all allocas with no l : %x = φ[0;%l , 1:%l ] DAE 3 4 1 2 subsequent store %x4, %p loads/stores. ret %x4 elim φs Example SSA Optimizations l : %p = alloca i64 1 Find • Dead Store store 0, %p alloca %b = %y > 0 Elimination (DSE) – Eliminate all br %b, %l , %l 2 3 max φs stores with no subsequent LAS/ loads. l2: %x3 = φ[0,%l1] store %x3, %p LAA store 1, %p • Dead Alloca br %l DSE 3 Elimination (DAE) – Eliminate all allocas with no l : %x = φ[0;%l , 1:%l ] DAE 3 4 1 2 subsequent store %x4, %p loads/stores. ret %x4 elim φs Example SSA Optimizations l : 1 Find • Eliminate φ nodes: alloca %b = %y > 0 – Singletons – With identical br %b, %l2, %l3 max φs values from each l : %x = [0,%l ] LAS/ 2 3 φ 1 predecessor LAA – See Aycock & Horspool, 2002 br %l3 DSE DAE l3: %x4 = φ[0;%l1, 1:%l2] ret %x4 elim φs Example SSA Optimizations l : 1 Find • Eliminate φ nodes: alloca %b = %y > 0 – Singletons – With identical br %b, %l2, %l3 max φs values from each l : %x = [0,%l ] LAS/ 2 3 φ 1 predecessor LAA br %l3 DSE DAE l3: %x4 = φ[0;%l1, 1:%l2] ret %x4 elim φs Example SSA Optimizations l : 1 Find • Done! alloca %b = %y > 0 br %b, %l2, %l3 max φs LAS/ l2: LAA br %l3 DSE DAE l3: %x4 = φ[0;%l1, 1:%l2] ret %x4 elim φ LLVM Phi Placement • This transformation is also sometimes called register promotion – older versions of LLVM called this “mem2reg” memory to register promotion • In practice, LLVM combines this transformation with scalar replacement of aggregates (SROA) – i.e.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    52 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us