Overview

♦ In this second part of the course we will talk Implementation of about how to implement: High Level Languages ♦ Objects and inheritance. ♦ FPLs: higher order functions, laziness. ♦ Advanced Compiler Techniques Concurrency: processes, message passing. ♦ 2005 Automatic memory management. (GC) Erik Stenman ♦ Virtual Machines. EPFL ♦ Just in time compilation.

Advanced Compiler Techniques 5/6/2005 2 http://lamp.epfl.ch/teaching/advancedCompiler/

Implementation of High Level Implementation of Languages Object Oriented Languages

♦We will look at some simple ways to ♦In class based OO languages each object implement concepts in HLL. belongs to a class that defines the fields, ♦We will look at some more complex and methods, and the type of the object. more efficient implementations of these class A { A a = new A;

concepts. Implementation OO of int x = 42; a.foo (); int y = 17; VMT Code: foo ♦We will also look at some general Stack A a: VMT Code: foo foo : return x optimization techniques that can be used Heap int foo () { header : with great advantage in HLL. return x; x: 42 } y: 17 } Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 3 http://lamp.epfl.ch/teaching/advancedCompiler/ 4 http://lamp.epfl.ch/teaching/advancedCompiler/

Implementation of Implementation of Object Oriented Languages Object Oriented Languages

♦In class based OO languages each object ♦In class based OO languages each object belongs to a class that defines the fields, belongs to a class that defines the fields, methods, and the type of the object. methods, and the type of the object.

class A { A a = new A; Reference to object: class A { A a = new A; Representation of object:

Implementation OO of int x = 42; a.foo (); many/object. Implementation OO of int x = 42; a.foo (); 1/object. int y = 17; VMT Code:foo int y = 17; VMT Code:foo Stack A a: VMT Code:foo Stack A a: VMT Code:foo foo: return x foo: return x Heap Heap int foo () { header: int foo () { header : return x; x: 42 return x; x: 42 } y: 17 } y: 17 } } Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 5 http://lamp.epfl.ch/teaching/advancedCompiler/ 6 http://lamp.epfl.ch/teaching/advancedCompiler/

1 Implementation of Implementation of Object Oriented Languages Object Oriented Languages

♦In class based OO languages each object ♦In class based OO languages each object belongs to a class that defines the fields, belongs to a class that defines the fields, methods, and the type of the object. methods, and the type of the object.

class A { A a = new A; Virtual Method Table: class A { A a = new A; Code for functions (foo):

Implementation OO of a.foo (); Implementation OO of a.foo (); int x = 42; 1/class. int x = 42; max 1/class. int y = 17; VMT Code:foo int y = 17; VMT Code: foo Stack A a: VMT Code:foo Stack A a: VMT Code: foo foo : return x foo: return x Heap Heap int foo () { header: int foo () { header: return x; x: 42 return x; x: 42 } y: 17 } y: 17 } } Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 7 http://lamp.epfl.ch/teaching/advancedCompiler/ 8 http://lamp.epfl.ch/teaching/advancedCompiler/

Implementation of Single Inheritance: Object Oriented Languages Fields

♦Object Oriented languages support ♦With single inheritance we can order the inheritance. fields in such a way that all fields of a class ♦Inheritance complicates the answer to some are stored after fields of the superclass. questions: ♦This way we know at the Implementation OO of ♦ Where is the value of a field stored? offset of each field. ♦ Where is the code for a certain method? Implementation OO:of Single inheritance ♦ What type will a value have at runtime?

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 9 http://lamp.epfl.ch/teaching/advancedCompiler/ 10 http://lamp.epfl.ch/teaching/advancedCompiler/

Single Inheritance: Single Inheritance: Fields Fields

Heap header: ♦Example: x: 0 class A {int x = 0;} header: class A { int x = 0; } class B extends A {int y = 0; x: 0 int z = 0;} y: 0 class C extends A {int r = 0;} header: class B extends A { int y = 0; class extends B {int s = 0;} x: 0 z: 0 y: 0 s: 0 int z = 0; } z: 0 Offsets: class C extends A { int r = 0; } Stack A a: (A,B,C,D). x: 1 class D extends A { int s = 0; } B b:

Implementation OO:of Single inheritance Implementation OO:of Single inheritance (B,D). y: 2 C c: header: (B,D). z: 3 x: 0 D d: r: 0 (C). r: 2 (D). s: 4

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 11 http://lamp.epfl.ch/teaching/advancedCompiler/ 12 http://lamp.epfl.ch/teaching/advancedCompiler/

2 Single Inheritance: Single Inheritance: Methods Methods

♦ If we only have single inheritance we can handle ♦Example: methods in much the same way as fields. ♦ We store addresses to methods in the VMT class A { int f {…}; } instead of in the object. class B extends A { int g {…}; } ♦ We copy all the addresses of the super classes to class C extends B { int f {…}; } the VMT of the subclasses. ♦ If a method is overridden we use the address of the new definition instead of the definition in the Implementation OO:of Single inheritance superclass. Implementation OO:of Single inheritance

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 13 http://lamp.epfl.ch/teaching/advancedCompiler/ 14 http://lamp.epfl.ch/teaching/advancedCompiler/

Single Inheritance: Single Inheritance: Methods Testing Class Membership

class A {int f {…}; } VMT ( AAA))) class B extends A { int g {…}; } f: class C extends B { ♦Many OO languages allow you to test class Heap int f {…}; } VMT ( BBB))) Code: A_f header: f: membership of an object. header: g: Code: B_g ♦In Java there is “o instanceof C”. header: VMT ( CCC))) Code: C_f ♦An object is a member of all its A a = new A; f: B b = new B; g: superclasses. C c = new C; b.g(); ♦ c.f(); We need to be able to find the superclass of Implementation OO:of Single inheritance Implementation OO:of Single Inheritance a class. Let us extend our implementation LD r1,SP(8) ; Get b LD r1,SP(4) ; Get c LD r2,r1(0) ; Get &VMT( B) LD r2,r1(0) ; Get &VMT( C) with class descriptors. LD r3,r2(4) ; Get & B_g LD r3,r2(0) ; Get & C_f call r3 ; Call B_g call r3 ; Call C_f Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 15 http://lamp.epfl.ch/teaching/advancedCompiler/ 16 http://lamp.epfl.ch/teaching/advancedCompiler/

Single Inheritance: Single Inheritance: Class Membership Testing Class Membership class A {int f {…}; } Class AAA class B extends A { super: int g {…}; } VMT class C extends B { f: ♦Searching through the class hierarchy is int f {…}; } Code: A_f Heap Class BBB inefficient. A a = new A; Heap B b = new B; header: super: Code: B_g C c = new C; VMT f: ♦We can trade space for speed. header: c instanceof A; g: Code: C_f ♦Let each class descriptor have a display of Now we can do header: c instanceof A as: Class CCC all superclasses. I.E., a direct link to each t = c.header super: Implementation OO:of Single inheritance Implementation OO:of Single Inheritance LLL: if t == A goto True VMT f: superclass. t = t.super g: if t != nil goto LLL res = false goto End True : res = true

End : Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 17 http://lamp.epfl.ch/teaching/advancedCompiler/ 18 http://lamp.epfl.ch/teaching/advancedCompiler/

3 Single Inheritance: Class Membership Class A class A {} class B extends A { } level: 1 class C extends B { } s ♦In languages with multiple inheritance, i.e., Class B VMT where it is possible to extend several parent A a = new A; Heap B b = new B; header: level: 2 classes with a class, all the operations we C c = new C; ss c instance of A; header: s have seen become more difficult. header: Now we can do VMT c instance of A as: ♦Java’s hybrid approach with interfaces t1 = c.header Class C complicates these issues in the same way as

Implementation OO:of Single Inheritance res = t1[0] >= 1 \\\\\\A_level level: 3 if !res goto End sss Implementation OO:of Inheritance Multiple multiple inheritance. t2 = t1[2222] \\ 2<-A_level+1 ss res = (t2 == A)A AA s End: VMT Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 19 http://lamp.epfl.ch/teaching/advancedCompiler/ 20 http://lamp.epfl.ch/teaching/advancedCompiler/

Multiple Inheritance: Multiple Inheritance: Graph Coloring Graph Coloring

Heap header: ♦ One way to handle the layout of fields would be x: 0 class A {int x = 0;} to use graph coloring. (This can also be used for class B {int y = 0; methods.) int z = 0;} class C extends A,B {int r = 0;} header: ----- ♦ All identical fields would have to occupy the y: 0 z: 0 same offset in the object. Offsets: Stack A a: ♦ For some objects there would be holes in the array (A,C). x: 1 B b: of fields. To reduce the wasted space the fields (B,C). y: 2 Implementation OO:of inheritance Multiple Implementation OO:of Inheritance Multiple C c: header: can be compacted in the object by storing the (B,C). z: 3 x: 0 y: 0 offsets in the class descriptor. (C).r: 4 z: 0 r: 0

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 21 http://lamp.epfl.ch/teaching/advancedCompiler/ 22 http://lamp.epfl.ch/teaching/advancedCompiler/

Multiple Inheritance: Multiple Inheritance: Graph Coloring Graph Coloring class A {int x = 0;} Class A class B {int y = 0; Heap int z = 0;} Heap header: x: 1 class C extends A,B {int r = 0;} x: 0 ♦ A a = new A; One problem with global graph coloring is B b = new B; Class B B d = new B; header: that it is global: you need the whole C c = new C; y: 0 ---- program – must be done at link time. z: 0 y: 1 z: 2 ♦If dynamic linking is possible this approach Offsets: header: Stack A a: y: 0 Class C becomes even harder. (A,C). x: header[0] z: 0 B b: (B,C). y: header[1] x: 1 Implementation OO:of inheritance Multiple B d: header: y: 2 Implementation OO:of Inheritance Multiple (B,C). z: header[2] C c: x: 0 z: 3 (C). r: header[3] y: 0 r: 4 z: 0 r: 0 Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 23 http://lamp.epfl.ch/teaching/advancedCompiler/ 24 http://lamp.epfl.ch/teaching/advancedCompiler/

4 Multiple Inheritance: Multiple Inheritance: Hashing Trampolines

♦ Second approach: Hashing. ♦Third approach: Trampoline functions. ♦ Instead of a global compile- or link time solution we can calculate a hash value for each name at compile time. ♦We give each object several headers, one ♦ At runtime we use the hash value as an offset into a hash for each extended class. table in the class descriptor. ♦ This contains the offset to fields in the object. ♦We add trampoline functions that changes (This also works for method addresses.) the view of the object from one class to ♦ This can be costly if there are many collisions in the hash another in an efficient way.

Implementation OO:of Inheritance Multiple table. Implementation OO:of Inheritance Multiple

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 25 http://lamp.epfl.ch/teaching/advancedCompiler/ 26 http://lamp.epfl.ch/teaching/advancedCompiler/

Multiple Inheritance: Multiple Inheritance: Trampolines Trampolines class A { class A { int x = 0; C VMTVMT----AAAA Code: tramp int x = 0; C VMTVMT----AAAA Code: tramp int f() {…}} int f() {…}} return r1 return r1 class B { tramp: class B { tramp: int y = 0; int y = 0; int g() {… y …}} f: Code: f int g() {… y …}} f: Code: f class C extends A,B { class C extends A,B { int z = 0;} C VMTVMT----BBBB int z = 0;} C VMTVMT----BBBB Code: tramp Code: tramp C c1 = new C(); C c1 = new C(); header: header: A a = ( A) c1 ; tramp: r1 = r1 – 8 A a = ( A) c1 ; tramp: r1 = r1 – 8 x: 0 x: 0 C c2 = ( C) a; tramp_g: return r1 C c2 = ( C) a; tramp_g: return r1 B b = ( B) c2 ; header: B b = ( B) c2 ; header: C c3 = ( C) b; y: 0 Code: tramp_g C c3 = ( C) b; y: 0 Code: tramp_g z: 0 r1 = r1 + 8 z: 0 r1 = r1 + 8 call g call g

Implementation OO:of inheritance Multiple c1 = Implementation OO:of inheritance Multiple c1 = a = c1; Code: g a = c1 ; Code: g c2 = a.tramp(); /* = a */ c2 = a.tramp (); /* = a */ b = c2+8; b = c2 + 8; c3 = b.tramp(); /* = b-8 */ c3 = b.tramp (); /* = b - 8 */ Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 27 http://lamp.epfl.ch/teaching/advancedCompiler/ 28 http://lamp.epfl.ch/teaching/advancedCompiler/

Multiple Inheritance: Optimizing OO-Programs Trampolines class A { int x = 0; C VMTVMT----AAAA Code: tramp int f() {…}} return r1 class B { tramp: ♦ In modern machines a jump to a known address int y = 0; f: is much faster than a jump to an address fetched inf g() {… y …}} Code: f class C extends A,B { C VMTVMT----BBBB from a table. int z = 0;} Code: tramp ♦ also makes inlining and C c1 = new C(); header: tramp: r1 = r1 – 8 interprocedural analysis harder. A a = (A) c1; x: 0 tramp_g: return r1 C c2 = (C) a; header: ♦ Possible solutions: Whole program optimization, B b = (B) c2; y: 0 Code: tramp_g C c3 = (C) b; link time optimization, JIT compilation, or z: 0 r1 = r1 + 8 c1 .z; // c1 [16] runtime optimizations. call g Implementation OO:of Optimizations c1 .x; // c1 [4] Implementation OO:of inheritance Multiple ♦ When we have the whole program we can turn c1 .y; // c1 [12] Code: g c1 .g(); // t=c[8]; t2 =t[8]; call t2 ; many dynamic properties into static properties. a.f(); // t=a[0]; t2 =t[8]; call t2 ; … b.g(); // t=b[0]; t2 =t[8]; call t2 ; y // r1[4]

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 29 http://lamp.epfl.ch/teaching/advancedCompiler/ 30 http://lamp.epfl.ch/teaching/advancedCompiler/

5 Inline caching Polymorphic Inline Caching

♦Many dynamic calls actually go to the same ♦If a call site is polymorphic inline caching class all the time. can lead to degraded performance. ♦For each call site remember the actual ♦Solution: Polymorphic inline caching, target of the last call. remember more than one target address. ♦Next time jump directly to this location,

Implementation OO:of Optimizations and check if we end up in the right place. Implementation OO:of Optimizations

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 31 http://lamp.epfl.ch/teaching/advancedCompiler/ 32 http://lamp.epfl.ch/teaching/advancedCompiler/

Polymorphic Inline Caching OO: Summary

♦Polymorphic inline caching can be ♦ Implementing OO efficiently means implemented with an if then else search implementing inheritance efficiently. tree: ♦ There are several possible solution available and there is still research going on in this area. v.f()

Implementation OO of ♦ One of the most successful techniques for

if c.header < C { optimizing OO is to do it at runtime using JIT Implementation OO:of Optimizations if c.header < B A.f() else B.f() compilation – something we will look closer at } else { if c.header < D C.f() else D.f() later in the course. }

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 33 http://lamp.epfl.ch/teaching/advancedCompiler/ 34 http://lamp.epfl.ch/teaching/advancedCompiler/

Implementation of Functional Higher Order Functions Programming Languages

♦ There is no common agreement on exactly what a ♦ In Eins (and in C) you have “second”-order functions. ♦ That is, functions are also values in the language: you can take functional is. But usually their addresses and pass them around and apply them. def apply (fff: (Int) => Int , x: Int ): Int = fff(x); such a language should have at least one of the ♦ These functions can be represented with just a function pointer, following concepts: i.e., the address of the function. ♦ No statements – only functions (or expressions). ♦ Functions that take functions as arguments are called

Implementation FPL of Implementation FPL of higher order functions . ♦ Higher order functions. ♦ For a language to have interesting higher order functions ♦ Pureness (no side effects). you need to be able to create new functions at runtime. E.g., in Scala you can write: ♦ Laziness. val fff:( Int => Int ) = x => x + 1; ♦ Automatic memory management (Garbage collection.)

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 35 http://lamp.epfl.ch/teaching/advancedCompiler/ 36 http://lamp.epfl.ch/teaching/advancedCompiler/

6 Higher Order Functions Higher Order Functions

♦ In an OO language a closure can be implemented as an object with a ♦ To get really interesting functions at runtime you single method and several instance variables. need to be able to capture the free variables of the def f(y:Int ):( Int => Int ) = x => x + y; function. f(42)(17) ♦ class F { A free variable is a variable that is not bound by the int y; definition of the function. ( yyy is free in x => x+yyy.) public F( int y) { this .y = y; } public int apply (int x) { Implementation FPL of Implementation FPL of return x+y; def f(y:Int ):( Int => Int ) = x => x + y; } ♦ In order to do this we need closures . } public F f(int y) = new F(y); ♦ A closure is a data structure that contains a f(42). apply (17); function pointer and a way to access all free variables of the body of the function. Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 37 http://lamp.epfl.ch/teaching/advancedCompiler/ 38 http://lamp.epfl.ch/teaching/advancedCompiler/

Higher Order Functions Higher Order Functions

def f(y:Int):(Int => Int) = { class IntRef { ♦ This is more or less the way Scala implements var z = y*2; int v; functions. val f = x=>x+z; public IntRef(int i) {v=i;} z = z +1; public set(int i) {v=i;} ♦ To make it more general we can make all closures f; } } implement the Function interface: public F f(int y) = { public interface Function1 { class F { IntRef z = new IntRef(y*2); F f = new F(z); public abstract java.lang.Object apply(java.lang.Object a0); IntRef y; public F(IntRef y) { z.set(z.v + 1); Implementation FPL of Implementation FPL of } this.y = y; } return f; public int apply(int x) { } ♦ We also need to take care of local (mutable) return x+y.v; } variables that are captured by the function. This can } be done by turning them into references.

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 39 http://lamp.epfl.ch/teaching/advancedCompiler/ 40 http://lamp.epfl.ch/teaching/advancedCompiler/

Pure Functional Languages Lazy Evaluation

♦ In a pure functional language there are no side ♦With lazy evaluation , an expression is not effects . ♦ This includes no updates of variables. That is, evaluated unless its value is demanded by variables are immutable. some other part of the computation. ♦ Variables are, like variables in mathematics, just names for values. ♦In contrast, strict languages (Java, ML, C, Implementation FPL of ♦ If we say x = 42; then we give the value 42 a new Implementation FPL of Erlang) evaluate each expression as the name: x, from now on x and 42 are interchangeable. reaches it. ♦ With a pure functional language it is possible to do equational reasoning .

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 41 http://lamp.epfl.ch/teaching/advancedCompiler/ 42 http://lamp.epfl.ch/teaching/advancedCompiler/

7 Call-by-Name Evaluation Call-by-Name Evaluation

♦Most languages pass function arguments ♦ Call-by-name evaluation avoids this problem by not evaluating the arguments, instead a is using call-by-value: created for each argument. ♦ i.e. all arguments are evaluated before a ♦ A thunk is a function that can be called to function is called. compute the value on demand. ♦ e.g. in the expression f(g(x+y)) , first (x+y) is f(g(x+y)) is translated to Implementation FPL of evaluated then the function g is called before Implementation FPL of f(()=>g(()=>x+y)) ♦ Any use of the argument in f is replaced by an application the function f is called. of the function: ♦ If the function f doesn’t use its argument then f(x) = x; is translated to the evaluation of g and of x+y is done in vane. f(x) = x();

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 43 http://lamp.epfl.ch/teaching/advancedCompiler/ 44 http://lamp.epfl.ch/teaching/advancedCompiler/

Call-by-Name Evaluation Call-by-Need

♦Scala provides call-by-name with explicit ♦With call-by-need each thunk is only def parameters. evaluated once. ♦A problem with call-by-name is that a ♦This is implemented by giving each thunk a thunk may be executed many times. memo slot that stores the evaluated value; Implementation FPL of f(x) = x+x; is translated to Implementation FPL of each evaluation of the thunk first checks f(x) = x()+x(); the memo slot: if it is empty the expression is evaluated and stored in the slot, otherwise the value in the slot is returned.

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 45 http://lamp.epfl.ch/teaching/advancedCompiler/ 46 http://lamp.epfl.ch/teaching/advancedCompiler/

Call-by-Need Call-by-need

Conceptually a thunk for x+y can be implemented as: ♦A thunk can also be implemented just as class Thunk { two words res = null; ♦When the thunk is evaluated both fields are apply() = { if res == null then res = x+y updated: the memo slot with the value and Implementation FPL of Implementation FPL of else res the function with a new function that } returns the value. }

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 47 http://lamp.epfl.ch/teaching/advancedCompiler/ 48 http://lamp.epfl.ch/teaching/advancedCompiler/

8 Optimization of FP Inline Expansion

♦ Functional programs tend to use many small ♦ Inline expansion or inlining is an optimization functions. Modern hardware is optimized for where a function call is replaced by the body of imperative programs with few large functions, the function. i.e., function calls are relatively expensive. ♦ If this is done in a stage in the compiler where all ♦ Hence it can be profitable to reduce the number of independent names are replaced by unique Implementation FPL of Implementation FPL of function calls and increase the size of functions. symbols then the process is quite straightforward. ♦ This can be done by inline expansion . Otherwise the formal parameters need to be renamed ( α-converted).

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 49 http://lamp.epfl.ch/teaching/advancedCompiler/ 50 http://lamp.epfl.ch/teaching/advancedCompiler/

Inline Expansion Inline Expansion

♦ If inline expansion is applied ♦If we inline a recursive function just as any indiscriminately, the size of the program other function we would probably end up explodes. with a call to the original function. Either ♦ To limit the code growth we can: directly after the first iteration or after a Implementation FPL of 1. Expand only frequent call sites. Implementation FPL of while. 2. Expand only small functions. 3. Expand functions called only once, and perform dead function elimination .

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 51 http://lamp.epfl.ch/teaching/advancedCompiler/ 52 http://lamp.epfl.ch/teaching/advancedCompiler/

Inline Expansion Inline Expansion def loop (int x, int max , int y) = if ( x > max ) y else loop (x + 1, y * y); ♦To remedy this we can bring the definition def f(int z) = of the recursion with us in the inlining by loop (1,10, z); splitting the function into a prelude and a  loop header .

Implementation FPL of def f(int z) = { Implementation FPL of val x = 1; val max = 10; val y = z; if ( x > max ) y else loop (x + 1, max , y * y); }

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 53 http://lamp.epfl.ch/teaching/advancedCompiler/ 54 http://lamp.epfl.ch/teaching/advancedCompiler/

9 Inline Expansion Loop-Invariant Hoisting def loop (int x, int max , int y) = if ( x > max ) y else loop (x + 1, y * y); ♦We can avoid passing around values that def f(int z) = loop (1,10, z); are the same in each recursive call by using  def f(int z) = { loop-invariant hoisting . val x = 1; val max = 10; val y = z; val loop = ♦Just let the constant value become a free (int xX , int maxX , int yX ) => Implementation FPL of if ( xX > maxX ) yX Implementation FPL of variable. else loop (xX + 1, maxX , yX * yX ); if ( x > max ) y ♦In our example lift max from an argument else loop (x + 1, max , y * y); } to a free variable.

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 55 http://lamp.epfl.ch/teaching/advancedCompiler/ 56 http://lamp.epfl.ch/teaching/advancedCompiler/

Loop-Invariant Hoisting Inline Expansion

def loop (int x, int max , int y) = if ( x > max ) y else loop (x + 1, y * y); ♦ Inline expansion in itself can be useful since the def f(int z) = loop (1,10, z); overhead for a function call and return is removed, but the real benefit comes from  applying standard optimizations on the inline def f(int z) = { val x = 1; val max = 10; val y = z; expanded program.

Implementation FPL of val loop = Implementation FPL of (int xX , int yX ) => ♦ Constant propagation and folding, dead code and if ( xX > max ) yX else loop (xX + 1, yX * yX ); unreachable code elimination all work better if ( x > max ) y when the scope (of a function) is increased. else loop (x + 1, y * y); }

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 57 http://lamp.epfl.ch/teaching/advancedCompiler/ 58 http://lamp.epfl.ch/teaching/advancedCompiler/

Inline Expansion Inline Expansion & constant prop & copy prop

def loop (int x, int max , int y) = def loop (int x, int max , int y) = if ( x > max ) y if ( x > max ) y else loop (x + 1, y * y); else loop (x + 1, y * y);

def f(int z) = def f(int z) = loop (1,10, z); loop (1,10, z);

  def f(int z) = { def f(int z) = { val x = 1; val max = 10; val y = z; val x = 1; val max = 10; val y = z; Implementation FPL of val loop = Implementation FPL of val loop = (int xX , int yX ) => (int xX , int yX ) => if ( xX > 10) yX if ( xX > 10) yX else loop (xX + 1, yX * yX ); else loop (xX + 1, yX * yX ); if ( x > 10) y if (1 > 10) z else loop (x + 1, y * y); else loop (1 + 1, z * z); } }

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 59 http://lamp.epfl.ch/teaching/advancedCompiler/ 60 http://lamp.epfl.ch/teaching/advancedCompiler/

10 Inline Expansion Efficient Tail Calls & constant folding

def loop (int x, int max , int y) = if ( x > max ) y else loop (x + 1, y * y); ♦A function call f(x) within a body of a

def f(int z) = function g is in a tail position if calling f is loop (1,10, z); the last thing g will do before returning.  ♦We can save stack space and execution time def f(int z) = { Implementation FPL of val loop = Implementation FPL of by turning the call to f into a jump to f. (int xX , int yX ) => ♦ if ( xX > 10) yX For some languages, like Erlang and else loop (xX + 1, yX * yX ); Scheme, proper tail calls is not an loop (2, z * z); } optimization but a requirement .

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 61 http://lamp.epfl.ch/teaching/advancedCompiler/ 62 http://lamp.epfl.ch/teaching/advancedCompiler/

Tail Calls Equational Reasoning

♦ A tail call can be transformed from a call to a ♦In a pure language we can perform jump as follows: β-substitution . 1. Move actual parameters into argument registers (and stack positions). ♦ That is, replacing a call to a function with a 2. Restore callee-save registers. version of the body of the function where each 3. Pop the stack frame of the calling function. occurrence of the formal parameter is replaced Implementation FPL of 4. Jump to the callee. Implementation FPL of by the argument. ♦ If both the caller and the callee have few arguments so ♦ that they all fit in argument registers then step 1 might (( x) => x + x)(42) β→ 42 + 42 be eliminated by a coalescing register allocator, and step ♦Basically: we can perform function calls at 2 and 3 might also be unnecessary: the tail call becomes just a jump. compile time.

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 63 http://lamp.epfl.ch/teaching/advancedCompiler/ 64 http://lamp.epfl.ch/teaching/advancedCompiler/

Optimization of Lazy FP Optimization of Lazy FP

♦A lazy language allows us to do some ♦ Invariant hoisting: def f(i) = { optimizations that would not be safe in a def g(j) = h(i) * j; strict language: g } ♦ Invariant hoisting. --- def f(i) = { Implementation FPL of ♦ Dead code removal (of function calls). Implementation FPL of val h = h(i); def g(j) = h * j; ♦ Strictness Analysis. g } ♦ If h(n) loops infinitely but the result of f(n) is never called a strict language would loop in the call to f(n).

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 65 http://lamp.epfl.ch/teaching/advancedCompiler/ 66 http://lamp.epfl.ch/teaching/advancedCompiler/

11 Optimization of Lazy FP Optimization of Lazy FP

♦ Dead code removal: ♦ The overhead of thunk creation and evaluation is quite def f(i:int): int = { high, so they should only be used when needed. var d = g(x); ♦ If a function f(x) is certain to evaluate its argument x, i + 2; there is no need to create a thunk for x. } ♦ We can use a strictness analysis to find out which ♦ In an imperative language g(x) can not be arguments should be evaluated at the call site and which Implementation FPL of removed, there might be side effects. Implementation FPL of should be passed as . ♦ In a strict pure language removing g(x) might ♦ In general exact strictness analysis is not computable – a turn a non-terminating computation into a conservative approximation must be used, i.e., assume that arguments who can not be proved strict are non- terminating one. strict.

Advanced Compiler Techniques 5/6/2005 Advanced Compiler Techniques 5/6/2005 67 http://lamp.epfl.ch/teaching/advancedCompiler/ 68 http://lamp.epfl.ch/teaching/advancedCompiler/

12