10/30/2007

First-Class Functions

• Data values are first-class if they can – be assigned to local variables – be components of data structures – be passed as arguments to functions Functional Languages and – be returned from functions Higher-Order Functions – be created at run-time • How functions are treated by programming languages? language passed as arguments returned from functions nested Java No No No Leonidas Fegaras CYesYesNo C++ Yes Yes No Pascal Yes No Yes Modula-3 Yes No Yes Scheme Yes Yes Yes ML Yes Yes Yes

Functional Languages and Higher-Order Functions 1 Functional Languages and Higher-Order Functions 2

Function Types How can you do this in Java?

• A new type constructor interface Comparison { (T1,T2,...,Tn) -> T0 boolean compare ( int x, int y ); Takes n arguments of type T1, T2, ..., Tn and returns a value of type T0 } Unary function: T1 -> T0 Nullary function: () -> T0 void sort ( int[] A, Comparison cmp ) { •Example: for (int i = 0; iboolean ) { for (int j=i+1; j= y; } sort(A,new Leq()); sort(A,leq) sort(A,geq) Functional Languages and Higher-Order Functions 3 Functional Languages and Higher-Order Functions 4

... or better Nested Functions class Comparison { • Without nested scopes, a function may be represented as a pointer abstract boolean compare ( int x, int y ); to its code } • Functional languages (Scheme, ML, Haskell), as well as Pascal and Modula-3, support nested functions sort(A,new Comparison() – They can access variables of the containing lexical scope { boolean compare ( int x, int y ) { return x <=y; } }) plot ( f: (float)->float ) { ... }

plotQ ( a, b, c: float ) { p ( x: float ) { return a*x*x + b*x + c; } plot(p); } • Nested functions may access and update free variables from containing scopes • Representing functions as pointers to code is not good any more

Functional Languages and Higher-Order Functions 5 Functional Languages and Higher-Order Functions 6

1 10/30/2007

Closures What about Returned Functions?

• Nested functions may need to access variables in previous frames • If the frame of the function that defined the passing function has in the stack been popped out from the run-time stack, the static link will be a • Function values is a closure that consists of dangling pointer – a pointer to code – an environment (dictionary) for free variables ()->int make_counter () { • Implementation of the environment: int count = 0; – It is simply a static link to the beginning of the frame that defined the int inc () { return count++; } function bottom code return inc; for p } p plot ( f: (float)->float ) { ... } plotQ make_counter()() + make_counter()(); plotQ ( a, b, c: float ) { f closure of p ()->int c = make_counter(); p ( x: float ) { return a*x*x + b*x + c; } plot plot(p); top c()+c(); } Run-time stack

Functional Languages and Higher-Order Functions 7 Functional Languages and Higher-Order Functions 8

Frames in Heap!

• Solution: heap-allocate function frames • Local variables need to be – No need for run-time stack – stored in heap only if they can escape – Frames of all lexically enclosing functions are reachable from a closure via – accessed after the defining function returns static link chains • It happens only if – The GC will collect unused frames – the variable is referenced from within some nested function • Problem: Frames will make a lot of garbage look reachable – the nested function is returned or passed to some function that might store it in a data structure • Variables that do not escape are allocated on a stack frame rather than on heap • No escaping variable => no heap allocation • Escape analysis must be global – Often approximate (conservative analysis)

Functional Languages and Higher-Order Functions 9 Functional Languages and Higher-Order Functions 10

Functional Programming Languages Lambda Calculus

• Programs consist of functions with no side-effects • The theoretical foundation of functional languages is lambda • Functions are first class values calculus – Formalized by Church in 1941 • Build modular programs using function composition – Minimal in form • No accidental coupling between components – Turing-complete • No assignments, statements, for-loops, while-loops, etc • Syntax: if e1, e2, and e are expressions in lambda calculus, so are • Supports higher -level, declarative programming style – Variable: v – Application: e e • Automatic memory management (garbage collection) 1 2 – Abstraction: λv. e • Emphasis on types and type inference • Bound vs free variables – Built-in support for lists and other recursive data types • Beta reduction: – Type inference is like type checking but no type declarations are required –(λv. e ) e → e [e /v] • Types of variables and expressions can be inferred from context 1 2 1 2 (e but with all free occurrences of v in e replaced by e ) – Parametric data types and polymorphic type inference 1 1 2 – need to be careful to avoid the variable capturing problem (name clashes) • Strict vs lazy languages

Functional Languages and Higher-Order Functions 11 Functional Languages and Higher-Order Functions 12

2 10/30/2007

Church encoding: Integers Other Types

• Integers: • Booleans –0 = λs. λz. z –true = λt. λf. t –1 = λs. λz. s z –false = λt. λf. f –2 = λs. λz. s s z –if pred e1 e2 = pred e1 e2 • eg, if pred is true, then (λt. λf. t) e e =e –6 = λs. λz. s s s s s s z 1 2 1 • Lists – ... they correspond to successor (s) and zero (z) – nil = λc. λn. n • Simple arithmetic: – [2,5,8] = λc. λn. c 2 (c 5 (c 8 n)) –add = λn. λm. λs. λz. n s (m s z) –cons = λx. λr. λc. λn. c x (r c n) • cons 2 (cons 5 (cons 8 nil)) = … = λc. λn. c 2 (c 5 (c 8 n)) add 2 3 = (λn. λm. λs. λz. n s (m s z)) 2 3 –append = λr. λs. λc. λn. r c (s c n) = λs. λz. 2 s (3 s z) – head = λs. s (λx. λr. x) ? = λs. λz. (λs. λz. s s z) s ((λs. λz. s s s z) s z) •Pairs = λs. λz. (λs. λz. s s z) s (s s s z) –pair = λx. λy. λp. p x y = λs. λz. s s s s s z –first = λs. s (λx. λy. x) = 5

Functional Languages and Higher-Order Functions 13 Functional Languages and Higher-Order Functions 14

Reductions Recursion

• REDucible EXpression (redex) • Infinite reduction: (λx. x x) (λx. x x) – an application expression is a redex – no normal form; no termination – abstractions and variables are not redexes • A fixpoint combinator Y satisfies: • Use beta reduction to reduce – Y f is reduced to f (Y f) –(λx. add x x) 5 is reduced to 10 – Y = (λg. (λx. g (x x)) (λx. g (x x))) • Normal form = no reductions are – Y is always built-in • Reduction is confluent (has the Church-Rosser property) – normal forms are unique regardless of the order of reduction • IlImplement s recursi on • Weak normal forms (WNF) • factorial = Y (λf. λn. if (= n 0) 1 (* n (f (- n 1)))) – no redexes outside of abstraction bodies • Call by value (eager evaluation): WNF + leftmost innermost reductions • Call by name: WNF + leftmost outermost reductions (normal order) • Call by need (lazy evaluation): call by name, but each redex is evaluated at most once – terms are represented by graphs and reductions make shared subgraphs

Functional Languages and Higher-Order Functions 15 Functional Languages and Higher-Order Functions 16

Second-Order Polymorphic Lambda Calculus Type Checking

• Types are: – Type variable: v – Universal quantification: ∀v. t

– Function: t1 → t2 • Lambda terms are: – Variable: v

– Application: e1 e2 – Abstraction: λv:t. e – Type abstraction: Λv. e – Type instantiation: e[t] • Integers –int = ∀a. (a → a) → a → a –succ = λx:int. Λa. λs:(a → a). λz:a. s (x[a] s z) –plus = λx:int. λy:int. x[int] succ y

Functional Languages and Higher-Order Functions 17 Functional Languages and Higher-Order Functions 18

3 10/30/2007

Functional Languages Type Inference

• Functional languages = typed lambda calculus + syntactic sugar • Functional languages need type inference rather than type checking • Functional languages support parametric (generic) data types – λv:t. e requires type checking – λv. e requires type inference (need to infer the type of v) data List a = Nil • Type inference is undecidable in general | Cons a (List a) • Solution: type schemes (shallow types): data Tree a b = Leaf a – ∀a . ∀a . … ∀a . t no other universal quantification in t | Node b (Tree a b) (Tree a b) 1 2 n –(∀b. b → int) → (∀b. b → int) is not shallow Cons 1 (Cons 2 Nil) • When a type is missing, then a fresh type variable is used Cons “a” (Cons “b” Nil) • Type checking is based on type equality; type inference is based on type • Lists are built-in Haskell: [1,2,3] = 1:2:3:[] unification • Polymorphic functions: – A type variable can be unified with any type append (Cons x r) s = Cons x (append r s) • Example in Haskell: append Nil s = s let f = λx. x in (f 5, f “a”) The type of append is ∀a. (List a) → (List a) → (List a) λx. x has type ∀a. a → a • Cost of polymorphism: polymorphic values must be boxed (pointers to heap) • Parametric polymorphism vs ad-hoc polymorphism (overloading)

Functional Languages and Higher-Order Functions 19 Functional Languages and Higher-Order Functions 20

Higher-Order Functions

• Map a function f over every element in a list map:: (a → b) → [a] → [b] map f [] = [] map f (a:s) = (f a):(map f s) e.g. map (+) [1,2,3,4] = [2,3,4,5]

• Replace all cons list constructions with the function c and the nil with the value z foldr:: (a → b → b) → b → [a] → b foldr c z [] = z foldr c z (a:s) = c a (foldr c z s) e.g. foldr (+) 0 [1,2,3] = 6 e.g. append x y = foldr (++) y x e.g. map f x = foldr (λa r. (f a):r) [] x

Functional Languages and Higher-Order Functions 21

4