<<

Tail Calls

CMSC 330: Organization of •A tail call is a function call that is the last thing a Programming Languages function does before it returns

let add x y = x + y let f z = add z z (* tail call *)

let rec length = function More on Scope [] -> 0 | (_::t) -> 1 + (length t) (* not a tail call *) Operational Semantics

let rec length a = function [] -> a | (_::t) -> length (a + 1) t (* tail call *)

CMSC 330 2

Tail Tail Recursion (cont’d)

• Recall that in OCaml, all looping is via recursion let rec length l = match l with l [1;2] – Seems very inefficient [] -> 0 [2] – Needs one stack frame for recursive call | (_::t) -> 1 + (length t) l [] length [1; 2] l

• A function is tail recursive if it is recursive and eax: 012 the recursive call is a tail call

• However, if the program is tail recursive... – Can instead reuse stack frame for each recursive call

CMSC 330 3 CMSC 330 4

Tail Recursion (cont’d) Names and Binding

• Programs use names to refer to things let rec length a l = match l with a 012 – E.g., in x = x + 1, x refers to a variable [] -> a | (_::t) -> (length (a + 1) t) l [1;2][2][] •A binding is an association between a name and what it refers to length 0 [1; 2] – int x; /* x is bound to a stack location containing an int */ eax: 2 – int f (int) { ... } /* f is bound to a function */ – class { ... } /* C is bound to a class */ – let x = e1 in e2 (* x is bound to e1 *) • The same stack frame is reused for the next call, since we’d just pop it off and return anyway

CMSC 330 5 CMSC 330 6

1 Name Restrictions Names and Scopes

• Languages often have various restrictions on • Good names are a precious commodity names to make lexing and parsing easier – They help document your code – Names cannot be the same as keywords in the – They make it easy to remember what names language correspond to what entities – OCaml function names must be lowercase – OCaml type constructor and module names must be • We want to be able to reuse names in different, uppercase non-overlapping regions of the code – Names cannot include special characters like ;,:etc • Usually names are upper- and lowercase letters, digits, and _ (where the first character can’t be a digit) • Some languages also allow more symbols like ! or -

CMSC 330 7 CMSC 330 8

Names and Scopes (cont’d) Example

void w(int i) { •iis in scope •A scope is the region of a program where a ... binding is active } – in the body of w, the body of y, and after the – The same name in a different scope can refer to a void x(float j) { declaration of j in z different binding (refer to a different program object) ... – but all those i’s are } different • A name is in scope if it's bound to something void y(float i) { ... •jis in scope within the particular scope we’re referring to } – in the body of x and z void z(void) { int j; char *i; ... }

CMSC 330 9 CMSC 330 10

Ordering of Bindings Order of Bindings – OCaml

• Languages make various choices for when • let x = e1 in e2 – x is bound to e1 in scope of e2 declarations of things are in scope • let rec x = e1 in e2 – x is bound in e1 and in e2

let x = 3 in let y = x + 3 in... (* x is in scope here *)

let x = 3 + x in ... (* error, x not in scope *)

let rec length = function [] -> 0 | (h::t) -> 1 + (length t) (* ok, length in scope *) in ...

CMSC 330 11 CMSC 330 12

2 Order of Bindings – C Order of Bindings – Java

• All declarations are in scope from the • Declarations are in scope from the declaration declaration onward onward, except for methods and fields, which int i; are in scope throughout the class int j = i; /* ok, i is in scope */ i = 3; /* also ok */ class C { void f(){ void f(...) { ... } ...g()... // OK } int i; int j = j + 3; /* error */ void g(){ f(...); /* ok, f declared */ ... } } f(...); /* may be error; need prototype (or oldstyle C) */

void f(...) { ... }

CMSC 330 13 CMSC 330 14

Shadowing Names Namespaces

• Shadowing is rebinding a name in an inner • Languages have a “top-level” or outermost scope scope to have a different meaning – Many things go in this scope; hard to control collisions – May or may not be allowed by the language • Common solution seems to be to add a hierarchy – OCaml: Modules C OCaml •List.hd, String.length, etc. int i; letg=3;; •opento add names into current scope letgx=x+3;; void f(float i) { – Java: Packages Java • java.lang.String, java.awt.Point, etc. { void h(int i) { • import to add names into current scope char *i = NULL; { ... – C++: Namespaces float i; // not allowed } ... • namespace f { class g { ... } }, f::g b, etc. } • using namespace to add names to current scope } }

CMSC 330 15 CMSC 330 16

Mangled Names Static Scope Recall

• What happens when these names need to be •In static scoping, a name refers to its closest seen by other languages? binding, going from inner to outer scope in the – What if a C program wants to call a C++ method? program text • C doesn’t know about C++’s naming conventions – Languages like C, C++, Java, Ruby, and OCaml are statically scoped • For multilingual communication, names are int i; often mangled into some flat form { – E.g., classC{intf(int *x, int y) { ... } } int j; becomes symbol __ZN1C3fEPii in g++ – E.g., native valueOf(int) in java.lang.String { float i; corresponds to the C function Java_java_lang_String_valueOf__I j = (int) i; } } CMSC 330 17 CMSC 330 18

3 Free and Bound Variables Static Scoping and Nested Functions • The bound variables of a scope are those • To allow arbitrary nested functions with higher- names that are declared in it order functions and static scoping, we needed • If a variable is not bound in a scope, it is free closures

– The bindings of variables which are free in a scope let add x = (fun y -> x + y) are "inherited" from declarations of those variables in outer scopes in static scoping (add3)4  4  3+4  7

{/*1*/ j is bound in int j;

scope 1 {/*2*/ float i; i is bound in

j is free in j = (int) i; scope 2 scope 2 } } CMSC 330 19 CMSC 330 20

Nested Functions (cont’d) Example

• We need closures for upward funargs x letfx= – Functions that are returned by other functions letgy=x+y g y = x + y in y g3 • If we only have downward funargs, then we don’t need full closures – These are functions that are only passed inward – So when they’re called, any nonlocal variables they • When g is called, x is still on the stack access from outer scopes are still around

CMSC 330 21 CMSC 330 22

Example Downward Funargs

x • It turns out that if we only pass functions letappfz=fz app f z = f z downward, there are cheaper implementation f let f x = strategies for static scoping than closures letgy=x+yingy=x+y z app g 3 y • They're called static links and displays, and they're used by – Pascal and Algol-family languages • When g is called, x is still on the stack – gcc nested functions

• We won’t go into details, though (CMSC 430 covers these in exciting detail.)

CMSC 330 23 CMSC 330 24

4 Dynamic Scope Nested Dynamic Scopes

• In a language with dynamic scoping, a name • Full dynamic scopes can be nested refers to its closest binding at runtime – Static scope relates to the program text – LISP was the common example – Dynamic scope relates to program execution trace

Perl (the keyword local introduces dynamic scope) Scheme (top-level scope only is dynamic) $l = "global"; (define f (lambda () a)) ; defines a no-argument function which returns a sub A { local $l = "local"; (define a 3) ; bind a to 3 B(); } (f) ; calls f and returns 3 global (define a 4) (f) ; calls f and returns 4 sub B { print "$l\n"; } local global B(); A(); B();

CMSC 330 25 CMSC 330 26

Static vs. Dynamic Scope

Static scoping Dynamic scoping CMSC 330: Organization of – Local understanding of – Can be hard to Programming Languages function behavior understand behavior of functions – Know at compile-time – Requires finding name what each name refers bindings at runtime to Operational Semantics – Easier to implement (just – A bit trickier to keep a global table of implement stacks of variable/value bindings )

CMSC 330 27

Introduction Roadmap: Compilation of program

• So far we’ve looked at regular expressions, Grammar: Program: (Rec Des) Parsing: Postorder: SA X = 2 + 3 ast postfix A  id = E X 2 3 + = automata, and context-free grammars E  T+E | T = T  P * T | P – These are ways of defining sets of strings P  id | n | (E) X + – We can use these to describe what programs you 2 3 can write down in a language

• (Almost...) Compilation: Program semantics: Push X – I.e., these describe the syntax of a language What does program Push 2 mean Push 3 Add top 2 Assign top to • What about the semantics of a language? ? second

– What does a program “mean”? Is this correct? What is program supposed to do?

CMSC 330 29 CMSC 330 30

5 Operational Semantics Roadmap: Semantics of a program

• There are several different kinds of semantics Grammar: Program: (Rec Des) Parsing: Postorder: SA X = 2 + 3 ast postfix – Denotational: A program is a mathematical function A  id = E X 2 3 + = E  T+E | T = T  P * T | P – Axiomatic: Develop a logical proof of a program P  id | n | (E) X + • Give predicates that hold when a program (or 2 3 part) is executed

Program semantics: Compilation: • We will briefly look at operational semantics Push X = Push 2 Push 3 – A program is defined by how you execute it on a value Add top 2 X + mathematical model of a machine Assign top to 2 3 second

• We will look at a subset of OCaml as an example

CMSC 330 31 CMSC 330 32

Evaluation OCaml Programs

• We’re going to define a relation E  v • E ::= x | n | true | false | [] | if E then E else E – This means “expression E evaluates to v” | fun x = E | E E

• So we need a formal way of defining programs –xstands for any identifier and of defining things they may evaluate to –nstands for any integer – true and false stand for the two boolean values • We’ll use grammars to describe each of these –[]is the empty list – One to describe abstract syntax trees E – Using = in fun instead of -> to avoid some confusion later – One to describe OCaml values v

CMSC 330 33 CMSC 330 34

Values Grammars for Trees • We’re just using grammars to describe trees • v ::= n | true | false | [] | v::v E ::= x | n | true | false | [] | if E then E else E – n is an integer (not a string corresp. to an integer) | fun x = E | E E Given a program, we saw • Same idea for true, false, [] last time how to convert v ::= n | true | false | [] | v::v – v1::v2 is the pair with v1 and v2 it to an ast (e.g., type ast = recursive descent parsing) • This will be used to build up lists Id of string type value = • Notice: nothing yet requires v2 to be a list | Num of int Val_Num of int | Bool of bool – Important: Be sure to understand the difference | Val_Bool of bool | Nil between program text S and mathematical objects v. | Val_Nil |Ifofast*ast*ast • E.g., the text 3 evaluates to the mathematical number 3 | Val_Pair of value * | Fun of string * ast value – To help, we’ll use different colors and italics | App of ast * ast • This is usually not done, and it’s up to the reader to Goal: For any ast, we want an operational rule remember which is which to obtain a value that represents the execution

CMSC 330 35 CMSC 330 of ast 36

6 Operational Semantics Rules Operational Semantics Rules (cont’d)

• How about built-in functions? n  n

true  true ( + ) n m  n + m – We’re applying the + function false  false • (we put parens around it because it’s not in infix notation; will skip this from now on) []  [] • Ignore currying for the moment, and pretend we have multi- argument functions – On the right-hand side, we’re computing the mathematical sum; the left-hand side is source code – But what about + (+ 3 4) 5 ? • Each basic entity evaluates to the • We need recursion corresponding value

CMSC 330 37 CMSC 330 38

Rules with Hypotheses Error Cases

• To evaluate + E E , we need to evaluate E ,   1 2 1 E1 n E2 m then evaluate E , then add the results 2 + E E  n + m – This is call-by-value 1 2 E  n E  m • Because we wrote n, m in the hypothesis, we mean that they must 1 2 be integers  + E1 E2 n + m • But what if E1 and E2 aren’t integers? – E.g., what if we write + false true ? – This is a “natural deduction” style rule – It can be parsed, but we can’t execute it • We will have no rule that covers such a case – It says that if the hypotheses above the line hold, – Convention: If there is not rule to cover a case, then the expression is then the conclusion below the line holds erroneous – A program that evaluates to a stuck expression produces a run time • i.e., if E1 executes to value n and if E2 executes to value m, error in practice then +E1 E2 executes to value n+m

CMSC 330 39 CMSC 330 40

Trees of Semantic Rules Rules for If

• When we apply rules to an expression, we   E1 true E2 v actually get a tree if E then E else E  v – Corresponds to the recursive evaluation procedure 1 2 3 • For example: + (+ 3 4 ) 5   E1 false E3 v 3  344   if E1 then E2 else E3 v (+ 3 4)  7 5  5 •Examples + ( + 3 4) 5  12 – if false then 3 else 4  4 – if true then 3 else 4  3 • Notice that only one branch is evaluated CMSC 330 41 CMSC 330 42

7 Rule for :: Rules for Identifiers

  x  ??? E1 v1 E2 v2  :: E1 E2 v1::v2 • Let’s assume for now that the only identifiers are parameter names –Ex. (fun x = + x 3) 4 • So :: allocates a pair in memory – When we see x in the body, we need to look it up – So we need to keep some sort of environment • Are there any conditions on E and E ? 1 2 • This will be a map from identifiers to values – No! We will allow E2 to be anything – OCaml’s type system will disallow non-lists

CMSC 330 43 CMSC 330 44

Semantics with Environments Rules for Identifiers and Application

• Extend rules to the form A; E  v  no hypothesis means – Means in environment A, the program text E evaluates to v A; x A(x) “in all cases” • Notation: – We write • for the empty environment   – We write A(x) for the value that x maps to in A A; E2 v A, x:v; E1 v' – We write A, x:v for the same environment as A, except x is now v •xmight or might not have mapped to anything in A  A; ( (fun x = E1) E2) v' – We write A, A' for the environment with the bindings of A' added to and overriding the bindings of A • To evaluate a user-defined function applied to an – The empty environment can be omitted when things are clear, and in adding argument: other bindings to an empty environment we can write just those bindings if – Evaluate the argument (call-by-value) things are clear – Evaluate the function body in an environment in which the formal parameter is bound to the actual argument – Return the result

CMSC 330 45 CMSC 330 46

Example: (fun x = + x 3) 4 = ? Nested Functions

• This works for cases of nested functions – ...as long as they are fully applied •, x:4; x  4 •, x:4; 3  3 •; 4  4 •, x:4; + x 3  7 • But what about the true higher-order cases? – Passing functions as arguments, and returning •; (fun x = + x 3) 4  7 functions as results – We need closures to handle this case – ...and a closure was just a function and an environment – We already have notation around for writing both parts

CMSC 330 47 CMSC 330 48

8 Closures Revised Rule for Lambda

• Formally, we add closures (A, x.E) to values –Ais the environment in which the closure was created A; fun x = E  (A, x.E) –xis the parameter name –Eis the source code for the body • x will be discussed next time. Means a binding of x in E. • v ::= n | true | false | [] | v::v • To evaluate a function definition, create a closure when the function is created | (A, x.E) – Notice that we don’t look inside the function body

CMSC 330 49 CMSC 330 50

Revised Rule for Application Example

     A; E1 (A', x.E) A; E2 v •; (fun x = (fun y = + x y)) (•, x.(fun y = + x y)) A, A', x:v; E  v' •; 3  3  A; (E1 E2) v' x:3; (fun y = + x y)  (x:3, y.(+ x y)) •; (fun x = (fun y = + x y)) 3  (x:3, y.(+ x y)) • To apply something to an argument: – Evaluate it to produce a closure – Evaluate the argument (call-by-value) – Evaluate the body of the closure, in Let = (fun x = (fun y = + x y)) 3 • The current environment, extended with the closure’s environment, extended with the binding for the parameter CMSC 330 51 CMSC 330 52

Example (cont’d) Why Did We Do This? (cont’d)

• Operational semantics are useful for •;  (x:3, y.(+ x y)) – Describing languages • Not just OCaml! It’s pretty hard to describe a big language  like C or Java, but we can at least describe the core •; 4 4 components of the language x:3, y:4; (+ x y)  7 – Giving a precise specification of how they work • Look in any language standard – they tend to be vague in •; ( 4 ) 7 many places and leave things undefined – Reasoning about programs • We can actually prove that programs do something or don’t do something, because we have a precise definition of how they work

CMSC 330 53 CMSC 330 54

9