NAMES, SCOPES AND BINDING A REVIEW OF THE CONCEPTS

Name Binding and Binding Time

 Name binding is the associaon of objects (data and/or code) with names (idenfiers)  Shape S = new Shape();  The binding of a program element to a parcular property is the choice of the property from a set of possible properes  binding and binding mes are the properes of program elements that are determined by the definion of the language or its implementaon  The me during program formulaon or processing when this choice is made is the binding me  There are many classes of bindings in programming languages as well as many different binding mes Binding Time

Binding mes:  Run me (execuon me):  On entry to a subprogram or block  Binding of formal to actual parameters  Binding of formal parameters to storage locaons  At arbitrary points during execuon  binding of variables to values  Dynamic binding

Binding Time

 Compile me (Sta Time)  Declaraons (programmer acon) Variable names variable types program statement structure  Compiler acon Relave locaon of data objects  Linker acons Relave locaon of different object modules Binding Time

Binding Time Binding Time o The sum operator (+)  At compilaon me (depending on the type of the operands because of overloading) o If x is declared integer + means one thing o if x is declared real means something else o + can also be overloaded by the programmer.

o Example (C++): it is possible to specify that + operates on strings: string operator +(string& a, string& b) { return a.append(b); }

Binding Time

Shape s= new Shape(); s.getArea(); // The compiler can resolve this method call statically. Binding Time

public void MakeSomeFoo(object a) { // Things happen... ((Shape) a).getArea(); // You won't know if this works until runtime!}

Binding Time: discussion

• Many of the most important and subtle differences between programming languages involve differences in binding me • The trade off is between stac analysis, efficient execuon and flexibility • The language comes with a type system. The compiler assigns a type expression to parts of the source program. The compiler checks that the type usage in the program conforms to the type system for the language. • When efficiency is a consideraon (Fortran, C) languages are designed so that as many bindings as possible are performed during translaon • Where flexibility is the prime determiner, bindings are delayed unl execuon me so that they may be made data dependent Dynamic Dispatch

Dynamic dispatch allows the code executed when a message is sent to an object (o.m(x)) to be determined by run-me values. interface Shape { ... void draw() { ... } } class Circle extends Shape { ... void draw() { ... } } class Square extends Shape { ... void draw() { ... } } ... Shape s = ...; //could be a circle a square, or something else. s.draw(); Invoking s.draw() could run the code for any of the methods shown in the program (or for any other class that extends Shape).

Java, all methods (except for stac methods) are dispatched dynamically. In C++, only virtual members are dispatched dynamically. Note that dynamic dispatch is not the same as overloading, which is usually resolved using the stac types of the arguments to the funcon being called. Objects lifeme

Program execution time Creation of an object

Object lifetime Creation of a binding Binding lifetime

Destruction of a binding Dangling reference if these two times are Destruction of an object interchanged

Dangling References

Int * p = new int; Int * q = new int;

// things happen on p and q delete p; //Other things happens Use(q) Dangling references #include int *call();

void main(){ int*ptr; ptr=call(); fflush(stdin); printf("%d",*ptr); }

int * call(){ int x=25; ++x; return &x; } Storage Management

 Programming languages provide three storage allocaon mechanisms o Stac Absolute address retained troughout program’s execuon o Stack Dynamic allocaon with calls&returns o Heap Allocated and de-allocated at arbitrary me

Stac Allocaon

 Global variables  Constants o manifest, declared (parameter variables in Fortran) or idenfied by the compiler  Variables idenfied as const in C can be a funcon of non constants and therefore cannot be stacally allocated  Constant tables generated by the compiler for debugging and other purposes Stac Allocaon

 In the absence of recursion, all variables can be stacally allocated  Also, can be stacally allocated: o Arguments and return values (or their addresses). Allocaon can be in processor registers rather than in memory o Temporaries o Bookkeeping informaon  return address  saved registers  debugging informaon

Stac Allocaon Stack-based Allocaon

 Needed when language permits recursion  Useful in languages without recursion because it can save space  Each subroune invocaon creates a frame or acvaon record o arguments o return address o local variables o temporaries o bookkeeping informaon  Stack maintained by o calling sequence o prologue o epilogue

Stack-based Allocaon (Cont.) Una funzione ricorsiva

int Func ( /* in */ int a, /* in */ int b ) { int result; if ( b == 0 ) // base case result = 0; else if ( b > 0 ) // first general case result = a + Func ( a , b - 1 ) ) ; return result; }

23

Run-Time Stack Acvaon Records

x = Func(5, 2);// original call at instruction 100

FCTVAL ? original call result ? at instruction 100 b 2 a 5 pushes on this record Return Address 100 for Func(5,2) 24 Run-Time Stack Acvaon Records

x = Func(5, 2);// original call at instruction 100

FCTVAL ? call in Func(5,2) code result ? at instruction 50 b 1 pushes on this record a 5 for Func(5,1) Return Address 50 FCTVAL ? result 5+Func(5,1) = ? b 2 record for Func(5,2) a 5 Return Address 100 25

Run-Time Stack Acvaon Records

x = Func(5, 2);// original call at instruction 100 FCTVAL ? call in Func(5,1) code result ? at instruction 50 b 0 pushes on this record a 5 for Func(5,0) Return Address 50 FCTVAL ? result 5+Func(5,0) = ? b 1 a 5 record for Func(5,1) Return Address 50 FCTVAL ? result 5+Func(5,1) = ? b 2 a 5 record for Func(5,2) Return Address 100 26 Run-Time Stack Acvaon Records

x = Func(5, 2);// original call at instruction 100 FCTVAL 0 result 0 b 0 record for Func(5,0) a 5 Return Address 50 is popped first with its FCTVAL FCTVAL ? result 5+Func(5,0) = ? b 1 a 5 record for Func(5,1) Return Address 50 FCTVAL ? result 5+Func(5,1) = ? b 2 a 5 record for Func(5,2) Return Address 100 27

Run-Time Stack Acvaon Records

x = Func(5, 2);// original call at instruction 100

FCTVAL 5 result 5+Func(5,0) = 5+ 0 b 1 a 5 record for Func(5,1) Return Address 50 is popped next with its FCTVAL FCTVAL ? result 5+Func(5,1) = ? b 2 a 5 record for Func(5,2) Return Address 100 28 Run-Time Stack Acvaon Records

x = Func(5, 2);// original call at instruction 100

FCTVAL 10 result 5+Func(5,1) = 5+5 b 2 a 5 record for Func(5,2) Return Address 100 is popped last with its FCTVAL 29

Heap-based Allocaon

 Region of storage in which blocks of memory can be allocated and de-allocated at arbitrary mes  Because they are not allocated in the stack, the lifeme of objects allocated in the heap is not confined to the subroune where they are created o They can be assigned to parameters (or to components of objects accessed via pointers by parameters) o They can be returned as value of the subroune/funcon/ method Find the errors in this code

Heap-based Allocaon

 Several strategies to manage space in the heap  Fragmentaon o Internal fragmentaon when space allocated is larger than needed o External fragmentaon when allocated blocks are scaered through the heap. Total space available might be more than requested, but no block has the needed size Heap-based Allocaon

 One approach to maintain the free memory space is to use a free list  Two strategies to find a block for a give request o First fit: use first block in the list large enough to sasfy the request o Best fit: search the list to find the smallest block that sasfy the request  The free list could be organized as an array of free lists where each list in the array contain blocks of the same size

Cells and Liveness

Cell = data item in the heap o Cells are “pointed to” by pointers held in registers, stack, global/stac memory, or in other heap cells Roots: registers, stack locaons, global/stac variables A cell is live if its address is held in a root or held by another live cell in the heap Garbage

Garbage is a block of heap memory that cannot be accessed by the program o An allocated block of heap memory does not have a reference to it (cell is no longer “live”) Garbage collecon (GC) - automac management of dynamically allocated storage o Reclaim unused heap blocks for later use by program

slide 35

Example of Garbage

class node { int value; p = new node(); q = new node(); node next; q = p; } delete p; node p, q;

slide 36 The Perfect Garbage Collector

No visible impact on program execuon Works with any program and its data structures o For example, handles cyclic data structures Collects garbage (and only garbage) cells quickly o Incremental; can meet real-me constraints Has excellent spaal locality of reference o No excessive paging, no negave cache effects Manages the heap efficiently o Always sasfies an allocaon request and does not fragment

slide 37

Reference Counng: Example

Heap space root set 1

2

1 1 1

1 2 1

slide 38 Reference Counng: Cycles

Heap space Memory leak root set 1

1

1 1 1

1 2 1

slide 39

Mark-Sweep Example (1)

Heap space root set

slide 40 Mark-Sweep Example (2)

Heap space root set

slide 41

Mark-Sweep Example (3)

Heap space root set

slide 42 Mark-Sweep Example (4)

Heap space Free unmarked root cells set

Reset mark bit of marked cells

slide 43

Generaonal Garbage Collecon

Observaon: most cells that die, die young o Nested scopes are entered and exited more frequently, so temporary objects in a nested are born and die close together in me Divide the heap into generaons, and GC the younger cells more frequently o Amorze the cost across generaons

slide 44 Example with Immediate “Aging” (1)

C root A D set Young

B E

F Old

G

slide 45

Example with Immediate “Aging” (2)

C root D set Young

E

A F Old

G B

slide 46 Generaons with Semi-Spaces

root Youngest set . . From-space To-space .

Middle generation(s)

Oldest

From-space To-space

slide 47

SCOPE Variable

 A variable is a locaon (AKA reference) that can be associated with a value.  Obtaining the value associated with a variable is called dereferencing, and creang or changing the associaon is called assignment.

Semancs of Programming Languages Formal Model Graphical

Names Env Store x x y y 123 z 43.21 3.14 “abc” z s x fun s “abc” fun

Scope and extent Typed Variables

 In stacally-typed languages, a variable also has a type, meaning that only values of a given type can be stored in it  In dynamically-typed languages, values, not variables, have types

Scope rules

 The region of the program in which a binding is acve is its scope  Most languages today are lexically scoped  Some languages (e.g. Perl) have both lexical and dynamic scoping Stac Scope - Nested Subrounes

 In most languages any constant, type, variables or subrounes declared within a subroune are not visible outside the subroune  Closest nested scope rule: a name is known in the scope in which it is declared unless it is hidden by another declaraon of the same name

Stac Scope - Nested Subrounes (2) procedure P1(A1: T1); var X: real; … procedure P2(A2: T2); … procedure P3(A3: T3); … begin … (*body of P3 *) end; … begin … (* body of P2 *) end; … procedure P4(A4: T4); … function F1(A5: T5) : T6; var X: integer; … begin … (* body of F1 *) end; … begin … (* body of P4 *) end; … begin … (* body of P1 *) end; Stac Scope - Nested Subrounes (3)

 To find the frames of surrounding scopes where the desired data is a stac link could be used

funcon A() { int I; funcon B() { int J; funcon C() { int K; K=I+J; B(); } C(); } // body of B. B(); } // body of A. A calls B calls C calls B calls C. Stac Scope - Nested Subrounes (4)

Stac Scope - Nested Subrounes (5)

{ /* B1 */!

{ /* B2 */! ARB4 { /* B3 */!

{ /* B4 */! ARB3

}! ARB2 }! AR }! B1 }! class Outer { final int x; class Inner { //int x; void foo() { x; } }

void bar() { Inner i = new Inner(); int x; i.foo(); } }

GNU MIPS gcc MIPS

gcc x386

local m local m-1 ... local 1 old fp return addr arg1 arg2 ... argn sp fp Example

Example: invocaon Example int baz(int x, int y) { char buf[256]; { int z = y + 1; x += z; } return x + y; } Modules

 Modularizaon depends on informaon hiding  Funcons and subrounes can be used to hide informaon. However, this is not flexible enough.  One reason is that persistent data is usually needed to create abstracon. This can be addressed in some cases using stacally allocated values

Stac Scope - Modules (Cont.) Stac Scope - Modules (Cont.)

 But modularizaon oen requires a variety of operaons on persistent data.

Stac Scope - Modules (Cont.)

 Objects inside a module are visible to each other  Objects inside can be hidden explicitly (using a keyword like private) or implicitly (objects are only visible outside if they are exported)  In some language objects outside need to be imported to be visible within the module Stac Scope - Modules Modula-2 examples

VAR a,b: CARDINAL;! MODULE M;! !IMPORT a; EXPORT w,x;! !VAR u,v,w; CARDINAL;! !MODULE N;! ! !IMPORT u; EXPORT x,y;! ! !VAR x,y,z: CARDINAL;! ! !(* x,u,y,z visible here *)! !END N;! !(* a,u,v,w,x,y visible here *)! END M;! (* a,b,w,x visible here *)! Modules as types

Dynamic scope

 In early Lisp systems variables were bound dynamically rather than stacally  In a language with dynamic binding, free variables in a procedure get their values from the environment in which the procedure is called rather than the environment in which the procedure is defined Symbol Tables

 Symbol tables are used to keep track of scope and binding informaon about names.  The symbol table is searched every me a name is encountered in the source text  Changes occur when a new name or new informaon about a name is discovered  The abstract syntax tree will contain pointers to the symbol table rather than the actual names used for objects in the source text

Symbol Tables (Cont.)

 Each symbol table entry contains o the symbol name o its category (scalar variable, array, constant, type, procedure, field name, parameter, etc.) o scope number o type (a pointer to another symbol table entry) o and addional, category specific fields (e.g. rank and shape for arrays)  To keep symbol table records uniform, it may be convenient for some of the informaon about a name to be kept outside the table entry, with only a pointer to this informaon stored in the entry Symbol Tables (Cont.)

 The symbol table may contain the keywords at the beginning if the lexical scanner searches the symbol table for each name  Alternavely, the lexical scanner can idenfy keywords using a separate table or by creang a separate final state for each keyword

Symbol Tables (Cont.)

 One of the important issues is handling stac scope  A simple soluon is to create a symbol table for each scope and aach it to the node in the abstract syntax tree corresponding to the scope  An alter nave is to use a addional data structure to keep track of the scope. This structure would resemble a stack: 4 top

LL A 2 2 C 0 additional scope_marker B

A

symbol table

Symbol Tables (Cont.)

 A hash table can be added to the previous data structure to accelerate the search.  Elements with the same name are linked from top to boom.  Search start at the entry of the hash table and proceeds through the linked list unl the end of the list is reached (old_id) or unl the link list refers to an element below scope_marker(LL - 1) (new_id) Symbol Tables (Cont.)

Symbol Tables Symbol Tables (Cont.)

Associaon Lists and Central Reference Tables The binding of referencing environments

P1()! { REAL X! { /* B1 */! { /* B2 */! AR { /* B3 */! P3

P2(P3)! PX }! AR ! P3()! P2 ! {! !x! AR ! } ! B3 }! P2(PX)! ARB2 { ! PX()! ARB1 }!

}! ARP1 }!