Names (Identifiers) • Design issues Names, Bindings, Types and Ð Maximum length? Ð Are connector characters allowed? Scopes Ð Are names case sensitive? Ð Are special words reserved words or keywords? • Length Ð I: maximum 6 Matt Evett Ð COBOL: maximum 30 Ð FORTRAN 90 and ANSI : maximum 31 Dept. Computer Science Ð Ada: no limit, and all are significant Eastern Michigan Univ. Ð C++: no limit, but implementors often impose one (adapted from Sebesta’s slides) • Connectors Ð Pascal, Modula-2, and FORTRAN 77 don't allow Ð Others do

Identifier Case Sensitivity Special Identifiers

¥ Disadvantage: readability (names that look alike ¥ Def: A keyword is a word that is special only in are different) certain contexts ¥ worse in Modula-2 because predefined names Ð Example: Fortran’s REAL APPLE vs. REAL = 3.4 are mixed case (e.g. WriteCard) Ð Disadvantage: poor readability ¥ C, C++, Java, and Modula-2 names are case ¥ Def: A reserved word is a special word that sensitive cannot be used as a user-defined name ¥ The names in other languages are not – C’s switch, case, etc.

Variables Addresses • A variable is an abstraction of a memory • Abstract memory cell - the physical cell or cell collection of cells associated with a variable Ð The l-value of a variable is its address • Variables can be characterized as a sextuple Ð The r-value of a variable is its value of attributes: • A variable may have different addresses at – name, address, value, type, lifetime, and different times during execution. – Name - not all variables have them • A variable may have different addresses at – Address - the memory address with which it is different places in a program associated. – Value - the contents of the location with which • If two variable names can be used to access the variable is associated the same memory location, they are called aliases Aliases Variable Type & Value

• Determines the range of values of variables • Creating aliases: and the set of operations that are defined for – Pointers, reference variables, Pascal variant values of that type; in the case of floating records, C and C++ unions, and FORTRAN EQUIVALENCE (and through parameters - point, type also determines the precision discussed in Ch 8) • Some of the original justifications for aliases are no longer valid; e.g. memory reuse in FORTRAN. Replace them with dynamic allocation

Binding Binding Times

• Def: A binding is an association, such as • Language design time--e.g., bind operator symbols between an attribute and an entity, or to operations between an operation and a • Language implementation time--e.g., bind floating point type to an internal representation • Def: Binding time is the time at which a • Compile time--e.g., bind a variable to a type in C binding takes place. or Java • Load time--e.g., bind a FORTRAN 77 variable to a memory cell (or a C static variable) • Runtime--e.g., bind a nonstatic local variable to a memory cell

Types of Bindings Typing Variables

• Def: A binding is static if it occurs before • Type Bindings run time and remains unchanged throughout – How is a type specified? program execution. – When does the binding take place? • Def: A binding is dynamic if it occurs • If static, type may be specified by either an explicit during execution or can change during or an implicit declaration execution of the program. – Def: An explicit declaration is a statement used for declaring the types of variables – Def: An implicit declaration is a default mechanism for specifying types of variables (at the their first appearance in program) Example Typing Dynamic Type Binding

• FORTRAN, PL/I, BASIC, and Perl provide • Specified through an assignment statement implicit declarations – e.g. APL: LIST ⇐ 2 4 6 8 vs. LIST ⇐ 17.3 – Advantage: writability – E.g. Lisp: (setq bob “hi”) vs. (setq bob 3) – Disadvantage: reliability (less trouble with Perl) • Advantage: flexibility (generic program • First char = $ for scalar, @ for array, etc. units) • Disadvantages: – High cost (dynamic type checking and interpretation) – Type error detection by the compiler is difficult

Dynamic Binding via Inference Storage Bindings

• Type Inferencing (e.g. ML, Miranda, and • Keeping track of binding of variables to Haskell) their memory cells. – Rather than by assignment statement, types are • Allocation - getting a cell from some pool determined from the context of the reference of available cells – E.g. ML: fun circ(r) = 3.1415 * r * r • Deallocation - putting a cell back into the – E.g. ML: fun circ(r) = 10 * r * r pool • Def: The lifetime of a variable is the time during which it is bound to a particular memory cell

Categories of Variables Static Variables

• To speak of storage bindings, it is useful to • Bound to memory cells before execution categorize variables by their lifetimes: begins and remains bound to the same – Inefficient, because all attributes are dynamic memory cell throughout execution. – Loss of error detection – e.g. all FORTRAN 77 variables, C static – Static variables, global variables – Stack-dynamic • Advantage: efficiency (direct addressing), – Explicit heap-dynamic history-sensitive subprogram support – Implicit heap-dynamic • Disadvantage: lack of flexibility (no recursion) Explicit Heap-Dynamic Stack-Dynamic Variables Variables

– Storage bindings are created for vars when their • Allocated and deallocated by explicit declaration statements are elaborated. directives, specified by the programmer, • If scalar, all attributes except address are statically which take effect during execution bound – Referenced only through pointers or references • e.g. local variables in Pascal and C – e.g. dynamic objects in C++ (via new and – Advantage: allows recursion; conserves storage delete), all objects in Java – Disadvantages: • Overhead of allocation and deallocation • Advantage: provides for dynamic storage • Subprograms cannot be history sensitive management • Inefficient references (indirect addressing ) • Disadvantage: inefficient and unreliable

Implicit Heap-Dynamic Variables Type Checking

• Allocation and deallocation caused by

assignment statements. I.e., when a - Generalize the concept of operands and operators variable is assigned a value, its cell (and all to include subprograms and assignments attributes) are allocated Def: Type checking is the activity of ensuring that the operands of an operator are of compatible – e.g. all variables in APL types • Advantage: flexibility Def: A compatible type is one that is either legal for the operator, or is allowed under language rules to be impli citly converted, by compiler- • Disadvantages: generated code, to a legal type. This automatic conversion is called a coercion.

Strong Typing Type Errors • Advantage: allows the detection of the misuses of variables that result in type errors • Def: A type error is the application of an • Languages: operator to an operand of an inappropriate – FORTRAN 77 is not: parameters, EQUIVALENCE type – Pascal is not: variant records – If all type bindings are static, nearly all type – Modula-2 is not: variant records, WORD type checking can be static – C and C++ are not: parameter type checking can be – If type bindings are dynamic, type checking avoided; unions are not type checked must be dynamic – Ada is, almost (UNCHECKED CONVERSION is loophole) (Java is similar) • Def: A is strongly typed if type errors are always detected • Coercion rules can strongly weaken strong typing (C++ vs Ada) Dynamic Type Binding Type Compatibility

• Advantage of dynamic type binding: • Def: Type compatibility by name means the programming flexibility two variables have compatible types if they are in either the same declaration or in • Disadvantages: declarations that use the same type name – efficiency – Easy to implement but highly restrictive: – late error detection (costs more) • Subranges of integer types are not compatible with • Ex: Lisp integer types • If function parameters are to be a structure type, T, that type must be declared in one, global location. Can’t be declared in both formal and actual parameter lists (e.g. Pascal)

Compatibility by Structure Problems with Structured Types • Consider the problem of two structured • Def: Type compatibility by structure means types: that two variables have compatible types if – Suppose they are circularly defined their types have identical structures – Are two record types compatible if they are – More flexible, but harder to implement structurally the same but use different field names? – Are two array types compatible if they are the same except that the subscripts are different? (e.g. [1..10] and [-5..4]) – Are two enumeration types compatible if their components are spelled differently?

More Problems Example Compatibility • Language examples: • With structural type compatibility, you – Pascal: usually structure, but in some cases cannot differentiate between types of the name is used (formal parameters) same structure (e.g. different units of speed, – C: structure, except for records both float) – C++: name – See Mars Polar Explorer disaster! Fall 1999. – Ada: restricted form of name • Derived (sub-)types allow types with the same structure to be different. – type celsius is new FLOAT; type fahrenheit is new FLOAT • Anonymous types are all unique, even in: A, B : array (1..10) of INTEGER: Scope Static Scope

• Def: The scope of a variable is the range of • … is based on program text; syntax statements over which it is visible. – To connect a name reference to a variable, the • Def: The nonlocal variables of a program compiler must find the declaration. unit are those that are visible but not – Search process: search declarations, first declared there. locally, then in increasingly larger enclosing scopes, until one is found for the given name. • The scope rules of a language determine • Enclosing static scopes (to a specific scope) how references to names are associated with are called its static ancestors; the nearest variables static ancestor is called a static parent.

Nested Scopes Creating static scopes

• Variables can be hidden (shadowed) from a • Blocks - a method of creating static scopes unit by having a "closer" variable with the inside program units--from ALGOL 60 same name. • Examples: – I.e., identifier refers to the variable with that – C and C++: “{” and “}” name in the nearest static ancestor scope. for (...) { int index; … } – C++, Lisp and Ada allow access to shadowed – Ada: “begin” and “end” variables. declare LCL : FLOAT; • C++ uses scope operator “::”. E.g: ::x accesses the begin , x, rather than the local variable x. ... end

Evaluating Static Scopes Evaluating Static Scopes (2) Consider the PASCAL-like example: Assume MAIN calls A and B A calls C and D • Graph of desired potential callability B calls A and E MAIN main MAIN A B A A B C CD E • Graph of actual potential callability D C D E – Danger! B Scope tree E main A B Lexical Program structure (A is def’d within C D E MAIN, etc.) Problems with Static Scoping Dynamic Scope

• Suppose the spec is changed so that D must • Based on program unit calling sequences, now access some data in B not their textual layout • Solutions: – temporal versus spatial scope resolution – Put D in B (but then C can no longer call it and • References to variables are connected to D cannot access A's variables) declarations by searching back through the – Move the data from B that D needs to MAIN chain of subprogram calls that forced (but then all procedures can access them) execution to this point. • Same problem for procedure access! – Lisp provides dynamic scoping via special • Overall: static scoping often encourages declarations many globals (hack to provide access)

Example: Dynamic Scoping, Lisp Example, (continued) • The function FIND-BIGGEST takes a list of positive integers, and returns a dotted pair consisting of the biggest and second-biggest integers in the list. FIND-BIGGEST uses REDUCE in conjunction with Now, we will use dynamic scoping (a SPECIAL variable) to solve another function, BIGGESTYET, and a global variable (boo!!). the same problem without a global variable. In effect, secondB is like a "temporary" global variable, that exists only within the lifetime of FIND-BIGGEST. (defvar second-biggest -1) ; used in BIGGESTYET (defun find-biggest (L) (setq second-biggest -1) (defun find-biggest (L) (let ((result (reduce #'biggestYet L))) (let ((secondB -1)) (cons result second-biggest))) (declare (special secondB)) (defun biggestYet (a b) (let ((result (reduce #'biggestYet L))) (let ((max (if (< a b) b a)) (cons result secondB)))) (min (if (>= a b) a b))) (defun biggestYet (a b) (if (> min second-biggest) (let ((max (if (< a b) b a)) (setq second-biggest min)) (min (if (< a b) a b))) max)) (if (> min secondB) USER(20): (find-biggest '(1 3 8 5 2 6 2)) (setq secondB min)) max)) (8 . 6)

Example:Imperative Example Evaluating Dynamic Scoping

MAIN - declaration of x SUB1 - declaration of x - • Evaluation of Dynamic Scoping: ... – Advantage: convenience call SUB2 ... – Disadvantage: poor readability SUB2 • Scope and lifetime are sometimes closely ... - reference to x - related, but are different concepts!! ... MAIN calls SUB1 – Consider a static variable in a C or C++ SUB1 ... calls SUB2 SUB2 call uses SUB1 x function ... Static scoping - reference to x is to MAIN's x

Dynamic scoping - reference to x is to SUB1's x Referencing Environments with Referencing Environments Dynamic Scoping

• Def: The referencing environment of a • In a dynamic-scoped language, the statement is the collection of all names that referencing environment is the local are visible in the statement variables plus all visible variables in all – In a static scoped language, that is the local active subprograms variables plus all of the visible variables in all – See book example (p. 185) of the enclosing scopes – See book example (p. 184) – A subprogram is active if its execution has begun but has not yet terminated

Named Constants Variable Initialization

• Def: A named constant is a variable bound to a value only at time it is bound to storage • Def: The binding of a variable to a value at – Advantages: readability and modifiability the time it is bound to storage is called • The binding of values to named constants can be initialization either static (called manifest constants) or dynamic • Initialization is often done on the • Languages: declaration statement – Pascal: literals only – e.g., Ada – Modula-2 and FORTRAN 90: constant-valued SUM : FLOAT := 0.0; expressions –C++: – Ada, C++, and Java: expressions of any kind int foo = 1;