Handout 10: Procedures and Objects the Basic Imperative Language (See Handout 11) Has Only Commands, Variables and Expressions
Total Page:16
File Type:pdf, Size:1020Kb
06-02552 Princ. of Progr. Languages (and “Extended”) The University of Birmingham Spring Semester 2019-20 School of Computer Science c Uday Reddy2019-20 Handout 10: Procedures and objects The basic imperative language (see Handout 11) has only commands, variables and expressions. In this handout, we examine adding procedures and objects (actually classes). 1 Algol-like languages Procedures can be added to the basic imperative language using the typed lambda calculus. This was in fact done in the first systematically designed programming language, Algol 60, defined by an international committee of computer scientists.1 John Reynolds,2 in 1981, proposed a systematic redesign of Algol based on typed lambda calculus and called it Idealized Algol. We follow his approach. 1. Typed lambda calculus Recall that the types of a typed lambda calculus are given by the syntax: T ::= b j T1 ! T2 where b ranges over basic types. For a functional programming language, we choose basic types such as Int, Bool etc. For an imperative programming language, we can choose basic types that represent state-based computations. 2. Lambda calculus for imperative programs To obtain a typed imperative programming language, we pick basic types to be those representing imperative programming concepts (cf. Handout 11). These are: • Mutable variables, also called references, representing storage locations. • Expressions that read the state of variables and return a value (“state readers”) • Commands that alter the state of variables (“state transformers”). Variables, expressions and commands are not treated as types in typical imperative programming languages. Rather, they are designated as separate syntactic categories. However, to obtain the full power of typed lambda calculus, it is useful to regard them as types. The basic data types such as int, bool, and char or not regarded as types of the lambda calculus. This is because there are no terms in imperative programming languages that directly denote data values (except constants). Rather, terms denote either mutable variables or expressions, each of which might deal with values of particular data types. Let δ stand for types such as int, bool, . Then the basic types of the lambda calculus for imperative programs are: • var[δ], also written as ref[δ], for variables that store δ-typed data values. • exp[δ], for expressions that return δ-typed data values. • comm, for commands. In summary, the types of our lambda calculus for imperative programs is as follows: T ::= var[δ] j exp[δ] j comm j T1 ! T2 3. Terminology: variables, references and identifiers. Note that the term “variable” in imperative programming refers to storage locations whose values can be modified. In contrast, lambda calculus as well as standard mathematics use the term “variable” for a completely different concept, viz., symbols used to stand for values. 1It is doubtful if the committee members knew lambda calculus fully, but they reinvented some of its ideas for themselves. Thus Algol had only part of lambda calculus, not the full calculus. Peter Landin showed the correspondence between the two procedure mechanisms a few years later. See Landin, Peter. A correspondence between ALGOL 60 and Church’s Lambda-notations: Part II, Communications of the ACM, March 1965. 2Reynolds, John. The essence of Algol, in Algorithmic Languages, North-Holland, 1981. 1 To avoid conflict between the two uses, Algol 68 introduced the term “reference” for mutable variables in the sense of imperative programming. The terminology did not catch on within the imperative language culture (except in isolated usages like “call by reference”). However, it became standard in functional programming culture. So, we use both the terms “variable” and “reference” for referring to this concept. Algol 60 used the term “identifier” to what we call variable in mathematics and the lambda calculus. So an “identifier” is a symbol, used for formal parameters for functions and for naming various things, like functions, types, classes etc. For example, in the term λx. x + y of type exp[int] ! exp[int], the symbol x is a bound identifier and y is a free identifier. 4. Constants for imperative programs All the primitive operations of the imperative programs are modelled as constants in our typed lambda calculus. We group them into four classes, for ease of exposition: • Primitive operations for expressions. All the constants and primitive operations needed for data values are expressed as constants that act on exp types. Some examples are: 0, 1, 2, ...:: exp[int] true, false :: exp[bool] +, -, ... :: exp[int] ! exp[int] ! exp[int] =, <, ... :: exp[int] ! exp[int] ! exp[bool] &&, || :: exp[bool] ! exp[bool] ! exp[bool] not :: exp[bool] ! exp[bool] The only thing surprising about these types is that they involve the exp type constructor. We need the exp type constructor because, in general, the arguments for operations such as + are “expressions” which can read the state of variables. The result of such an application, e.g., x + y, is again an expression that is state-dependent. • Primitive operations that deal with commands. These are as follows: skip :: comm (;) :: comm ! comm ! comm if :: exp[bool] ! comm ! comm ! comm Note that (; ) is an infix operator. For if, we will use the syntactic sugar: if B then C1 else C2 if BC1 C2 • Primitive operations that deal with variables. These are as follows: read :: var[δ] ! exp[δ] (:=) :: var[δ] ! exp[δ] ! comm The assignment operator (:=) is an infix operator. Note that the type implies that, if V is a variable and E an expression then V := E is a command. Its effect is to evaluate E and assign its value to V . The read operation is normally left implicit in typical Algol-like languages. But it says that, if V is a variable, then read V is an expression. Its effect is to read the value of V and return it. Leaving the read operation implicit means that, whenever a variable is used in a position where an expression is expected, we automatically insert a read operator. For example, we write x := x + 1 where the variable x on the right hand side is used where an expression is needed. So, we understand it as: x := (read x) + 1 This convention is only used in what we traditionally call “imperative” languages. Functional languages like ML and Haskell have the read operation explicit. • Finally, we have an operation for local variable declarations: local[δ] :: (var[δ] ! comm) ! comm The effect of local[δ] B is to create a new local variable for δ-typed values, say V , and then execute B(V ). After B(V ) finishes, the local variable is deallocated. 2 Since this form looks a little heavy in normal usage: local[int](λx. C) we use the simpler notation: flocal[int] x; Cg and understand it to have the same effect. In summary, all the behaviour of imperative programs can be modelled using a few primitive functions in terms of the basic types var[δ], exp[δ] and comm. 5. Procedures. The procedures of Algol-like languages are mapped directly into the functions of lambda calculus. For example, the Algol 60 procedure declaration: procedure swap(var[int] x, var[int] y) f local[int] t; t := x; x := y; y := t g is thought of as the definition of a function swap: let swap = λx. λy. flocal[int] λt. t := read x; x := read y; y := read t g The type of swap is: swap : var[int] ! var[int] ! comm In general, a procedure is always a function that has comm is its codomain type.3 A function that has an exp[δ] type as its codomain type is thought of as a “function procedure” or sometimes just called a “function”. (But this is misleading because all procedures are functions.) 6. Semantics of procedure call. Prior to Algol 60, the meaning of a procedure such as swap was understood operationally, in terms of machine instructions that would be executed. That story might run as follows: 1. Push references to the variables i and j on the system stack. 2. Push the program counter on the stack, and jump to the code of swap. 3. When the code of swap finishes, pop the arguments i and j as well as the saved program counter from the system stack, and jump back to the saved program counter position. The definition of Algol 60 put paid to such operational descriptions. The semantics of a procedure call, as given in the Algol 60 Report, is to simply copy the body of the procedure to where the procedure call appears, and replace the formal parameters by the arguments, like so: . swap i j; fint t; =) t := read i; i := read j; j := read tg . This semantics came to be known as the Algol copy rule. We might also call it procedure unfolding. Note that the copy rule is precisely the β-equivalence reduction rule of the lambda calculus. 3It is not useful to think of this as a function “returning” a command. The idea of “returning” results only makes sense in purely functional languages. Here it is better to think of swap as “mapping” two variables x and y into a command. 3 2 Objects and classes Whereas in functional programming, data abstraction is achieved by abstracting over a type, in imperative programming it is more common to abstract over storage. The resulting abstractions are called objects. The behaviour of objects is defined via classes. Historically, objects and classes were introduced in the language Simula 67.4 They were popularised by the language C++ in the 1980s and widely adopted since then. An object encapsulates some amount of storage, represented by mutable variables or other objects, and provides operations that can be used by clients.