GEDANKEN-A Simple Typeless context of the language is permissible in any other mean- ingful context. In particular, functions and labels are per- Language Based on the mitted to be results of functions or values of references (e.g. variables), without imposing restrictions which main- Principle of Completeness and the tain a stack discipline for run-time storage allocation. (2) The Reference Concept. Assignment and indirect Reference Concept addressing are formalized in the following manner: among the possible values which may occur in a program are objects called references, which in turn possess other values. JOHN C. REYNOLDS The assignment operation always affects the relation be- Argonne National Laboratory,* Argonne, Illinois tween some reference and its value. Neither of these principles is novel. LISP [la and lb] (in its interpretive implementations), IswIM [2], and PAL GEDANKEN is an experimental with the [3] all satisfy the principle of completeness, and the refer- following characteristics. (1) Any value which is permitted in ence concept is used in ALGOL 68 [4] and BASEL [5]. But some context of the language is permissible in any other mean- GEDANKEN goes beyond these languages in exploiting the ingful context. In particular, functions and labels are permissible power of these principles, i.e. in eliminating other language results of functions and values of variables. (2) Assignment and features which are rendered redundant by completeness indirect addressing are formalized by introducing values, and references. SpecificMly: called references, which in turn possess other values. The as- (1) The existence of function-returning and reference- signment operation always affects the relation between some returning functions allows all compound data structures to reference and its value. (3) All compound data structures are be treated as functions. For example, a one-dimensional treated as functions. (4) Type declarations are not permitted. ALGoL-like array is treated as a function whose domain is a The functional approach to data structures and the use of finite set of consecutive integers and which maps each of references insure that any process which accepts some data these integers into a unique reference. This approach in- structure will accept any logically equivalent structure, regard- sures that any process which accepts some data structure less of its internal representation. More generally, any data will accept any logically equivalent structure, regardless of structure may be implicit; i.e. it may be specified by giving an its internal representation. More generally, any data struc- arbitrary algorithm for computing or accessing its components. ture may be implicit; i.e. it may be specified by giving an The existence of label variables permits the construction of co- arbitrary algorithm for computing or accessing its compo- routines, quasi-parallel processes, and other unorthodox control nents. (Functional data structures have been suggested by mechanisms. Balzer [6], but his realization of the concept is quite differ- A variety of programming examples illustrates the generality ent than GEDANKEN.) of the language. Limitations and possible extensions are dis- (2) The existence of label variables permits the construc- cussed briefly. tion of coroutines, quasi-parallel processes, and other un- KEY WORDS AND PHRASES: programming language, data structure, refer- orthodox control mechanisms. This is a direct consequence ence, assignment, coroutine, quasi-parallel process, typeless language, applica- of not imposing a stack discipline on the program control tive language, , list processing, nondetermlnlstlc algorithm information. CR CATEGORIES: 4.20, 4.22, 5.23, 5.24 The main limitation of GEDANKEN is that declara- tions are not allowed to restrict the value ranges of identi- fiers, references, or function results. Languages with this Introduction property are usually called "typeless," although the types The recent development of programming languages of values may be tested during execution. We do not sug- suggests that the simultaneous achievement of simplicity gest that type declarations are unimportant or that it is and generality in language design is a serious unsolved trivial to add them to GEDANKEN without destroying problem. This paper describes an experimental language, the generality of the language; this is a major theoretical called GEDANKEN, which was developed to attack this problem. problem. The originality of GEDANKEN lies primarily in the GEDANKEN is not intended to be a generally useful language features which have been excluded, and the main language, althoughit could be effective in situations where a aim of this paper is to demonstrate that these exclusions fair degree of object program inefficiency is tolerable. Its (except typelessness) do not impair generality. For this major purpose (reflected in its name, which is meant as an purpose, we include extensive programming examples. analogy to gedankenexperiments in physics) is to explore A formal definition of GEDANKEN is given in [7]. A the consequences of two basic design principles: complete but extremely inefficient implementation has (1) Completeness. Any value which is permitted in some been produced by translating this formal definition into * Applied Mathematics Division. Work performed under the LisP; this implementation has been used to check all auspices of the US Atomic Energy Commission. examples given in this paper.

308 Communications of the ACM Volume 13 / Number 5 / May, 1970 After describing the syntax of the language and the types TABLE I. A GRAMMAI¢ FOR GEDANKEN of values which are manipulated during program execution, we discuss the applicative part of the language, i.e. the (expo) ::= (constant> [ (identifier) I ((block)) (expl) ::= (expo) ] (function designator> evaluation of expressions and the application of functions. (function designator) ::= (expc> (expl) Finally, the imperative aspects, such as references, assign- (exp2) ::= (expl) [ (expl) = (exp2) ment, labels, and jumps, will be introduced. (expa) ::= (exp2) ] (exp2) AND (exp,) (exp4) ::= (exp3) ] (exp3> OR (exp4) Syntax (exp.> ::= (exp4) ] (conditional exp) [ (lambda exp>[ (exp4) := (exps) (conditional exp) ::= IF (exp6) THEN (exps) ELSE (exps) Mthough the importance of GEDANKEN lies in its (lambda exp) ::= k (pform0) (exp,) semantics, a definite syntax must be specified so that pro- (exp6) ::= (exps) I (sequence exp) ] (case exp) gramming examples can be given. A GEDANKEN program (sequence exp) ::= (empty> [ (exp,), (exps> {, (exp6>}* is a sequence of tokens separated by zero or more blanks, (case exp) ::= CASE (exp6) OF (exp6) {, (exps)}* (pform¢> ::= (identifier) [ ((pforml)) with at least one blank used as a separator whenever the (pforml) ::= (pform6) [ (sequence pform) juxtaposition would otherwise be ambiguous. The tokens (sequence pform) ::= (empty> ] (pform¢), (pform0) [, (pform0)}* are sequences of characters classified as follows: (decl) ::= (pforml> IS (exp6) (recursive decl) ::= (identifier> ISR (lambda exp) constants digit strings (denoting integers), quoted strings (label> ::= (identifier> : reserved words AND, OR, IF, THEN, ELSE, CASE, OF, (statement> ::= {(label>}* (exp6) IS, ISR (block) :: = {(decl);}* {(recursive decl); }* {(statement>; }* identifiers all other alphanumeric strings beginning with (statement> a letter (program> ::= (block) punctuation tokens k, = : () ; := Certain predefined identifiers have standard meanings. These include: TRUE, FALSE, LL, and UL, which denote they are similar to atoms in LisP, except that they lack specific primitive values; ERROR, which denotes a built-in property lists and print names, l~Iore precisely, the atoms label value causing program termination; and the names of are a denumerably infinite set of values which may be all built-in functions. (These predefined identifiers differ tested for equality, but which do not possess any ordering from reserved words in that the programmer can override or arithmetic operations. Two particular atoms, denoted the standard meanings by declarations.) by the predefined identifiers LL and UL, play a special role The set of token sequences which are well-formed in the language. Additional atoms are created by the GEDANKEN programs is specified by the context-free built-in function ATOM, which returns a distinct atom grammar (over an infinite vocabulary of tokens) in Table each time it is applied. I. The syntactic variables in this grammar are subscripted A function is a value which may be applied to another to distinguish among phrases with a similar semantic role value called its argument. When so applied, the function but different levels of precedence. Thus phrases of the will either: (i) return a value called its result, (ii) transfer classes (exp0), ... , (exps) are all called expressions, while control to a label value without returning a result, (iii) phrases of the classes (pform0) and (pform~) are called cause an error stop, or (iv) initiate a nonterminating com- parameter forms. The notation {a} * is used to indicate an putation. (The application of a function may also alter the arbitrary number (including zero) of occurrences of the state of a computation by producing various side effects, string a. which will be discussed later.) The set of arguments for It should be noted that a block can consist of a single which a function will return a result is called the domain expression; this permits any expression to be parenthesized of the function. A number of built-in functions are pro- without changing its semantics. vided which may be used without being defined; additional "user-defined" functions are produced by the evaluation of Primitive Values and Functions various expressions. The items of data which are manipulated during the (Proper procedures, in the sense of ALGOL, are not pro- execution of a GEDANKEN program are called values. vided in GEDANKEN, since they are equivalent to func- The set of all values is partitioned into seven types: inte- tions which execute useful side effects but return an irrele- gers, Booleans, characters, and atoms (collectively called vant result. Functions with multiple arguments are not primitive values), and functions, references, and label values provided, since they are equivalent to functions whose (collectively called nonprimitive values). (Floating-point arguments are sequences, as described below.) numbers are excluded, but their inclusion would not raise The functional approach to data structures is reflected in any significant problems.) Although the language does not the absense of a distinct type of value corresponding to the contain type declarations, a complete set of built-in func- conventional notion of a vector or array; the analogous tions is available for testing the type of a value during values in GEDANKEN are functions. Thus we will use program execution. the word "vector" to denote those functions which are Among the primitive values, only atoms are unusual; logically equivalent to conventional vectors.

Volume 13 / Number 5 / May, 1970 Communications of the ACC~I 309 It is evident that the domain of a GEDANKEN function the applicative sublanguage, which is obtained by dis- which is a vector must include a finite set of consecutive regarding references, label values, and the operations which integers; these integers are the analogue of the subscripts of manipulate them. a conventional vector. But a conventional vector also has Within this sublanguage, the basic operation is the the property that its set of subscripts is explicit; i.e. there evaluation of expressions. Since the evaluation of an ex- must be some method of testing the vector to determine its pression will usually involve the evaluation of its sub- least and greatest subscripts. To reflect this property in expressions, the definition of this operation is inherently GEDANKEN, we require that the domain of a vector recursive. Also, when an expression contains freeidentifiers, must include, in addition to the subscript set, the atoms its evaluation is only meaningful in the presence of some LL and UL, and that the results of applying the vector to mapping of these identifiers into values; such a mapping is LL and UL must be the least and greatest subscripts. called an environment and is said to bind each identifier to a This leads to the following definition. A function F is value. called a vector whenever: (1) its domain includes the atoms A complete program is always evaluated in an environ- LL and UL; (2) the results of applying F to LL and UL ment which binds the predefined identifiers into their are integers such that F(UL) _> F(LL) -- 1; (3) the domain standard values. Whenever the evaluation of an expression of F includes all integers i such that F(LL) < i < F(UL). e involves the evaluation of an immediate subexpression If F is a vector, then the integers F(LL), F(UL), and d, then, unless e is a lambda expression or a block, e' is F(UL) -- F(LL) -Jr 1 are called the lower limit, upper limit, evaluated in the same environment as e. The evaluation of and length of F, respectively, and for each integer i such lambda expressions and blocks (described in detail below) that F(LL) < i < F(UL), the result of applying F to i involves the concept of extension: if i is an identifier, v is a is called the ith component of F. value, and 7 and 7' are environments such that n' binds i A vector is called a sequence if its lower limit is 1. to v and specifies the same binding as 7 for all other identi- Although a vector is a kind of function, and a sequence fiers, then n' is called the extension of n formed by binding is a kind of vector, neither "vector" nor "sequence" is a ito v. "type" in the usual sense, since one cannot write a program We now describe the evaluation of each nontrivial form which will test whether an arbitrary function is a vector or of expression. The application of a function to an argument a sequence. Certain operations in GEDANKEN (e.g. is performed by a function designator: evaluation of sequence expressions or application of the (function designator) ::= (exp0) (expl) built-in function VECTOR) are guaranteed to produce function argument vectors, but equally valid vectors may also be produced by part part more general mechanisms (e.g. evaluation of lambda ex- which is evaluated by first evaluating its function part and pressions). Vectors produced in the latter manner are said its argument part to obtain values vf (which must be a to be implicit. function) and va, and then applying vf to v~. (Since the (The realization of vectors in GEDANKEN is in con- argument part is evaluated before the function is applied, trast to several languages, such as PAL, in which subscript this form of evaluation is similar to call by value in ALGOL, limits are obtained by applying built-in functions to vec- rather than call by name.) Since function designators have tors. In the latter approach vectors are not purely func- a right-associative syntax, the usual composition of func- tional, since they are amenable to other operations than tions may be written without parentheses; e.g. F(G(X)) application. The practical effect is to prohibit implicit may be written as F G X. vectors.) Functions may be produced by the evaluation of lambda The existence of sequences in GEDANKEN justifies the expressions: elimination of functions with multiple arguments. The analogue of a conventional function with k arguments, (lambda exp) ::= k (pformo) (exp.} when either k = 0 or/c _> 2, is a function whose single body argument is a sequence of length k. For example, the do- Basically, the value of a lambda expression is a function main of the built-in function ADD is the set of sequences which (when it is applied to an argument at some later of length two whose components are both integers. (This point during the computation) computes its result by approach is a direct borrowing from PAL.) binding the parameter form to its argument and then eval- The remaining types of nonprimitive values, references uating the body. More precisely, if f is the function ob- and label values, will be defined later. tained by evaluating k(p)e in the environment n, then the result of applying f to an argument a will be obtained by Applicative Semantics evaluating e in an environment which is the extension of To describe the semantics of GEDANKEN, we follow formed by binding p to a. (The meaning of binding p to a, Landin [2] and Evans [3] in dividing the language into an when p is not an identifier, will be defined below.) applicative part, involving the evaluation of expressions This binding mechanism is quite conventional (it is and the application of functions, and an imperative part, called FUNARG binding in LisP and is similar to the involving assignment and control jumps. We first consider mechanism used in ALGOLand in PL/I), but a clear under-

310 Communications of the ACM Volume 13 / Number 5 / May, 1970 standing of its implications is vital. There are two separate body) (3, 4), X is bound to 3 and Y is bound to 4. However, actions: (i) the evaluation of the lambda expression to the sequence argument approach also provides useful produce a function, and (ii) the application of this function unconventional capabilities, e.g. (k(X, Y) body) (IF P to its arguments. The body of the lambda expression is not THEN (3, 4) ELSE (5, 6)). More importantly, the ability evaluated until (ii), but the environment in which the body to bind a single identifier to an entire sequence provides is evaluated is an extension of the environment used during the equivalent of a function with an indefinite number of (i) rather than (ii). As a result, when a lambda expression arguments, e.g. (XX body) (IF P THEN (3, 4) ELSE contains free identifiers, its evaluation in different environ- (5, 6, 7)). meats will produce different functions. For example, in an GEDANKEN is similar to EUL~R [8] in treating all environment where Y is bound to an integer k, the evalua- types of unlabeled statements as expressions. In particular, tion of h(X) ADD(X, Y) produces a function which in- a block is a form of expression with a meaningful value: creases its argument by k. (block) ::= {(decl);}* {(recursive decl);}* Functions which are sequences may also be produced by {(statement);}* (statement) the evaluation of sequence expressions: where (sequence exp) ::= (empty) ] (exp,), (exps) {, (exps)}* (decl) ::= (pforml) IS (exp,) Let n be the number of subexpressions. Then the sequence (reeursive decl) ::= (identifier) ISt~ (lambda exp) expression is evaluated by first evaluating its subexpres- Basically, a block is evaluated by first carrying out the sions to obtain values vl, ... , v~ and then producing a bindings indicated by its declarations, recursive declara- sequence of length n whose ith component (for 1 < i < n) is v~. tions, and labels, and then evaluating the statements in order from left to right. The value of the block is the value Because of their low precedence, sequence expressions of the rightmost statement. The values of preceding state- are usually parenthesized, but the parentheses themselves ments are ignored; in the absence of imperative features, do not indicate a sequence expression. Thus the expressions these statements have no effect. () and (X, Y) both produce sequences, but (X) has the More precisely, a block is evaluated as follows (we in- same value as X. There is no sequence expression which clude the binding of labels although it is an imperative produces a sequence of length one, but such sequences can aspect of the language): be produced by the built-in function UNITSEQ, which (1) For each declaration ((decl)), in order from left to returns a sequence whose only component is the value of its argument. right: the right side of the declaration is evaluated, and then the current environment is extended by binding the As noted earlier, a function of n arguments (n ~ 1) is left side of the declaration to the value of the right side. treated in GEDANKEN as a function of a sequence of length n. This suggests that when a function produced by a (2) The current environment is further extended by binding each identifier which occurs on the left of a recur- lambda expression expects to receive a sequence as its argument, the parameter form within the lambda expres- sire declaration ((recursive decl)), or as the label of a statement, to a distinct "dummy" value. sion should be able to bind several different identifiers to (3) The right side of each recursive declaration is the components of the sequence. To provide this capability evaluated, and its value replaces the corresponding dummy we extend the notion of a parameter form to include a value. sequence parameter form (which is a rough analogue of a formal parameter list in ALGOL) : (4) For each label, an apropriate label value is created and replaces the corresponding dummy value. (sequence pform) ::= (empty) I (pformo), (pform0) {, (pform0)}* (5) The statements are evaluated in order from left to The relevant semantics are given by defining (recur- right. sively) the extension of an environment n formed by bind- (6) The value of the block is the value of the rightmost ing an arbitrary parameter form p to a value v. This statement. extension is computed as follows: In steps 2 to 4, the device of binding identifiers to dummy values and then replacing the dummy values (1) If p is an identifier, then n is extended by binding allows an environment to be cyclic, i.e. to bind an identifier ptov. to a value which is produced by evaluating a lambda (2) If p has the form (p'), then ~ is extended by binding expression (or label) in the same environment. p' to v. The essential difference between (nonrecursive) declara- (3) If p is a sequence parameter form, pl,..., p. tions and recursive declarations is that the right side of a (n ~ 1), then v, which must be a function, is applied declaration "feels" only the bindings caused by preceding to each integer from 1 to n, and ~/ is repeatedly declarations, while the right side of a recursive declaration extended by binding each p~ to the result of v(i). feels the bindings caused by all declarations in the block, The syntax of sequence expressions and sequence param- including implicit label declarations. Recursive declara- eter forms preserves conventional notation for functions tions are needed to define recursive functions conveniently, of several arguments. Thus in the evaluation of (h(X, Y) including families of functions which call one another.

Volume 13 / Number 5 / May, 1970 Communications of the ACM 311 (They also permit the definition of functions which jump CDR, we treat the two-field list cell produced by CONS as into the immediately enclosing block.) a function whose domain contains two elements (e.g. 1 Nonrecursive declarations are less essential, but they and 2) and which maps these elements into the values of its permit convenient constructions such as X IS ADD(X, CAR and CDR fields. This viewpoint leads directly to the 1). More important, their existence allows the right sides definitions: of recursive declarations to be limited to lambda expres- CONS IS X(X, Y) XZ IF Z = 1 THEN X ELSE Y; sions, so that meaningless constructions such as X ISR CAR ISXXX 1; ADD(X, 1) are syntactically illegal. CDR IS XX X 2; Conditional expressions have the same meaning as in ALGOL. Case expressions have a rather unorthodox mean- These definitions imply an ability to do list processing ing (which is convenient for defining implicit sequences): without special built-in functions. In a conventional list- CASE e0 OF el, ..., en is evaluated by first evaluating e0 processing system (e.g. compiled Lisp 1.5 [la and lb] or to obtain a value i; then if i is an integer satisfying 1 _< i some extensions of ALGOL [4, 9]) user-defined functions are < n, the value of the case expression is obtained by restricted so that storage for the wdues of their identifiers evaluating e~ ; if i is LL or UL the value is 1 or n respec- obeys a stack discipline. Then list structures, which do not tively; all other values of i give an error stop. obey a stack discipline, must be allocated in a separate The remaining forms of expressions are most easily storage area, and built-in functions or operations must be defined as abbreviations. Except for coercion (discussed provided for accessing this area. But in GEDANKEN, the later), they can be eliminated from a program by applying user may develop list-processing by defining function- the following transformations: returning functions (such as CONS above) which violate a stack discipline. In effect, all storage is potentially list- el = e2 ~ EQUAL(el , e2) structured. el AND e~ ~ (IF el THEN e~ ELSE FALSE) Although the above approach is workable and theoreti- el OR e2 ~ (IF el THEN TRUE ELSE e2) cally attractive, it is more convenient to use sequence el := e2 ~ SET(el , e2) expressions to create list elements and direct application The built-in function SET will be defined later. EQUAL to obtain their subfields. Thus, we write (X, Y) instead of tests the equality of primitive data, but if either com- CONS (X, Y), X 1 instead of CAR X, and X 2 instead of ponent of its argument is a function or a label value, it CDR X. Following this approach, we introduce lists by will return FALSE. Its action on references will be de- first creating an atom to denote the empty list: scribed later. NIL IS ATOM( ); Theoretically, nonrecursive declarations, sequence pa- rameter forms, and sequence expressions can also be and then defining a list to be either the atom NIL or a regarded as abbreviations. Their occurrences in a program sequence of length two whose second component is a list. can be eliminated by repeated application of the following The following functions will return the length of a list, equivalences: find the ith element of a list, and append one list to another: p IS e; b ~ (k(p)(b))(e) k(pl , .-. , p,) b (when n~ 1) LISTLENGTH ISR XL IF L = NIL THEN 0 ki(p~ ISil; ... ; p~ ISin'; b) ELSE INC LISTLENGTH L 2; el , .-- , e, (when n ~ 1) LISTELEM IStL x(I, L) IF L = NIL THEN GOTO ERROR (il ISe~; -.- ; i.ISe,; Xi(CASEiOFil,.-. ,i,)) ELSE IF I = 1 THEN L 1 ELSE LISTELEM(DEC I, L 2); APPEND ISR X(X, Y) IF X = NIL THEN Y where n' is an integer constant whose value is n, and i, i~, ELSE (X 1, APPEND(X 2, Y)); ..., i~ are distinct identifiers which do not occur in the program being transformed. Hence INC and DEC are built-in functions which increase It should be noted that GEDANKEN does not include or decrease an integer by one. certain features, such as infix arithmetic operators or for As a second example, consider one-dimensional arrays. statements, which would enhance the conciseness of the We have defined a type of function called a vector which is language without expanding the range of programs which the analogue of a one-dimensional array, and we have could be expressed. Such features could be added easily, introduced sequence expressions for creating vectors. But but they are not germane to the basic purposes of the a sequence expression can only produce a vector which is a language. sequence, and it is inconvenient for producing very long vectors. What is needed is a function which will produce a Functional Data Structures vector from a functional specification of its components, Even the applicative part of GEDANKEN is sufficient i.e. which will accept another function, tabulate its results to demonstrate the power and flexibility which can be over a finite range, and return a "lookup" function for the obtained by treating data structures functionally. resulting table. As a first example, consider LisP-like list structures. To Thus we define a function VECTOR which accepts an define analogues of the Lisp functions CONS, CAR, and argument (L, U, F), where L and U are integers and F is a

312 Communications of the ACM Volume 13 / Number 5 / May, 1970 function. If U < L, VECTOR returns an empty vector V that we wish to give it a sequence whose ith component is such that V(LL) = L and V(UL) = L -- 1. Otherwise, the ith element of a list L. This can be done in a conven- VECTOR evaluates F(I) for each integer I between L and tional manner by evaluating P VECTOR (1, LIST- U inclusive, and returns a vector V such that V(LL) = L, LENGTH L, ~, I LISTELEM(I, L)), which copies the V(UL) = U and for L < I < U, V(I) is the value of F(I). elements of L into a contiguous array. But it is also possible The basic approach is to recur on the length of the vector, to evaluate P MAKESEQFROMLIST L, where tabulating a single value (bound to T) at each level of MAKESEQFROMLIST IS X L recursion. k I IF I = LL THEN 1 ELSE IF I = UL THEN LISTLENGTH L ELSE LISTELEM(I, L); VECTOR ISR X(L, U, F) IF GREATER(L, U) THEN MAKESEQFROMLIST does not copy the components of x I IF I = LL THEN L ELSE IF I = UL THEN DEC L L; instead, it returns an implicit sequence which will look ELSE GOTO ERROR ELSE (V IS VECTOR(L, DEC U, F); T IS F U; up the appropriate element of L each time one of its com- XIIFI= ULTHENUELSEIFI = UTHENTELSEVI); ponents is accessed. It is equally possible to produce an implicit list from a It is evident that this function, although theoretically sequence: correct, will be extremely inefficient in any reasonable MAKELISTFROMSEQ ISR X S MLFSI(1, S); implementation. For this reason, a built-in function MLFS1 ISR x(I, S) IF GREATER (I, S UL) THEN NIL VECTOR is provided which is defined to be equivalent to ELSE X K (CASE K OF S I, MLFSI(INC I, S)); the function above (except for coercion). (This question of efficiency may be clarified by consider- (Here MLFS1 is a subsidiary function which produces an ing implementation mechanisms. In a simple implementa- implicit list from the subsequence of S that begins with tion, functions would possess two distinct internal repre- the Ith component.) sentations: If a function was produced by evaluating a The data structures shown so far have the limitation that lambda expression, it would be represented by a "lambda once a structure has been created, its components or ele- record" containing a pointer to code which was compiled ments cannot be altered. To overcome this limitation we from the lambda expression plus values for each free must introduce the imperative aspects of GEDANKEN. identifier in the lambda expression (i.e. a representation of References the environment in which the lambda expression was evaluated). On the other hand, if a function was created by In any programming language which permits assign- evaluating a sequence expression or by the application of ment, there is a class of objects which are affected by VECTOR, it would be represented by a "vector record" assignment. We will call these objects references; other containing domain limit and indexing information plus a terms used commonly in the literature are "name" and contiguous array of component values. It is evident that "L-value." At any time during the execution of a program, the above definition of VECTOR would yield a vector each reference possesses some value. The effect of an assign- whose internal representation was a linked list of lambda ment operation r: = v is to cause the reference denoted by records, each containing one component value, rather than r to possess the value denoted by v. a contiguous array.) Within this definitional framework, there are at least Using lists and vectors, we may illustrate our assertion three distinct approaches to assignment (see Figure 1): that any process which accepts some data structure will (1) Identifiers are used as references. This approach is accept any logically equivalent structure. Suppose that P used in SNOBOL [10], where a form of indirect addressing is a function which expects a sequence as its argument, and is achieved by allowing identifiers to occur as values. Un- fortunately, the approach does not mesh well with block structure; a discussion of the difficulties is given by Kain [11]. [IDENTIFIERS] IIDENTIFIERSI (2) References are distinct from either identifiers or values, and are interposed between all other value-denoting entities and their values. Thus the bindings of identifiers, r~ the arguments and results of functions, and the com- VALUES ponents of vectors are all references, and the values J]REFERENCESvALUES li~° I%}1/ ~28} VALUES denoted by these entities are actually the values possessed 28 by the references. This approach is used in PAL, and to a IDENTIFIERS IREFERENCESF large extent in FORTnAN and PL/I, except that in the latter / languages function results are values, and identifiers may ~'1FUNCTIONS t/ FUNCTIONS~ ~FUNCTIONS~/ be bound directly to functions and label values, but not to primitive values. The approach meshes well with block (i) (2) (3) structure but is rather inflexible; one moves from the Fzo. 1. Three approaches to assignment applicative situation, where assignment is impossible, to

Volume 13 / Number 5 / May, 1970 Communications of the ACM 313 the opposite extreme, where every value-denoting entity Then: can be affected by assignment. (1) All built-in functions which would otherwise be (3) References are treated as a distinct type of value, so meaningless coerce their argument or the appropriate that any value-denoting entity can denote either a con- components of their arguments. For example, ADD(X, Y) ventionalvalue or a reference which in turn possesses a is equivalent to ADD(COERCE X, COERCE Y), but value. This approach is used in ALGOL 68 and BASEL. ISREF X is not equivalent to ISREF COERCE X, nor (BASEL also permits a form of assignment which alters VAL X to VAL COERCE X. identifier binding.) It is compatible with block structure (2) REF X coerces X, SET(R, X) (and therefore R and is more flexible than the previous approach, since the := X) coerces X, and EQUAL(X, Y) (and therefore X = programmer can introduce references in just those con- Y) coerces both X and Y. Since these functions would each texts where he intends to do assignment. Advantages be meaningful for references without coercion, analogous should accrue in both the optimization of data representa- noncoercing functions, named NCREF, NCSET, and tions and the checking of erroneous assignment statements. NCEQUAL, are also provided. NCREF and NCSET (The above categorization must be qualified by the fact permit references to possess values which are also refer- that FORTRAN, PL/I, ALGOL 68, and BASEL all have type- ences. NCEQUAL can be used to determine whether two declaration mechanisms which affect their treatment of values are the same reference. assignment. A discussion of this interaction is beyond the (3) Conditional and case expressions coerce the values of this paper.) of their leftmost subexpressions. In GEDANKEN we have chosen to use the third ap- (4) Expressions involving AND and OR coerce the proach to assignment. Thus we introduce a new, denum- values of both their subexpressions. erably infinite set of values called references, and stipulate (5) A function designator coerces the value of its function that each reference possesses some other value (which may part. itself be a reference). Three built-in functions are provided (6) When a sequence parameter form pl, ..., p. is to manipulate references: REF, SET, and VAL. REF X bound to a value a, each p~ will be bound to (COERCE a) returns a distinct reference each time it is applied; this (i). reference is initialized to possess the value X. SET(R, X) (7) Vectors which are created by evaluating sequence (which can be abbreviated R := X) causes R (which must expressions or by application of the built-in functions be a reference) to possess the value X, and also returns X; VECTOR or UNITSEQ will coerce their argument. its action on R is an example of a side effect. VAL R returns Despite their ad hoc appearance, most of these coercion the value possessed by R (which must be a reference). rules are instances of the general principle that coercion For example, under the scope of the declaration X IS 3, should only occur in situations which would otherwise give the identifier X is bound to the integer 3, and this binding an error termination. The exceptions are rules (2) and cannot be altered by assignment. Evaluation of the expres- (4), which are simply concessions to conventional nota- sion X := 4 would give an error, since 3 is not a reference. tion. Analogously, under the scope of the declaration X IS REF 3, the identifier X is bound to the reference created Data Structures with Embedded References by REF, and this binding cannot be changed by assign- The utility of references becomes apparent when refer- ment. But now evaluation of X := 4 is legitimate, and ence-returning functions are used to embed references causes the value possessed by the reference bound to X to within data structures, yielding structures which can be change from 3 to 4. Thus in the execution of the block altered by assignment. (x ISREF3;VALX = 3;X := 4;VALX = 4) This approach provides precise control over the ways in which data structures can be altered. Thus the GEDAN- both equality predicates will be true. KEN equivalent of an ALGOL-like one-dimensional array The major difficulty with this approach is the frequent is a vector whose components are references, e.g. necessity for using the function VAL. For example, under the scope of the declarations X IS REF 3; Y IS REF 4; X IS VECTOR(i, 100, X I REF 0); one would write ADD(VAL X, VAL Y) rather than Under the scope of this declaration, assignment can be ADD(X, Y), since ADD acts upon integers rather than made to the components of X, e.g. X(7) := 10, but not to references. To alleviate this difficulty, we introduce X itself. In particular, the subscript limits X LL and X coercion conventions into GEDANKEN; i.e. we stipulate UL are fixed by the declaration. that references will be replaced by their values in certain On the other hand, the equivalent of a string variable is contexts which would otherwise be meaningless. provided by a reference whose value is a vector: Specifically, let COERCE be the function S IS REF VECTOR(i, 100, F); COERCE ISR k X IF ISREF X THEN COERCE VAL X ELSE X; Here assignment can be made to S itself (possibly changing (which is available as a built-in function), and define "to the subscript limits) but not to its components. coerce X" to mean the replacement of X by COERCE X. A second consequence of the reference concept is the

314 Communications of the ACCM Volume 13 / Number 5 / May, 1970 ability to define data structures or sets of data structures such functional property lists: which share elements, in the sense that assignment to one MAKEPROPLIST IS k( ) (L IS REF NIL; k P PROPVAL(P, L)); element will affect another. Consider a square matrix M. We could define M as a vector of vectors, i.e. Each application of MAKEPROPLIST returns a now instance of PROPVAL, with L bound to a private "own M IS VECTOR(l, 10, k I VECTOR(l, 10, k J REF 0)); variable." Since a property can be any primitive value, a but this leads to the inconvenience of referring to an ele- functional property list is similar to a reference-valued ment of 1Vi by (M I) J. It is more natural to define I~{[ as a vector, except that it has an indefinite domain. Indeed, reference-returning function of pairs of integers: functional property lists can be used to provide an efficient implementation of sparse vectors. M IS (M1 IS VECTOR(l, 10, k I VECTOR(l, 10, k J REF 0)); As a final example of the use of references, suppose that ~(I, J) (M1 I) J); READ is a function such that each application of READ so that an element is referred to as M(I, J). Now consider produces the next item of data from some input stream, the additional declarations: and that we wish to produce an implicit list of the succes- sive items in the stream. The following function (of no MT IS k(I, J) M(J, I); MD IS k I M(I, I); arguments) returns such a list: Here MT and MD denote the transpose and diagonal of hi, MAKERLIST ISR k( ) in the sense that assignment to an element of one matrix (B ISREF0;kI affects the corresponding elements of the others. (IF B = 0 THEN B := (READ ( ),MAKERLIST()) ELSE ( ); Elements may also be shared within the same data B I)); structure. For example, The result of MAKERLIST is an implicit list (whose S IS(S1 IS VECTOR(l, 10, k I VECTOR(l, I, k J REF 0)); implicit length is infinite) which only applies READ as x(I, J) IF NOT GREATER(J, I) THEN (S1 I) J ELSE (S1 J) I); items of data are actually needed, and only stores pre- defines a symmetric matrix in which assignment to S(I, J) viously read items which are still accessible. also alters S(J, I). The embedding of references in list structures also pro- Implicit References vides control over the ways in which these structures may be altered. An example is the property list, which is a list The utility of implicit data structures suggests the of property-value pairs subject to two operations: the introduction of an analogous facility for references. Thus value paired with a given property may be looked up; or we introduce the concept of an implicit reference, i.e. a the value paired with a given property may be changed, value whose external appearance is the same as a reference, adding a new pair to the list if the property is not already but which may carry out an arbitrary computation each present. It is evident that references must occur in the time it is set or evaluated. (Implicit references are related property list at two points: each value must be a reference, to doublets in POP-2 [12].) so that it can be changed; and the entire list must be a To specify an implicit reference, the programmer must reference, so that new pairs can be added. provide two functions: a "setting function" S which will The following function manipulates such property lists. be executed each time a value is assigned to the implicit Given a property P and a (reference to a) property list L, reference, and an "evaluating function" V which will be PROPVAL(P, L) searches L for an occurrence of P. If P is executed each time the implicit reference is evaluated. Thus found, the reference paired with P is returned. Otherwise, an implicit reference is produced by applying the built-in a pair consisting of P and a new reference (initialized to function IMMPREF(S, V), where S and V may be arbitrary zero) is added to L, and the new reference is returned. The functions of one and zero arguments respectively. Each argument P is coerced. application of IMPREF produces a distinct implicit PROPVAL IS k(P, L) reference, and these implicit references satisfy the predicate (P IS COERCE P; ISREF and are coerced in the same manner as conven- SEARCHL ISR ~ X tional references. But the effect of SET or VAL on an IF X = NIL THEN implicit reference is to execute S or V. Specifically, if R (NEWV IS REF 0; L := ((P, NEWV), VAL L); NEWV) is the result of IMPREF(S, V), then ELSE IF (X 1) 1 = P THEN (X 1) 2 ELSE SEARCHL X 2; SEARCHL VAL L); NCSET(R, X) = (S X; X) SET(R,X) = (X IS COERCEX;SX;X) An application of this function can occur on either side of VALR = V () an assignment operation; on the right side it will act to look up a value, on the left side it will act to alter a value. To illustrate the use of implicit references, consider the A further step can be taken by viewing the property list problem of protecting a reference-valued vector. Suppose itself as a reference-returning function which accepts a that P is a function which accepts a vector whose com- property and returns a reference to the corresponding ponents are references. We wish to apply P to such a value. The following function (of no arguments) returns vector V, but to protect the components of V from being

Volume 13 / Number 5 / May, 1970 Communications of the ACM 315 affected by P; i.e. we want these components to revert to (3) A dump, which specifies the computations to be their original values after the application of P is finished. performed after the current block is completed. The The simplest approach is to copy V by executing P dump is a pushdown stack containing an entry for VECTOR(V LL, V UL, X I REF V I), but this will be each block and lambda-expression body whose inefficient if V is large and only a few components are reset evaluation is incomplete; each entry contains a by P. An alternative approach is to maintain a "change control and an environment (plus additional in- list" of the components of V which have been altered by formation which is needed to describe partially P. This may be done by executing P PSEUDOCOPY V, evaluated compound expressions). where (4) A memory, which specifies the mapping of references into their values. PSEUDOCOPY IS x V A label value consists of a control, an environment, and (CL IS REF NIL; SEARCHCL ISg X(X, I, F, G) IF X = NIL THEN G( ) a dump. During the evaluation of a block, immediately ELSE IF (X1) 1 = ITHENF (X1) 2 before the first statement is evaluated, a label value is ELSE SEARCHCL(X 2, I, F, G); created for each label in the block; each label value con- x I (I IS COERCE I; tains a list of the statements between the corresponding IF I = LL THEN V LL ELSE IF I = IJL THEN V UL label and the block end, plus the current environment ELSE IF NOT ISINTEGER I OR GREATER(V LL, I) OR GREATER(I, V UL) (including the bindings of the labels themselves) and dump. THEN GOTO ERROR When the built-in function GOTO is applied to a label ELSE IMPREF ( value, the current control, environment, and dump are X X SEARCHCL(VAL CL, I, X R NCSET(R, X), replaced by the constituents of the label value, and execu- x( ) (CL := ((I, NCREF X), VAL CL)), tion continues with the first statement of the new control. X( ) SEARCHCL(VAL CL, I, VAL, X( ) VAL V I)))); The memory is not altered. The result of PSEUDOCOPY is an implicit vector This mechanism permits jumps within the same block whose components are implicit references. Internally, CL (which leave the environment and dump unchanged) or to is a reference to the change list, which is a list of pairs, each higher level blocks, with the same effect as in ALGOL. But containing an integer argument of some altered com- the fact that label values can be possessed by references or ponent and a reference to the current value of that com- returned by functions also provides the ability to jump ponent. SEARCHCL is a subsidiary function which back into a block after it has been exited from. It is this searches a change list X for a pair beginning with the capability which allows the construction of coroutines. integer I. If such a pair is found, SEARCHCL returns the result of F applied to the reference paired with I; otherwise Coroutines SEARCHCL returns the result of G, which is a function A coroutine is a procedure which can relinquish control of no arguments. (The noncoercing functions NCSET and to its calling program and later be reactivated to continue NCREF are used to allow the values possessed by the computation. The simplest situation is that of two pro- components of V to be references.) cedures, each of which treats the other as a subroutine. As an example, suppose that COMPILE is a procedure Label Values which produces a succession of data items called instruc- The final type of value used in GEDANKEN is the tions, outputting each instruction by applying a function label value. These values are created during execution of a OUT, and that ASSEMBLE is a procedure which accepts block containing labeled statements, and are used as a succession of instructions, inputting each instruction by applying a function IN. If OUT and IN are arguments to arguments to the built-in function GOTO, which never COMPILE and ASSEMBLE respectively, we have returns but instead causes a transfer of control to the computational state represented by the label value. COMPILE ISRXOUT (... OUTX... ); A more precise description requires introducing a model ASSEMBLE ISI~ X IN ( ... X := IN( ) ... ); of the interpretation of GEDANKEN by an abstract We now want to couple these procedures so that machine. A complete description of such a model (given ASSEMBLE receives the output of COMPILE. Speci- in [7]) is beyond the scope of this paper, but the following fically, we want to run ASSEMBLE until it requests input, aspects are relevant to an understanding of the label and than run COMPILE until it produces the required output, GOTO mechanisms: then run ASSEMBLE again, etc. The necessary program During the execution of a program (at any instant when can be written by using label-valued references which are a statement is about to be evaluated) the state of the global to both IN and OUT: abstract interpreter will include the following entities: (LC IS REF 0; LA IS I~EF 0; INST IS REF 0; (1) A conlrol, which gives a list of the statements re- LC := LC1;ASSEMBLE(X( ) (LA := LA1; GOTOLC; maining to be evaluated in the current block. LAI: VAL INST)); GOTO DONE; LCi: COMPILE(XX (LC := LC2; INST := X; GOTO LA; (2) An environment, which gives the identifier bindings LC2:)) ; GOTO ERROR; to be used in the current block. DONE :) ;

316 Communications of the ACCM Volume 13 / Number 5 / May, 1970 Here LA and LC are label-valued references saving the CONT: IF R = NIL AND W = NIL THEN GOTO DONE current states of ASSEMBLE and COMPILE, and INST ELSE IF R = NIL THEN (CHAR :=READCHAR();R :=W;W := NIL) is a third reference used to hold the instruction being ELSE ( ) ; transmitted from COMPILE to ASSEMBLE. If COM- (L IS 1~ 1; R := R 2; GOTO L); PILE finishes while ASSEMBLE is still waiting for another DONE: VAL C) instruction, an error stop occurs. Each independent parser is represented by a label value if 1Nondeterministie Algorithms it has not completed its parse, or by its result if it has Label values in GEDANKEN are closely related to completed its parse. The finite set of parsers is main- "processes" in simulation languages such as SIMULA. [13a tained by the values of the references C, W, and R. C gives and 13b]; both are mechanisms which allow the state of a a list of the results of completed parses, W gives a list of suspended computation to be saved as an item of data. The label values representing the parsers which are waiting for essential difference is that further execution of a computa- the next character, and R gives a similar list for the parsers tion which was saved as a process causes the process to be which are ready for execution before reading the next updated, while further execution of a computation saved character. The reference CHAR keeps track of the current as a label value leaves the label value unchanged. Thus character, and is updated by the built-in function READ- label values can be used to repeatedly initiate execution CHAR. The label CONT is reached whenever execution is from the same state. to be switched from one parser to another. The final value This capability can be used to program a mode of execu- of the block is the list of completed parses; the input tion for nondeterministic algorithms [14] in which alterna- string is ill formed, well formed, or ambiguous depending tive paths are pursued concurrently. A simple example is upon whether this list has zero, one, or more than one nondeterministie parsing. It is fairly straightforward to element. convert a context-free grammar into a recursive parsing (This approach to parsing is basically the same as that function. Unfortunately, for many grammars this function used in the COGENT programming system [15a and 15b]. will contain nondeterministic branches, i.e. points at which It is presented here as an illustration of the generality of a conditional branch must be performed although the cur- GEDANKEN, but it does not represent a significant rent state of the parse is insufficient to determine this advance in the field of parsing techniques. Although it is branch. reasonably efficient for a large class of unambiguous gram- When such nondeterminism exists, parsing can be ac- mars, at least if the function PARSE is carefully con- complished by simulating a finite set of independent structed, some ambiguous grammars will cause an ex- parsers, all accepting the same input string and obeying ponential growth in the number of parsers and are better the same program, but with different control states. When treated by other methods, such as that of Earley [16].) a parser encounters a nondeterministic branch, it expands into two separate parsers; when a parser reads an input Limitations and Possible Extensions character which is inconsistent with its control state, it is The goal of applying the basic principles of GEDANKEN" deleted. to the design of an efficient general purpose programming Specifically, we assume that PARSE(IN, AMB, FAIL) language raises several interesting research problems: is a function which accepts two functions IN and AMB, (1) Addition of Type Declarations. The most natural and a label value FAIL, and returns some representation approach is probably an extension of Hoare's concept of of a successful parse. The function IN, of no arguments, is record classes [9]. The programmer would be able to applied by PARSE to read each character of the input declare an arbitrary number of disjoint function, reference, string. The function AMB, whose argument is a label and label classes, and would specify the range of each value, is applied to execute a nondeterministic branch; one identifier, function result, and reference value to be some side of the branch returns from AMB while the other union of such classes (and/or predefined classes of primi- jumps to the label-valued argument. PARSE jumps to the tive values). All functions in the same class would have the label value FAIL when it encounters an inconsistent same domain-range relation, and all references in the same character. We assume that PARSE does not set any class would have the same set of possible values. references, or at least that it does not expect the value of However, the functional approach to data structures will any reference to be preserved across an application of IN require unusual flexibility in the specification of the do- or AMB. main-range relations of functions. If an inhomogeneous The following program carries out the concurrent execu- data structure such as a record is to be treated as a func- tion of PARSE, synchronizing the independent parsers by tion, then it must be possible to specify that the range their reading of characters: of such a function depends on its argument. For example, (C IS REF NIL; W IS REF NIL; R IS REF NIL; the set of lists of integers would be the ~.mion of the set CHAR IS REF NIL; {NIL} with a class of functions with domain (1, 2) which C := (PARSE(k() (W := (L1, VALW); GOTO CONT; Li: VAL CHAR), map 1 into an integer but map 2 into a list of integers. k L2 (R := (L2, VAL R)), CONT), An elaboration of this approach to type, limited to a VAL C); purely applicative language, is described in [17].

Volume 13 / Number 5 / May, 1970 Communications of the ACM 317 (2) Open Functions. Efficient implementation of func- expression, but produces zero with right-to-left evaluation. tional data structures will require that certain functions be Label-valued references lead to more paradoxical pro- compiled into open code, i.e. that function designators grams, such as should be replaced by modified copies of the corresponding (x IS iILEF 0; L IS R.EF 0; M IS ILEF 0; L := L1; lambda-expression body, and that these copies should then (X := INC X, (iV[ := Mi; Mi: GOTO L)); be simplified to take advantage of constant arguments. LI: L := L2;GOTOM;L2: VALX) This capability could be provided by a macro-definitional which produces one with left-to-right evaluation, zero with facility. A second approach, more in keeping with the spirit right-to-left evaluation, and possibly two with intermixed of GEDANKEN, would be to permit certain lambda evaluation. expressions to be given an OPEN attribute. This problem is common to a wide variety of languages. This raises the question of whether a compiler could One either imposes a fixed order of evaulation, as in determine automatically when a designator of a lambda- ALGOL 60 or GEDANKEN, or permits a significant class defined function should be replaced by a copy of the func- of well-formed programs to have indeterminate inter- tion body. One might conjecture that such an expansion pretations, as in A.~GOL 68 or PL/I. But a more flexible could be performed for any function which was defined by approach might be possible, e.g. a limited form of impera- a nonrecursive declaration. Unfortunately, this conjecture tive features which could be added to an applicative lan- is disproved by the existence of a nonrecursive fixed-point guage without destroying order-of-evaluation indepen- function: dence. Y IS xG (U IS xV G(xX (V V) X); U U); (5) Other Label-Value Problems. Label-valued refer- ences can easily cause the preservation of data which will which can be used to convert any simply recursive function no longer be accessed by a computation. If L is a label- (i.e. a function which calls itself directly but not indirectly valued reference, then GOTO L will cause execution to via other functions) into an equivalent nonrecursive func- proceed from the computational state denoted by L. But tion [18]. the unchanged state must also be saved in case GOTO L is Thus suppose a recursive function F is defined by F executed again before the value of L is changed. If, in fact, ISR b, where F is the only identifier which occurs free in b. such a repeated jump cannot occur, then information Let F1 be the nonrecursive function defined by F1 IS XF will be saved unnecessarily unless the programmer goes to (b). Then the function (Y F1) can be shown to be equi- the trouble of resetting L immediately after the original valent to F, with the same domain of termination. More- jump. (As an example, the program for linking the co- over, the expansion of a function designator such as (Y routines COMPILE and ASSEMBLE will preserve the F1) X by repeated substitution of the definitions of Y and states of these routines unnecessarily.) F1 will never terminate. Presumably, it would be better to force the programmer (3) Storage Allocation. A serious drawback of the to extra trouble in order to preserve, rather than discard, principle of completeness is the elimination of any run- a reactivated computational state. This might be accom- time stack discipline, so that all data storage must be plished by adapting the concept of "process" used in recovered by garbage collection. This problem might be simulation languages, and providing a basic function for alleviated by adding language facilities for indicating copying processes. However, it is not clear how to combine contexts where a stack discipline is applicable. Even with- the process concept with an ALGoL-like use of label values out such facilities, it may be possible to determine by in a clean manner which does not violate the principle of program analysis, particularly with appropriate type completeness. declarations, situations where storage can be recovered A further difficulty is the inability of a label value to without garbage collection. preserve the values of references (i.e. the memory). In the (4) Side Effects. In the applicative subset of GEDAN- nondeterministic parser described earlier, the restriction KEN, the immediate subexpressions of a function designa- on the use of references in the function PARSE arises from tor or a sequence expression can be evaluated in any order, this problem. or the steps of their evaluation can be intermixed, without (6) Secondary Storage and File Management. Even with affecting the result or termination of any program. This open functions and sophisticated code optimization, it property, which is obviously desirable for code optimiza- may be intolerably inefficient to impose a purely functional tion or multiprocessing, is destroyed by the introduction approach on all data structures. But the functional ap- of assignment, since subexpressions can execute interfering proach still holds considerable promise for the treatment of side effects. large structures which require secondary storage. A stated, The situation is exacerbated by the introduction of but usually unmet goal of most data management systems label values, since then the order of evaluation can affect is the complete separation of the logical properties of a file the number of times a subexpression is executed. The from its physical representation. A natural approach to program this goal would be to equate a logical file with a collection (XISREF0;(X:=INCX, GOTOL); L: VALX) of functions for accessing the file, and to permit these produces one with left-to-right evaluation of the sequence functions to be implicit.

318 Communications of the ACCM Volume 13 / Number 5 / May, 1970 Acknowledgments. The author wishes to thank Dr. 8. WIRTH, N., AND WEBER, H. EULER--A generalization of M. D. MacLaren of Argonne National Laboratory and ALGOL and its formal definition: Pt. I, Pt. II. Comm. ACM 9, 1 and 2 (Jan., Feb. 1966), 13-25, 89-99. Professor Arthur Evans, Jr., of Massachusetts Institute 9. WIRTH, N., AND HOARE, C. A. R. A contribution to the of Technology for their stimulating discussions and helpful development of ALGOL. Comm. ACM 9, 6 (June 1966), suggestions. 413-432. 10. FARBER, D. J., GRISWOLD, R. E., AND POLONSKY, L P. The RECEIVED APRIL, 1969; REVISED OCTOBER 1969; FEBRUARY, 1970 SNOBOL3 programming language. Bell Syst. Tech. J. 45 (July-Aug. 1966), 895--944. REFERENCES 11. KAIN, R. Y. Block structures, indirect addressing, and la. McCARTrtY, J. Recursive functions of symbolic expressions garbage collection. Comm. ACM lZ, 7 (July 1969), 395-398. and their computation by machine, Pt. I. Comm. ACM 3, 4 12. BURSTALL,R. M., AND POPPLESTONE, R.J. POP-2 reference (Apr. 1960), 184-195. manual. In Machine Intelligence g, E. Dale and D. Michie (Eds.), American Elsevier, New York, 1968, pp. 205-246. lb. , ET AL. LISP 1.5 programmers manual. MIT Press, Cambridge, Mass., 1962. 13a. DAHL, O. J., AND NYGAARD g. SIMULA--An ALGOL- based simulation language. Comm. ACM 9, 9 (Sept. 1966), 2. LANDIN, P.J. The next 700 programming languages. Comm. ACM 9, 3 (Mar. 1966), 157-166. 671-678. 13b. --, MYrmHAU¢, B., AND NYGAARD,K. SIMULA 67 com- 3. EVANS, A. PAL--Alanguage designed for teaching program- mon base language. Publ. No. S-2, Norwegian Computing ming linguistics. Proc. ACM 23rd Nat. Conf. 1968, Brandin Systems Press, Princeton, N.J., pp. 395--403. Center, Oslo, May 1968. 14. FLOYD, R.W. Nondeterministic algorithms. J. ACM 15, 3 4. VAN WIJNGAARDEN,A. (Ed.), MAILLOUX,B. J., PECK, J. E. L., (Oct. 1967), 636-644. AND KOSTER, C. H.A. Report on the algorithmic language 15a. REYNOr,DS,J.C. An introduction to the COGENT program- ALGOL 68. MR 101, Mathematisch Centrum, Amsterdam, ming system. Proc. ACM 20th Natl. Conf., 1965, pp. 422- Feb., 1969. 436. 5. CHEATHAM, T. E., JR., FISCHER, A., AND JORRAND, P. On 15b. --, COGENT programming manual. ANL-7022, Argonne basis for ELF--An extensible language facility. Proc. Nat. Lab., Argonne, Ill., Mar. 1965. AFIPS 1968 Fall Joint Comput. Conf., Vol. 33 Pt. 2, MDI 16. EARLEY, J. An efficient context-free parsing algorithm. Publications, Wayne, Pa., pp. 937-948. Com. ACM 18:2 (Feb. 1970), 94-102. 6. BALZER, R. M. Dataless programming. Proc. AFIPS 1967 17. REYNOLDS, J. C. A set-theoretic approach to the concept of type. Working paper, NATO Conf. on Techniques in Fall Joint Comput. Conf. Vol. 31, MDI Publications, Software Engineering, Rome, Oct. 1969. Wayne, Pa., pp. 535-544. 18a. EVANS, A. Private communication. 7. REYNOLDS,J.C. GEDANKEN--A simple typeless language 18b. MORRIS, J. H. Lambda-calculus models of programming which permits functional data structures and coroutines. languages. MAC-TR-57, Project MAC, MIT, Cambridge, ANL-7621, Argonne Nat. Lab., Argonne, Ill., Sept. 1969. Mass., Dec. 1968.

v

A Language for Treating Graphs Introduction Graphs are an important tool in applied mathematics. Their use in engineering is frequent in such areas as auto- S. CRESPI-REGHIZZI* AND R. MORPURGO matic control, electric networks, printed circuits design, Politecnico di Milano, t Milan, Italy etc. Even greater is the importance of graphs in operation A language for the representation of graphs is described, and research, where many important problems can be modeled the formulation of graph operations such as node and/or link and solved by means of graphs. A large number of these deletion or insertion, union, intersection, comparison, and tra- problems are discussed by Kaufmann [1] and include trans- versal of graphs is given. portation, distribution, and traffic problems. Still other Graphs are represented by linked lists. The language is areas, like epidemiology and linguistics, frequently use syntactically defined as an extension to ALGOL 60, and it is graphs. translated into ALGOL by means of a syntax-driven compiler. Graphs, it is true, could be replaced by other mathe- Application areas for this language are operation research, matical techniques--Boolean methods for instance [2]-- network problems, control theory, traffic problems, etc. but we think that graphs have the important advantage of making the solution of the problem more visual and more KEY WORDS AND PHRASES: graphs, oriented, nonoriented,multiple, colored intuitive. graph, language, extended ALGOL, operator-precedence, syntax-driven com- Therefore it was felt that a language for handling graphs piler, operation research, network, traffic CR CATEGORIES: 3.2, 3.5, 4.2, 5.3 would be of enough interest to some groups of users to warrant the effort involved in its development. * Present address: University of California at Los Angeles, De- partment of Engineering, Los Angeles, Calif. The major goal for this language was that it should en- t Istituto di Elettrotecnica ed Elettronica able the programmer to formulate operations on graphs in

Volume 13 / Number 5 / May, 1970 Communications of the ACCM 319