197 7 ACM Turing Award Lecture
The 1977 ACM Turing Award was presented to John Backus putations called Fortran. This same group designed the first at the ACM Annual Conference in Seattle, October 17. In intro- system to translate Fortran programs into machine language. ducing the recipient, Jean E. Sammet, Chairman of the Awards They employed novel optimizing techniques to generate fast Committee, made the following comments and read a portion of machine-language programs. Many other compilers for the lan- the final citation. The full announcement is in the September guage were developed, first on IBM machines, and later on virtu- 1977 issue of Communications, page 681. ally every make of computer. Fortran was adopted as a U.S. "Probably there is nobody in the room who has not heard of national standard in 1966. Fortran and most of you have probably used it at least once, or at During the latter part of the 1950s, Backus served on the least looked over the shoulder of someone who was writing a For. international committees which developed Algol 58 and a later tran program. There are probably almost as many people who version, Algol 60. The language Algol, and its derivative com- have heard the letters BNF but don't necessarily know what they pilers, received broad acceptance in Europe as a means for de- stand for. Well, the B is for Backus, and the other letters are veloping programs and as a formal means of publishing the explained in the formal citation. These two contributions, in my algorithms on which the programs are based. opinion, are among the half dozen most important technical In 1959, Backus presented a paper at the UNESCO confer- contributions to the computer field and both were made by John ence in Paris on the syntax and semantics of a proposed inter- Backus (which in the Fortran case also involved some col- national algebraic language. In this paper, he was the first to leagues). It is for these contributions that he is receiving this employ a formal technique for specifying the syntax of program- year's Turing award. ming languages. The formal notation became known as BNF- The short form of his citation is for 'profound, influential, standing for "Backus Normal Form," or "Backus Naur Form" to and lasting contributions to the design of practical high-level recognize the further contributions by Peter Naur of Denmark. programming systems, notably through his work on Fortran, and Thus, Backus has contributed strongly both to the pragmatic for seminal publication of formal procedures for the specifica- world of problem-solving on computers and to the theoretical tions of programming languages.' world existing at the interface between artificial languages and The most significant part of the full citation is as follows: computational linguistics. Fortran remains one of the most '... Backus headed a small IBM group in New York City widely used programming languages in the world. Almost all during the early 1950s. The earliest product of this group's programming languages are now described with some type of efforts was a high-level language for scientific and technical corn- formal syntactic definition.' " Can Programming Be Liberated from the von Neumann Style? A Functional Style and Its Algebra of Programs John Backus IBM Research Laboratory, San Jose
Conventional programming languages are growing ever more enormous, but not stronger. Inherent defects at the most basic level cause them to be both fat and weak: their primitive word-at-a-time style of program- ming inherited from their common ancestor--the von Neumann computer, their close coupling of semantics to state transitions, their division of programming into a world of expressions and a world of statements, their inability to effectively use powerful combining forms for building new programs from existing ones, and their lack of useful mathematical properties for reasoning about programs. An alternative functional style of programming is General permission to make fair use in teaching or research of all founded on the use of combining forms for creating or part of this material is granted to individual readers and to nonprofit libraries acting for them provided that ACM's copyright notice is given programs. Functional programs deal with structured and that reference is made to the publication, to its date of issue, and data, are often nonrepetitive and nonrecursive, are hier- to the fact that reprinting privileges were granted by permission of the archically constructed, do not name their arguments, and Association for Computing Machinery. To otherwise reprint a figure, table, other substantial excerpt, or the entire work requires specific do not require the complex machinery of procedure permission as does republication, or systematic or multiple reproduc- declarations to become generally applicable. Combining tion. forms can use high level programs to build still higher Author's address: 91 Saint Germain Ave., San Francisco, CA 94114. level ones in a style not possible in conventional lan- © 1978 ACM 0001-0782/78/0800-0613 $00.75 guages.
613 Communications August 1978 of Volume 2 i the ACM Number 8 Associated with the functional style of programming grams, and no conventional language even begins to is an algebra of programs whose variables range over meet that need. In fact, conventional languages create programs and whose operations are combining forms. unnecessary confusion in the way we think about pro- This algebra can be used to transform programs and to grams. solve equations whose "unknowns" are programs in much For twenty years programming languages have been the same way one transforms equations in high school steadily progressing toward their present condition of algebra. These transformations are given by algebraic obesity; as a result, the study and invention of program- laws and are carried out in the same language in which ming languages has lost much of its excitement. Instead, programs are written. Combining forms are chosen not it is now the province of those who prefer to work with only for their programming power but also for the power thick compendia of details rather than wrestle with new of their associated algebraic laws. General theorems of ideas. Discussions about programming languages often the algebra give the detailed behavior and termination resemble medieval debates about the number of angels conditions for large classes of programs. that can dance on the head of a pin instead of exciting A new class of computing systems uses the functional contests between fundamentally differing concepts. programming style both in its programming language and Many creative computer scientists have retreated in its state transition rules. Unlike von Neumann lan- from inventing languages to inventing tools for describ- guages, these systems have semantics loosely coupled to ing them. Unfortunately, they have been largely content states--only one state transition occurs per major com- to apply their elegant new tools to studying the warts putation. and moles of existing languages. After examining the Key Words and Phrases: functional programming, appalling type structure of conventional languages, using algebra of programs, combining forms, functional forms, the elegant tools developed by Dana Scott, it is surprising programming languages, von Neumann computers, yon that so many of us remain passively content with that Neumann languages, models of computing systems, ap- structure instead of energetically searching for new ones. plicative computing systems, applicative state transition The purpose of this article is twofold; first, to suggest systems, program transformation, program correctness, that basic defects in the framework of conventional program termination, metacomposition languages make their expressive weakness and their CR Categories: 4.20, 4.29, 5.20, 5.24, 5.26 cancerous growth inevitable, and second, to suggest some alternate avenues of exploration toward the design of new kinds of languages. Introduction
I deeply appreciate the honor of the ACM invitation 2. Models of Computing Systems to give the 1977 Turing Lecture and to publish this account of it with the details promised in the lecture. Underlying every programming language is a model Readers wishing to see a summary of this paper should of a computing system that its programs control. Some turn to Section 16, the last section. models are pure abstractions, some are represented by hardware, and others by compiling or interpretive pro- 1. Conventional Programming Languages: Fat and grams. Before we examine conventional languages more Flabby closely, it is useful to make a brief survey of existing models as an introduction to the current universe of Programming languages appear to be in trouble. alternatives. Existing models may be crudely classified Each successive language incorporates, with a little by the criteria outlined below. cleaning up, all the features of its predecessors plus a few more. Some languages have manuals exceeding 500 2.1 Criteria for Models pages; others cram a complex description into shorter 2.1.1 Foundations. Is there an elegant and concise manuals by using dense formalisms. The Department of mathematical description of the model? Is it useful in Defense has current plans for a committee-designed proving helpful facts about the behavior of the model? language standard that could require a manual as long Or is the model so complex that its description is bulky as 1,000 pages. Each new language claims new and and of little mathematical use? fashionable features, such as strong typing or structured 2.1.2 History sensitivity. Does the model include a control statements, but the plain fact is that few lan- notion of storage, so that one program can save infor- guages make programming sufficiently cheaper or more mation that can affect the behavior of a later program? reliable to justify the cost of producing and learning to That is, is the model history sensitive? use them. 2.1.3 Type of semantics. Does a program successively Since large increases in size bring only small increases transform states (which are not programs) until a termi- in power, smaller, more elegant languages such as Pascal nal state is reached (state-transition semantics)? Are continue to be popular. But there is a desperate need for states simple or complex? Or can a "program" be suc- a powerful methodology to help us think about pro- cessively reduced to simpler "programs" to yield a final
614 Communications August 1978 of Volume 21 the ACM Number 8 "normal form program," which is the result (reduction three parts: a central processing unit (or CPU), a store, semantics)? and a connecting tube that can transmit a single word 2.1.4 Clarity and conceptual usefulness of programs. between the CPU and the store (and send an address to •Are programs of the model clear expressions of a process the store). I propose to call this tube the yon Neumann or computation? Do they embody concepts that help us bottleneck. The task of a program is to change the to formulate and reason about processes? contents of the store in some major way; when one considers that this task must be accomplished entirely by 2.2 Classification of Models pumping single words back and forth through the von Using the above criteria we can crudely characterize Neumann bottleneck, the reason for its name becomes three classes of models for computing systems--simple clear. operational models, applicative models, and von Neu- Ironically, a large part of the traffic in the bottleneck mann models. is not useful data but merely names of data, as well as 2.2.1 Simple operational models. Examples: Turing operations and data used only to compute such names. machines, various automata. Foundations: concise and Before a word can be sent through the tube its address useful. History sensitivity: have storage, are history sen- must be in the CPU; hence it must either be sent through sitive. Semantics: state transition with very simple states. the tube from the store or be generated by some CPU Program clarity: programs unclear and conceptually not operation. If the address is sent from the store, then its helpful. address must either have been sent from the store or 2.2.2 Applicative models. Examples: Church's generated in the CPU, and so on. If, on the other hand, lambda calculus [5], Curry's system of combinators [6], the address is generated in the CPU, it must be generated pure Lisp [17], functional programming systems de- either by a fixed rule (e.g., "add 1 to the program scribed in this paper. Foundations: concise and useful. counter") or by an instruction that was sent through the History sensitivity: no storage, not history sensitive. Se- tube, in which case its address must have been sent ... mantics: reduction semantics, no states. Program clarity: and so on. programs can be clear and conceptually useful. Surely there must be a less primitive way of making 2.2.3 Von Neumann models. Examples: von Neu- big changes in the store than by pushing vast numbers mann computers, conventional programming languages. of words back and forth through the von Neumann Foundations: complex, bulky, not useful. History sensitiv- bottleneck. Not only is this tube a literal bottleneck for ity: have storage, are history sensitive. Semantics: state the data traffic of a problem, but, more importantly, it is transition with complex states. Program clarity: programs an intellectual bottleneck that has kept us tied to word- can be moderately clear, are not very useful conceptually. at-a-time thinking instead of encouraging us to think in The above classification is admittedly crude and terms of the larger conceptual units of the task at hand. debatable. Some recent models may not fit easily into Thus programming is basically planning and detailing any of these categories. For example, the data-flow the enormous traffic of words through the von Neumann languages developed by Arvind and Gostelow [1], Den- bottleneck, and much of that traffic concerns not signif- nis [7], Kosinski [13], and others partly fit the class of icant data itself but where to find it. simple operational models, but their programs are clearer than those of earlier models in the class and it is perhaps possible to argue that some have reduction semantics. In 4. Von Neumann Languages any event, this classification will serve as a crude map of the territory to be discussed. We shall be concerned only Conventional programming languages are basically with applicative and von Neumann models. high level, complex versions of the von Neumann com- puter. Our thirty year old belief that there is only one kind of computer is the basis of our belief that there is 3. Von Neumann Computers only one kind of programming language, the conven- tional--von Neumann--language. The differences be- In order to understand the problems of conventional tween Fortran and Algol 68, although considerable, are programming languages, we must first examine their less significant than the fact that both are based on the intellectual parent, the von Neumann computer. What is programming style of the von Neumann computer. Al- avon Neumann computer? When von Neumann and though I refer to conventional languages as "von Neu- others conceived it over thirty years ago, it was an mann languages" to take note of their origin and style, elegant, practical, and unifying idea that simplified a I do not, of course, blame the great mathematician for number of engineering and programming problems that their complexity. In fact, some might say that I bear existed then. Although the conditions that produced its some responsibility for that problem. architecture have changed radically, we nevertheless still Von Neumann programming languages use variables identify the notion of "computer" with this thirty year to imitate the computer's storage cells; control statements old concept. elaborate its jump and test instructions; and assignment In its simplest form avon Neumann computer has statements imitate its fetching, storing, and arithmetic.
615 Communications August 1978 of Volume 21 the ACM Number 8 The assignment statement is the von Neumann bottle- 5. Comparison of von Neumann and Functional neck of programming languages and keeps us thinking Programs in word-at-a-time terms in much the same way the computer's bottleneck does. To get a more detailed picture of some of the defects Consider a typical program; at its center are a number of von Neumann languages, let us compare a conven- of assignment statements containing some subscripted tional program for inner product with a functional one variables. Each assignment statement produces a one- written in a simple language to be detailed further on. word result. The program must cause these statements to be executed many times, while altering subscript values, 5.1 A von Neumann Program for Inner Product in order to make the desired overall change in the store, c.-~-0 since it must be done one word at a time. The program- for i .~ I step 1 until n do mer is thus concerned with the flow of words through c .--- c + ali]xbIi] the assignment bottleneck as he designs the nest of Several properties of this program are worth noting: control statements to cause the necessary repetitions. a) Its statements operate on an invisible "state" ac- Moreover, the assignment statement splits program- cording to complex rules. ming into two worlds. The first world comprises the right b) It is not hierarchical. Except for the right side of sides of assignment statements. This is an orderly world the assignment statement, it does not construct complex of expressions, a world that has useful algebraic proper- entities from simpler ones. (Larger programs, however, ties (except that those properties are often destroyed by often do.) side effects). It is the world in which most useful com- c) It is dynamic and repetitive. One must mentally putation takes place. execute it to understand it. The second world of conventional programming lan- d) It computes word-at-a-time by repetition (of the guages is the world of statements. The primary statement assignment) and by modification (of variable i). in that world is the assignment statement itself. All the e) Part of the data, n, is in the program; thus it lacks other statements of the language exist in order to make generality and works only for vectors of length n. it possible to perform a computation that must be based f) It names its arguments; it can only be used for on this primitive construct: the assignment statement. vectors a and b. To become general, it requires a proce- This world of statements is a disorderly one, with few dure declaration. These involve complex issues (e.g., call- useful mathematical properties. Structured programming by-name versus call-by-value). can be seen as a modest effort to introduce some order g) Its "housekeeping" operations are represented by into this chaotic world, but it accomplishes little in symbols in scattered places (in the for statement and the attacking the fundamental problems created by the subscripts in the assignment). This makes it impossible word-at-a-time von Neumann style of programming, to consolidate housekeeping operations, the most com- with its primitive use of loops, subscripts, and branching mon of all, into single, powerful, widely useful operators. flow of control. Thus in programming those operations one must always Our fixation on yon Neumann languages has contin- start again at square one, writing "for i .--- ..." and ued the primacy of the von Neumann computer, and our "for j := ..." followed by assignment statements sprin- dependency on it has made non-von Neumann languages kled with i's and j's. uneconomical and has limited their development. The absence of full scale, effective programming styles founded on non-von Neumann principles has deprived 5.2 A Functional Program for Inner Product designers of an intellectual foundation for new computer Def Innerproduct architectures. (For a brief discussion of that topic, see - (Insert +)o(ApplyToAll x)oTranspose Section 15.) Applicative computing systems' lack of storage and Or, in abbreviated form: history sensitivity is the basic reason they have not Def IP - (/+)o(ax)oTrans. provided a foundation for computer design. Moreover, most applicative systems employ the substitution opera- Composition (o), Insert (/), and ApplyToAll (a) are tion of the lambda calculus as their basic operation. This functional forms that combine existing functions to form operation is one of virtually unlimited power, but its new ones. Thus fog is the function obtained by applying complete and efficient realization presents great difficul- first g and then fi and c~f is the function obtained by ties to the machine designer. Furthermore, in an effort applyingf to every member of the argument. If we write to introduce storage and to improve their efficiency on f:x for the result of applying f to the object x, then we von Neumann computers, applicative systems have can explain each step in evaluating Innerproduct applied tended to become engulfed in a large von Neumann to the pair of vectors <<1, 2, 3>, <6, 5, 4>> as follows: system. For example, pure Lisp is often buried in large IP:<< i,2,3>, <6,5,4>> = extensions with many von Neumann features. The re- Definition of IP ~ (/+)o(ax)oTrans: << 1,2,3>, <6,5,4>> suiting complex systems offer little guidance to the ma- Effect of composition, o ~ (/+):((ax):(Trans: chine designer. <<1,2,3>, <6,5,4>>))
616 Communications August 1978 of Volume 21 the ACM Number 8 Applying Transpose (/+):((ax): <<1,6>, <2,5>, <3,4>>) provides a general environment for its changeable fea- Effect of ApplyToAll, a (/+):
617 Communications August 1978 of Volume 21 the ACM Number 8 two worlds prevents the combining forms in either world foundations provide powerful tools for describing the from attaining the full power they can achieve in an language and for proving properties of programs. When undivided world. applied to avon Neumann language, on the other hand, A second obstacle to the use of combining forms in it provides a precise semantic description and is helpful von Neumann languages is their use of elaborate naming in identifying trouble spots in the language. But the conventions, which are further complicated by the sub- complexity of the language is mirrored in the complexity stitution rules required in calling procedures. Each of of the description, which is a bewildering collection of these requires a complex mechanism to be built into the productions, domains, functions, and equations that is framework so that variables, subscripted variables, only slightly more helpful in proving facts about pro- pointers, file names, procedure names, call-by-value for- grams than the reference manual of the language, since mal parameters, call-by-name formal parameters, and so it is less ambiguous. on, can all be properly interpreted. All these names, Axiomatic semantics [11] precisely restates the in- conventions, and rules interfere with the use of simple elegant properties ofvon Neumann programs (i.e., trans- combining forms. formations on states) as transformations on predicates. The word-at-a-time, repetitive game is not thereby changed, merely the playing field. The complexity of this 8. APL versus Word-at-a-Time Programming axiomatic game of proving facts about von Neumann programs makes the successes of its practitioners all the Since I have said so much about word-at-a-time more admirable. Their success rests on two factors in programming, I must now say something about APL addition to their ingenuity: First, the game is restricted [12]. We owe a great debt to Kenneth Iverson for showing to small, weak subsets of full von Neumann languages us that there are programs that are neither word-at-a- that have states vastly simpler than real ones. Second, time nor dependent on lambda expressions, and for the new playing field (predicates and their transforma- introducing us to the use of new functional forms. And tions) is richer, more orderly and effective than the old since APL assignment statements can store arrays, the (states and their transformations). But restricting the effect of its functional forms is extended beyond a single game and transferring it to a more effective domain does assignment. not enable it to handle real programs (with the necessary Unfortunately, however, APL still splits program- complexities of procedure calls and aliasing), nor does it ming into a world of expressions and a world of state- eliminate the clumsy properties of the basic von Neu- ments. Thus the effort to write one-line programs is mann style. As axiomatic semantics is extended to cover partly motivated by the desire to stay in the more orderly more of a typical von Neumann language, it begins to world of expressions. APL has exactly three functional lose its effectiveness with the increasing complexity that forms, called inner product, outer product, and reduc- is required. tion. These are sometimes difficult to use, there are not Thus denotational and axiomatic semantics are de- enough of them, and their use is confined to the world scriptive formalisms whose foundations embody elegant of expressions. and powerful concepts; but using them to describe avon Finally, APL semantics is still too closely coupled to Neumann language can not produce an elegant and states. Consequently, despite the greater simplicity and powerful language any more than the use of elegant and power of the language, its framework has the complexity modern machines to build an Edsel can produce an and rigidity characteristic of von Neumann languages. elegant and modem car. In any case, proofs about programs use the language of logic, not the language of programming. Proofs talk 9. Von Neumann Languages Lack Useful about programs but cannot involve them directly since Mathematical Properties the axioms of von Neumann languages are so unusable. In contrast, many ordinary proofs are derived by alge- So far we have discussed the gross size and inflexi- braic methods. These methods require a language that bility of von Neumann languages; another important has certain algebraic properties. Algebraic laws can then defect is their lack of useful mathematical properties and be used in a rather mechanical way to transform a the obstacles they present to reasoning about programs. problem into its solution. For example, to solve the Although a great amount of excellent work has been equation published on proving facts about programs, von Neu- mann languages have almost no properties that are ax+bx=a+b helpful in this direction and have many properties that for x (given that a+b ~ 0), we mechanically apply the are obstacles (e.g., side effects, aliasing). distributive, identity, and cancellation laws, in succes- Denotational semantics [23] and its foundations [20, sion, to obtain 21] provide an extremely helpful mathematical under- standing of the domain and function spaces implicit in (a + b)x = a + b programs. When applied to an applicative language (a + b)x = (a + b) l (such as that of the "recursive programs" of [16]), its X~ 1.
618 Communications August 1978 of Volume 21 the ACM Number 8 Thus we have proved that x = 1 without leaving the past three or four years and have not yet found a "language" of algebra. Von Neumann languages, with satisfying solution to the many conflicting requirements their grotesque syntax, offer few such possibilities for that a good language must resolve. But I believe this transforming programs. search has indicated a useful approach to designing non- As we shall see later, programs can be expressed in von Neumann languages. a language that has an associated algebra. This algebra This approach involves four elements, which can be can be used to transform programs and to solve some summarized as follows. equations whose "unknowns" are programs, in much the a) A functional style of programming without varia- same way one solves equations in high school algebra. bles. A simple, informal functional programming (FP) Algebraic transformations and proofs use the language system is described. It is based on the use of combining of the programs themselves, rather than the language of forms for building programs. Several programs are given logic, which talks about programs. to illustrate functional programming. b) An algebra of functional programs. An algebra is described whose variables denote FP functional pro- 10. What Are the Alternatives to von Neumann grams and whose "operations" are FP functional forms, Languages? the combining forms of FP programs. Some laws of the algebra are given. Theorems and examples are given that Before discussing alternatives to von Neumann lan- show how certain function expressions may be trans- guages, let me remark that I regret the need for the above formed into equivalent infinite expansions that explain negative and not very precise discussion of these lan- the behavior of the function. The FP algebra is compared guages. But the complacent acceptance most of us give with algebras associated with the classical applicative to these enormous, weak languages has puzzled and systems of Church and Curry. disturbed me for a long time. I am disturbed because c) A formal functional programming system. A formal that acceptance has consumed a vast effort toward mak- (FFP) system is described that extends the capabilities ing von Neumann languages fatter that might have been of the above informal FP systems. An FFP system is better spent in looking for new structures. For this reason thus a precisely defined system that provides the ability I have tried to analyze some of the basic defects of to use the functional programming style of FP systems conventional languages and show that those defects can- and their algebra of programs. FFP systems can be used not be resolved unless we discover a new kind of lan- as the basis for applicative state transition systems. guage framework. d) Applicative state transition systems. As discussed In seeking an alternative to conventional languages above. The rest of the paper describes these four ele- we must first recognize that a system cannot be history ments, gives some brief remarks on computer design, sensitive (permit execution of one program to affect the and ends with a summary of the paper. behavior of a subsequent one) unless the system has some kind of state (which the first program can change and the second can access). Thus a history-sensitive I I. Functional Programming Systems (FP Systems) model of a computing system must have a state-transition semantics, at least in this weak sense. But this does not 11.1 Introduction mean that every computation must depend heavily on a In this section we give an informal description of a complex state, with many state changes required for each class of simple applicative programming systems called small part of the computation (as in von Neumann functional programming (FP) systems, in which "pro- languages). grams" are simply functions without variables. The de- To illustrate some alternatives to von Neumann lan- scription is followed by some examples and by a discus- guages, I propose to sketch a class of history-sensitive sion of various properties of FP systems. computing systems, where each system: a) has a loosely An FP system is founded on the use of a fixed set of coupled state-transition semantics in which a state tran- combining forms called functional forms. These, plus sition occurs only once in a major computation; b) has simple definitions, are the only means of building new a simply structured state and simple transition rules; c) functions from existing ones; they use no variables or depends heavily on an underlying applicative system substitution rules, and they become the operations of an both to provide the basic programming language of the associated algebra of programs. All the functions of an system and to describe its state transitions. FP system are of one type: they map objects into objects These systems, which I call applicative state transition and always take a single argument. (or AST) systems, are described in Section 14. These In contrast, a lambda-calculus based system is simple systems avoid many of the complexities and founded on the use of the lambda expression, with an weaknesses of von Neumann languages and provide for associated set of substitution rules for variables, for a powerful and extensive set of changeable parts. How- building new functions. The lambda expression (with its ever, they are sketched only as crude examples of a vast substitution rules) is capable of defining all possible area of non-von Neumann systems with various attrac- computable functions of all possible types and of any tive properties. I have been studying this area for the number of arguments. This freedom and power has its 619 Communications August 1978 of Volume 21 the ACM Number 8 disadvantages as well as its obvious advantages. It is There is one important constraint in the construction analogous to the power of unrestricted control statements of objects: if x is a sequence with J_ as an element, then in conventional languages: with unrestricted freedom x = ±. That is, the "sequence constructor" is "±-pre- comes chaos. If one constantly invents new combining serving." Thus no proper sequence has i as an element. forms to suit the occasion, as one can in the lambda calculus, one will not become familiar with the style or Examples of objects useful properties of the few combining forms that are ± 1.5 ¢p AB3
620 Communications August 1978 of Volume 21 the ACM Number 8 Atom and g, f and g are its parameters, and it denotes the atom:x - x is an atom ~ T; x#3- ~ F; ,1, function such that, for any object x,
Equals (fog) :x =f:(g:x). eq:x -- x=--+
621 Communications August 1978 of Volume 21 the ACM Number 8 Binary to unary (x is an object parameter) and knows how to apply it. Iffis a functional form, then (bu f x) :y - f:
The above illustrates the simple rule: to apply a Def IP ~- (/+)o(ax)otrans defined symbol, replace it by the right side of its defini- tion. Of course, some definitions may define nontermi- 11.3.3 Matrix multiply. This matrix multiplication nating functions. A set D of definitions is well formed if program yields the product of any pair
622 Communications August 1978 of Volume 21 the ACM Number 8 pi = distl:
67,4 Communications August 1978 of Volume 2 ! the ACM Number 8 so that we now have a solution whose right side is free definedog ; -~ lo [fig] --f off's? Such solutions are helpful in two ways: first, they to indicate that the law (or theorem) on the right holds give proofs of "termination" in the sense that (4) means within the domain of objects x for which definedog:x thatf:x is defined if and only if there is an n such that, = T. Where for every i less than n, pi: x = F and pn : X = T and qn : X is defined. Second, (4) gives a case-by-case description Def defined - ~r off that can often clarify its behavior. i.e. defined:x - x=± ---> ±; T. In general we shall write The foundations for the algebra given in a subsequent a qualified functional equation: section are a modest start toward the goal stated above. For a limited class of equations its "linear expansion p --->--->f- g theorem" gives a useful answer as to when one can go to mean that, for any object x, wheneverp:x = T, then from indefinitely extendable equations like (l) to infinite f:x=g:x. expansions like (4). For a larger class of equations, a more general "expansion theorem" gives a less helpful Ordinary algebra concerns itself with two operations, answer to similar questions. Hopefully, more powerful addition and multiplication; it needs few laws. The al- theorems covering additional classes of equations can be gebra of programs is concerned with more operations found. But for the present, one need only know the (functional forms) and therefore needs more laws. conclusions of these two simple foundational theorems Each of the following laws requires a corresponding in order to follow the theorems and examples appearing proposition to validate it. The interested reader will find in this section. most proofs of such propositions easy (two are given The results of the foundations subsection are sum- below). We first define the usual ordering on functions marized in a separate, earlier subsection titled "expan- and equivalence in terms of this ordering: sion theorems," without reference to f'Lxed point con- DEFINITIONf__
625 Communications August 1978 of Volume 21 the ACM Number 8 Wheref&g - ando[f,g]; PROPOSITION 2 pair -= atom --~ F; eqo[length,2] Pair & notonullo I --*--* II Composition and condition (right associated paren- apndlo[[l 2, 2], distro[tlo 1, 2]] -= distr theses omitted) (Law II.2 is noted in Manna et al. [16], where f± g is the function: ando[f, g], andf 2 -fof. p. 493.) PROOF. We show that both sides produce the same result II.l (p---~f',g)oh = poh --~ foh; goh II.2 h o (p---~ g) -- p ~ h oj~ hog when applied to any pair
IV Condition and construction 12.3 Example: Equivalence of Two Matrix IV.l [fi ..... (p--~g;h) ..... fn] Multiplication Programs - p---~ [fi ..... g ..... fn]; [fl ..... h ..... fi~] We have seen earlier the matrix multiplication pro- IV.l.1 [fl ..... (pl --~ gl; ... ;pn ---> gn; h) ..... fm] gram: =pl ~ [J~ ..... gl ..... fm]; Def MM - aaIP o adistl o distr o [1, transo2]. •.-;pn "-'> [fi ..... g...... fm]; Ill ..... h ..... fm] We shall now show that its initial segment, MM', where This concludes the present list of algebraic laws; it is by Def MM' - aaIP o adistl o distr, no means exhaustive, there are many others. can be defined recursively. (MM' "multiplies" a pair of matrices after the second matrix has been transposed. Proof of two laws Note that MM', unlike MM, gives A_ for all arguments We give the proofs of validating propositions for laws that are not pairs.) That is, we shall show that MM' I. 10 and I. 11, which are slightly more involved than most satisfies the following equation which recursively defines of the others. the same function (on pairs): PROPOSITION 1 f-= null o 1 ~ q~; apndlo [alpodistlo [1 o 1, 2], fo [tlo 1, 2]]. apndl o [fog, afoh] ~ af o apndl o [g,h] Our proof will take the form of showing that the follow- PROOF. We show that, for every object x, both of the ing function, R, above functions yield the same result. Def R m null o 1 --~ 6; CASE 1. h:x is neither a sequence nor q,. apndlo[aIpodistlo[l o 1, 2], MM'o[tlo 1, 2]] Then both sides yield ± when applied to x. CASE 2. h:x = ~. Then is, for all pairs
626. Communicat; ~ns August 1978 of Volume 21 the ACM Number 8 Def MM' - aalP o adistl o distr where the pi's and qi's are particular functions, so that E Def R -- nullo 1 ~ ~; has the property: apndlo[alpodistlo[12, 2], MM'o[tlo 1, 2]] E(fi) -fi+l for i = 0, 1, 2 .... (E3) PROOF. Then we say that E is expansive and has the jS's as CASE 1. pair&nullol • ~MM'-=R. approximating functions. pair&nullol > >R--=6 bydefofR If E is expansive and has approximating functions as pair & nullo 1 ---~---~ MM' - in (E2), and iff is the solution of (El), thenf can be since distr: <$,x> = $ by def of distr written as the infinite expansion and aj~ = $ by def of Apply to all. f-po--* qo; ... ;pn ---> qn; ... (E4) And so: aaIP o adistl o distr: <~,x> = q~. Thus pair & nullo 1 ---~---~ MM' - R. meaning that, for any x, fix # ± iff there is an n _> 0 CASE 2. pair & notonullo I ~ MM' -- R. such that (a)pi:x = F for all i < n, and (b)pn:x = T, and (c) qn:X # _l_. Whenf:x # ±, thenf.'x = qn:X for this n. pair & notonullo 1 ---~---~ R - R', (l) (The foregoing is a consequence of the "expansion theo- rem".) by def of R and R', where 12.4.2 Linear expansion. A more helpful tool for Def R' - apndlo[alPodistlo[12, 2], MM'o[tlo 1, 2]]. solving some equations applies when, for any function h,
We note that E(h) - p0 ---, q0; El(h) (LEI) R' -- apndlo[fog, afoh] and there exist pi and qi such that
where El(pi ---> qi; h) = pi+l ~ qi+l; El(h) for i = 0, 1, 2 .... (LE2) f-= aIpodistl g - [12, 21 and h =- distro[tlo 1, 2] E,(i) - _[_. (LE3) af- a(alpodistl) = aalpoadistl (by 111.4). (2) Under the above conditions E is said to be linearly Thus, by I. 10, expansive. If so, andf is the solution of R '= afoapndlo[g,h]. (3) f ~- E(f) (LE4) Now apndlo[g,h] -= apndlo[[l 2, 2], distro[tlo 1, 2]], then E is expansive and f can again be written as the thus, by I. 11, infinite expansion pair & notonullo I ---~---> apndlo[g,h] = distr. (4) f=-po--, q0;'... ;pn "-> qn; ... (LE5) And so we have, by (1), (2), (3) and (4), using the pi's and qi's generated by (LE 1) and (LE2). pair & notonullo 1 ---~---~ R - R' Although the pi's and qi's of (E4) or (LE5) are not - afodistr - aaIPoadistlodistr - MM'. unique for a given function, it may be possible to find Case l and Case 2 together prove the theorem. [] additional constraints which would make them so, in which case the expansion (LE5) would comprise a can- onical form for a function. Even without uniqueness 12.4 Expansion Theorems these expansions often permit one to prove the equiva- In the following subsections we shall be "solving" lence of two different function expressions, and they some simple equations (where by a "solution" we shall often clarify a function's behavior. mean the "least" function which satisfies an equation). To do so we shall need the following notions and results drawn from the later subsection on foundations of the 12.5 A Recursion Theorem algebra, where their proofs appear. Using three of the above laws and linear expansion, 12.4.1 Expansion. Suppose we have an equation of one can prove the following theorem of moderate gen- the form erality that gives a clarifying expansion for many recur- sively defined functions. f- E(f) (El) RECURSION THEOREM: Letf be a solution of where E(f) is an expression involvingf. Suppose further that there is an infinite sequence of functionsfi for i = 0, f- p --~ g;, Q(f) (1) 1, 2 ..... each having the following form: where fo-£ Q(k) - ho[i, koj] for any function k (2) J~+l mpo "-"> qo; .-. ;pi--~ qi; -1- (E2) and p, g, h, i, j are any given functions, then
627 Communications August 1978 of Volume 2 l the ACM Number 8 f-p---> ~,,poj-. Q(g); ... ;pojn'-> Q~(g); ... (3) Using these results for ios ~, eq0os ~, and Qn(~) in the previous expansion for f, we obtain (where Q~(g) is ho[i, Qn-a(g)°J], and j~ is joj n-1 for n >__ 2) and fix - x=O--~ 1; ... ; x=n --~n×(n- l) x...x Ix l;... Qn(g) ___/h o[i, ioj, .... ioj n-a, gojn]. (4) Thus we have proved thatf terminates on precisely the PROOF. We verify thatp --> g;, Q(f) is linearly expansive. set of nonnegative integers and that it is the factorial Let p~, qn and k be any functions. Then function thereon. Q(pn ~ qn, k) - ho[i, (pn ---~ q~; k)°j] by (2) 12.6 An Iteration Theorem - ho[i, (pn°j--~ qn°j; koj)] by II.l This is really a corollary of the recursion theorem. It - ho(p~oj---~ [i, qnOj]; [i, koj]) by IV.l gives a simple expansion for many iterative programs. - p~oj---~ ho[i, q~oj]; ho[i, koj] by II.2 ITERATION THEOREM: Letf be the solution (i.e., the least -p, oj-~ Q(q~); Q(k) by (2) (5) solution) of Thus ifpo -p and qo - g, then (5) gives px -poj and f - p---~ g;, hofok ql = Q(g) and in general gives the following functions satisfying (LE2) then pn -p°j n and qn -~ Qn(g). (6) f =_ p .-o g; pok ~ hogok; ... ; pok n --o hnogok~; ...
Finally, PROOF. Let h' - ho2, i' =- id, f =- k, then Q(i) -- ho[i, ioj] f -- p ---> g; h' o[i', foj'] - ho[i, &] by III.l.l since ho2o[id, fok] - hofok by 1.5 (id is defined except = ho~ by 1.9 for A_, and the equation holds for _1_). Thus the recursion ~- i by III.1.1. (7) theorem gives Thus (5) and (6) verify (LE2) and (7) verifies (LE3), with f_p___> g; ... ;pokn _.> Qnfg); ... E1 -- Q. If we let E(--f) -= p ~ g; Q(f), then we have (LE1); thus E is linearly expansive. Since f is a solution where off-- E(f), conclusion (3) follows from (6) and (LE5). Qn(g) _ ho2o[id, Qn-l(g)ok] Now =_ hoOn-t(g)ok =__ hnogok n Q~(g) = ho[i, Qn-~(g)°J] byI.5 [] - ho[i, ho[ioj, .... ho[ioj n-~, goj n] ... ]] 12.6.1 Example: Correctness proof for an iterative factorial function. Letf be the solution of by I. 1, repeatedly f- eq0ol --> 2;fo[so 1, ×] -/ho[i, ioj, .... iojn-l,'g°j ~] by 1.3 (8) where Def s - -o[id, i] (substract 1). We want to prove Result (8) is the second conclusion (4). [] thatf.'
628 Communications August 1978 of Volume 21 the ACM Number 8 pair-->--* k ~+~ =- [soan, Xo[an, bn]] by 1.1 and 1.5 (6) since pn and p" provide the qualification needed for q~ -- q" - h%2. To pass from (5) to (6) we must check that whenever an Now suppose there is an x such thatfx # g:x. Then or bn yield £ in (5), so will the right side of (6). Now there is an n such that pi:x = p¢:x = F for i < n, and p,:x soan =- s n+l° 1 -- an+l (7) # p~:x. From (12) and (13) this can only happen when ×o[an, b.] -/× ° [s~° 1, s n-lo 1..... so 1, 1, 2] h%2:x = ±. But since h is ±-preserving, hmo2:x = I for - b.+l by 1.3. (8) all m _> n. Hencef:x = g:x = i by (14) and (15). This contradicts the assumption that there is an x for which Combining (6), (7), and (8) gives fix # g:x. Hence f- g. pair--->---> k ~+~ - [an+l, bn+l]. (9) This example (by J. H. Morris, Jr.) is treated more elegantly in [16] on p. 498. However, some may find that Thus (2) holds for n = 1 and holds for n + 1 whenever the above treatment is more constructive, leads one more it holds for n, therefore, by induction, it holds for every mechanically to the key questions, and provides more n _> 1. Now (2) gives, for pairs: insight into the behavior of the two functions. definedok n --,--. pok n = eq0o l o[an, bn] - eq0oan = eq0os"o 1 (10) 12.7 Nonlinear Equations defmedok n ---~--~ gok ~ The preceding examples have concerned "linear" ---- 2°[an, bn] --/× o [s~-X°l ..... sol, 1, 2] (11) equations (in which the "unknown" function does not have an argument involving itself). The question of the (both use 1.5). Now (1) tells us thatf.'
629 Communications August 1978 of Volume 21 the ACM Number 8 = f'.(foh:x) =f.'(f.'x) by (7) DEFINITION. Let E(f) be a function expression satisfying = f2:x. the following: Ifp:x iS neither T nor F, thenf.'x -- ± =f2:x. Thus E(h) -po---~ qo; El(h) for all h E F (EEl) f_f2. where pi E F and qi E F exist such that El(pi ~ qi; h) - pi+l ~ qi+l; El(h) 12.8 Foundations for the Algebra of Programs for all h E F and i = 0, 1.... (LE2) Our purpose in this section is to establish the validity of the results stated in Section 12.4. Subsequent sections and do not depend on this one, hence it can be skipped by EI(_T_) -= &. (LE3) readers who wish to do so. We use the standard concepts and results from [16], but the notation used for objects Then E is said to be linearly expansive with respect to and functions, etc., will be that of this paper. these pi's and qi's. We take as the domain (and range) for all functions LINEAR EXPANSION THEOREM: Let E be linearly expansive the set O of objects (which includes ±) of a given FP with respect to pi and qi, i = 0, 1..... Then E is expansive system. We take F to be the set of functions, and F to be with approximating functions the set of functional forms of that FP system. We write E(f) for any function expression involving functional fo- i (1) forms, primitive and defined functions, and the function f,+l -po ~ q0; ... ;pi ~ qi; i. (2) symbol f, and we regard E as a functional that maps a PROOF. We want to show that E(fi) =-fi+~ for any i _ 0. function f into the corresponding function E(f). We Now assume that all f ~ F are &-preserving and that all functional forms in F correspond to continuous function- E(fo) = p0---~ qo; E~ (i) ----p0---~ q0; & --fi (3) als in every variable (e.g., [f, g] is continuous in bothf by (LE1) (LE3) (1). and g). (All primitive functions of the FP system given Let i > 0 be fLxed and let earlier are _L-preserving, and all its functional forms are continuous.) fi = po ~ qo; W1 (4a) W1 ~ px ~ ql; W2 (4b) DEFINITIONS. Let E(f) be a function expression. Let etc. fo-=£ Wi--1 ~ pi-1 ~ qi--1; ~. (4-) fi+x ----po ~ qo; ... ; pi ~ qi; J- for i = 0, 1.... Then, for this i > 0 where pi, qi E F. Let E have the property that E(fi) - p0 ~ q0; El(fi) by (LE1) E(fi)----fi+~ fori=0,1 ..... E~(fi) - pl ~ ql; El(Wa) by (LE2) and (4a) El(w~) - p2 --~ q2; E~(w2) by (LE2) and (4b) Then E is said to be expansive with the approximating functionsfi. We write etc. E~(wi-~) -= pi ~ qi; E~ (i) by (LE2) and (4-) f=po---~ q0; ... ;pn---~ qn;-.. - pi --~ qi; A- by (LE3) to mean that f = limi{fi}, where the fi have the form above. We call the right side an infinite expansion off. Combining the above gives We takef.'x to be defined iff there is an n _> 0 such that E(fi) -f+l for arbitrary i > 0, by (2). (5) (a) pi:x = F for all i < n, and (b) p,:x = T, and (c) qn:x By (3), (5) also holds for i -- 0; thus it holds for all i >__ 0. is defined, in which casef.'x = qn:X. Therefore E is expansive and has the required approxi- EXPANSION THEOREM: Let E(f) be expansive with ap- mating functions. [] proximating functions as above. Let f be the least func- COROLLARY. If E is linearly expansive with respect to pi tion satisfying and qi, i = 0, 1..... andfis the least function satisfying f~ E(f). f---- E(f) (LE4) Then then f-- p0 ~ q0; ... ; p, ~ qn; ... f - po ~ qo; ... ; pn ---~ qn; .... (LE5) PROOF. Since E is the composition of continuous func- tionals (from F) involving only monotonic functions (_l_-preserving functions from F) as constant terms, E is 12.9 The Algebra of Programs for the Lambda Calculus continuous ([16] p. 493). Therefore its least fixed pointf and for Combinators is limi{Ei(j-)} -= limi(fi} ([16] p. 494), which by defmition Because Church's lambda calculus [5] and the system is the above inf'mite expansion forf. [] of combinators developed by Sch6nfinkel and Curry [6]
630 Communications August 1978 of Volume 21 the ACM Number 8 are the primary mathematical systems for representing of these complexities. (The Church-Rosser theorem, or the notion of application of functions, and because they Scott's proof of the existence of a model [22], is required are more powerful than FP systems, it is natural to to show that the lambda calculus has a consistent seman- enquire what an algebra of programs based on those tics.) The defmition of pure Lisp contained a related systems would look like. error for a considerable period (the "funarg" problem). The lambda calculus and combinator equivalents of Analogous problems attach to Curry's system as well. FP composition, fog, are In contrast, the formal (FFP) version of FP systems (described in the next section) has no variables and only hfgx.(f(gx)) --- B an elementary substitution rule (a function for its name), where B is a simple combinator defined by Curry. There and it can be shown to have a consistent semantics by a is no direct equivalent for the FP object 631 Communications August 1978 of Volume 21 the ACM Number 8 13.2 Syntax sible a function, apply, which is meaningless in FP We describe the set O of objects and the set E of systems: expre.,;sions of an FFP system. These depend on the apply: 632 Communications August 1978 of Volume 21 the ACM Number 8 position is that it permits the definition of new functional -- Ix(pMLAST:< = aapply:< 633 Communications August 1978 of Volume 2 I the ACM Number 8 Store and push, pop, purge pendent on D; if Tatom:D # # (that is, atom is defined Like fetch, store takes an object n as its parameter; it in D) then its meaning is It(c:x), where c = Tatom:D, the is written J,n ("store n"). When applied to a pair 634 Communications August 1978 of Volume 2 l the ACM Number 8 their programming language and the description of their not depend on a host of baroque protocols for commu- state transitions. The underlying applicative system of nicating with the state, and we want to be able to make the AST system described below is an FFP system, but large transformations in the state by the application of other applicative systems could also be used. general functions. AST systems provide one way of To understand the reasons for the structure of AST achieving these goals. Their semantics has two protocols systems, it is helpful first to review the basic structure of for getting information from the state: (1) get from it the avon Neumann system, Algol, observe its limitations, definition of a function to be applied, and (2) get the and compare it with the structure of AST systems. After whole state itself. There is one protocol for changing the that review a minimal AST system is described; a small, state: compute the new state by function application. top-down, self-protecting system program for file main- Besides these communications with the state, AST se- tenance and running user programs is given, with direc- mantics is applicative (i.e. FFP). It does not depend on tions for installing it in the AST system and for running state changes because the state does not change at all an example user program. The system program uses during a computation. Instead, the result of a computa- "name functions" instead of conventional names and the tion is output and a new state. The structure of an AST user may do so too. The section concludes with subsec- state is slightly restricted by one of its protocols: It must tions discussing variants of AST systems, their general be possible to identify a definition (i.e. cell) in it. Its properties, and naming systems. structure--it is a sequence--is far simpler than that of the Algol state. 14.2 The Structure of Algol Compared to That of AST Thus the structure of AST systems avoids the com- Systems plexity and restrictions of the von Neumann state (with An Algol program is a sequence of statements, each its communications protocols) while achieving greater representing a transformation of the Algol state, which power and freedom in a radically different and simpler is a complex repository of information about the status framework. of various stacks, pointers, and variable mappings of identifiers onto values, etc. Each statement communi- 14.3 Structure of an AST System cates with this constantly changing state by means of An AST system is made up of three elements: complicated protocols peculiar to itself and even to its 1) An applicative subsystem (such as an FFP system). different parts (e.g., the protocol associated with the 2) A state D that is the set of definitions of the variable x depends on its occurrence on the left or right applicative subsystem. of an assignment, in a declaration, as a parameter, etc.). 3) A set of transition rules that describe how inputs It is as if the Algol state were a complex "store" that are transformed into outputs and how the state D is communicates with the Algol program through an enor- changed. mous "cable" of many specialized wires. The complex The programming language of an AST system is just communications protocols of this cable are fixed and that of its applicative subsystem. (From here on we shall include those for every statement type. The "meaning" assume that the latter is an FFP system.) Thus AST of an Algol program must be given in terms of the total systems can use the FP programming style we have effect of a vast number of communications with the state discussed. The applicative subsystem cannot change the via the cable and its protocols (plus a means for identi- state D and it does not change during the evaluation of fying the output and inserting the input into the state). an expression. A new state is computed along with output By comparison with this massive cable to the Algol and replaces the old state when output is issued. (Recall state/store, the cable that is the von Neumann bottleneck that a set of definitions D is a sequence of cells; a cell of a computer is a simple, elegant concept. name is the name of a defined function and its contents Thus Algol statements are not expressions represent- is the defining expression. Here, however, some cells ing state-to-state functions that are built up by the use of may name data rather than functions; a data name n will orderly combining forms from simpler state-to-state be used in l'n (fetch n) whereas a function name will be functions. Instead they are complex messages with con- used as an operator itself.) text-dependent parts that nibble away at the state. Each We give below the transition rules for the elementary part transmits information to and from the state over the AST system we shall use for examples of programs. cable by its own protocols. There is no provision for These are perhaps the simplest of many possible transi- applying general functions to the whole state and thereby tion rules that could determine the behavior of a great making large changes in it. The possibility of large, variety of AST systems. powerful transformations of the state S by function 14.3.1 Transition rules for an elementary AST sys- application, S---, f.'S, is in fact inconceivable in the von tem. When the system receives an input x, it forms the Neumann--cable and protocol--context: there could be application (SYSTEM:x) and then proceeds to obtain its no assurance that the new state f:S would match the meaning in the FFP subsystem, using the current state cable and its fLxed protocols unless f is restricted to the D as the set of definitions. SYSTEM is the distinguished tiny changes allowed by the cable in the first place. name of a function defined in D (i.e. it is the "system We want a computing system whose semantics does program"). Normally the result is a pair 635 Communications August 1978 of Volume 21 the ACM Number 8 #(SYSTEM:x) = 636 Communications August 1978 of Volume 21 the ACM Number 8 System-change inputs. When and the FP function is the same as that represented by the FFP object, provided that update = pUPDA TE and is-system-change o~ KE Y:operand = COMP, STORE, and CONS represent the controlling is-system-change:key = T, functions for composition, store, and construction. key specifies that the user is authorized to make a system All FP definitions needed for our system can be change and that input = ~INPUT:operand represents a converted to cells as indicated above, giving a sequence functionfthat is to be applied to D to produce the new Do. We assume that the AST system has an empty state statef:D. (Of coursef:D can be a useless new state; no to start with, hence SYSTEM is not defined. We want to constraints are placed on it.) The output is a report, define SYSTEM initially so that it will install its next namely report-change: . input as the state; having done so we can then input Do Expression inputs. When is-expression:key = T, the and all our definitions will be installed, including our system understands that the output is to be the meaning program--system--itseff. To accomplish this we enter of the FFP expression input; ~INPUT:operand produces our first input it and it is evaluated, as are all expressions. The state is 637 Communications August 1978 of Volume 2 l the ACM Number 8 Our miniature system program has no provision for d) By defining appropriate functions one can, I be- giving control to a user's program to process many lieve, introduce major new features at any time, using inputs, but it would not be difficult to give it that the same framework. Such features must be built into capability while still monitoring the user's program with the framework of avon Neumann language. I have in the option of taking control back. mind such features as: "stores" with a great variety of naming systems, types and type checking, communicat- 14.5 Variants of AST Systems ing parallel processes, nondeterminacy and Dijkstra's A major extension of the AST systems suggested "guarded command" constructs [8], and improved meth- abow; would provide combining forms, "system forms," ods for structured programming. for building a new AST system from simpler, component e) The framework of an AST system comprises the AST systems. That is, a system form would take AST syntax and semantics of the underlying applicative sys- systems as parameters and generate a new AST system, tem plus the system framework sketched above. By just as a functional form takes functions as parameters current standards, this is a tiny framework for a language and generates new functions. These system forms would and is the only fixed part of the system. have properties like those of functional forms and would become the "operations" of a useful "algebra of systems" 14.7 Naming Systems in AST and von Neumann in much the same way that functional forms are the Models "operations" of the algebra of programs. However, the In an AST system, naming is accomplished by func- problem of finding useful system forms is much more tions as indicated in Section 13.3.3. Many useful func- difficult, since they must handle RESETS, match inputs tions for altering and accessing a store can be defined and outputs, and combine history-sensitive systems (e.g. push, pop, purge, typed fetch, etc.). All these defi- rather than fixed functions. nitions and their associated naming systems can be in- Moreover, the usefulness or need for system forms is troduced without altering the AST framework. Different less clear than that for functional forms. The latter are kinds of "stores" (e.g., with "typed cells") with individual essential for building a great variety of functions from naming systems can be used in one program. A cell in an initial primitive set, whereas, even without system one store may contain another entire store. forms, the facilities for building AST systems are already The important point about AST naming systems is so rich that one could build virtually any system (with that they utilize the functional nature of names (Rey- the general input and output properties allowed by the nolds' OEDANr~N [19] also does so to some extent within given AST scheme). Perhaps system forms would be avon Neumann framework). Thus name functions can useful for building systems with complex input and be composed and combined with other functions by output arrangements. functional forms. In contrast, functions and names in von Neumann languages are usually disjoint concepts 14.6 Remarks About AST Systems and the function-like nature of names is almost totally As I have tiled to indicate above, there can be concealed and useless, because a) names cannot be ap- innumerable variations in the ingredients of an AST plied as functions; b) there are no general means to system--how it operates, how it deals with input and combine names with other names and functions; c) the output, how and when it produces new states, and so on. objects to which name functions apply (stores) are not In any case, a number of remarks apply to any reasonable accessible as objects. AST system: The failure of von Neumann languages to treat a) A state transition occurs once per major computa- names as functions may be one of their more important tion and can have useful mathematical properties. State weaknesses. In any case, the ability to use names as transitions are not involved in the tiniest details of a functions and stores as objects may turn out to be a computation as in conventional languages; thus the lin- useful and important programming concept, one which guistic yon Neumann bottleneck has been eliminated. should be thoroughly explored. No complex "cable" or protocols are needed to com- municate with the state. b) Programs are written in an applicative language 15. Remarks About Computer Design that can accommodate a great range of changeable parts, parts whose power and flexibility exceed that of any von The dominance of von Neumann languages has left Neumann language so far. The word-at-a-time style is designers with few intellectual models for practical com- replaced by an applicative style; there is no division of puter designs beyond variations of the von Neumann programming into a world of expressions and a world of computer. Data flow models [1] [7] [13] are one alterna- statements. Programs can be analyzed and optimized by tive class of history-sensitive models. The substitution an algebra of programs. rules of lambda-calculus based languages present serious c) Since the state cannot change during the compu- problems for the machine designer. Berkling [3] has tation of system:x, there are no side effects. Thus inde- developed a modified lambda calculus that has three pendent applications can be evaluated in parallel. kinds of applications and that makes renaming of vail- 638 Communications August 1978 of Volume 21 the ACM Number 8 ables unnecessary. He has developed a machine to eval- programming style of the von Neumann computer. Thus uate expressions of this language. Further experience is variables = storage cells; assignment statements = fetch- needed to show how sound a basis this language is for ing, storing, and arithmetic; control statements = jump an effective programming style and how efficient his and test instructions. The symbol ".----" is the linguistic machine can be. von Neumann bottleneck. Programming in a conven- Mag6 [15] has developed a novel applicative machine tional~von Neumann--language still concerns itself built from identical components (of two kinds). It eval- with the word-at-a-time traffic through this slightly more uates, directly, FP-like and other applicative expressions sophisticated bottleneck. Von Neumann languages also from the bottom up. It has no von Neumann store and split programming into a world of expressions and a no address register, hence no bottleneck; it is capable of world of statements; the first of these is an orderly world, evaluating many applications in parallel; its built-in op- the second is a disorderly one, a world that structured erations resemble FP operators more than von Neumann programming has simplified somewhat, but without at- computer operations. It is the farthest departure from tacking the basic problems of the split itself and of the the yon Neumann computer that I have seen. word-at-a-time style of conventional languages. There are numerous indications that the applicative Section 5. This section compares avon Neumann style of programming can become more powerful than program and a functional program for inner product. It the von Neumann style. Therefore it is important for illustrates a number of problems of the former and programmers to develop a new class of history-sensitive advantages of the latter: e.g., the von Neumann program models of computing systems that embody such a style is repetitive and word-at-a-time, works only for two and avoid the inherent efficiency problems that seem to vectors named a and b of a given length n, and can only attach to lambda-calculus based systems. Only when be made general by use of a procedure declaration, these models and their applicative languages have proved which has complex semantics. The functional program their superiority over conventional languages will we is nonrepetitive, deals with vectors as units, is more have the economic basis to develop the new kind of hierarchically constructed, is completely general, and computer that can best implement them. Only then, creates "housekeeping" operations by composing high- perhaps, will we be able to fully utilize large-scale inte- level housekeeping operators. It does not name its argu- grated circuits in a computer design not limited by the ments, hence it requires no procedure declaration. von Neumann bottleneck. Section 6. A programming language comprises a framework plus some changeable parts. The framework of a von Neumann language requires that most features 16. Summary must be built into it; it can accommodate only limited changeable parts (e.g., user-defined procedures) because The fifteen preceding sections of this paper can be there must be detailed provisions in the "state" and its summarized as follows. transition rules for all the needs of the changeable parts, Section 1. Conventional programming languages as well as for all the features built into the framework. are large, complex, and inflexible. Their limited expres- The reason the von Neumann framework is so inflexible sive power is inadequate to justify their size and cost. is that its semantics is too closely coupled to the state: Section 2. The models of computing systems that every detail of a computation changes the state. underlie programming languages fall roughly into three Section 7. The changeable parts of von Neumann classes: (a) simple operational models (e.g., Turing ma- languages have little expressive power; this is why most chines), (b) applicative models (e.g., the lambda calcu- of the language must be built into the framework. The lus), and (c) von Neumann models (e.g., conventional lack of expressive power results from the inability of von computers and programming languages). Each class of Neumann languages to effectively use combining forms models has an important difficulty: The programs of for building programs, which in turn results from the class (a) are inscrutable; class (b) models cannot save split between expressions and statements. Combining information from one program to the next; class (c) forms are at their best in expressions, but in von Neu- models have unusable foundations and programs that mann languages an expression can only produce a single are conceptually unhelpful. word; hence expressive power in the world of expressions Section 3. Von Neumann computers are built is mostly lost. A further obstacle to the use of combining around a bottleneck: the word-at-a-time tube connecting forms is the elaborate use of naming conventions. the CPU and the store. Since a program must make Section 8. APL is the first language not based on its overall change in the store by pumping vast numbers the lambda calculus that is not word-at-a-time and uses of words back and forth through the von Neumann functional combining forms. But it still retains many of bottleneck, we have grown up with a style of program- the problems of von Neumann languages. ming that concerns itself with this word-at-a-time traffic Section 9. Von Neumann languages do not have through the bottleneck rather than with the larger con- useful properties for reasoning about programs. Axio- ceptual units of our problems. matic and denotational semantics are precise tools for Section 4. Conventional languages are based on the describing and understanding conventional programs, 639 Communications August 1978 of Volume 21 the ACM Number 8 but they only talk about them and cannot alter their FP systems, as compared with the much more powerful ungainly properties. Unlike von Neumann languages, classical systems. Questions are suggested about algo- the language of ordinary algebra is suitable both for rithmic reduction of functions to infinite expansions and stating its laws and for transforming an equation into its about the use of the algebra in various "lazy evaluation" solution, all within the "language." schemes. Section 10. In a history-sensitive language, a pro- Section 13. This section describes formal functional gram can affect the behavior of a subsequent one by programming (FFP) systems that extend and make pre- changing some store which is saved by the system. Any cise the behavior of FP systems. Their semantics are such language requires some kind of state transition simpler than that of classical systems and can be shown semantics. But it does not need semantics closely coupled to be consistent by a simple fixed-point argument. to states in which the state changes with every detail of Section 14. This section compares the structure of the computation. "Applicative state transition" (AST) Algol with that of applicative state transition (AST) systems are proposed as history-sensitive alternatives to systems. It describes an AST system using an FFP system von Neumann systems. These have: (a) loosely coupled as its applicative subsystem. It describes the simple state state-transition semantics in which a transition occurs and the transition rules for the system. A small self- once per major computation; (b) simple states and tran- protecting system program for the AST system is de- sition rules; (c) an underlying applicative system with scribed, and how it can be installed and used for file simple "reduction" semantics; and (d) a programming maintenance and for running user programs. The section language and state transition rules both based on the briefly discusses variants of AST systems and functional underlying applicative system and its semantics. The naming systems that can be defined and used within an next four sections describe the elements of this approach AST system. to non-von Neumann language and system design. Section 15. This section briefly discusses work on Section 11. A class of informal functional program- applicative computer designs and the need to develop ming (FP) systems is described which use no variables. and test more practical models of applicative systems as Each system is built from objects, functions, functional the future basis for such designs. forms, and definitions. Functions map objects into ob- jects. Functional forms combine existing functions to Acknowledgments. In earlier work relating to this form new ones. This section lists examples of primitive paper I have received much valuable help and many functions and functional forms and gives sample pro- suggestions from Paul R. McJones and Barry K. Rosen. grams. It discusses the limitations and advantages of FP I have had a great deal of valuable help and feedback in systems. preparing this paper. James N. Gray was exceedingly Section 12. An "algebra of programs" is described generous with his time and knowledge in reviewing the whose variables range over the functions of an FP system first draft. Stephen N. Zilles also gave it a careful reading. and whose "operations" are the functional forms of the Both made many valuable suggestions and criticisms at system. A list of some twenty-four laws of the algebra is this difficult stage. It is a pleasure to acknowledge my followed by an example proving the equivalence of a debt to them. I also had helpful discussions about the nonrepetitive matrix multiplication program and a re- first draft with Ronald Fagin, Paul R. McJones, and cursive one. The next subsection states the results of two James H. Morris, Jr. Fagin suggested a number of im- "expansion theorems" that "solve" two classes of equa- provements in the proofs of theorems. tions. These solutions express the "unknown" function Since a large portion of the paper contains technical in such equations as an infinite conditional expansion material, I asked two distinguished computer scientists that constitutes a case-by-case description of its behavior to referee the third draft. David J. Giles and John C. and immediately gives the necessary and sufficient con- Reynolds were kind enough to accept this burdensome ditions for termination. These results are used to derive task. Both gave me large, detailed sets of corrections and a "recursion theorem" and an "iteration theorem," which overall comments that resulted in many improvements, provide ready-made expansions for some moderately large and small, in this final version (which they have general and useful classes of "linear" equations. Exam- not had an opportunity to review). I am truly grateful ples of the use of these theorems treat: (a) correctness for the generous time and care they devoted to reviewing proofs for recursive and iterative factorial functions, and this paper. (b) a proof of equivalence of two iterative programs. A Finally, I also sent copies of the third draft to Gyula final example deals with a "quadratic" equation and A. Mag6, Peter Naur, and John H. Williams. They were proves that its solution is an idempotent function. The kind enough to respond with a number of extremely next subsection gives the proofs of the two expansion helpful comments and corrections. Geoffrey A. Frank theorems. and Dave Tolle at the University of North Carolina The algebra associated with FP systems is compared reviewed Mag6's copy and pointed out an important with the corresponding algebras for the lambda calculus error in the definition of the semantic function of FFP and other applicative systems. The comparison shows systems. My grateful thanks go to all these kind people some advantages to be drawn from the severely restricted for their help. 640 Communications August 1978 of Volume 21 the ACM Number 8 References 12. Iverson, K. A Programming Language. Wiley, New York, 1962. I. Arvind, and Gostelow, K.P. A new interpreter for data flow 13. Kosinski, P. A data flow programming language. Rep. RC 4264, schemas and its implications for computer architecture. Tech. Rep. IBM T.J. Watson Research Ctr., Yorktown Heights, N.Y., March No. 72, Dept. Comptr. Sci., U. of California, Irvine, Oct. 1975. 1973. 2. Backus, J. Programming language semantics and closed 14. Landin, P.J. The mechanical evaluation of expressions. Computer applicative languages. Conf. Record ACM Symp. on Principles of J. 6, 4 (1964), 308-320. Programming Languages, Boston, Oct. 1973, 71-86. 15. Mag6, G.A. A network of microprocessors to execute reduction 3. Berkling, K.J. Reduction languages for reduction machines. languages. To appear in Int. J. Comptr. and Inform. Sci. Interner Bericht ISF-76-8, Gesellschaft f'dr Mathematik und 16. Manna, Z., Ness, S., and Vuillemin, J. Inductive methods for Datenverarbeitung MBH, Bonn, Sept. 1976. proving properties of programs. Comm..4 CM 16,8 (Aug. 1973) 4. Burge, W.H. Recursive Programming Techniques. Addison- 491-502. Wesley, Reading, Mass., 1975. 17. McCarthy, J. Recursive functions of symbolic expressions and 5. Church, A. The Calculi of Lambda-Conversion. Princeton U. their computation by machine, Pt. 1. Comm. ,4CM 3, 4 (April 1960), Press, Princeton, N.J., 1941. 184-195. 6. Curry, H.B., and Feys, R. Combinatory Logic, Vol. 1. North- 18. MeJones, P. A Church-Rosser property of closed applicative Holland Pub. Co., Amsterdam, 1958. languages. Rep. RJ 1589, IBM Res. Lab., San Jose, Calif., May 1975. 7. Dennis, J.B. First version of a data flow procedure language. 19. Reynolds, J.C. GEDANKEN--asimple typeless language based on Tech. Mem. No. 61, Lab. for Comptr. Sci., M.I.T., Cambridge, Mass., the principle of completeness and the reference concept. Comm. May 1973. ACM 13, 5 (May 1970), 308-318. 8. Dijkstra, E.W. ,4 Discipline of Programming. Prentice-Hall, 20. Reynolds, J..C. Notes on a lattice-theoretic approach to the theory Englewood Cliffs, N.J., 1976. of computation. Dept. Syst. and Inform. Sci., Syracuse U., Syracuse, 9. Friedman, D.P., and Wise, D.S. CONS should not evaluate its N.Y., 1972. arguments. In Automata, Languages and Programming, S. Michaelson 21. Scott, D. Outline of a mathematical theory of computation. Proc. and R. Milner, Eds., Edinburgh U. Press, Edinburgh, 1976, pp. 4th Princeton Conf. on Inform. Sci. and Syst., 1970. 257-284. 22. Scott, D. Lattice-theoretic models for various type-free calculi. 10. Henderson, P., and Morris, J.H. Jr. A lazy evaluator. Conf. Proc. Fourth Int. Congress for Logic, Methodology, and the Record Third ACM Symp. on Principles of Programming Languages, Philosophy of Science, Bucharest, 1972. Atlanta, Ga., Jan. 1976, pp. 95-103. 23. Scott, D., and Strachey, C. Towards a mathematical semantics I1. Hoare, C.A.R. An axiomatic basis for computer programming. for computer languages. Proc. Symp. on Comptrs. and Automata, Comm. ,4CM 12, 10 (Oct. 1969), 576-583. Polytechnic Inst. of Brooklyn, 1971. 641 Communications August 1978 of Volume 21 the ACM Number 8 . (We might have constructed a primitive function, defs, named DEFS such that for any different operand than the one above, one with just three object x ~ ±, cells, for key, input, and file. We did not do so because defs:x = pDEFS:x = D real programs, unlike subsystem, would contain many name functions referring to data in the state, and this where D is the current state and set of definitions of the "standard" construction of the operand would suffice AST system. This function allows programs access to the then as well.) whole state for any purpose, including the essential one 14.4.2 The "subsystem" function. We now give the of computing the successor state. FP definition of the function subsystem, followed by brief explanations of its six cases and auxiliary functions. 14.4 An Example of a System Program The above description of our elementary AST system, Def subsystem --- plus the FFP subsystem and the FP primitives and is-system-changeo TK E Y ---, [report-change, apply ]o[ ~ l N P U T, defs]; functional forms of earlier sections, specify a complete is-expressiono~'KE Y --~ ['[I N P U T, clefs]; is-programo TKE Y--~ system-checkoapplyo[ ~ lNP UT, defs]; history-sensitive computing system. Its input and output is-queryo'~KE Y --> [query-response oH1NPUT, T FILE], clefs]; behavior is limited by its simple transition rules, but is-update oI"KE Y --, otherwise it is a powerful system once it is equipped with [report-update, J,FILEo[update, defs]] a suitable set of definitions. As an example of its use we o[~INPUT, TF1LE]; shall describe a small system program, its installation, [report-erroro[~KEY,'~lNPUT], defs]. and operation. This subsystem has five "p ~ j~" clauses and a final Our example system program will handle queries and default function, for a total of six classes of inputs; the updates for a file it maintains, evaluate FFP expressions, treatment of each class is given below. Recall that the run general user programs that do not damage the file or operand of subsystem is a sequence of cells containing the state, and allow authorized users to change the set of key, input, and file as well as all the defined functions of definitions and the system program itself. All inputs it D, and that subsystem:operand =