<<

197 7 ACM Lecture

The 1977 ACM Turing Award was presented to putations called Fortran. This same group designed the first at the ACM Annual Conference in Seattle, October 17. In intro- system to translate Fortran programs into machine language. ducing the recipient, Jean E. Sammet, Chairman of the Awards They employed novel optimizing techniques to generate fast Committee, made the following comments and read a portion of machine-language programs. Many other compilers for the lan- the final citation. The full announcement is in the September guage were developed, first on IBM machines, and later on virtu- 1977 issue of Communications, page 681. ally every make of computer. Fortran was adopted as a U.S. "Probably there is nobody in the room who has not heard of national standard in 1966. Fortran and most of you have probably used it at least once, or at During the latter part of the 1950s, Backus served on the least looked over the shoulder of someone who was writing a For. international committees which developed Algol 58 and a later tran program. There are probably almost as many people who version, Algol 60. The language Algol, and its derivative com- have heard the letters BNF but don't necessarily know what they pilers, received broad acceptance in Europe as a means for de- stand for. Well, the B is for Backus, and the other letters are veloping programs and as a formal means of publishing the explained in the formal citation. These two contributions, in my on which the programs are based. opinion, are among the half dozen most important technical In 1959, Backus presented a paper at the UNESCO confer- contributions to the computer field and both were made by John ence in Paris on the syntax and semantics of a proposed inter- Backus (which in the Fortran case also involved some col- national algebraic language. In this paper, he was the first to leagues). It is for these contributions that he is receiving this employ a formal technique for specifying the syntax of program- year's Turing award. ming languages. The formal notation became known as BNF- The short form of his citation is for 'profound, influential, standing for "Backus Normal Form," or "Backus Naur Form" to and lasting contributions to the design of practical high-level recognize the further contributions by Peter Naur of Denmark. programming systems, notably through his work on Fortran, and Thus, Backus has contributed strongly both to the pragmatic for seminal publication of formal procedures for the specifica- world of problem-solving on computers and to the theoretical tions of programming languages.' world existing at the interface between artificial languages and The most significant part of the full citation is as follows: computational linguistics. Fortran remains one of the most '... Backus headed a IBM group in New York City widely used programming languages in the world. Almost all during the early 1950s. The earliest product of this group's programming languages are now described with some type of efforts was a high-level language for scientific and technical corn- formal syntactic definition.' " Can Programming Be Liberated from the von Neumann Style? A Functional Style and Its Algebra of Programs John Backus IBM Research Laboratory, San Jose

Conventional programming languages are growing ever more enormous, but not stronger. Inherent defects at the most basic level cause them to be both fat and weak: their primitive word-at-a-time style of program- ming inherited from their common ancestor--the von Neumann computer, their close coupling of semantics to state transitions, their division of programming into a world of expressions and a world of statements, their inability to effectively use powerful combining forms for building new programs from existing ones, and their lack of useful mathematical properties for reasoning about programs. An alternative functional style of programming is General permission to make fair use in teaching or research of all founded on the use of combining forms for creating or part of this material is granted to individual readers and to nonprofit libraries acting for them provided that ACM's copyright notice is given programs. Functional programs deal with structured and that reference is made to the publication, to its date of issue, and data, are often nonrepetitive and nonrecursive, are hier- to the fact that reprinting privileges were granted by permission of the archically constructed, do not name their arguments, and Association for Machinery. To otherwise reprint a figure, table, other substantial excerpt, or the entire work requires specific do not require the complex machinery of procedure permission as does republication, or systematic or multiple reproduc- declarations to become generally applicable. Combining tion. forms can use high level programs to build still higher Author's address: 91 Saint Germain Ave., San Francisco, CA 94114. level ones in a style not possible in conventional lan- © 1978 ACM 0001-0782/78/0800-0613 $00.75 guages.

613 Communications August 1978 of Volume 2 i the ACM Number 8 Associated with the functional style of programming grams, and no conventional language even begins to is an algebra of programs whose variables range over meet that need. In fact, conventional languages create programs and whose operations are combining forms. unnecessary confusion in the way we think about pro- This algebra can be used to transform programs and to grams. solve equations whose "unknowns" are programs in much For twenty years programming languages have been the same way one transforms equations in high school steadily progressing toward their present condition of algebra. These transformations are given by algebraic obesity; as a result, the study and invention of program- laws and are carried out in the same language in which ming languages has lost much of its excitement. Instead, programs are written. Combining forms are chosen not it is now the province of those who prefer to work with only for their programming power but also for the power thick compendia of details rather than wrestle with new of their associated algebraic laws. General theorems of ideas. Discussions about programming languages often the algebra give the detailed behavior and termination resemble medieval debates about the number of angels conditions for large classes of programs. that can dance on the head of a pin instead of exciting A new class of computing systems uses the functional contests between fundamentally differing concepts. programming style both in its and Many creative computer scientists have retreated in its state transition rules. Unlike von Neumann lan- from inventing languages to inventing tools for describ- guages, these systems have semantics loosely coupled to ing them. Unfortunately, they have been largely content states--only one state transition occurs per major com- to apply their elegant new tools to studying the warts putation. and moles of existing languages. After examining the Key Words and Phrases: functional programming, appalling type structure of conventional languages, using algebra of programs, combining forms, functional forms, the elegant tools developed by , it is surprising programming languages, von Neumann computers, yon that so many of us remain passively content with that Neumann languages, models of computing systems, ap- structure instead of energetically searching for new ones. plicative computing systems, applicative state transition The purpose of this article is twofold; first, to suggest systems, program transformation, program correctness, that basic defects in the framework of conventional program termination, metacomposition languages make their expressive weakness and their CR Categories: 4.20, 4.29, 5.20, 5.24, 5.26 cancerous growth inevitable, and second, to suggest some alternate avenues of exploration toward the design of new kinds of languages. Introduction

I deeply appreciate the honor of the ACM invitation 2. Models of Computing Systems to give the 1977 Turing Lecture and to publish this account of it with the details promised in the lecture. Underlying every programming language is a model Readers wishing to see a summary of this paper should of a computing system that its programs control. Some turn to Section 16, the last section. models are pure abstractions, some are represented by hardware, and others by compiling or interpretive pro- 1. Conventional Programming Languages: Fat and grams. Before we examine conventional languages more Flabby closely, it is useful to make a brief survey of existing models as an introduction to the current universe of Programming languages appear to be in trouble. alternatives. Existing models may be crudely classified Each successive language incorporates, with a little by the criteria outlined below. cleaning up, all the features of its predecessors plus a few more. Some languages have manuals exceeding 500 2.1 Criteria for Models pages; others cram a complex description into shorter 2.1.1 Foundations. Is there an elegant and concise manuals by using dense formalisms. The Department of mathematical description of the model? Is it useful in Defense has current plans for a committee-designed proving helpful facts about the behavior of the model? language standard that could require a manual as long Or is the model so complex that its description is bulky as 1,000 pages. Each new language claims new and and of little mathematical use? fashionable features, such as strong typing or structured 2.1.2 History sensitivity. Does the model include a control statements, but the plain fact is that few lan- notion of storage, so that one program can save infor- guages make programming sufficiently cheaper or more mation that can affect the behavior of a later program? reliable to justify the cost of producing and learning to That is, is the model history sensitive? use them. 2.1.3 Type of semantics. Does a program successively Since large increases in size bring only small increases transform states (which are not programs) until a termi- in power, smaller, more elegant languages such as Pascal nal state is reached (state-transition semantics)? Are continue to be popular. But there is a desperate need for states simple or complex? Or can a "program" be suc- a powerful methodology to help us think about pro- cessively reduced to simpler "programs" to yield a final

614 Communications August 1978 of Volume 21 the ACM Number 8 "normal form program," which is the result (reduction three parts: a central processing unit (or CPU), a store, semantics)? and a connecting tube that can transmit a single word 2.1.4 Clarity and conceptual usefulness of programs. between the CPU and the store (and send an address to •Are programs of the model clear expressions of a process the store). I propose to call this tube the yon Neumann or computation? Do they embody concepts that help us bottleneck. The task of a program is to change the to formulate and reason about processes? contents of the store in some major way; when one considers that this task must be accomplished entirely by 2.2 Classification of Models pumping single words back and forth through the von Using the above criteria we can crudely characterize Neumann bottleneck, the reason for its name becomes three classes of models for computing systems--simple clear. operational models, applicative models, and von Neu- Ironically, a large part of the traffic in the bottleneck mann models. is not useful data but merely names of data, as well as 2.2.1 Simple operational models. Examples: Turing operations and data used only to compute such names. machines, various automata. Foundations: concise and Before a word can be sent through the tube its address useful. History sensitivity: have storage, are history sen- must be in the CPU; hence it must either be sent through sitive. Semantics: state transition with very simple states. the tube from the store or be generated by some CPU Program clarity: programs unclear and conceptually not operation. If the address is sent from the store, then its helpful. address must either have been sent from the store or 2.2.2 Applicative models. Examples: Church's generated in the CPU, and so on. If, on the other hand, lambda calculus [5], Curry's system of combinators [6], the address is generated in the CPU, it must be generated pure Lisp [17], functional programming systems de- either by a fixed rule (e.g., "add 1 to the program scribed in this paper. Foundations: concise and useful. counter") or by an instruction that was sent through the History sensitivity: no storage, not history sensitive. Se- tube, in which case its address must have been sent ... mantics: reduction semantics, no states. Program clarity: and so on. programs can be clear and conceptually useful. Surely there must be a less primitive way of making 2.2.3 Von Neumann models. Examples: von Neu- big changes in the store than by pushing vast numbers mann computers, conventional programming languages. of words back and forth through the von Neumann Foundations: complex, bulky, not useful. History sensitiv- bottleneck. Not only is this tube a literal bottleneck for ity: have storage, are history sensitive. Semantics: state the data traffic of a problem, but, more importantly, it is transition with complex states. Program clarity: programs an intellectual bottleneck that has kept us tied to word- can be moderately clear, are not very useful conceptually. at-a-time thinking instead of encouraging us to think in The above classification is admittedly crude and terms of the larger conceptual units of the task at hand. debatable. Some recent models may not fit easily into Thus programming is basically planning and detailing any of these categories. For example, the data-flow the enormous traffic of words through the von Neumann languages developed by Arvind and Gostelow [1], Den- bottleneck, and much of that traffic concerns not signif- nis [7], Kosinski [13], and others partly fit the class of icant data itself but where to find it. simple operational models, but their programs are clearer than those of earlier models in the class and it is perhaps possible to argue that some have reduction semantics. In 4. Von Neumann Languages any event, this classification will serve as a crude map of the territory to be discussed. We shall be concerned only Conventional programming languages are basically with applicative and von Neumann models. high level, complex versions of the von Neumann com- puter. Our thirty year old belief that there is only one kind of computer is the basis of our belief that there is 3. Von Neumann Computers only one kind of programming language, the conven- tional--von Neumann--language. The differences be- In order to understand the problems of conventional tween Fortran and Algol 68, although considerable, are programming languages, we must first examine their less significant than the fact that both are based on the intellectual parent, the von Neumann computer. What is programming style of the von Neumann computer. Al- avon Neumann computer? When von Neumann and though I refer to conventional languages as "von Neu- others conceived it over thirty years ago, it was an mann languages" to take note of their origin and style, elegant, practical, and unifying idea that simplified a I do not, of course, blame the great mathematician for number of engineering and programming problems that their complexity. In fact, some might say that I bear existed then. Although the conditions that produced its some responsibility for that problem. architecture have changed radically, we nevertheless still Von Neumann programming languages use variables identify the notion of "computer" with this thirty year to imitate the computer's storage cells; control statements old concept. elaborate its jump and test instructions; and assignment In its simplest form avon Neumann computer has statements imitate its fetching, storing, and arithmetic.

615 Communications August 1978 of Volume 21 the ACM Number 8 The assignment statement is the von Neumann bottle- 5. Comparison of von Neumann and Functional neck of programming languages and keeps us thinking Programs in word-at-a-time terms in much the same way the computer's bottleneck does. To get a more detailed picture of some of the defects Consider a typical program; at its center are a number of von Neumann languages, let us compare a conven- of assignment statements containing some subscripted tional program for inner product with a functional one variables. Each assignment statement produces a one- written in a simple language to be detailed further on. word result. The program must cause these statements to be executed many times, while altering subscript values, 5.1 A von Neumann Program for Inner Product in order to make the desired overall change in the store, c.-~-0 since it must be done one word at a time. The program- for i .~ I step 1 until n do mer is thus concerned with the flow of words through c .--- c + ali]xbIi] the assignment bottleneck as he designs the nest of Several properties of this program are worth noting: control statements to cause the necessary repetitions. a) Its statements operate on an invisible "state" ac- Moreover, the assignment statement splits program- cording to complex rules. ming into two worlds. The first world comprises the right b) It is not hierarchical. Except for the right side of sides of assignment statements. This is an orderly world the assignment statement, it does not construct complex of expressions, a world that has useful algebraic proper- entities from simpler ones. (Larger programs, however, ties (except that those properties are often destroyed by often do.) side effects). It is the world in which most useful com- c) It is dynamic and repetitive. One must mentally putation takes place. execute it to understand it. The second world of conventional programming lan- d) It computes word-at-a-time by repetition (of the guages is the world of statements. The primary statement assignment) and by modification (of variable i). in that world is the assignment statement itself. All the e) Part of the data, n, is in the program; thus it lacks other statements of the language exist in order to make generality and works only for vectors of length n. it possible to perform a computation that must be based f) It names its arguments; it can only be used for on this primitive construct: the assignment statement. vectors a and b. To become general, it requires a proce- This world of statements is a disorderly one, with few dure declaration. These involve complex issues (e.g., call- useful mathematical properties. Structured programming by-name versus call-by-value). can be seen as a modest effort to introduce some order g) Its "housekeeping" operations are represented by into this chaotic world, but it accomplishes little in symbols in scattered places (in the for statement and the attacking the fundamental problems created by the subscripts in the assignment). This makes it impossible word-at-a-time von Neumann style of programming, to consolidate housekeeping operations, the most com- with its primitive use of loops, subscripts, and branching mon of all, into single, powerful, widely useful operators. flow of control. Thus in programming those operations one must always Our fixation on yon Neumann languages has contin- start again at square one, writing "for i .--- ..." and ued the primacy of the von Neumann computer, and our "for j := ..." followed by assignment statements sprin- dependency on it has made non-von Neumann languages kled with i's and j's. uneconomical and has limited their development. The absence of full scale, effective programming styles founded on non-von Neumann principles has deprived 5.2 A Functional Program for Inner Product designers of an intellectual foundation for new computer Def Innerproduct architectures. (For a brief discussion of that topic, see - (Insert +)o(ApplyToAll x)oTranspose Section 15.) Applicative computing systems' lack of storage and Or, in abbreviated form: history sensitivity is the basic reason they have not Def IP - (/+)o(ax)oTrans. provided a foundation for computer design. Moreover, most applicative systems employ the substitution opera- Composition (o), Insert (/), and ApplyToAll (a) are tion of the lambda calculus as their basic operation. This functional forms that combine existing functions to form operation is one of virtually unlimited power, but its new ones. Thus fog is the function obtained by applying complete and efficient realization presents great difficul- first g and then fi and c~f is the function obtained by ties to the machine designer. Furthermore, in an effort applyingf to every member of the argument. If we write to introduce storage and to improve their efficiency on f:x for the result of applying f to the object x, then we von Neumann computers, applicative systems have can explain each step in evaluating Innerproduct applied tended to become engulfed in a large von Neumann to the pair of vectors <<1, 2, 3>, <6, 5, 4>> as follows: system. For example, pure Lisp is often buried in large IP:<< i,2,3>, <6,5,4>> = extensions with many von Neumann features. The re- Definition of IP ~ (/+)o(ax)oTrans: << 1,2,3>, <6,5,4>> suiting complex systems offer little guidance to the ma- Effect of composition, o ~ (/+):((ax):(Trans: chine designer. <<1,2,3>, <6,5,4>>))

616 Communications August 1978 of Volume 21 the ACM Number 8 Applying Transpose (/+):((ax): <<1,6>, <2,5>, <3,4>>) provides a general environment for its changeable fea- Effect of ApplyToAll, a (/+): , x: <2,5>, x: <3,4>> tures. Applying × (/+): <6,10,12> Now suppose a language had a small framework Effect of Insert, / +: <6, +: > Applying + +: <6,22> which could accommodate a great variety of powerful Applying + again 28 features entirely as changeable parts. Then such a frame- work could support many different features and styles Let us compare the properties of this program with without being changed itself. In contrast to this pleasant those of the von Neumann program. possibility, von Neumann languages always seem to have a) It operates only on its arguments. There are no an immense framework and very limited changeable hidden states or complex transition rules. There are only parts. What causes this to happen? The answer concerns two kinds of rules, one for applying a function to its two problems of von Neumann languages. argument, the other for obtaining the function denoted The first problem results from the von Neumann by a functional form such as composition, fog, or style of word-at-a-time programming, which requires ApplyToAll, af, when one knows the functions f and g, that words flow back and forth to the state, just like the the parameters of the forms. flow through the von Neumann bottleneck. Thus avon b) It is hierarchical, being built from three simpler Neumann language must have a semantics closely cou- functions (+, x, Trans) and three functional forms fog, pled to the state, in which every detail of a computation af, and/f. changes the state. The consequence of this semantics c) It is static and nonrepetitive, in the sense that its closely coupled to states is that every detail of every structure is helpful in understanding it without mentally feature must be built into the state and its transition executing it. For example, if one understands the action rules. of the forms fog and af, and of the functions x and Thus every feature of avon Neumann language must Trans, then one understands the action of ax and of be spelled out in stupefying detail in its framework. (c~x)oTrans, and so on. Furthermore, many complex features are needed to prop d) It operates on whole conceptual units, not words; up the basically weak word-at-a-time style. The result is it has three steps; no step is repee, ted. the inevitable rigid and enormous framework of avon e) It incorporates no data; it is completely general; it Neumann language. works for any pair of conformable vectors. f) It does not name its arguments; it can be applied to any pair of vectors without any procedure declaration or 7. Changeable Parts and Combining Forms complex substitution rules. g) It employs housekeeping forms and functions that The second problem of von Neumann languages is are generally useful in many other programs; in fact, that their changeable parts have so little expressive only + and x are not concerned with housekeeping. power. Their gargantuan size is eloquent proof of this; These forms and functions can combine with others to after all, if the designer knew that all those complicated create higher level housekeeping operators. features, which he now builds into the framework, could Section 14 sketches a kind of system designed to be added later on as changeable parts, he would not be make the above functional style of programming avail- so eager to build them into the framework. able in a history-sensitive system with a simple frame- Perhaps the most important element in providing work, but much work remains to be done before the powerful changeable parts in a language is the availabil- above applicative style can become the basis for elegant ity of combining forms that can be generally used to and practical programming languages. For the present, build new procedures from old ones. Von Neumarm the above comparison exhibits a number of serious flaws languages provide only primitive combining forms, and in yon Neumann programming languages and can serve the von Neumann framework presents obstacles to their as a starting point in an effort to account for their present full use. fat and flabby condition. One obstacle to the use of combining forms is the split between the expression world and the statement world in von Neumann languages. Functional forms 6. Language Frameworks versus Changeable Parts naturally belong to the world of expressions; but no matter how powerful they are they can only build expres- Let us distinguish two parts of a programming lan- sions that produce a one-word result. And it is in the guage. First, its framework which gives the overall rules statement world that these one-word results must be of the system, and second, its changeable parts, whose combined into the overall result. Combining single words existence is anticipated by the framework but whose is not what we really should be thinking about, but it is particular behavior is not specified by it. For example, a large part of programming any task in von Neumann the for statement, and almost all other statements, are languages. To help assemble the overall result from part of Algol's framework but library functions and user- single words these languages provide some primitive defined procedures are changeable parts. Thus the combining forms in the statement world--the for, while, framework of a language describes its fixed features and and if-then-else statements--but the split between the

617 Communications August 1978 of Volume 21 the ACM Number 8 two worlds prevents the combining forms in either world foundations provide powerful tools for describing the from attaining the full power they can achieve in an language and for proving properties of programs. When undivided world. applied to avon Neumann language, on the other hand, A second obstacle to the use of combining forms in it provides a precise semantic description and is helpful von Neumann languages is their use of elaborate naming in identifying trouble spots in the language. But the conventions, which are further complicated by the sub- complexity of the language is mirrored in the complexity stitution rules required in calling procedures. Each of of the description, which is a bewildering collection of these requires a complex mechanism to be built into the productions, domains, functions, and equations that is framework so that variables, subscripted variables, only slightly more helpful in proving facts about pro- pointers, file names, procedure names, call-by-value for- grams than the reference manual of the language, since mal parameters, call-by-name formal parameters, and so it is less ambiguous. on, can all be properly interpreted. All these names, Axiomatic semantics [11] precisely restates the in- conventions, and rules interfere with the use of simple elegant properties ofvon Neumann programs (i.e., trans- combining forms. formations on states) as transformations on predicates. The word-at-a-time, repetitive game is not thereby changed, merely the playing field. The complexity of this 8. APL versus Word-at-a-Time Programming axiomatic game of proving facts about von Neumann programs makes the successes of its practitioners all the Since I have said so much about word-at-a-time more admirable. Their success rests on two factors in programming, I must now say something about APL addition to their ingenuity: First, the game is restricted [12]. We owe a great debt to Kenneth Iverson for showing to small, weak subsets of full von Neumann languages us that there are programs that are neither word-at-a- that have states vastly simpler than real ones. Second, time nor dependent on lambda expressions, and for the new playing field (predicates and their transforma- introducing us to the use of new functional forms. And tions) is richer, more orderly and effective than the old since APL assignment statements can store arrays, the (states and their transformations). But restricting the effect of its functional forms is extended beyond a single game and transferring it to a more effective domain does assignment. not enable it to handle real programs (with the necessary Unfortunately, however, APL still splits program- complexities of procedure calls and aliasing), nor does it ming into a world of expressions and a world of state- eliminate the clumsy properties of the basic von Neu- ments. Thus the effort to write one-line programs is mann style. As axiomatic semantics is extended to cover partly motivated by the desire to stay in the more orderly more of a typical von Neumann language, it begins to world of expressions. APL has exactly three functional lose its effectiveness with the increasing complexity that forms, called inner product, outer product, and reduc- is required. tion. These are sometimes difficult to use, there are not Thus denotational and axiomatic semantics are de- enough of them, and their use is confined to the world scriptive formalisms whose foundations embody elegant of expressions. and powerful concepts; but using them to describe avon Finally, APL semantics is still too closely coupled to Neumann language can not produce an elegant and states. Consequently, despite the greater simplicity and powerful language any more than the use of elegant and power of the language, its framework has the complexity modern machines to build an Edsel can produce an and rigidity characteristic of von Neumann languages. elegant and modem car. In any case, proofs about programs use the language of logic, not the language of programming. Proofs talk 9. Von Neumann Languages Lack Useful about programs but cannot involve them directly since Mathematical Properties the axioms of von Neumann languages are so unusable. In contrast, many ordinary proofs are derived by alge- So far we have discussed the gross size and inflexi- braic methods. These methods require a language that bility of von Neumann languages; another important has certain algebraic properties. Algebraic laws can then defect is their lack of useful mathematical properties and be used in a rather mechanical way to transform a the obstacles they present to reasoning about programs. problem into its solution. For example, to solve the Although a great amount of excellent work has been equation published on proving facts about programs, von Neu- mann languages have almost no properties that are ax+bx=a+b helpful in this direction and have many properties that for x (given that a+b ~ 0), we mechanically apply the are obstacles (e.g., side effects, aliasing). distributive, identity, and cancellation laws, in succes- Denotational semantics [23] and its foundations [20, sion, to obtain 21] provide an extremely helpful mathematical under- standing of the domain and function spaces implicit in (a + b)x = a + b programs. When applied to an applicative language (a + b)x = (a + b) l (such as that of the "recursive programs" of [16]), its X~ 1.

618 Communications August 1978 of Volume 21 the ACM Number 8 Thus we have proved that x = 1 without leaving the past three or four years and have not yet found a "language" of algebra. Von Neumann languages, with satisfying solution to the many conflicting requirements their grotesque syntax, offer few such possibilities for that a good language must resolve. But I believe this transforming programs. search has indicated a useful approach to designing non- As we shall see later, programs can be expressed in von Neumann languages. a language that has an associated algebra. This algebra This approach involves four elements, which can be can be used to transform programs and to solve some summarized as follows. equations whose "unknowns" are programs, in much the a) A functional style of programming without varia- same way one solves equations in high school algebra. bles. A simple, informal functional programming (FP) Algebraic transformations and proofs use the language system is described. It is based on the use of combining of the programs themselves, rather than the language of forms for building programs. Several programs are given logic, which talks about programs. to illustrate functional programming. b) An algebra of functional programs. An algebra is described whose variables denote FP functional pro- 10. What Are the Alternatives to von Neumann grams and whose "operations" are FP functional forms, Languages? the combining forms of FP programs. Some laws of the algebra are given. Theorems and examples are given that Before discussing alternatives to von Neumann lan- show how certain function expressions may be trans- guages, let me remark that I regret the need for the above formed into equivalent infinite expansions that explain negative and not very precise discussion of these lan- the behavior of the function. The FP algebra is compared guages. But the complacent acceptance most of us give with algebras associated with the classical applicative to these enormous, weak languages has puzzled and systems of Church and Curry. disturbed me for a long time. I am disturbed because c) A formal functional programming system. A formal that acceptance has consumed a vast effort toward mak- (FFP) system is described that extends the capabilities ing von Neumann languages fatter that might have been of the above informal FP systems. An FFP system is better spent in looking for new structures. For this reason thus a precisely defined system that provides the ability I have tried to analyze some of the basic defects of to use the functional programming style of FP systems conventional languages and show that those defects can- and their algebra of programs. FFP systems can be used not be resolved unless we discover a new kind of lan- as the basis for applicative state transition systems. guage framework. d) Applicative state transition systems. As discussed In seeking an alternative to conventional languages above. The rest of the paper describes these four ele- we must first recognize that a system cannot be history ments, gives some brief remarks on computer design, sensitive (permit execution of one program to affect the and ends with a summary of the paper. behavior of a subsequent one) unless the system has some kind of state (which the first program can change and the second can access). Thus a history-sensitive I I. Functional Programming Systems (FP Systems) model of a computing system must have a state-transition semantics, at least in this weak sense. But this does not 11.1 Introduction mean that every computation must depend heavily on a In this section we give an informal description of a complex state, with many state changes required for each class of simple applicative programming systems called small part of the computation (as in von Neumann functional programming (FP) systems, in which "pro- languages). grams" are simply functions without variables. The de- To illustrate some alternatives to von Neumann lan- scription is followed by some examples and by a discus- guages, I propose to sketch a class of history-sensitive sion of various properties of FP systems. computing systems, where each system: a) has a loosely An FP system is founded on the use of a fixed set of coupled state-transition semantics in which a state tran- combining forms called functional forms. These, plus sition occurs only once in a major computation; b) has simple definitions, are the only means of building new a simply structured state and simple transition rules; c) functions from existing ones; they use no variables or depends heavily on an underlying applicative system substitution rules, and they become the operations of an both to provide the basic programming language of the associated algebra of programs. All the functions of an system and to describe its state transitions. FP system are of one type: they map objects into objects These systems, which I call applicative state transition and always take a single argument. (or AST) systems, are described in Section 14. These In contrast, a lambda-calculus based system is simple systems avoid many of the complexities and founded on the use of the lambda expression, with an weaknesses of von Neumann languages and provide for associated set of substitution rules for variables, for a powerful and extensive set of changeable parts. How- building new functions. The lambda expression (with its ever, they are sketched only as crude examples of a vast substitution rules) is capable of defining all possible area of non-von Neumann systems with various attrac- computable functions of all possible types and of any tive properties. I have been studying this area for the number of arguments. This freedom and power has its 619 Communications August 1978 of Volume 21 the ACM Number 8 disadvantages as well as its obvious advantages. It is There is one important constraint in the construction analogous to the power of unrestricted control statements of objects: if x is a sequence with J_ as an element, then in conventional languages: with unrestricted freedom x = ±. That is, the "sequence constructor" is "±-pre- comes chaos. If one constantly invents new combining serving." Thus no proper sequence has i as an element. forms to suit the occasion, as one can in the lambda calculus, one will not become familiar with the style or Examples of objects useful properties of the few combining forms that are ± 1.5 ¢p AB3 adequate for all purposes. Just as structured program- <.4, <, C>, D> <,4, ±> = ± ming eschews many control statements to obtain pro- grams with simpler structure, better properties, and uni- 11.2.2 Application. An FP system has a single oper- form methods for understanding their behavior, so func- ation, application. Iff is a function and x is an object, tional programming eschews the lambda expression, sub- thenf:x is an application and denotes the object which stitution, and multiple function types. It thereby achieves is the result of applying f to x. f is the operator of the programs built with familiar functional forms with application and x is the operand. known useful properties. These programs are so struc- Examples of applications tured that their behavior can often be understood and proven by mechanical use of algebraic techniques similar +:<•,2> = 3 tI:<.A,B,C> = to those used in solving high school algebra problems. I: = A 2: = B Functional forms, unlike most programming con- 11.2.3 Functions, F. All functions fin F map objects structs, need not be chosen on an ad hoc basis. Since into objects and are bottom-preserving:f:± = ±, for allf they are the operations of an associated algebra, one in F. Every function in F is either primitive, that is, chooses only those functional forms that not only provide supplied with the system, or it is defined (see below), or powerful programming constructs, but that also have it is a functional form (see below). attractive algebraic properties: one chooses them to max- It is sometimes useful to distinguish between two imize the strength and utility of the algebraic laws that cases in whichf:x=±. If the computation forf:x termi- relate them to other functional forms of the system. nates and yields the object _1_, we sayfis undefined at x, In the following description we shall be imprecise in that is, f terminates but has no meaningful value at x. not distinguishing between (a) a function symbol or Otherwise we sayfis nonterminating at x. expression and (b) the function it denotes. We shall indicate the symbols and expressions used to denote Examples of primitive functions functions by example and usage. Section 13 describes a Our intention is to provide FP systems with widely formal extension of FP systems (FFP systems); they can useful and powerful primitive functions rather than weak serve to clarify any ambiguities about FP systems. ones that could then be used to define useful ones. The following examples define some typical primitive func- 11.2 Description tions, many of which are used in later examples of An FP system comprises the following: programs. In the following definitions we use a variant l) a set O of objects; of McCarthy's conditional expressions [ 17]; thus we write 2) a set F of functions f that map objects into objects; pl -+ el; ... ;pn ~ en; e,+l 3) an operation, application; instead of McCarthy's expression 4) a set F of functional forms; these are used to combine existing functions, or objects, to form new functions in (401---> el ..... pn ---~ en, T---~ en+l). F; The following definitions are to hold for all objects x, xi, 5) a set D of definitions that define some functions in F and assign a name to each. y, yi, Z, Zi: What follows is an informal description of each of Selector functions the above entities with examples. 1 :X ~- X= ""* X1; I and for any positive integer s 11.2.1 Objects, O. An object x is either an atom, a sequence whose elements xi are objects, or S:X----X=&n~s--~ xs;-L ± ("bottom" or "undefined"). Thus the choice of a set A Thus, for example, 3: = C and 2: = ±. of atoms determines the set of objects. We shall take A Note that the function symbols 1, 2, etc. are distinct from to be the set of nonnull strings of capital letters, digits, the atoms 1, 2, etc. and special symbols not used by the notation of the FP system. Some of these strings belong to the class of atoms Tail called "numbers." The atom ~ is used to denote the thx -- x= ~ if; empty sequence and is the only object which is both an x= & n 2__2 ~ ; i atom and a sequence. The atoms T and F are used to Identity denote "true" and "false." id:x - x

620 Communications August 1978 of Volume 21 the ACM Number 8 Atom and g, f and g are its parameters, and it denotes the atom:x - x is an atom ~ T; x#3- ~ F; ,1, function such that, for any object x,

Equals (fog) :x =f:(g:x). eq:x -- x= & y=z----> T; x= & y--~z---> F; ,1, Some functional forms may have objects as parameters. Null For example, for any object x, ~c is a functional form, the null:x -= x=~ ~ 12, x~_l_ ~ F; _1_ constant function of x, so that for any object y Reverse reverse:x = x=4~ ~ dp; Yc:y = y=l ~ 3-; x. X= ~ ; -J- In particular, _T_ is the everywhere-_l_ function. Distribute from left; distribute from right Below we give some functional forms, many of which distl:x - x= ---> ep; are used later in this paper. We usep, f, and g with and X=<.V,<21 ..... an>> -----> < ..... >; ± without subscripts to denote arbitrary functions; and x, distr:x -- x= ---) q~; Xl ..... x., y as arbitrary objects. Square brackets [...] are X=<,2> ---> <<.V1,Z>, ... , >; ± used to indicate the functional form for construction, which denotes a function, whereas pointed brackets Length <...> denote sequences, which are objects. Parentheses length:x - x= --+ n; x=qa ---> 0; ,1, are used both in particular functional forms (e.g., in Add, subtract, multiply, and divide condition) and generally to indicate grouping. + :x = x= &y,z are numbers--+ y+z; ,1, -:x - x= & y,z are numbers ~ y-z; ,1, Composition x :x -- x= & y,z are numbers ~ yXz; ,1, (fog):x =- f:(g:x) +:x - x= & y,z are numbers---> y+z; ,1, (where y+0 = ,1,) Construction [fi ..... fn]:x = (Recall that since Transpose < .... 3_.... > = _1_ and all functions are _L-preserving, so trans:x -- x=<4, ..... 4'> "--> ,/,; is [fi ..... fn]-) X= --+ ; _1_ where Condition Xi~"~- and (p-+ f, g):x -- (p:x)=T---~ f:x; (p:x)=F--+ g:x; ± yj=, l_i__n, l_j_m. Conditional expressions (used outside of FP systems to And, or, not describe their functions) and the functional form condi- and:x ~ x= --> T; tion are both identified by "---~". They are quite different x= V x= V x= ---> F; 3- although closely related, as shown in the above defini- etc. tions. But no confusion should arise, since the elements Append left; append right of a conditional expression all denote values, whereas apndl:x = x= ~ ; the elements of the functional form condition all denote X~-<.V,> ~ ; 3_ functions, never values. When no ambiguity arises we apndr:x -= x=--+ ; omit right-associated parentheses; we write, for example, X=<,Z> "-'> ; "1" pl ---)f,;p2--*f2; g for (pl---> fi; (/02-'-~f2; g)). Right selectors; Right tail Constant (Here x is an object parameter.) lr:x - x= ---) Xn; -J- ~c:y = y=/~ ±; x 2r:x -- x= & n_2 -+ x,-,; 3- Insert etc. /f:x =- x= ~ Xl; x= & n_>2 tlr:x-- x= --+ 6; -->f:>; ± x= & n_>2 --+ ; "1" If f has a unique right unit ur # ±, where Rotate left; rotate right f: E {x, 3_} for all objects x, then the above rotl:x = x=~ ~ 4~; x= "--> ; definition is extended:/f:q~ = ur. Thus x= & n_2 ---> ; ± etc. /+:<4,5,6> = +:<4, +:<5,/+:<6>>> = +:<4, +:<5,6>> = 15 11.2.4 Functional forms, F. A functional form is an /+:~=0 expression denoting a function; that function depends on the functions or objects which are the parameters of the Apply to all expression. Thus, for example, iff and g are any func- af:x - x=ep ~ 4'; tions, then fog is a functional form, the composition off X= ~ ; ±

621 Communications August 1978 of Volume 21 the ACM Number 8 Binary to unary (x is an object parameter) and knows how to apply it. Iffis a functional form, then (bu f x) :y - f: the description of the form tells how to compute f: x in terms of the parameters of the form, which can be done Thus by further use of these rules. Iff is defmed, Deff- r, as (bu + l):x = l+x in (3), then to fmdf:x one computes r:x, which can be done by further use of these rules. If none of these, then While f:x - .1_. Of course, the use of these rules may not (while p f): x ~ p: x= T --* (while p f): (f: x); terminate for somefand some x, in which case we assign p:x=F---~ x; ± the value f: x --- .1_. The above functional forms provide an effective method for computing the values of the functions they 11.3 Examples of Functional Programs denote (if they terminate) provided one can effectively The following examples illustrate the functional pro- apply their function parameters. gramming style. Since this style is unfamiliar to most readers, it may cause confusion at first; the important 11.2.5 Definitions. A definition in an FP system is an point to remember is that no part of a function definition expression of the form is a result itself. Instead, each part is a function that must Def l -- r be applied to an argument to obtain a result. 11.3.1 Factorial. where the left side 1 is an unused function symbol and the right side r is a functional form (which may depend Def ! - eq0 ~ ]; xo[id, !osubl] on/). It expresses the fact that the symbol I is to denote where the function given by r. Thus the definition Def last 1 - loreverse defines the function lastl that produces the Def eq0 -- eqo[id, 0] last element of a sequence (or ±). Similarly, Def subl - -o[id, ]] Def last -- nullotl --> l; lastotl Here are some of the intermediate expressions an FP system would obtain in evaluating !: 2: defines the function last, which is the same as last 1. Here in detail is how the definition would be used to compute !:2 ~ (eqO--~ 1; ×o[id, !osubl]):2 last: <1,2>: xo[id, !osubl]:2 last: = ×: ~ ×:<2, !:1> definition of last (nullotl ---, 1; lastotl): x:<2, x:> action of the form (p--~fi g) lastotl: x:<2, X:> ~ x:<2, x:> since nullotl: = null:<2> x:<2,1> ~ 2. =F action of the form fog last:(tl:) In Section 12 we shall see how theorems of the algebra definition of primitive tail last:<2> definition of last (nullotl --~ l; lastotl):<2> of FP programs can be used to prove that ! is the action of the form (p-*~ g) 1:<2> factorial function. since nutlotl:<2> = null:oh = T 11.3.2 Inner product. We have seen earlier how this definition of selector 1 ~2 definition works.

The above illustrates the simple rule: to apply a Def IP ~- (/+)o(ax)otrans defined symbol, replace it by the right side of its defini- tion. Of course, some definitions may define nontermi- 11.3.3 Matrix multiply. This matrix multiplication nating functions. A set D of definitions is well formed if program yields the product of any pair of con- no two left sides are the same. formable matrices, where each matrix m is represented 11.2.6 Semantics. It can be seen from the above that as the sequence of its rows: an FP system is determined by choice of the following m = sets: (a) The set of atoms A (which determines the set of where mi = for i = 1..... r. objects). (b) The set of primitive functions P. (c) The set Def MM = (aalp)o(adistl)odistro[ 1, transo2] of functional forms F. (d) A well formed set of definitions D. To understand the semantics of such a system one The program MM has four steps, reading from right to needs to know how to compute f:x for any function f left; each is applied in turn, beginning with [1, transo2], and any object x of the system. There are exactly four to the result of its predecessor. If the argument is , possibilities for f: then the first step yields where n' = trans:n. The (l)fis a primitive function; second step yields <.... , >, where the (2)fis a functional form; mi are the rows of m. The third step, adistl, yields (3) there is one definition in D, Deff- r; and , ..., distl:> = (4) none of the above. Iff is a primitive function, then one has its description where

622 Communications August 1978 of Volume 21 the ACM Number 8 pi = distl: -- <, ..., > itive functions and functional forms, the FP framework fori= 1..... r provides for a large class of languages with various styles and capabilities. The algebra of programs associated and nj' is the jth column of n (the jth row of n'). Thus pi, with each of these depends on its particular set of func- a sequence of row and column pairs, corresponds to the tional forms. The primitive functions, functional forms, i-th product row. The operator aedP, or a(etlP), causes and programs given in this paper comprise an effort to alP to be applied to each pi, which in turn causes IP to develop just one of these possible styles. be applied to each row and column pair in each pi. The 11.4.2 Limitations of FP systems. FP systems have result of the last step is therefore the sequence of rows a number of limitations. For example, a given FP system comprising the product matrix. If either matrix is not is a fixed language; it is not history sensitive: no program rectangular, or if the length of a row of m differs from can alter the library of programs. It can treat input and that of a column of n, or if any element of m or n is not output only in the sense that x is an input andf:x is the a number, the result is Z. output. If the set of primitive functions and functional This program MM does not name its arguments or forms is weak, it may not be able to express every any intermediate results; contains no variables, no loops, computable function. no control statements nor procedure declarations; has no An FP system cannot compute a program since func- initialization instructions; is not word-at-a-time in na- tion expressions are not objects. Nor can one define new ture; is hierarchically constructed from simpler compo- functional forms within an FP system. (Both of these nents; uses generally applicable housekeeping forms and limitations are removed in formal functional program- operators (e.g., af, distl, distr, trans); is perfectly general; ming (FFP) systems in which objects "represent" func- yields ± whenever its argument is inappropriate in any tions.) Thus no FP system can have a function, apply, way; does not constrain the order of evaluation unnec- such that essarily (all applications of IP to row and column pairs can be done in parallel or in any order); and, using apply: = x :y algebraic laws (see below), can be transformed into more because, on the left, x is an object, and, on the right, x "efficient" or into more "explanatory" programs (e.g., is a function. (Note that we have been careful to keep one that is recursively defined). None of these properties the set of function symbols and the set of objects distinct: hold for the typical von Neumann matrix multiplication thus 1 is a function symbol, and 1 is an object.) program. The primary limitation of FP systems is that they are Although it has an unfamiliar and hence puzzling not history sensitive. Therefore they must be extended form, the program MM describes the essential operations somehow before they can become practically useful. For of matrix multiplication without overdetermining the discussion of such extensions, see the sections on FFP process or obscuring parts of it, as most programs do; and AST systems (Sections 13 and 14). hence many straightforward programs for the operation can be obtained from it by formal transformations. It is an inherently inefficient program for von Neumann 11.4.3 Expressive power of FP systems. Suppose two computers (with regard to the use of space), but efficient FP systems, FP~ and FP2, both have the same set of ones can be derived from it and realizations of FP objects and the same set of primitive functions, but the systems can be imagined that could execute MM without set of functional forms of FP~ properly includes that of the prodigal use of space it implies. Efficiency questions FP2. Suppose also that both systems car~ express all are beyond the scope of this paper; let me suggest only computable functions on objects. Nevertheless, we can that since the language is so simple and does not dictate say that FPi is more expressive than FP2, since every any binding of lambda-type variables to data, there may function expression in FP2 can be duplicated in FP1, but be better opportunities for the system to do some kind of by using a functional form not belonging to FP2, FP~ can "lazy" evaluation [9, 10] and to control data management express some functions more directly and easily than more efficiently than is possible in lambda-calculus FP2. based systems. I believe the above observation could be developed into a theory of the expressive power of languages in 11.4 Remarks About FP Systems which a language A would be more expressive than 11.4.1 FP systems as programming languages. FP language B under the following roughly stated condi- systems are so minimal that some readers may find it tions. First, form all possible functions of all types in A difficult to view them as programming languages. by applying all existing functions to objects and to each Viewed as such, a functionfis a program, an object x is other in all possible ways until no new function of any the contents of the store, and f:x is the contents of the type can be formed. (The set of objects is a type; the set store after programfis activated with x in the store. The of continuous functions [T->U] from type T to type U is set of definitions is the program library. The primitive a type. IffE[T----~U] and tET, thenft in U can be formed functions and the functional forms provided by the by applying f to t.) Do the same in language B. Next, system are the basic statements of a particular program- compare each type in A to the corresponding type in B. ming language. Thus, depending on the choice of prim- If, for every type, A's type includes B's corresponding 67.3 Communications August 1978 of Volume 21 the ACM Number 8 type, then A is more expressive than B (or equally correct, he will need much simpler techniques than those expressive). If some type of A's functions is incomparable the professionals have so far put forward. The algebra of to B's, then A and B are not comparable in expressive programs below may be one starting point for such a power. proof discipline and, coupled with current work on al- 11.4.4 Advantages of FP systems. The main reason gebraic manipulation, it may also help provide a basis FP systems are considerably simpler than either conven- for automating some of that discipline. tional languages or lambda-calculus-based languages is One advantage of this algebra over other proof tech- that they use only the most elementary fixed naming niques is that the programmer can use his programming system (naming a function in a det'mition) with a simple language as the language for deriving proofs, rather than fixed rule of substituting a function for its name. Thus having to state proofs in a separate logical system that they avoid the complexities both of the naming systems merely talks about his programs. of conventional languages and of the substitution rules At the heart of the algebra of programs are laws and of the lambda calculus. FP systems permit the definition theorems that state that one function expression is the of different naming systems (see Sections 13.3.4 and same as another. Thus the law [fig]oh =_ [foh, goh] says 14.7) for various purposes. These need not be complex, that the construction off and g (composed with h) is the since many programs can do without them completely. same function as the construction of (f composed with Most importantly, they treat names as functions that can h) and (g composed with h) no matter what the functions be combined with other functions without special treat- f, g, and h are. Such laws are easy to understand, easy to ment. justify, and easy and powerful to use. However, we also FP systems offer an escape from conventional word- wish to use such laws to solve equations in which an at-a-time programming to a degree greater even than "unknown" function appears on both sides of the equa- APL [12] (the most successful attack on the problem to tion. The problem is that iffsatisfies some such equation, date within the von Neumann framework) because they it will often happen that some extensionf' of f will also provide a more powerful set of functional forms within satisfy the same equation. Thus, to give a unique mean- a unified world of expressions. They offer the opportu- ing to solutions of such equations, we shall require a nity to develop higher level techniques for thinking foundation for the algebra of programs (which uses about, manipulating, and writing programs. Scott's notion of least fixed points of continuous func- tionals) to assure us that solutions obtained by algebraic manipulation are indeed least, and hence unique, solu- 12. The Algebra of Programs for FP Systems tions. Our goal is to develop a foundation for the algebra 12.1 Introduction of programs that disposes of the theoretical issues, so The algebra of the programs described below is the that a programmer can use simple algebraic laws and work of an amateur in algebra, and I want to show that one or two theorems from the foundations to solve it is a game amateurs can profitably play and enjoy, a problems and create proofs in the same mechanical style game that does not require a deep understanding of logic we use to solve high-school algebra problems, and so and mathematics. In spite of its simplicity, it can help that he can do so without knowing anything about least one to understand and prove things about programs in fixed points or predicate transformers. a systematic, rather mechanical way. One particular foundational problem arises: given So far, proving a program correct requires knowledge equations of the form of some moderately heavy topics in mathematics and f-po---} q0; ... ;pi---} qi; Ei(f), (1) logic: properties of complete partially ordered sets, con- tinuous functions, least fLxed points of functionals, the where the pi's and qi's are functions not involvingf and first-order predicate calculus, predicate transformers, Ei(f) is a function expression involvingfi the laws of the weakest preconditions, to mention a few topics in a few algebra will often permit the formal "extension" of this approaches to proving programs correct. These topics equation by one more "clause" by deriving have been very useful for professionals who make it their Ei(f) -- pi+l ~ qi+l; Ei+l(f) (2) business to devise proof techniques; they have published a lot of beautiful work on this subject, starting with the which, by replacing El(f) in (1) by the right side of (2), work of McCarthy and Floyd, and, more recently, that yields of Burstall, Dijkstra, Manna and his associates, Milner, f---po--* q0; ... ;pi+l ~ q~+l; Ei+l(f). (3) Morris, Reynolds, and many others. Much of this work This formal extension may go on without limit. One is based on the foundations laid down by Dana Scott question the foundations must then answer is: when can (denotational semantics) and C. A. R. Hoare (axiomatic the least f satisfying (1) be represented by the infinite semantics). But its theoretical level places it beyond the expansion scope of most amateurs who work outside of this spe- cialized field. f- po --~ qo; ... ; pn --~ qn; ... (4) If the average programmer is to prove his programs in which the final clause involving f has been dropped,

67,4 Communications August 1978 of Volume 2 ! the ACM Number 8 so that we now have a solution whose right side is free definedog ; -~ lo [fig] --f off's? Such solutions are helpful in two ways: first, they to indicate that the law (or theorem) on the right holds give proofs of "termination" in the sense that (4) means within the domain of objects x for which definedog:x thatf:x is defined if and only if there is an n such that, = T. Where for every i less than n, pi: x = F and pn : X = T and qn : X is defined. Second, (4) gives a case-by-case description Def defined - ~r off that can often clarify its behavior. i.e. defined:x - x=± ---> ±; T. In general we shall write The foundations for the algebra given in a subsequent a qualified functional equation: section are a modest start toward the goal stated above. For a limited class of equations its "linear expansion p --->--->f- g theorem" gives a useful answer as to when one can go to mean that, for any object x, wheneverp:x = T, then from indefinitely extendable equations like (l) to infinite f:x=g:x. expansions like (4). For a larger class of equations, a more general "expansion theorem" gives a less helpful Ordinary algebra concerns itself with two operations, answer to similar questions. Hopefully, more powerful addition and multiplication; it needs few laws. The al- theorems covering additional classes of equations can be gebra of programs is concerned with more operations found. But for the present, one need only know the (functional forms) and therefore needs more laws. conclusions of these two simple foundational theorems Each of the following laws requires a corresponding in order to follow the theorems and examples appearing proposition to validate it. The interested reader will find in this section. most proofs of such propositions easy (two are given The results of the foundations subsection are sum- below). We first define the usual ordering on functions marized in a separate, earlier subsection titled "expan- and equivalence in terms of this ordering: sion theorems," without reference to f'Lxed point con- DEFINITIONf__--> [f,g] and fog: SO[fi ..... fn] --fi 1.5.1 [fi°l ..... fnonlo[ga ..... gn] - [flog1 ..... fnogn] PROPOSITION: For all functions f, g, and h and all objects 1.6 tl°[fi] -- ~; and x, ([f,g]oh):x =- [foh, goh]:x. tlo[J] ..... f~] _--2 PROOF: defmedof --~--~ tlo[fi] -= ([fig]oh): x = [fig] :(h :x) and tlo[f ..... fn] -- L6 ..... f~] for n_>2 by definition of composition 1.7 distlo[fi [g~ ..... gn]] -= [[f, gx] ..... [fign]] = defmedof-->--> distlo[f,~;] -~ by definition of construction The analogous law holds for distr. = <(foh):x, (goh):x> 1.8 apndlo[fi [gl ..... gn]] ~- [figl ..... gn] by definition of composition nullog-o--, apndlo[fig] - [f] = [foh, goh]:x And so on for apndr, reverse, rotl, etc. by definition of construction [] 1.9 [ .... J_.... ] - J_ Some laws have a domain smaller than the domain 1.10 apndlo [fog, afoh] =- afoapndlo[g,h] of all objects. Thus 1o[f,g] -fdoes not hold for objects I.ll pair & notonullol > x such that g:x = _1_. We write apndlo[[ 1 o 1,2], distro[tlo 1,2]] - distr

625 Communications August 1978 of Volume 21 the ACM Number 8 Wheref&g - ando[f,g]; PROPOSITION 2 pair -= atom --~ F; eqo[length,2] Pair & notonullo I --*--* II Composition and condition (right associated paren- apndlo[[l 2, 2], distro[tlo 1, 2]] -= distr theses omitted) (Law II.2 is noted in Manna et al. [16], where f± g is the function: ando[f, g], andf 2 -fof. p. 493.) PROOF. We show that both sides produce the same result II.l (p---~f',g)oh = poh --~ foh; goh II.2 h o (p---~ g) -- p ~ h oj~ hog when applied to any pair , where x # ~, as per the II.3 oro[q,notoq] ---~---~ando[p,q] ---~f, stated qualification. ando[p,notoq] ~ g; h - p---> (q~f, g); h CASE 1. X is an atom or i. Then distr: = .k, since x # $. The left side also yields ± when applied to , II.3.1 p --~ (p--.~f; g); h =- p---~ f; h since tlo 1 : = & and all functions are i-preserving. III Composition and miscellaneous CASE 2. x = . Then III.1 ~of_< de freed of_---~--->_~of= apndlo[[l 2, 2], distro[tlo 1, 2]]: III.l.l _[_of m fo± ==_ ± = apndl: <, distr: > IiI.2 foid = idof-=f = apndl: <, $> = <> if tl:x = q~ III.3 pair > > lodistr- [lol,2] also: = apndl: <, < ..... >> pair > ~ lotl-2 etc. if tl:x # IliA a(fog) =- af o ag = <, ... , > III.5 nullog--,--, afog = = distr: []

IV Condition and construction 12.3 Example: Equivalence of Two Matrix IV.l [fi ..... (p--~g;h) ..... fn] Multiplication Programs - p---~ [fi ..... g ..... fn]; [fl ..... h ..... fi~] We have seen earlier the matrix multiplication pro- IV.l.1 [fl ..... (pl --~ gl; ... ;pn ---> gn; h) ..... fm] gram: =pl ~ [J~ ..... gl ..... fm]; Def MM - aaIP o adistl o distr o [1, transo2]. •.-;pn "-'> [fi ..... g...... fm]; Ill ..... h ..... fm] We shall now show that its initial segment, MM', where This concludes the present list of algebraic laws; it is by Def MM' - aaIP o adistl o distr, no means exhaustive, there are many others. can be defined recursively. (MM' "multiplies" a pair of matrices after the second matrix has been transposed. Proof of two laws Note that MM', unlike MM, gives A_ for all arguments We give the proofs of validating propositions for laws that are not pairs.) That is, we shall show that MM' I. 10 and I. 11, which are slightly more involved than most satisfies the following equation which recursively defines of the others. the same function (on pairs): PROPOSITION 1 f-= null o 1 ~ q~; apndlo [alpodistlo [1 o 1, 2], fo [tlo 1, 2]]. apndl o [fog, afoh] ~ af o apndl o [g,h] Our proof will take the form of showing that the follow- PROOF. We show that, for every object x, both of the ing function, R, above functions yield the same result. Def R m null o 1 --~ 6; CASE 1. h:x is neither a sequence nor q,. apndlo[aIpodistlo[l o 1, 2], MM'o[tlo 1, 2]] Then both sides yield ± when applied to x. CASE 2. h:x = ~. Then is, for all pairs , the same function as MM'. R apndlo[fog, afoh]: x "multiplies" two matrices, when the first has more than zero rows, by computing the first row of the "product" = apndl: = afoapndlo[g,h ]: x (with aIpodistlo[lo 1, 2]) and adjoining it to the "prod- uct" of the tail of the first matrix and the second matrix. = afoapndl: = af.' Thus the theorem we want is = CASE 3. h:x = . Then pair > ~MM'=R, from which the following is immediate: apndlo[fog, afoh]: x -- apndl: > MM - MM' o [1, transo2] = R o [1, transo2]; ----- where ofoapndlo[g,h ]: x Def pair = atom --~ F; eqo]length, 2]. = afoapndl: > = af'. THEOREM: pair--*--~ MM' = R = [] where

626. Communicat; ~ns August 1978 of Volume 21 the ACM Number 8 Def MM' - aalP o adistl o distr where the pi's and qi's are particular functions, so that E Def R -- nullo 1 ~ ~; has the property: apndlo[alpodistlo[12, 2], MM'o[tlo 1, 2]] E(fi) -fi+l for i = 0, 1, 2 .... (E3) PROOF. Then we say that E is expansive and has the jS's as CASE 1. pair&nullol • ~MM'-=R. approximating functions. pair&nullol > >R--=6 bydefofR If E is expansive and has approximating functions as pair & nullo 1 ---~---~ MM' - in (E2), and iff is the solution of (El), thenf can be since distr: <$,x> = $ by def of distr written as the infinite expansion and aj~ = $ by def of Apply to all. f-po--* qo; ... ;pn ---> qn; ... (E4) And so: aaIP o adistl o distr: <~,x> = q~. Thus pair & nullo 1 ---~---~ MM' - R. meaning that, for any x, fix # ± iff there is an n _> 0 CASE 2. pair & notonullo I ~ MM' -- R. such that (a)pi:x = F for all i < n, and (b)pn:x = T, and (c) qn:X # _l_. Whenf:x # ±, thenf.'x = qn:X for this n. pair & notonullo 1 ---~---~ R - R', (l) (The foregoing is a consequence of the "expansion theo- rem".) by def of R and R', where 12.4.2 Linear expansion. A more helpful tool for Def R' - apndlo[alPodistlo[12, 2], MM'o[tlo 1, 2]]. solving some equations applies when, for any function h,

We note that E(h) - p0 ---, q0; El(h) (LEI) R' -- apndlo[fog, afoh] and there exist pi and qi such that

where El(pi ---> qi; h) = pi+l ~ qi+l; El(h) for i = 0, 1, 2 .... (LE2) f-= aIpodistl g - [12, 21 and h =- distro[tlo 1, 2] E,(i) - _[_. (LE3) af- a(alpodistl) = aalpoadistl (by 111.4). (2) Under the above conditions E is said to be linearly Thus, by I. 10, expansive. If so, andf is the solution of R '= afoapndlo[g,h]. (3) f ~- E(f) (LE4) Now apndlo[g,h] -= apndlo[[l 2, 2], distro[tlo 1, 2]], then E is expansive and f can again be written as the thus, by I. 11, infinite expansion pair & notonullo I ---~---> apndlo[g,h] = distr. (4) f=-po--, q0;'... ;pn "-> qn; ... (LE5) And so we have, by (1), (2), (3) and (4), using the pi's and qi's generated by (LE 1) and (LE2). pair & notonullo 1 ---~---~ R - R' Although the pi's and qi's of (E4) or (LE5) are not - afodistr - aaIPoadistlodistr - MM'. unique for a given function, it may be possible to find Case l and Case 2 together prove the theorem. [] additional constraints which would make them so, in which case the expansion (LE5) would comprise a can- onical form for a function. Even without uniqueness 12.4 Expansion Theorems these expansions often permit one to prove the equiva- In the following subsections we shall be "solving" lence of two different function expressions, and they some simple equations (where by a "solution" we shall often clarify a function's behavior. mean the "least" function which satisfies an equation). To do so we shall need the following notions and results drawn from the later subsection on foundations of the 12.5 A Recursion Theorem algebra, where their proofs appear. Using three of the above laws and linear expansion, 12.4.1 Expansion. Suppose we have an equation of one can prove the following theorem of moderate gen- the form erality that gives a clarifying expansion for many recur- sively defined functions. f- E(f) (El) RECURSION THEOREM: Letf be a solution of where E(f) is an expression involvingf. Suppose further that there is an infinite sequence of functionsfi for i = 0, f- p --~ g;, Q(f) (1) 1, 2 ..... each having the following form: where fo-£ Q(k) - ho[i, koj] for any function k (2) J~+l mpo "-"> qo; .-. ;pi--~ qi; -1- (E2) and p, g, h, i, j are any given functions, then

627 Communications August 1978 of Volume 2 l the ACM Number 8 f-p---> ~,,poj-. Q(g); ... ;pojn'-> Q~(g); ... (3) Using these results for ios ~, eq0os ~, and Qn(~) in the previous expansion for f, we obtain (where Q~(g) is ho[i, Qn-a(g)°J], and j~ is joj n-1 for n >__ 2) and fix - x=O--~ 1; ... ; x=n --~n×(n- l) x...x Ix l;... Qn(g) ___/h o[i, ioj, .... ioj n-a, gojn]. (4) Thus we have proved thatf terminates on precisely the PROOF. We verify thatp --> g;, Q(f) is linearly expansive. set of nonnegative integers and that it is the factorial Let p~, qn and k be any functions. Then function thereon. Q(pn ~ qn, k) - ho[i, (pn ---~ q~; k)°j] by (2) 12.6 An Iteration Theorem - ho[i, (pn°j--~ qn°j; koj)] by II.l This is really a corollary of the recursion theorem. It - ho(p~oj---~ [i, qnOj]; [i, koj]) by IV.l gives a simple expansion for many iterative programs. - p~oj---~ ho[i, q~oj]; ho[i, koj] by II.2 ITERATION THEOREM: Letf be the solution (i.e., the least -p, oj-~ Q(q~); Q(k) by (2) (5) solution) of Thus ifpo -p and qo - g, then (5) gives px -poj and f - p---~ g;, hofok ql = Q(g) and in general gives the following functions satisfying (LE2) then pn -p°j n and qn -~ Qn(g). (6) f =_ p .-o g; pok ~ hogok; ... ; pok n --o hnogok~; ...

Finally, PROOF. Let h' - ho2, i' =- id, f =- k, then Q(i) -- ho[i, ioj] f -- p ---> g; h' o[i', foj'] - ho[i, &] by III.l.l since ho2o[id, fok] - hofok by 1.5 (id is defined except = ho~ by 1.9 for A_, and the equation holds for _1_). Thus the recursion ~- i by III.1.1. (7) theorem gives Thus (5) and (6) verify (LE2) and (7) verifies (LE3), with f_p___> g; ... ;pokn _.> Qnfg); ... E1 -- Q. If we let E(--f) -= p ~ g; Q(f), then we have (LE1); thus E is linearly expansive. Since f is a solution where off-- E(f), conclusion (3) follows from (6) and (LE5). Qn(g) _ ho2o[id, Qn-l(g)ok] Now =_ hoOn-t(g)ok =__ hnogok n Q~(g) = ho[i, Qn-~(g)°J] byI.5 [] - ho[i, ho[ioj, .... ho[ioj n-~, goj n] ... ]] 12.6.1 Example: Correctness proof for an iterative factorial function. Letf be the solution of by I. 1, repeatedly f- eq0ol --> 2;fo[so 1, ×] -/ho[i, ioj, .... iojn-l,'g°j ~] by 1.3 (8) where Def s - -o[id, i] (substract 1). We want to prove Result (8) is the second conclusion (4). [] thatf.' = x! iff x is a nonnegative integer. Let p -= 12.5.1 Example: correctness proof of a reeursive eq0o 1, g - 2, h - id, k - [so 1, ×]. Then factorial function. Letfbe a solution of f-p --> g; hofok f-eq0--~ ]; ×o[id, fos] and so where f-p--> g; ... ;pokn ~ g°kn; ... (1) Def s - -o[id, i] (subtract 1). by the iteration theorem, since h n - id. We want to show Thenf satisfies the hypothesis of the recursion theorem that with p - eq0, g - L h - x, i - id, and j - s. Therefore pair--->---> k n --- Jan, bn] (2) f- eq0 --~ ]; ... ; eq0os n --~ Q~(h; ... holds for every n _> 1, where and an - s% 1 (3) Q~()) _/× o [id, idos ..... idos n-l, ]osn]. bn -/× ° [s~-1° 1..... so l, 1, 2] (4) NOW idos k -~ s k by III.2 and eq0os n --.--* los n - ] by Now (2) holds for n = 1 by definition of k. We assume III.1, since eq0osn:x implies defmedosn:x; and also it holds for some n _ 1 and prove it then holds for eq0osn:x --- eq0: (x - n) - x=n. Thus if eq0osn: x = T, n + 1. Now then x = n and pair ~ ~ k n+l -= kok ~ =- [so 1, x]o[a,, b,] (5) QR(~): n = n × (n - 1) × ... × (n - (n - 1)) x (l: (n - n)) = n!. since (2) holds for n. And so

628 Communications August 1978 of Volume 21 the ACM Number 8 pair-->--* k ~+~ =- [soan, Xo[an, bn]] by 1.1 and 1.5 (6) since pn and p" provide the qualification needed for q~ -- q" - h%2. To pass from (5) to (6) we must check that whenever an Now suppose there is an x such thatfx # g:x. Then or bn yield £ in (5), so will the right side of (6). Now there is an n such that pi:x = p¢:x = F for i < n, and p,:x soan =- s n+l° 1 -- an+l (7) # p~:x. From (12) and (13) this can only happen when ×o[an, b.] -/× ° [s~° 1, s n-lo 1..... so 1, 1, 2] h%2:x = ±. But since h is ±-preserving, hmo2:x = I for - b.+l by 1.3. (8) all m _> n. Hencef:x = g:x = i by (14) and (15). This contradicts the assumption that there is an x for which Combining (6), (7), and (8) gives fix # g:x. Hence f- g. pair--->---> k ~+~ - [an+l, bn+l]. (9) This example (by J. H. Morris, Jr.) is treated more elegantly in [16] on p. 498. However, some may find that Thus (2) holds for n = 1 and holds for n + 1 whenever the above treatment is more constructive, leads one more it holds for n, therefore, by induction, it holds for every mechanically to the key questions, and provides more n _> 1. Now (2) gives, for pairs: insight into the behavior of the two functions. definedok n --,--. pok n = eq0o l o[an, bn] - eq0oan = eq0os"o 1 (10) 12.7 Nonlinear Equations defmedok n ---~--~ gok ~ The preceding examples have concerned "linear" ---- 2°[an, bn] --/× o [s~-X°l ..... sol, 1, 2] (11) equations (in which the "unknown" function does not have an argument involving itself). The question of the (both use 1.5). Now (1) tells us thatf.' is defined iff existence of simple expansions that "solve .... quadratic" there is an n such thatpoki: = F for all i < n, and and higher order equations remains open. pok": = T, that .is, by (10), eq0os~:x = T, i.e., The earlier examples concerned solutions off-- E(f), x=n; and goId: is defined, in which case, by (1 l), where E is linearly expansive. The following example involves an E(f) that is quadratic and expansive (but f =/X: = n!, not linearly expansive). which is what we set out to prove. 12.7.1 Example: proof of idemlmtency ([16] p. 497). 12.6.2 Example: proof of equivalence of two iterative Letfbe the solution of programs. In this example we want to prove that two f-= E(f) -p--~ id;f%h. (1) iteratively defined programs,f and g, are the same func- tion. Letfbe the solution of We wish to prove that f--f2. We verify that E is expansive (Section 12.4.1) with the following approxi- f=_pol ~ 2; hofo[kol, 2]. (1) mating functions: Let g be the solution of j~-= i (2a) g - po 1 ~ 2; go[ko 1, ho2]. (2) fn -- p ~ id; ... ; poh n-1 ----> hn-1; J. for n > 0 (2b) First we note that p ~ fn - id and so Then, by the iteration theorem: (3) p°hi > >fn°hi --- hi. (3) f - p0 ---, q0; ... ;pn ---) qn; ... g --p6---> q6; ... ," p, ' ---) q .....'" (4) Now E(J~) -p --~ id; J_2oh ~-Jq, (4) where (letting r ° =- id for any r), for n = 0, 1.... and pn -polo[kol, 2] n --polo[k%l, 2] by 1.5.1 (5) E(fn) q, = h%2o[ko 1, 2] n -- h%2o[/Co 1, 2] by 1.5.1 (6) - p ---> id;f~o(p ---> id; ... ;poh n-1 ~ hn-12 j_)oh p'n-p°lo[k°l,h°2]n-p°lo[k%l,h%2] by 1.5.1 (7) -= p ~ id;fn°(p°h ~ h; ... ; p°h n --) h"; ± °h) q~ -= 2o[ko 1, ho2]" - 2o[/,3ol, hno2] by 1.5.1. (8) - p ..-.) id; poh --., f~oh; ... ; poh'~ ---~f~ oh~; fn Oi - p---> id; p°h---~ h; ... ; p°h"--) hn; & by(3) Now, from the above, using 1.5, -fn+~. (5) defmedo2 ~ p~ - po/Co 1 (9) Thus E is expansive by (4) and (5); so by (2) and Section defmedoh%2 ~ - p" - pok% 1 (10) 12.4.1 (E4) defmedo~o I ) • q~ = q~ - hno2 (11) f = p --* id; ... ; poh ~ --* h"; .... (6) Thus But (6), by the iteration theorem, gives defmedohno2 ) ) defmedo2 - f (12) defmedoh%2, • • p~ - p" (13) f- p --) id;foh. (7) and Now, ffp:x = T, thenf.'x = x =f2:x, by (1). Ifp:x = F, then f--po--> qo; ... ;pn---~ h%2; ... (14) g ---p~ ~ q~; ... ;p~ ~ h%2; ... (15) fix = f2oh:x by (1)

629 Communications August 1978 of Volume 21 the ACM Number 8 = f'.(foh:x) =f.'(f.'x) by (7) DEFINITION. Let E(f) be a function expression satisfying = f2:x. the following: Ifp:x iS neither T nor F, thenf.'x -- ± =f2:x. Thus E(h) -po---~ qo; El(h) for all h E F (EEl) f_f2. where pi E F and qi E F exist such that El(pi ~ qi; h) - pi+l ~ qi+l; El(h) 12.8 Foundations for the Algebra of Programs for all h E F and i = 0, 1.... (LE2) Our purpose in this section is to establish the validity of the results stated in Section 12.4. Subsequent sections and do not depend on this one, hence it can be skipped by EI(_T_) -= &. (LE3) readers who wish to do so. We use the standard concepts and results from [16], but the notation used for objects Then E is said to be linearly expansive with respect to and functions, etc., will be that of this paper. these pi's and qi's. We take as the domain (and range) for all functions LINEAR EXPANSION THEOREM: Let E be linearly expansive the set O of objects (which includes ±) of a given FP with respect to pi and qi, i = 0, 1..... Then E is expansive system. We take F to be the set of functions, and F to be with approximating functions the set of functional forms of that FP system. We write E(f) for any function expression involving functional fo- i (1) forms, primitive and defined functions, and the function f,+l -po ~ q0; ... ;pi ~ qi; i. (2) symbol f, and we regard E as a functional that maps a PROOF. We want to show that E(fi) =-fi+~ for any i _ 0. function f into the corresponding function E(f). We Now assume that all f ~ F are &-preserving and that all functional forms in F correspond to continuous function- E(fo) = p0---~ qo; E~ (i) ----p0---~ q0; & --fi (3) als in every variable (e.g., [f, g] is continuous in bothf by (LE1) (LE3) (1). and g). (All primitive functions of the FP system given Let i > 0 be fLxed and let earlier are _L-preserving, and all its functional forms are continuous.) fi = po ~ qo; W1 (4a) W1 ~ px ~ ql; W2 (4b) DEFINITIONS. Let E(f) be a function expression. Let etc. fo-=£ Wi--1 ~ pi-1 ~ qi--1; ~. (4-) fi+x ----po ~ qo; ... ; pi ~ qi; J- for i = 0, 1.... Then, for this i > 0 where pi, qi E F. Let E have the property that E(fi) - p0 ~ q0; El(fi) by (LE1) E(fi)----fi+~ fori=0,1 ..... E~(fi) - pl ~ ql; El(Wa) by (LE2) and (4a) El(w~) - p2 --~ q2; E~(w2) by (LE2) and (4b) Then E is said to be expansive with the approximating functionsfi. We write etc. E~(wi-~) -= pi ~ qi; E~ (i) by (LE2) and (4-) f=po---~ q0; ... ;pn---~ qn;-.. - pi --~ qi; A- by (LE3) to mean that f = limi{fi}, where the fi have the form above. We call the right side an infinite expansion off. Combining the above gives We takef.'x to be defined iff there is an n _> 0 such that E(fi) -f+l for arbitrary i > 0, by (2). (5) (a) pi:x = F for all i < n, and (b) p,:x = T, and (c) qn:x By (3), (5) also holds for i -- 0; thus it holds for all i >__ 0. is defined, in which casef.'x = qn:X. Therefore E is expansive and has the required approxi- EXPANSION THEOREM: Let E(f) be expansive with ap- mating functions. [] proximating functions as above. Let f be the least func- COROLLARY. If E is linearly expansive with respect to pi tion satisfying and qi, i = 0, 1..... andfis the least function satisfying f~ E(f). f---- E(f) (LE4) Then then f-- p0 ~ q0; ... ; p, ~ qn; ... f - po ~ qo; ... ; pn ---~ qn; .... (LE5) PROOF. Since E is the composition of continuous func- tionals (from F) involving only monotonic functions (_l_-preserving functions from F) as constant terms, E is 12.9 The Algebra of Programs for the Lambda Calculus continuous ([16] p. 493). Therefore its least fixed pointf and for Combinators is limi{Ei(j-)} -= limi(fi} ([16] p. 494), which by defmition Because Church's lambda calculus [5] and the system is the above inf'mite expansion forf. [] of combinators developed by Sch6nfinkel and Curry [6]

630 Communications August 1978 of Volume 21 the ACM Number 8 are the primary mathematical systems for representing of these complexities. (The Church-Rosser theorem, or the notion of application of functions, and because they Scott's proof of the existence of a model [22], is required are more powerful than FP systems, it is natural to to show that the lambda calculus has a consistent seman- enquire what an algebra of programs based on those tics.) The defmition of pure Lisp contained a related systems would look like. error for a considerable period (the "funarg" problem). The lambda calculus and combinator equivalents of Analogous problems attach to Curry's system as well. FP composition, fog, are In contrast, the formal (FFP) version of FP systems (described in the next section) has no variables and only hfgx.(f(gx)) --- B an elementary substitution rule (a function for its name), where B is a simple combinator defined by Curry. There and it can be shown to have a consistent semantics by a is no direct equivalent for the FP object in the relatively simple fLxed-point argument along the lines Church or Curry systems proper; however, following developed by Dana Scott and by Manna et al [16]. For Landin [14] and Burge [4], one can use the primitive such a proof see McJones [18]. functions prefix, head, tail, null, and atomic to introduce the notion of list structures that correspond to FP se- 12.10 Remarks quences. Then, using FP notation for lists, the lambda The algebra of programs outlined above needs much calculus equivalent for construction is ~fgx.. A work to provide expansions for larger classes of equations combinatory equivalent is an expression involving pret'Lx, and to extend its laws and theorems beyond the elemen- the null list, and two or more basic combinators. It is so tary ones given here. It would be interesting to explore complex that I shall not attempt to give it. the algebra for an FP-like system whose sequence con- If one uses the lambda calculus or combinatory structor is not _L-preserving (law 1.5 is strengthened, but expressions for the functional forms fog and [fig] to IV. 1 is lost). Other interesting problems are: (a) Find express the law 1.1 in the FP algebra, [f,g]oh = rules that make expansions unique, giving canonical [foh, goh], the result is an expression so complex that the forms for functions; (b) find algorithms for expanding sense of the law is obscured. The only way to make that and analyzing the behavior of functions for various sense clear in either system is to name the two function- classes of arguments; and (c) explore ways of using the als: composition - B, and construction --- A, so that Bfg laws and theorems of the algebra as the basic rules either =fog, and Afg --- [f,g]. Then 1.1 becomes of a formal, preexecution "lazy evaluation" scheme [9, 10], or of one which operates during execution. Such B(Afg)h -- A(Bfh)(Bgh), schemes would, for example, make use of the law 1o[f,g] _

631 Communications August 1978 of Volume 21 the ACM Number 8 13.2 Syntax sible a function, apply, which is meaningless in FP We describe the set O of objects and the set E of systems: expre.,;sions of an FFP system. These depend on the apply: = (x:y). choice of some set A of atoms, which we take as given. We assume that T (true), F (false), ff (the empty se- The result of apply:, namely (x:y), is meaningless quence), and # (default) belong to A, as well as "num- in FP systems on two levels. First, (x:y) is not itself an bers" of various kinds, etc. object; it illustrates another difference between FP and 1) Bottom, ±, is an object but not an atom. FFP systems: some FFP functions, like apply, map ob- 2) Every atom is an object. jects into expressions, not directly into objects as FP 3) Every object is an expression. functions do. However, the meaning of apply: is 4) If x~..... xn are objects [expressions], then an object (see below). Second, (x:y) could not be even an is an object [resp., expression] called a intermediate result in an FP system; it is meaningless in sequence (of length n) for n _> 1. The object [expression] FP systems since x is an object, not a function and FP xi for 1 ___ i __% n, is the ith element of the sequence systems do not associate functions with objects. Now if . (ff is both a sequence and an atom; APPLY represents apply, then the meaning of its length is 0.) (APPL Y:) is 5) If x and y are expressions, then (x:y) is an expression #(APPL Y:) called an application, x is its operator andy is its operand. = #((pAPPL Y):) Both are elements of the expression. = #(apply:) 6) If x = and if one of the elements of x is = It(NULL:A) = #((pNULL):A) _1_, then x = .1_. That is, <..., ± .... > = ±. = #(null:A) = #F = F. 7) All objects and expressions are formed by finite use of the above rules. The last step follows from the fact that every object is its A subexpression of an expression x is either x itself or own meaning. Since the meaning function/t eventually a subexpression of an element of x. An FFP object is an evaluates all applications, one can think of expression that has no application as a subexpression. apply as yielding F even though the actual Given the same set of atoms, FFP and FP objects are result is (NULL:A). the same. 13.3.2 How objects represent functions; the repre- 13.3 Informal Remarks About FFP Semantics sentation function #. As we have seen, some atoms 13.3.1 The meaning of expressions; the semantic (primitive atoms) will represent the primitive functions of function p. Every FFP expression e has a meaning, #e, the system. Other atoms can represent defined functions which is always an object; #e is found by repeatedly just as symbols can in FP systems. If an atom is neither replacing each innermost application in e by its meaning. primitive nor defined, it represents 1, the function which If this process is nonterminating, the meaning of e is ±. is .1_ everywhere. The meaning of an innermost application (x:y) (since it Sequences also represent functions and are analogous is innermost, x and y must be objects) is the result of to the functional forms of FP. The function represented applying the function represented by x to y, just as in FP by a sequence is given (recursively) by the following rule. systems, except that in FFP systems functions are rep- Metacomposition rule resented by objects, rather than by function expressions, (p):y = (pXl):<, y>, with atoms (instead of function symbols) representing primitive and defined functions, and with sequences where the xi's and y are objects. Here pxl determines representing the FP functions denoted by functional what functional form represents, forms. and x2 ..... Xn are the parameters of the form (in FFP, xl The association between objects and the functions itself can also serve as a parameter). Thus, for example, they represent is given by the representation function, P, let Def oCONST- 2ol; then in FFP of the FFP system. (Both p and # belong to the descrip- represents the FP functional form ~, since, by the meta- tion of the system, not the system itself.) Thus if the composition rule, ify ~ .1_, atom NULL represents the FP function null, then (o):y = (pCONST):<,y> pNULL = null and the meaning of (NULL:A) is = 20 I:<,y> = x. #(NULL:A) = (pNULL):A = null:A = F. From here on, as above, we use the colon in two senses. Here we can see that the first, controlling, operator of a When it is between two objects, as in (NULL:A), it sequence or form, CONST in this case, always has as its identifies an FFP application that denotes only itself; operand, after metacomposition, a pair whose first ele- when it comes between a function and an object, as in ment is the sequence itseff and whose second element is (oNULL):A or null:A, it identifies an FP-like application the original operand of the sequence, y in this case. The that denotes the result of applying the function to the controlling operator can then rearrange and reapply the object. elements of the sequence and original operand in a great The fact that FFP operators are objects makes pos- variety of ways. The significant point about metacom-

632 Communications August 1978 of Volume 21 the ACM Number 8 position is that it permits the definition of new functional -- Ix(pMLAST:<,>) forms, in effect, merely by defining new functions. It also by metacomposition permits one to write recursive functions without a deft- = #(applyo[1, tlo2]:<,>) nition. = #(apply:<,>) We give one more example of a controlling function = #(:) for a functional form: Def pCONS = aapplyotlodistr. = p(pMLAST:<,>) This defmition results in --where the = ~(lo2:<,>) fi are objects--representing the same function as mn. [pfl ..... #fn]. The following shows this. 13.3.3 Summary of the properties of p and p. So far (p):X we have shown how p maps atoms and sequences into = (pCONS):<,x> functions and how those functions map objects into by metacomposition expressions. Actually, p and all FFP functions can be extended so that they are defined for all expressions. = aapplyotlodistr:<,X> With such extensions the properties of p and/~ can be by def of pCONS summarized as follows:

= aapply:< ..... > 1) # E [expressions ~ objects]. by def of tl and distr and o 2) If x is an object,/~x = x. = ..... apply:> 3) If e is an expression and e = , then by def of a /~e -- <#el, ...,/ten>. - <(fi:x) ..... (fn:X)> by def of apply. 4) p E [expressions ---> ]expressions ---> expressions]]. 5) For any expression e, pe --- p~e). In evaluating the last expression, the meaning function 6) If x is an object and e an expression, then # will produce the meaning of each application, giving px:e = px:~e). pfi:x as the ith element. 7) If x and y are objects, then #(x:y) = #(px:y). In Usually, in describing the function represented by a words: the meaning of an FFP application (x:y) is found sequence, we shall give its overall effect rather than show by applying px, the function represented by x, to y and how its controlling operator achieves that effect. Thus then finding the meaning of the resulting expression we would simply write (which is usually an object and is then its own meaning). (p):x = <(fl:x) ..... (fn:X)> 13.3.4 Cells, fetching, and storing. For a number of reasons it is convenient to create functions which serve instead of the more detailed account above. as names. In particular, we shall need this facility in We need a controlling operator, COMP, to give us describing the semantics of definitions in FFP systems. sequences representing the functional form composition. To introduce naming functions, that is, the ability to We take pCOMP to be a primitive function such that, fetch the contents of a cell with a given name from a for all objects x, store (a sequence of cells) and to store a cell with given (p):X name and contents in such a sequence, we introduce = (jq:(f2:(.-- :(fn:X)...))) for n _> 1. objects called cells and two new functional forms, fetch and store. (I am indebted to Paul McJones for his observation that Cells ordinary composition could be achieved by this primitive function rather than by using two composition rules in A cell is a triple . We use this the basic semantics, as was done in an earlier paper form instead of the pair so that ceils [21.) can be distinguished from ordinary pairs. Fetch Although FFP systems permit the definition and investigation of new functional forms, it is to be expected The functional form fetch takes an object n as its that most programming would use a fixed set of forms parameter (n is customarily an atom serving as a name); (whose controlling operators are primitives), as in FP, so it is written ~'n (read "fetch n"). Its definition for objects that the algebraic laws for those forms could be em- n and x is ployed, and so that a structured programming style could "rn:x --- x = ~ ---> #; atom:x ~ 3-; be used based on those forms. (l:x) = ---> c; l'notl:x, In addition to its use in defining functional forms, metacomposition can be used to create recursive func- where # is the atom "default." Thus ~'n (fetch n) applied tions directly without the use of recursive definitions of to a sequence gives the contents of the first cell in the the form Deff =- E(f). For example, if pMLAST - sequence whose name is n; If there is no cell named n, nullotl*2 --+ lo2; applyo[1, tlo2], then p - the result is default, #. Thus l'n is the name function for last, where last:x -- x = --+ Xn; 3_. Thus the the name n. (We assume that pFETCH is the primitive operator .works as follows: function such that p -- Tn. Note that Tn simply passes over elements in its operand that are not p(:) cells.)

633 Communications August 1978 of Volume 2 I the ACM Number 8 Store and push, pop, purge pendent on D; if Tatom:D # # (that is, atom is defined Like fetch, store takes an object n as its parameter; it in D) then its meaning is It(c:x), where c = Tatom:D, the is written J,n ("store n"). When applied to a pair , contents of the first cell in D named atom. If ~atom:D where y is a sequence, ~,n removes the first cell named n = #, then atom is not defined in D and either atom is from y, if any, then creates a new cell named n with primitive, i.e. the system knows how to compute patom:x, contents x and appends it to y. Before defining ~n (store and It(atom:x) = It(patom:x), otherwise It(atom:x) = ±. n) we shall specify four auxiliary functional forms. (These can be used in combination with fetch n and store 13.4 Formal Semantics for FFP Systems n to obtain multiple, named, LIFO stacks within a We assume that a set A of atoms, a set D of defini- storage sequence.) Two of these auxiliary forms are tions, a set P C A of primitive atoms and the primitive specified by recursive functional equations; each takes functions they represent have all been chosen. We as- an object n as its parameter. sume that pa is the primitive function represented by a if a belongs to P, and that pa = ± if a belongs to Q, the (cellname n) - atom --, F; set of atoms in A-P that are not defined in D. Although eqo[length, 3] ---> eqo[[CELL, h], [1, 2]]; P p is defined for all expressions (see 13.3.3), the formal (push n) - pair -->apndlo[[CELL, h, 1], 2]; ± semantics uses its definition only on P and Q. The (pop n) - null --> qb; functions that p assigns to other expressions x are im- (cellname n)o 1 ---> tl; apndlo [1, (pop n)otl] plicitly determined and applied in the following semantic (purge n) =- null ---> ~; (cellname n)o 1 ---> (purge n)otl; rules for evaluating #(x:y). The above choices of A and apndlo[1, (purge n)otl] D, and of P and the associated primitive functions de- ~,n - pair --> (push n)o[1, (pop n)o2]; £ termine the objects, expressions, and the semantic func- The above functional forms work as follows. For tion #n for an FFP system. (We regard D as fixed and x # ±, (cellname n):x is Tifx is a cell named n, otherwise write It for ltD.) We assume D is a sequence and that ~'y:D it is F. (pop n):y removes the first cell named n from a can be computed (by the function ~'y as given in Section sequence y; (purge n):y removes all cells named n from 13.3.4) for any atomy. With these assumptions we define y. (push n): puts a cell named n with contents # as the least fixed point of the functional % where the x at the head of sequence y; ~n: is function ,it is defined as follows for any function # (for (push n):. all expressions x, xi, y, yi, z, and w): (Thus (push n): = y' pushes x onto the top of (~'it)x = x ~ A ~ x; a "stack" named n in y'; x can be read by ~n:y' = x and x = --"> ; can be removed by (pop n):y'; thus Tno(pop n):y' is the x = (y:z) --, element below x in the stack n, provided there is more than one cell named n in y'.) (y E A & (~'y:D) = # --~ It((py)(itz)); y E A & (l'y:D) -- w ~ #(w:z); 13.3.5 Definitions in FFP systems. The semantics of an FFP system depends on a fixed set of definitions D y = ---> It(yl:); It(ity:z)); ± (a sequence of cells), just as an FP system depends on its The above description of It expands the operator of an informally given set of def'mitions. Thus the semantic application by definitions and by metacomposition be- function It depends on D; altering D gives a new It' that fore evaluating the operand. It is assumed that predicates reflects the altered definitions. We have represented D like "x ~ A" in the above definition of ~'# are Z- as an object because in AST systems (Section 14) we preserving (e.g., "± E A" has the value ±) and that the shall want to transform D by applying functions to it and conditional expression itself is also ±-preserving. Thus to fetch data from it--in addition to using it as the source (Tit)± - £ and (Tit)(±:z) - ±. This concludes the seman- of function definitions in FFP semantics. tics of FFP systems. If is the first cell named n in the se- quence D (and n is an atom) then it has the same effect as the FP definition Def n - pc, that is, the meaning of 14. Applicative State Transition Systems (n:x) will be the same as that of Oc:x. Thus for example, (AST Systems) if > is the first cell in D named CONST, then it has the same effect as 14.1 Introduction Def CONST =- 201, and the FFP system with that D This section sketches a class of systems mentioned would fred earlier as alternatives to von Neumann systems. It must be emphasized again that these applicative state transi- It(CONST:<,z>) = y tion systems are put forward not as practical program- and consequently ming systems in their present form, but as examples of a class in which applicative style programming is made It(:B) = A. available in a history sensitive, but non-von Neumann In general, in an FFP system with definitions D, the system. These systems are loosely coupled to states and meaning of an application of the form (atom:x) is de- depend on an underlying applicative system for both

634 Communications August 1978 of Volume 2 l the ACM Number 8 their programming language and the description of their not depend on a host of baroque protocols for commu- state transitions. The underlying applicative system of nicating with the state, and we want to be able to make the AST system described below is an FFP system, but large transformations in the state by the application of other applicative systems could also be used. general functions. AST systems provide one way of To understand the reasons for the structure of AST achieving these goals. Their semantics has two protocols systems, it is helpful first to review the basic structure of for getting information from the state: (1) get from it the avon Neumann system, Algol, observe its limitations, definition of a function to be applied, and (2) get the and compare it with the structure of AST systems. After whole state itself. There is one protocol for changing the that review a minimal AST system is described; a small, state: compute the new state by function application. top-down, self-protecting system program for file main- Besides these communications with the state, AST se- tenance and running user programs is given, with direc- mantics is applicative (i.e. FFP). It does not depend on tions for installing it in the AST system and for running state changes because the state does not change at all an example user program. The system program uses during a computation. Instead, the result of a computa- "name functions" instead of conventional names and the tion is output and a new state. The structure of an AST user may do so too. The section concludes with subsec- state is slightly restricted by one of its protocols: It must tions discussing variants of AST systems, their general be possible to identify a definition (i.e. cell) in it. Its properties, and naming systems. structure--it is a sequence--is far simpler than that of the Algol state. 14.2 The Structure of Algol Compared to That of AST Thus the structure of AST systems avoids the com- Systems plexity and restrictions of the von Neumann state (with An Algol program is a sequence of statements, each its communications protocols) while achieving greater representing a transformation of the Algol state, which power and freedom in a radically different and simpler is a complex repository of information about the status framework. of various stacks, pointers, and variable mappings of identifiers onto values, etc. Each statement communi- 14.3 Structure of an AST System cates with this constantly changing state by means of An AST system is made up of three elements: complicated protocols peculiar to itself and even to its 1) An applicative subsystem (such as an FFP system). different parts (e.g., the protocol associated with the 2) A state D that is the set of definitions of the variable x depends on its occurrence on the left or right applicative subsystem. of an assignment, in a declaration, as a parameter, etc.). 3) A set of transition rules that describe how inputs It is as if the Algol state were a complex "store" that are transformed into outputs and how the state D is communicates with the Algol program through an enor- changed. mous "cable" of many specialized wires. The complex The programming language of an AST system is just communications protocols of this cable are fixed and that of its applicative subsystem. (From here on we shall include those for every statement type. The "meaning" assume that the latter is an FFP system.) Thus AST of an Algol program must be given in terms of the total systems can use the FP programming style we have effect of a vast number of communications with the state discussed. The applicative subsystem cannot change the via the cable and its protocols (plus a means for identi- state D and it does not change during the evaluation of fying the output and inserting the input into the state). an expression. A new state is computed along with output By comparison with this massive cable to the Algol and replaces the old state when output is issued. (Recall state/store, the cable that is the von Neumann bottleneck that a set of definitions D is a sequence of cells; a cell of a computer is a simple, elegant concept. name is the name of a defined function and its contents Thus Algol statements are not expressions represent- is the defining expression. Here, however, some cells ing state-to-state functions that are built up by the use of may name data rather than functions; a data name n will orderly combining forms from simpler state-to-state be used in l'n (fetch n) whereas a function name will be functions. Instead they are complex messages with con- used as an operator itself.) text-dependent parts that nibble away at the state. Each We give below the transition rules for the elementary part transmits information to and from the state over the AST system we shall use for examples of programs. cable by its own protocols. There is no provision for These are perhaps the simplest of many possible transi- applying general functions to the whole state and thereby tion rules that could determine the behavior of a great making large changes in it. The possibility of large, variety of AST systems. powerful transformations of the state S by function 14.3.1 Transition rules for an elementary AST sys- application, S---, f.'S, is in fact inconceivable in the von tem. When the system receives an input x, it forms the Neumann--cable and protocol--context: there could be application (SYSTEM:x) and then proceeds to obtain its no assurance that the new state f:S would match the meaning in the FFP subsystem, using the current state cable and its fLxed protocols unless f is restricted to the D as the set of definitions. SYSTEM is the distinguished tiny changes allowed by the cable in the first place. name of a function defined in D (i.e. it is the "system We want a computing system whose semantics does program"). Normally the result is a pair

635 Communications August 1978 of Volume 21 the ACM Number 8 #(SYSTEM:x) = D of our AST system will contain the definitions of all nonprimitive functions needed for the system program where o is the system output that results from input x and for users' programs. (Each definition is in a cell of and d becomes the new state D for the system's next the sequence D.) In addition, there will be a cell in D input. Usually d will be a copy or partly changed copy named FILE with contents file, which the system main- of the old state. If#(SYSTEM:x) is not a pair, the output tains. We shall give FP definitions of functions and later is an error message and the state remains unchanged. show how to get them into the system in their FFP form. 14.3.2 Transition rules: exception conditions and The transition rules make the input the operand of startup. Once an input has been accepted, our system SYSTEM, but our plan is to use name-functions to refer will not accept another (except , see below) to data, so the first thing we shall do with the input is to until an output has been issued and the new state, if any, create two cells named KEY and INPUT with contents installed. The system will accept the input key and input and append these to D. This sequence of at any time. There are two cases: (a) If SYSTEM is cells has one each for key, input, and -file; it will be the defmed in the current state D, then the system aborts its operand of our main function called subsystem. Subsys- current computation without altering D and treats x as tem can then obtain key by applying ~KEY to its oper- a new normal input; (b) if SYSTEM is not defined in D, and, etc. Thus the definition then x is appended to D as its first element. (This ends the complete description of the transition rules for our Def system -- pair--> subsystemoj~ [NONPAIR, defs] elementary AST system.) where If SYSTEM is defmed in D it can always prevent any change in its own definition. If it is not defined, f =- ~INPUTo[2, ~KEyoI1, defs]] an ordinary input x will produce #(SYSTEM:x) = & causes the system to output NONPAIR and leave the and the transition rules yield an error message and state unchanged if the input is not a pair. Otherwise, if an unchanged state; on the other hand, the input it is , then > will define SYSTEM to be s. f'. = <, 14.3.3 Program access to the state; the function , dl ..... d.> O DEF$. Our FFP subsystem is required to have one new where D =

. (We might have constructed a primitive function, defs, named DEFS such that for any different operand than the one above, one with just three object x ~ ±, cells, for key, input, and file. We did not do so because defs:x = pDEFS:x = D real programs, unlike subsystem, would contain many name functions referring to data in the state, and this where D is the current state and set of definitions of the "standard" construction of the operand would suffice AST system. This function allows programs access to the then as well.) whole state for any purpose, including the essential one 14.4.2 The "subsystem" function. We now give the of computing the successor state. FP definition of the function subsystem, followed by brief explanations of its six cases and auxiliary functions. 14.4 An Example of a System Program The above description of our elementary AST system, Def subsystem --- plus the FFP subsystem and the FP primitives and is-system-changeo TK E Y ---, [report-change, apply ]o[ ~ l N P U T, defs]; functional forms of earlier sections, specify a complete is-expressiono~'KE Y --~ ['[I N P U T, clefs]; is-programo TKE Y--~ system-checkoapplyo[ ~ lNP UT, defs]; history-sensitive computing system. Its input and output is-queryo'~KE Y --> [query-response oH1NPUT, T FILE], clefs]; behavior is limited by its simple transition rules, but is-update oI"KE Y --, otherwise it is a powerful system once it is equipped with [report-update, J,FILEo[update, defs]] a suitable set of definitions. As an example of its use we o[~INPUT, TF1LE]; shall describe a small system program, its installation, [report-erroro[~KEY,'~lNPUT], defs]. and operation. This subsystem has five "p ~ j~" clauses and a final Our example system program will handle queries and default function, for a total of six classes of inputs; the updates for a file it maintains, evaluate FFP expressions, treatment of each class is given below. Recall that the run general user programs that do not damage the file or operand of subsystem is a sequence of cells containing the state, and allow authorized users to change the set of key, input, and file as well as all the defined functions of definitions and the system program itself. All inputs it D, and that subsystem:operand =. accepts will be of the form where key is a Default inputs. In this case the result is given by the code that determines both the input class (system-change, last (default) function of the definition when key does expression, program, query, update) and also the identity not satisfy any of the preceding clauses. The output is of the user and his authority to use the system for the report-error: . The state is unchanged since given input class. We shall not specify a format for key. it is given by defs:operand = D. (We leave to the reader's Input is the input itself, of the class given by key. imagination what the function report-error will generate 14.4.1 General plan of the system program. The state from its operand.)

636 Communications August 1978 of Volume 21 the ACM Number 8 System-change inputs. When and the FP function is the same as that represented by the FFP object, provided that update = pUPDA TE and is-system-change o~ KE Y:operand = COMP, STORE, and CONS represent the controlling is-system-change:key = T, functions for composition, store, and construction. key specifies that the user is authorized to make a system All FP definitions needed for our system can be change and that input = ~INPUT:operand represents a converted to cells as indicated above, giving a sequence functionfthat is to be applied to D to produce the new Do. We assume that the AST system has an empty state statef:D. (Of coursef:D can be a useless new state; no to start with, hence SYSTEM is not defined. We want to constraints are placed on it.) The output is a report, define SYSTEM initially so that it will install its next namely report-change: . input as the state; having done so we can then input Do Expression inputs. When is-expression:key = T, the and all our definitions will be installed, including our system understands that the output is to be the meaning program--system--itseff. To accomplish this we enter of the FFP expression input; ~INPUT:operand produces our first input it and it is evaluated, as are all expressions. The state is > unchanged. where loader = ,ID>. Program inputs and system self-protection. When is- Then, by the transition rule for RESETwhen SYSTEM program:key = T, both the output and new state are is undefined in D, the cell in our input is put at the given by (pinput):D =. If newstate head of D = ~, thus defining pSYSTEM - ploader - contains file in suitable condition and the definitions of [DONE, id]. Our second input is Do, the set of definitions system and other protected functions, then we wish to become the state. The regular transition rule system-check: =. causes the AST system to evaluate Otherwise, system-check: #(SYSTEM:Do) -- [DONE, id]:Do = . Thus = . the output from our second input is DONE, the new Although program inputs can make major, possibly dis- state is Do, and pSYSTEM is now our system program astrous changes in the state when it produces newstate, (which only accepts inputs of the form ). system-check can use any criteria to either allow it to Our next task is to load the file (we are given an become the actual new state or to keep the old. A more initial value file). To load it we input a program into the sophisticated system-check might correct only prohibited newly installed system that contains-file as a constant changes in the state. Functions of this sort are possible and stores it in the state; the input is because they can always access the old state for compar- where ison with the new state-to-be and control what state transition will finally be allowed. pstore-file =-- ~FILEo[file, id]. File query inputs. If is-query:key -- T, the function Program-key identifies [DONE, store-file] as a program query-response is designed to produce the output = to be applied to the state Do to give the output and new answer to the query input from its operand . state D1, which is: File update inputs. If is-update:key = T, input speci- fies a f'de transaction understood by the function update, pstore-file:Do = ~FILEo[file, id]:D0, which computes updated-file = update: . Thus or Do with a cell containing file at its head. The output ~FILE has as its operand and thus is DONE:Do = DONE. We assume that system-check stores the updated file in the cell FILE in the new state. will pass unchanged. FP expressions have The rest of the state is unchanged. The function report- been used in the above in place of the FFP objects they update generates the output from its operand denote, e.g. DONE for . . 14.4.4 Using the system. We have not said how the 14.4.3 Installing the system program. We have de- system's file, queries or updates are structured, so we scribed the function called system by some FP definitions cannot give a detailed example of file operations. How- (using auxiliary functions whose behavior is only indi- ever, the structure of subsystem shows clearly how the cated). Let us suppose that we have FP definitions for system's response to queries and updates depends on the all the nonprimitive functions required. Then each defi- functions query-response, update, and report-update. nition can be converted to give the name and contents of Let us suppose that matrices m, n named M, and N a cell in D (of course this conversion itself would be done are stored in D and that the function MM described by a better system). The conversion is accomplished by earlier is defined in D. Then the input changing each FP function name to its equivalent atom (e.g., update becomes UPDA TE) and by replacing func- tional forms by sequences whose first member is the would give the product of the two matrices as output and controlling function for the particular form. Thus an unchanged state. Expression-key identifies the appli- ~FILEo[update, defs] is converted to cation as an expression to be evaluated and since defs:# , = D and [tM, ~'N]:D -- , the value of the expres- >, sion is the result MM:, which is the output.

637 Communications August 1978 of Volume 2 l the ACM Number 8 Our miniature system program has no provision for d) By defining appropriate functions one can, I be- giving control to a user's program to process many lieve, introduce major new features at any time, using inputs, but it would not be difficult to give it that the same framework. Such features must be built into capability while still monitoring the user's program with the framework of avon Neumann language. I have in the option of taking control back. mind such features as: "stores" with a great variety of naming systems, types and type checking, communicat- 14.5 Variants of AST Systems ing parallel processes, nondeterminacy and Dijkstra's A major extension of the AST systems suggested "guarded command" constructs [8], and improved meth- abow; would provide combining forms, "system forms," ods for structured programming. for building a new AST system from simpler, component e) The framework of an AST system comprises the AST systems. That is, a system form would take AST syntax and semantics of the underlying applicative sys- systems as parameters and generate a new AST system, tem plus the system framework sketched above. By just as a functional form takes functions as parameters current standards, this is a tiny framework for a language and generates new functions. These system forms would and is the only fixed part of the system. have properties like those of functional forms and would become the "operations" of a useful "algebra of systems" 14.7 Naming Systems in AST and von Neumann in much the same way that functional forms are the Models "operations" of the algebra of programs. However, the In an AST system, naming is accomplished by func- problem of finding useful system forms is much more tions as indicated in Section 13.3.3. Many useful func- difficult, since they must handle RESETS, match inputs tions for altering and accessing a store can be defined and outputs, and combine history-sensitive systems (e.g. push, pop, purge, typed fetch, etc.). All these defi- rather than fixed functions. nitions and their associated naming systems can be in- Moreover, the usefulness or need for system forms is troduced without altering the AST framework. Different less clear than that for functional forms. The latter are kinds of "stores" (e.g., with "typed cells") with individual essential for building a great variety of functions from naming systems can be used in one program. A cell in an initial primitive set, whereas, even without system one store may contain another entire store. forms, the facilities for building AST systems are already The important point about AST naming systems is so rich that one could build virtually any system (with that they utilize the functional nature of names (Rey- the general input and output properties allowed by the nolds' OEDANr~N [19] also does so to some extent within given AST scheme). Perhaps system forms would be avon Neumann framework). Thus name functions can useful for building systems with complex input and be composed and combined with other functions by output arrangements. functional forms. In contrast, functions and names in von Neumann languages are usually disjoint concepts 14.6 Remarks About AST Systems and the function-like nature of names is almost totally As I have tiled to indicate above, there can be concealed and useless, because a) names cannot be ap- innumerable variations in the ingredients of an AST plied as functions; b) there are no general means to system--how it operates, how it deals with input and combine names with other names and functions; c) the output, how and when it produces new states, and so on. objects to which name functions apply (stores) are not In any case, a number of remarks apply to any reasonable accessible as objects. AST system: The failure of von Neumann languages to treat a) A state transition occurs once per major computa- names as functions may be one of their more important tion and can have useful mathematical properties. State weaknesses. In any case, the ability to use names as transitions are not involved in the tiniest details of a functions and stores as objects may turn out to be a computation as in conventional languages; thus the lin- useful and important programming concept, one which guistic yon Neumann bottleneck has been eliminated. should be thoroughly explored. No complex "cable" or protocols are needed to com- municate with the state. b) Programs are written in an applicative language 15. Remarks About Computer Design that can accommodate a great range of changeable parts, parts whose power and flexibility exceed that of any von The dominance of von Neumann languages has left Neumann language so far. The word-at-a-time style is designers with few intellectual models for practical com- replaced by an applicative style; there is no division of puter designs beyond variations of the von Neumann programming into a world of expressions and a world of computer. Data flow models [1] [7] [13] are one alterna- statements. Programs can be analyzed and optimized by tive class of history-sensitive models. The substitution an algebra of programs. rules of lambda-calculus based languages present serious c) Since the state cannot change during the compu- problems for the machine designer. Berkling [3] has tation of system:x, there are no side effects. Thus inde- developed a modified lambda calculus that has three pendent applications can be evaluated in parallel. kinds of applications and that makes renaming of vail-

638 Communications August 1978 of Volume 21 the ACM Number 8 ables unnecessary. He has developed a machine to eval- programming style of the von Neumann computer. Thus uate expressions of this language. Further experience is variables = storage cells; assignment statements = fetch- needed to show how sound a basis this language is for ing, storing, and arithmetic; control statements = jump an effective programming style and how efficient his and test instructions. The symbol ".----" is the linguistic machine can be. von Neumann bottleneck. Programming in a conven- Mag6 [15] has developed a novel applicative machine tional~von Neumann--language still concerns itself built from identical components (of two kinds). It eval- with the word-at-a-time traffic through this slightly more uates, directly, FP-like and other applicative expressions sophisticated bottleneck. Von Neumann languages also from the bottom up. It has no von Neumann store and split programming into a world of expressions and a no address register, hence no bottleneck; it is capable of world of statements; the first of these is an orderly world, evaluating many applications in parallel; its built-in op- the second is a disorderly one, a world that structured erations resemble FP operators more than von Neumann programming has simplified somewhat, but without at- computer operations. It is the farthest departure from tacking the basic problems of the split itself and of the the yon Neumann computer that I have seen. word-at-a-time style of conventional languages. There are numerous indications that the applicative Section 5. This section compares avon Neumann style of programming can become more powerful than program and a functional program for inner product. It the von Neumann style. Therefore it is important for illustrates a number of problems of the former and programmers to develop a new class of history-sensitive advantages of the latter: e.g., the von Neumann program models of computing systems that embody such a style is repetitive and word-at-a-time, works only for two and avoid the inherent efficiency problems that seem to vectors named a and b of a given length n, and can only attach to lambda-calculus based systems. Only when be made general by use of a procedure declaration, these models and their applicative languages have proved which has complex semantics. The functional program their superiority over conventional languages will we is nonrepetitive, deals with vectors as units, is more have the economic basis to develop the new kind of hierarchically constructed, is completely general, and computer that can best implement them. Only then, creates "housekeeping" operations by composing high- perhaps, will we be able to fully utilize large-scale inte- level housekeeping operators. It does not name its argu- grated circuits in a computer design not limited by the ments, hence it requires no procedure declaration. von Neumann bottleneck. Section 6. A programming language comprises a framework plus some changeable parts. The framework of a von Neumann language requires that most features 16. Summary must be built into it; it can accommodate only limited changeable parts (e.g., user-defined procedures) because The fifteen preceding sections of this paper can be there must be detailed provisions in the "state" and its summarized as follows. transition rules for all the needs of the changeable parts, Section 1. Conventional programming languages as well as for all the features built into the framework. are large, complex, and inflexible. Their limited expres- The reason the von Neumann framework is so inflexible sive power is inadequate to justify their size and cost. is that its semantics is too closely coupled to the state: Section 2. The models of computing systems that every detail of a computation changes the state. underlie programming languages fall roughly into three Section 7. The changeable parts of von Neumann classes: (a) simple operational models (e.g., Turing ma- languages have little expressive power; this is why most chines), (b) applicative models (e.g., the lambda calcu- of the language must be built into the framework. The lus), and (c) von Neumann models (e.g., conventional lack of expressive power results from the inability of von computers and programming languages). Each class of Neumann languages to effectively use combining forms models has an important difficulty: The programs of for building programs, which in turn results from the class (a) are inscrutable; class (b) models cannot save split between expressions and statements. Combining information from one program to the next; class (c) forms are at their best in expressions, but in von Neu- models have unusable foundations and programs that mann languages an expression can only produce a single are conceptually unhelpful. word; hence expressive power in the world of expressions Section 3. Von Neumann computers are built is mostly lost. A further obstacle to the use of combining around a bottleneck: the word-at-a-time tube connecting forms is the elaborate use of naming conventions. the CPU and the store. Since a program must make Section 8. APL is the first language not based on its overall change in the store by pumping vast numbers the lambda calculus that is not word-at-a-time and uses of words back and forth through the von Neumann functional combining forms. But it still retains many of bottleneck, we have grown up with a style of program- the problems of von Neumann languages. ming that concerns itself with this word-at-a-time traffic Section 9. Von Neumann languages do not have through the bottleneck rather than with the larger con- useful properties for reasoning about programs. Axio- ceptual units of our problems. matic and denotational semantics are precise tools for Section 4. Conventional languages are based on the describing and understanding conventional programs,

640 Communications August 1978 of Volume 21 the ACM Number 8 References 12. Iverson, K. A Programming Language. Wiley, New York, 1962. I. Arvind, and Gostelow, K.P. A new interpreter for data flow 13. Kosinski, P. A data flow programming language. Rep. RC 4264, schemas and its implications for computer architecture. Tech. Rep. IBM T.J. Watson Research Ctr., Yorktown Heights, N.Y., March No. 72, Dept. Comptr. Sci., U. of California, Irvine, Oct. 1975. 1973. 2. Backus, J. Programming language semantics and closed 14. Landin, P.J. The mechanical evaluation of expressions. Computer applicative languages. Conf. Record ACM Symp. on Principles of J. 6, 4 (1964), 308-320. Programming Languages, Boston, Oct. 1973, 71-86. 15. Mag6, G.A. A network of microprocessors to execute reduction 3. Berkling, K.J. Reduction languages for reduction machines. languages. To appear in Int. J. Comptr. and Inform. Sci. Interner Bericht ISF-76-8, Gesellschaft f'dr Mathematik und 16. Manna, Z., Ness, S., and Vuillemin, J. Inductive methods for Datenverarbeitung MBH, Bonn, Sept. 1976. proving properties of programs. Comm..4 CM 16,8 (Aug. 1973) 4. Burge, W.H. Recursive Programming Techniques. Addison- 491-502. Wesley, Reading, Mass., 1975. 17. McCarthy, J. Recursive functions of symbolic expressions and 5. Church, A. The Calculi of Lambda-Conversion. Princeton U. their computation by machine, Pt. 1. Comm. ,4CM 3, 4 (April 1960), Press, Princeton, N.J., 1941. 184-195. 6. Curry, H.B., and Feys, R. Combinatory Logic, Vol. 1. North- 18. MeJones, P. A Church-Rosser property of closed applicative Holland Pub. Co., Amsterdam, 1958. languages. Rep. RJ 1589, IBM Res. Lab., San Jose, Calif., May 1975. 7. Dennis, J.B. First version of a data flow procedure language. 19. Reynolds, J.C. GEDANKEN--asimple typeless language based on Tech. Mem. No. 61, Lab. for Comptr. Sci., M.I.T., Cambridge, Mass., the principle of completeness and the reference concept. Comm. May 1973. ACM 13, 5 (May 1970), 308-318. 8. Dijkstra, E.W. ,4 Discipline of Programming. Prentice-Hall, 20. Reynolds, J..C. Notes on a lattice-theoretic approach to the theory Englewood Cliffs, N.J., 1976. of computation. Dept. Syst. and Inform. Sci., Syracuse U., Syracuse, 9. Friedman, D.P., and Wise, D.S. CONS should not evaluate its N.Y., 1972. arguments. In Automata, Languages and Programming, S. Michaelson 21. Scott, D. Outline of a mathematical theory of computation. Proc. and R. Milner, Eds., Edinburgh U. Press, Edinburgh, 1976, pp. 4th Princeton Conf. on Inform. Sci. and Syst., 1970. 257-284. 22. Scott, D. Lattice-theoretic models for various type-free calculi. 10. Henderson, P., and Morris, J.H. Jr. A lazy evaluator. Conf. Proc. Fourth Int. Congress for Logic, Methodology, and the Record Third ACM Symp. on Principles of Programming Languages, Philosophy of Science, Bucharest, 1972. Atlanta, Ga., Jan. 1976, pp. 95-103. 23. Scott, D., and Strachey, C. Towards a mathematical semantics I1. Hoare, C.A.R. An axiomatic basis for computer programming. for computer languages. Proc. Symp. on Comptrs. and Automata, Comm. ,4CM 12, 10 (Oct. 1969), 576-583. Polytechnic Inst. of Brooklyn, 1971.

641 Communications August 1978 of Volume 21 the ACM Number 8

639 Communications August 1978 of Volume 21 the ACM Number 8 but they only talk about them and cannot alter their FP systems, as compared with the much more powerful ungainly properties. Unlike von Neumann languages, classical systems. Questions are suggested about - the language of ordinary algebra is suitable both for rithmic reduction of functions to infinite expansions and stating its laws and for transforming an equation into its about the use of the algebra in various "lazy evaluation" solution, all within the "language." schemes. Section 10. In a history-sensitive language, a pro- Section 13. This section describes formal functional gram can affect the behavior of a subsequent one by programming (FFP) systems that extend and make pre- changing some store which is saved by the system. Any cise the behavior of FP systems. Their semantics are such language requires some kind of state transition simpler than that of classical systems and can be shown semantics. But it does not need semantics closely coupled to be consistent by a simple fixed-point argument. to states in which the state changes with every detail of Section 14. This section compares the structure of the computation. "Applicative state transition" (AST) Algol with that of applicative state transition (AST) systems are proposed as history-sensitive alternatives to systems. It describes an AST system using an FFP system von Neumann systems. These have: (a) loosely coupled as its applicative subsystem. It describes the simple state state-transition semantics in which a transition occurs and the transition rules for the system. A small self- once per major computation; (b) simple states and tran- protecting system program for the AST system is de- sition rules; (c) an underlying applicative system with scribed, and how it can be installed and used for file simple "reduction" semantics; and (d) a programming maintenance and for running user programs. The section language and state transition rules both based on the briefly discusses variants of AST systems and functional underlying applicative system and its semantics. The naming systems that can be defined and used within an next four sections describe the elements of this approach AST system. to non-von Neumann language and system design. Section 15. This section briefly discusses work on Section 11. A class of informal functional program- applicative computer designs and the need to develop ming (FP) systems is described which use no variables. and test more practical models of applicative systems as Each system is built from objects, functions, functional the future basis for such designs. forms, and definitions. Functions map objects into ob- jects. Functional forms combine existing functions to Acknowledgments. In earlier work relating to this form new ones. This section lists examples of primitive paper I have received much valuable help and many functions and functional forms and gives sample pro- suggestions from Paul R. McJones and Barry K. Rosen. grams. It discusses the limitations and advantages of FP I have had a great deal of valuable help and feedback in systems. preparing this paper. James N. Gray was exceedingly Section 12. An "algebra of programs" is described generous with his time and knowledge in reviewing the whose variables range over the functions of an FP system first draft. Stephen N. Zilles also gave it a careful reading. and whose "operations" are the functional forms of the Both made many valuable suggestions and criticisms at system. A list of some twenty-four laws of the algebra is this difficult stage. It is a pleasure to acknowledge my followed by an example proving the equivalence of a debt to them. I also had helpful discussions about the nonrepetitive matrix multiplication program and a re- first draft with Ronald Fagin, Paul R. McJones, and cursive one. The next subsection states the results of two James H. Morris, Jr. Fagin suggested a number of im- "expansion theorems" that "solve" two classes of equa- provements in the proofs of theorems. tions. These solutions express the "unknown" function Since a large portion of the paper contains technical in such equations as an infinite conditional expansion material, I asked two distinguished computer scientists that constitutes a case-by-case description of its behavior to referee the third draft. David J. Giles and John C. and immediately gives the necessary and sufficient con- Reynolds were kind enough to accept this burdensome ditions for termination. These results are used to derive task. Both gave me large, detailed sets of corrections and a "recursion theorem" and an "iteration theorem," which overall comments that resulted in many improvements, provide ready-made expansions for some moderately large and small, in this final version (which they have general and useful classes of "linear" equations. Exam- not had an opportunity to review). I am truly grateful ples of the use of these theorems treat: (a) correctness for the generous time and care they devoted to reviewing proofs for recursive and iterative factorial functions, and this paper. (b) a proof of equivalence of two iterative programs. A Finally, I also sent copies of the third draft to Gyula final example deals with a "quadratic" equation and A. Mag6, Peter Naur, and John H. Williams. They were proves that its solution is an idempotent function. The kind enough to respond with a number of extremely next subsection gives the proofs of the two expansion helpful comments and corrections. Geoffrey A. Frank theorems. and Dave Tolle at the University of North Carolina The algebra associated with FP systems is compared reviewed Mag6's copy and pointed out an important with the corresponding algebras for the lambda calculus error in the definition of the semantic function of FFP and other applicative systems. The comparison shows systems. My grateful thanks go to all these kind people some advantages to be drawn from the severely restricted for their help.