t

The Formal Description

of Programming Languages

using Predicate Logic

by

Christopher D.S. Moss

Submitted for the Ph.D. Degree

Department of Computing

Imperial College, London

July 1981

1 ABSTRACT

Metamorphosis grammars and the Horn Clause subset of first-order predicate logic as used in the Prolog language provide a powerful formalism for describing all aspects of conventional programming languages.

Colmerauer's M-grammars generalise the traditional grammar rewriting rules to apply to strings of function symbols with parameters. Expressed in first-order logic and using resolution these provide the facilities of other form- alisms such as W-grammars and Attribute Grammars with great- ly improved ease of use and comprehension and the added advantage that they may be run directly on a computer.

The thesis provides a methodology for expressing both syntax and semantics of programming languages in a coherent framework. Unlike some formalisms which attempt to give most of the definition in terms of either syntax or semantics, this tries to preserve a natural balance between the two. The syntax separates lexical and grammar parts and generates an abstract syntax which includes the semantics of 'static' objects such as numbers. The semantics is expressed by means of relations which express state transformations using logic rather than the more traditional lambda calculus. Prolog has a well-defined fixpoint or denotational semantics as well as its proof theoretic semantics, which gives the definitions an adequate mathematical basis. The traditional axiomatic method can also be used to express the semantics using a metalevel proof system in which the proof rules become axioms of the system.

2 To demonstrate these principles, descriptions of three example languages are presented. These are ASPLE, a small language which has been used to compare other methods, the Prolog language itself (a non-deterministic applicative language) and a subset of Algol 68 including full jumps and procedures. The definition of the latter uses a method similar to the continuation method.

An extensive survey is given of methods of syntax and semantic definition and several applications of the method are suggested, including language prototyping systems, comp- ilers and program proving systems.

3 CONTENTS

1. Introduction 7

2. Grammars and Logic 2.1. Metamorphosis Grammars 16 2.2. The Development of Syntax Descriptions. ... 35

3. Semantics 3.1. Relational Semantics 60 3.2. Axiomatic Semantics 71 3.3. The Development of Semantics 78

4. Examples of Formal Definitions 4.1. ASPLE 92 4.2. Prolog 119 4.3. Mini-Algol 68 139

5. Applications of Formal Definitions 5.1 Prototyping of languages ..... 153 5.2 Towards a logic -compiler 158 5.3 Program proving and transformation 172

References 179

Appendices A. The definition of a subset of Algol 68 188 B. A Compiler for ASPLE 214 C. The Conversion of M-grammars to Prolog 222

4 At the still point of the turning world. Neither flesh nor fleshness; Neither from nor towards; at the still point, there the dance is, But neither arrest nor movement. And do not call it fixity, Where past and future are gathered. Neither movement from nor towards, Neither ascent nor decline. Except for the point, the still point, There would be no dance, and there is only the dance.

T. S. Eliot (1935)

The Four Quartets - Burnt Norton 1

Thanks . . .

I would like to express my appreciation to everyone who has helped me in so many ways: by providing inspiration and frustration; by encouragement in chatting over issues and criticizing inane notions; by making life worth living in the real world that exists outside the thesis factory; and by practical help in many ways.

In particular I must thank Bob Kowalski for his continual inspiration as my supervisor; Keith Clark and Maarten van Emden for discussions of tricky questions; Ian Moor and Moez Agha Hosseini for acting as sounding boards and being extremely hospitable room-mates; Sarah Bellows and Ellen Haigh, who managed to locate the most obscure reports in the library; Diane Reeve and Sandra Evans for typing large sections of the thesis; and Karen King for being patient with me when the whole exercise seemed futile.

I was supported during this time by a studentship from the Science Research Council. They have my deep gratitude.

Chris.

6 Chapter 1

Introduction and Summary

The aim of providing an entirely formal specification for a programming language is a quest which has attracted a great deal of attention over the past twenty years. Al- though the majority of the problems have now been solved using a variety of techniques, what is still lacking is a common formalism with which to draw these together to make them readily comprehensible to the average practitioner of computing.

The easiest part of a language to formalise is the context-free syntax. In this area BNF and its variants have gradually prevailed over the alternatives such as those used to define COBOL. The context-sensitive parts were solved in principle by van Wijngaarden in the definition of Algol 68, but other related formalisms, such as attribute grammars, have been attracting more attention because of their in- creased readability and amenability to computer implement- ation compared with W-grammars.

The definition of semantics has taken much longer to establish and there is still considerable variation in the style of presentation, although the main lines are more generally agreed. Early definitions were essentially "oper- ational" in nature, based on simple automata which could "execute" programs. These are unsatisfactory on several counts: they cannot easily be used for many of the basic tasks for which semantics are required, such as proving properties of programs, or input-output relationships and equivalence; they have no way of describing non-terminating programs; and they are too "low-level" to provide an easy conceptualisation of the "meaning" of program constructs.

7 Introduction and Summary

Later methods have been much more abstract, with a mathematical or logical basis. Currently, the most complete and widely used method is that of denotational semantics, introduced by Strachey and Scott, which describes a language in functional notation. The lambda calculus is used as a metalanguage and various mappings, of identifiers to stores and stores to values, are described in terms of this. There are two other popular methods which are more abstract than denotational semantics: one is the axiomatic or inductive assertion method of Floyd and Hoare which is widely used in program proving but requires the user to supply the induct- ive assertions along with the program, and also has diffi- culty with such intrinsic programming constructs as jumps and functions with side effects. The other is the algebraic method which characterises the semantics of programs by a set of properties which are required of programs. It is not clear at this point how well this deals with the more comp- licated parts of programming languages or how easy it is to show that an algebraic definition is complete. The axiomatic method is probably best considered as a set of theorems or lemmas derived from the denotational definition and useful for specific purposes such as program proving.

In this thesis we demonstrate a logic programming ap- proach to the definition of programming languages. The basis of this is the Prolog language, which uses the Horn-clause subset of predicate logic linked with the resolution method for matching clauses. Each clause is composed of predicates in the form:

A B <- ! & B2 & &Bn. where n> = 0 and A and are predicates, and ,<-' stands for

8 Introduction and Summary

'if1. This may be regarded as an assertion if n=0 and either an implication or a procedure if n>0. If A is absent this may either be regarded as a denial or a goal. Any variables in the clause are regarded as universally quantified over the clause. An example of a complete Prolog program (includ- ing a goal statement) is:

Human(Turing). Human(Socrates). Fallible(x) <- Human(x). Greek(Socrates). <- Fallible(y) & Greek(y). for which the only valid solution is y=Socrates.

The procedural interpretation involves matching goals with the heads (left hand sides) of procedures and replacing these by the bodies (right hand sides) of the procedures in a manner very similar to the productions of a grammar, with each branch terminating in an assertion. This process is non-deterministic, since more than one head may match a goal. It can also be interpreted or compiled on a computer with remarkable efficiency.

An integral feature of Prolog systems is the use of metamorphosis grammars, originally envisaged by Colmerauer. These may be regarded as a regularised form of W-grammars and can be applied directly to the definition of both context-free and context-sensitive portions of programming languages. They are much simpler to comprehend than W- grammars because of the type-free nature of the logic used, and compare favourably with the use of attribute grammars for this purpose. In addition, they can be run directly on a computer to parse or generate a program, and for one-pass languages this requires no modification to the normal defin- ition.

9 Introduction and Summary

Colmerauer's original definition followed the style of Chomsky's production rules in allowing several symbols on the left hand side, although only following a non-terminal. However, these grammars can be systematically transformed into rules which are similar to context-free rules in that they only have a single non-terminal on the left hand side. These are slightly more general than the "definite clause grammars" defined by Pereira and Warren since they allow non-terminals as parameters, and they have the same power as W-grammars.

The syntax description of a language can be used to generate a more convenient "abstract syntax tree" which may be used to define the semantics of a program. This tree is generated at the root of the parse tree, and may also be termed the "static semantics", as it defines the value of those constructs which may be understood in an essentially static way: e.g. constants in a program.

Defining the semantics of a language may then be under- stood as giving the specification for the value of any abstract syntax tree defined by the syntax. This value may be defined by a Prolog program which defines the value of any program written in the programming language. Since this value involves programming constructs such as conditionals and loops it is not sufficient to regard it as the effect of evaluating a Prolog program. But the unique dual nature of the semantics of Prolog gives a valid basis which is not "operational".

Logic has both a model-theoretic and a proof-theoretic semantics. The first ensures that any Prolog program has a "model" and this corresponds to the fix-point semantics which defines the value of a program inductively. The second

10 Introduction and Summary ensures that the proof of any implication can be displayed. The equivalence of these two semantics is known as the completeness of first-order logic. However, another property, the undecidability of first-order logic, ensures that one cannot guarantee that the evaluation of the proof of any proposition will terminate in a finite time. Hence it undesirable to consider the proof-theoretic method in isola- tion.

To define the semantics of the constructs of a program- ming language, Prolog is used as a metalanguage to give the "relational semantics" of the language. This involves the use of terms which name "states" of the identifiers, stores, files etc which the program manipulates. Because of the consistency with which the states are used, the same mechan- ism that is used in metamorphosis grammars may be involved, giving an effect which is a form of relational composition.

One characteristic of this specificaton is that it must be "complete" in that it has a value for every input. It is therefore important to include error values to correspond to possible errors in the evaluation of a program, such as overflow or the non-termination of loops. Generally, these correspond to the use in Prolog of negation interpreted as failure to prove and the whole definition is therefore regarded as a "closed world".

The relational semantics is not the only one that can be given. The axiomatic semantics can be similarly presented as a meta-level proof system. Both program and assertions to be proved are represented as terms and the proof rules become axioms, or clauses, in the Prolog system. To achieve a useful program proving system it is necessary to add to this axioms for algebraic equivalence etc.

Many other uses of the definitions suggest them-

11 Introduction and Summary selves. One of the earliest uses of M-grammars was to write a compiler for a small Algol-like language. This is extended here to propose a schema for a compiler-compiler which includes the transformation of the semantics from one lang- uage to another as well as the syntax. There is already a growing literature on the transformation of programs written in Prolog from specifications to efficient algorithms. This work suggests that this can be extended to the more trad- itional algorithmic languages.

Summary of later Chapters

Chapter 2 Grammars and Syntax

Colmerauer's Metamorphosis Grammar is introduced with examples showing the greater clarity achieved for context- sensitive grammars. One aspect of M-Grammars that has been overlooked (the use of non-terminals as parameters) is cor- rected and it is shown how any M-grammar may be cast in "context-free" form, in which each left hand side has a single non-terminal. Different realisations of M-grammars in logic are discussed and their relative merits pointed out.

Other forms of syntactic description are then intro- duced and compared with M-grammars, including VDL, W-gram- mars and Attribute grammars. The fusion of the latter two, in Extended Attribute Grammars by Watt and Madsen is shown to be very close to the M-grammar formalism, although their basis is very different. In particular the use of resolution with M-grammar systems simplifies the production of attri- butes, while there are a number of results from the "input- output" annotations of attribute grammars that are useful in logic programming systems.

12 Introduction and Summary

Chapter 2. Semantics

The basis of a semantic definition based on Prolog is presented. The model and proof-theoretic semantics of Prolog are summarised and the fixpoint semantics also reviewed. The requirements for relational semantics are presented and a methodology for semantic definitions based on Prolog is outlined.

An outline is also given of the application of the axiomatic method in a first-order system. This uses a meta- level proof system in which the proof rules are axioms and predicates to be proved are treated as terms.

An extended review of the different approaches to sem- antics is given, covering operational, axiomatic, denot- ational and algebraic methods. The ideas of complementary definition are reappraised and related back to the twin poles of model and proof theory. We argue for the relational semantics using logic as a metalanguage as an alternative to the denotational semantics based on the lambda calculus.

Chapter 4. Examples of Semantics

Three examples of programming language definitions are presented in order to demonstrate the application of the principles suggested. These are complete definitions, in- cluding both syntax and semantics.

The first is ASPLE, a simple Algol 68 based language used by several authors to demonstrate the principles of language definition. It includes declarations, assignments, conditions, loops and references, but no jumps or proced- ures. Both relational and axiomatic semantics are given.

The second language is Prolog itself, which is a comp- Introduction and Summary lete contrast to Algol-like languages. It is an applicative, non-deterministic language useful for symbolic manipulation rather than numerical computation. This section is useful in explaining the essential features of Prolog to those who are not familiar with it. It is also useful in demonstrating the way in which the semantic model of a language may be refined progressively to include more concrete implementation de- tails. This leads to a discussion of one of the more contro- versial features of Prolog - the 'cut1 or 'slash' predicate.

The third section returns to Algol-like languages and expands the ASPLE language into a subset of Algol 68 which includes blocks, functions and jumps. Different treatments of programs with jumps are considered, and a technique analogous to the use of continuations in denotational sem- antics presented.

Chapter Applications

Formal definitions are not useful unless they can be used, and this chapter looks tentatively at the ways in which the definitions presented earlier may be put to prac- tical use.

Producing prototype versions of a new language quickly and accurately from a specification is potentially valuable to program designers. Though several of the specifications presented here have been run, there are scaling up problems, which are discussed and possible lines of research outlined.

M-grammars have been used in both the and generation phases of a compiler, though their efficiency is not impressive. An example of a compiler for ASPLE is pre- sented and a scheme laid out for a logic-based compiler- compiler. This has the merit of dealing systematically with the semantics as well as the syntax of the language, based

14 Introduction and Summary on the principles of the PQCC project at Carnegie-Mellon.

The application of axiomatic semantics to program prov- ing systems is immediate, but a usable Prolog-based system needsother facilities. These are discussed and an outline for such a system proposed.

Another approach to program proving is the concept of transforming programs from abstract specifications to effic- ient running algorithms. A great deal of work is going on in this area based on Prolog as well as other formalisms. The suitability of logic systems is thus more obvious than was once supposed.

Reading the Thesis

Though a brief description of Prolog has been given in this chapter, a full definitio of Prolog is not given until chapter 4.2. A reader who is unaquainted with this notation may prefer to look at this section first, or consult one of the other descriptions of Horn Clause programming, such as Kowalski (1979).

In chapter 3 a historical survey of semantic methods is given in section 3.3. The reader may well find it best to read this before tackling the earlier sections of the chapter.

15 Chapter 2.1

Metamorphosis Grammars

Metamorphosis grammars generalise the rewriting rules of the Chomsky grammars to apply to function symbols with parameters rather than (atomic) symbols. The difference may be compared to the distinction between predicate and propos- itional logic.

We will introduce the grammar using a simple example and then present it formally. The example is that of the context-sensitive grammar anbncn, n>0 and the syntax here is informal with lower case letters representing variables and terminal symbols underlined.

Sentence (n) -> Letters (n,A) Letters(n,B) Letters(n,C)

Letters(S(m),x) -> Letters(lfx) Letters(m,x) Letters(l,y) ->

If we consider the substitution of variables n=S(l) and m=l, then the rewriting takes place as follows:

Sentence(S(1)) -> Letters(S(l) ,A) Letters (S(1),B) Letters(S(1),C) -> Letters(1, A) Letters(l,A) Letters(l,B) Letters(l,B) Letters(1,C) Letters(1,C) -> AABBCC.

Here each line corresponds to one or more applications of the rewrite rules, in the same order as they are given. In the second rewriting there are three applications in which different substitutions for the variable x are made, and in the third there are six for the variable y.

16 Metamorphosis Grammars

In this terminal string there are no uninstantiated or free variables. This is a characteristic of M-grammars. They are defined on the variable-free set consisting of all the function and constant symbols applied in all possible ways to each other (the so-called Herbrand Universe). Thus in a formal definition of M-grammars the rewriting rules are a valid generalisation of Chomsky rules.

As a contrast, consider a grammar for the same language based on the more traditional rewriting rules of Chomsky grammars:

S -> aSBC I abC CB -> BC bB -> bb bC -> be cC -> cc

Although compact, the effect of this language is far from obvious, and to derive a specific string is a tricky process.

To define a metamorphosis grammar formally one must first define functional terms and strings.

An n-ary functional term is written f(t^,..,tn) where tl,,tn are terms constructed out of function symbols which may include zero-order terms (constants) and variables which may range over any terms.

A string is a list of variable-free terms connected by the infix functional symbol V and terminated by the cons- tant symbol 'NIL1, e.g. A.B.NIL, which may be written in- formally as AB. A term of this form which includes variables is called a string schema. The null string is simply NIL.

17 Metamorphosis Grammars

A Metamorphosis grammar is defined by 5 parts,

(F V ' N'VT,VSr->) where

(a) F is a set of functional symbols (containing '.f and NIL) .

(k) VT is a vocabulary of terminal symbols (a subset of the Herbrand Universe of F, written H[F]).

is a vocabulary of non-terminal symbols (also H [F]) • We assume by convention that H VT = ^

(d) A set of starting symbols Vg (c VJJ) . (e) A rewriting relation -> on V* (with the restrictions that x -> y implies x ^ NIL and x contains a non- terminal) where V = V^j u VT"

The language, L(G) , generated by the grammar G is the set of strings on VS' given by:

L(G) = (t £ Vj | there exists s & Vs with s ->* t }

The potential difficulty of applying the definition (cf. the problems of consistently generating the two-level grammars of van Wijngaarden) is alleviated by three factors:

(1) A consistent mapping onto first-order logic is possib- le, thus making available the vast quantity of theoret- ical and practical results in this area.

(2) One important feature of logic is the method of resol- ution used for 'matching1 variables. This allows all possible productions to be represented at any stage of the production process, without requiring an extra set of productions for variables.

(3) It can be shown without loss of generality (see below) that one can limit rewriting rules to those which only contain a single non-terminal symbol on the left hand

18 Metamorphosis Grammars

side. This potentially makes available many of the results from the theory of context-free grammars which has been extensively researched over the last twenty years.

Representing M-grammars

Several representations of M-grammars in logic have been used: some are more efficient in processing terms, and others are more general. We will show later that all rules may be simplified so that only a single non-terminal appears on the left hand side. Hence it is only necessary to provide rules for this. At the end of the section we will also include Colmerauer's (1978) original formulation, which has historic interest.

Most of the formulations depend on observing the simil- arity between production rules and the composition of rel- ations. For instance, given the grammar:

Sentence -> NounPhrase VerbPhrase NounPhrase -> Determiner Adjective Noun VerbPhrase -> Verb Determiner -> "the" Adjective -> "mome" Noun -> "raths" Verb -> "outgrabe" the structure of the sentence "the mome raths outgrabe" is:

19 Metamorphosis Grammars

Sentence

NounPhrase VerbPhrase

Verb

1 Determiner 2 Adjective 3 Noun 4 5

n n n the" "mome "raths" outgrabe"

Thus a sentence may be considered as the composition of a noun phrase and verb phrase, given relations naming the constituents as follows:

Sentence from 1 to 5 NounPhrase from 1 to 4 VerbPhrase from 4 to 5 and the description of the sentence is:

Sentence from 1 to 5 if NounPhrase from 1 to 4 and VerbPhrase from 4 to 5

This may be generalised by placing variables in the place- markers to give a rule:

Sentence(ul,u2) <- NounPhrase(ulru3) & VerbPhrase(u3,u2). which is the normal form of relational composition. The other rules may be constructed in a similar way.

These principles may be applied in several ways.

20 Metamorphosis Grammars

1. Using a single 3-place predicate Connects, a gram- mar is represented as follows.

A rule

is translated

Connects(l,uO,u) <- Connects (rl,uO,ul) & ... & Conne c t s ( r n , un , u) .

Each position in the string is marked by a unique atom (say an integer) and the terminal symbol t between two points pj^ an(3 p2 is given by an assertion:

Connects(t, pi, P2*

Thus the sentence above may be represented by the assertions

Connects(the, 1, 2). Connects(mome, 2, 3). Connects(raths,3, 4). Connects(outgrabe,4, 5).

and the grammatical rule which parses this statement may be represented by

Connects (Sentence, ul,u) re- connects (NounPhrase, ul,u2) & Connects(VerbPhrase, u2,u3).

The other rules which are needed may of course be described similarly.

This representation is particularly simple, but has

21 Metamorphosis Grammars several disadvantages from a practical point of view. All the terminal symbols must be expressed by means of asser- tions and the use of only a single predicate symbol is a disadvantage in Prolog systems, which generally index on the name of the predicate in an efficient manner.

2. Two three-place predicates - say, Nonterm and Term - may be used, the first for non-terminals and the second for terminals. This increases the flexibility with which term- inal symbols may be represented. A particularly useful method is to use difference lists as follows. A single clause is used to define Term:

Term(s, s.x, x).

This may be read: a terminal symbol s is the first item on the list which is the second argument of Term; the third argument is the remainder of the list with that item re- moved. Grammar rules are expressed as before, and the start- ing symbol of the grammar is expressed as:

<- Nonterm(S, list, NIL). where S is the starting symbol of the grammar, list is the string corresponding to a sentence in the grammar and NIL represents the end of the list. The advantage of this rep- resentation is that for both parsing and generation of sentences the target strings may be held as terms rather than as assertions, which is far more convenient in practic- al logic programming. When parsing from left to right, no searching through assertions is necessary, as the next item in the string is always available.

There is one extra rule that must be observed in the general case. If there is a rule of the form:

22 Metamorphosis Grammars

N(x) -> x. where x is a variable and N a nonterminal, then it must be represented by two clauses:

Nonterm(N(x), uO, u) <- Nonterm(x, uO, u).

Nonterm(NCx), uOf u) <- Term(x, uO, u).

In practice it is usually possible to select which of these translations is appropriate for the grammar, since one knows whether x will represent a terminal or non-terminal.

3. It is possible to optimise the second representation in two ways:

3.1. The Term predicate may be dispensed with entirely. A clause of the form

Nonterm(A,uO,u)<- Term(B,uO,ul) & Nonterm (C,ul,u2)) & Term (D,u2,u). is replaced (by resolution against the earlier clause for Term) by

NontermCA, B.uO, u) <- Nonterm(C, uO, D.u).

3.2 Each occurrence of an atom having the form Nonterm(N(pl,p2,..,pn) ,ui,uj) where N is a non-terminal 'with n parameters is replaced by the atom N(pi,p2,..pn,ui,uj) which has n+2 parameters.

This has the computational advantage that the name of the non-terminal can be used as the predicate name which is used as a primary index in all Prolog systems. It has the disadvantage that rules of the form N(x) -> x, where x is a non-terminal, cannot be represented directly.

23 Metamorphosis Grammars

In a rule of this form, x may stand either for a term- inal or a non-terminal symbol. Colmerauer's original paper omits to consider the latter possibility, though it only affects his normalisation procedure marginally. However it does affect the representation of the grammar forms in logic. Rules of this form are to be found, for instance, in the Algol 68 report. An example using our notation is:

Pack(x) -> begin x end.

The power of this facility may be illustrated by the ease with which two common notations in extended BNF may be implemented (e.g. see Wirth 1977).

(a) Optional constructs. These may be indicated by placing square brackets around the construct. If the characters '[' and •]' are together regarded as the non-terminal symbol, this may simply be defined by:

[ x ] -> x. [ x ] -> NIL.

(b) Repetitions. These may be similarly indicated using curly brackets, whose definition is:

{ x } -> x { x }. { x } -> NIL.

The two optimisations (3.1 & 3.2) can be combined, yielding the representation which is used by most Prolog systems:

A(B.uO, u) <- C(uO, D.u).

4. A final representation is of interest as possibly

24 Metamorphosis Grammars the most natural translation of the rewriting rules, as well as being that used by Colmerauer in his original paper. A two-place predicate D is used and rules are adjusted to the form:

Nt £ -> £0 Ntl ... Ntm £m where Nti are nonterminals and tj are terminals, are reo- resented using the corresponding string schema:

D (nt. t.u, tO.uO) <- D (nt]_. 'u^^ D(nt m.tm.u,um-i) .

This representation shows how extra terminals may ap- pear on the left hand side after the initial non-terminal. Any number of terminals may of course be included at each point that a terminal is indicated.

Although many grammars for natural language were writ- ten using left hand side terminals, it has been found that they are more obscure than grammars written using other devices, such as the Extraposition Grammars suggested by Pereira (1980).

Simplification of Metamorphosis Grammars

Chomsky's grammars assume an arbitrary mixture of term- inals and non-terminals on the left hand side of rewriting rules. However, most results in language theory have been derived with rules of the "context-free" type which have only a single non-terminal on the left hand side. Colmerauer demonstrates that a grammar may be simplified so that all rules have the form:

nonterminal, terminal,..,terminal -> ...

(where the terminals on the left hand side may be absent).

25 Metamorphosis Grammars

He then presents a binary logic relation which can represent any grammar written in that form (as presented above).

Here we go one stage further and show that any grammar can be expressed in a form which contains rules having only a single non-terminal symbol on the left hand side. This condition is similar to the 'context-free* condition for Chomsky grammars and leads to a representation using a three-place logic relation.

To translate a grammar to 'context-free1 form:

(1) Replace each rule of the form:

11 12--1m -> rl r2••rn

where an<3 r^ are any terminals or non-terminals, by

u) _> N(l1,u0,l2.l3--lm- N(r lrUQrUi) N(r2,ui,U2) N(rn,un_!,u)

are where u, UQ,ui..un_i' variables and N is a function symbol which does not occur in the grammar.

(2) Add productions of the form: N(t,NIL,NIL) -> 1. for each terminal symbol t.

(3) Add a single production of the form N(n, n.u, u) -> NIL. where n and u are variables.

(4) Replace the starting symbols S by N(S, NIL, NIL).

Proof

26 Metamorphosis Grammars

For a term N(afbrc) consider a as the name of the production, c as the required context to the right of the string which this production produces (its right context), and b as the context which has been required by the productions to the left of this one.

Consider a production of the form A -> BC in which the production for B is of the form BD -> E, represented as: N(A,uO,u) -> N(B,uO,ul) N(C,ul,u) N(B,u,D.u) -> n(E,u,u).

Then the right context of B is D. This must be 'absorbed1 by the left context of some production to the right of B. There are four possibilities:

(1) The right context is the next right production of the parent, A (in this case C). Then this matches with and is absorbed by the production: N(n,n.u,u) -> NIL.

(2) The right context is the leftmost descendent of the next right production (C). Then the right context is transmitted via the parameter uO in a production: N(l,uO,u) -> N(rl,uO,ul) ...

(3) The right context occurs in some later right prod- uction of A or in some descendent of it. Then the right

context will be transmitted by the parameters u^ to un (of the procedure A) as long as no terminal productions inter- vene (as these have NIL for left and right context).

(4) The right context occurs to the right of some ancestor of A. Then the right context is transmitted via the parameter u. If the ancestor itself has a right context, this occurs after those right contexts. (As an example, consider the treatment of the right context D in the grammar

27 Metamorphosis Grammars

A ->BCD,BC ->E,ED ->f which produces A ->BCD ->ED ->f.)

The above argument applies to single terminals or non- terminal symbols as right context; if the right context consists of more than one symbol, each can be transmitted in turn in the same way. It applies equally to clauses whose left hand side starts with a terminal or non-terminal - the only difference between them lies in the restricted production for terminal symbols. End Proof

In the grammar rephrased in this way there is only one non-terminal symbol, N, and the symbols used in the original grammar occur as parameters of this. This technique derives from a consistent method of rewriting grammars developed by Pereira (1980). As an example, the grammar presented (p.17) for anbncn as a Chomsky-type grammar, may be rewritten as:

N(S,uO,u) -> N(a,u0,ul) N(S,ul,u2) N(B,u2,u3) N(C,u3,u) I N(a,u0,ul) N(b,ul,u2) N(C,u2,u). N(C,u0,B.u) -> N(B,uO,ul) N(C,ul,u). N(b,uO,B.u) -> N(b,u0,ul) N(b,ul,u). N(b,u0,C.u) -> N(b,uO,ul) N(c,ul,u). N(c,uO,C.u) -> N(c,uO,ul) N(c,ul,u). N(a,NIL,NIL) -> N(b,NIL,NIL) -> fc. N(c,NIL,NIL) -> N(u,u.v,v) -> NIL. where u^ an<3 v are the only variables.

This is a rather ungainly version of the grammar which was rendered much more cleanly earlier, also using an M- grammar. However it is a grammar which may be parsed direct- ly in a top-down left-to-right fashion.

28 Metamorphosis Grammars

Some further characteristics of M-grammars

The flexibility of M-grammars enables one to draw a clear distinction between the lexical syntax of a language and its syntax. A phrase, say "the slithy toves" may be considered as a string of characters, or a string of words. Using the infix 'dot1 notation this may then be either:

t.h.e.1 1.s.l.i.t.h.y.' 1.t.o.v.e.s.NIL or

"the"."slithy"."toves".NIL where "the" is a shorthand notation for t.h.e.NIL.

The second example is then a string of strings, which is a convenient form in which to present the syntax of a programming language. For, given a rule of the form:

Assignment -> Variable ":=" Expression "?". the input is assumed to be a string of function symbols and strings which has previously been 'tokenised' by a . A possible string parsed by this process is:

Identifier("day").":=".Integer(23).";".NIL

Here the strings stand for themselves and any spaces, newlines, comments etc will be removed by the lexical anal- ysis which produced this list. The function symbols in the list hold specific entities which are described by the lexical rules, such as identifiers and numbers. These are described in the syntax by a rule of the form:

Variable -> @Identifier(v). where the symbol 'G1 signifies that the function following

29 Metamorphosis Grammars is a terminal symbol, and 'v1 is the value of this partic- ular identifier.

Examples will be given in chapter 4 of the description of lexical syntax. However, writing a lexical syntax in BNF is a tedious process, basically because one must allow for every possible construct that must come after it. For inst- ance, after a number one might encounter a non-significant character, such as a space, or a significant one such as ')' or but not a digit, or, in most languages, a letter. This is easy to represent in a finite state diagram, but lengthy in BNF. The sensible way to write the lexical rule is:

Number -> Digit (Number I "Alphameric).

The sign stands for negation, which may eas lly be introduced into the logic formalism by introducing a new variable symbol and using logical negation. Thus a logic formulation of the above clause might be:

Number(uO,u) <- Digit(uO,ul) & Number(ul,u). Number(uO,u) <- Digit(uO,u) & ~Alphameric(u,u2).

It is worth noting that the alternative formulation, in which all possibilities are enumerated, must include an empty production if the token can include the last character in the file. This can lead to ambiguous parses if the top level of the lexical syntax is written in the obvious way:

Tokenlist -> Space Tokenlist I Token Tokenlist I NIL. Token -> Number I ... .

In this case, a string such as "1234" can be parsed as

30 Metamorphosis Grammars

"12"."34" or nln."234" as well as "1234". In practice, this problem is avoided by using the rules in order, selecting the first valid parse and using the 'cut' predicate. This does not address more general problems of ambiguity.

Given the formulation of the grammar rules in logic, it is easy to see that other conditions can be incorporated into the grammar. A grammar rule is a Prolog clause in which each predicate has two extra parameters. If these parameters represent the same value, or if they are omitted altogether (which amounts to the same thing), then the predicate can generate no symbols. Such extra conditions are extremely useful as they can perform extra checking, or data manipul- ation. Since their definition is by Prolog clauses, they can be of arbitrary complexity.

A grammar can also be made to produce output while parsing an input text. There are two ways in which this can be done. The most obvious is to use extra parameters for the non-terminals and build up a tree of output symbols. For instance, one might build up the output for a list of state- ments as follows:

Statementlist (sl.s2) -> Statement(si) ";" Statementlist(s2). Statement(Asgt(v,e)) -> Variable(v) ":=" Expression(e).

The list constructor and the function symbol Asgt form nodes of the tree which is built up as output.

A disadvantage of this notation is that parameters used for output obscure the meaning of the grammar and may be confused with variables which form context-sensitive res- trictions. It is therefore possible to deal systematically with output in a grammar in the same way as input is man- aged, by using two extra parameters in the logic represent-

31 Metamorphosis Grammars ation. This may be termed a "correspondence grammar" as it represents the correspondence between input and output. The productions above may be represented as follows:

Statementlist -> Statement Statementlist. Statement -> Variable>>v ":=" Expression>>e => Asgt(v,e).

In these productions, the symbol '>>' means 'outputs' and '=>' is used for the output of the whole production. The clauses may be represented in logic as follows:

Statement(uO,u,xO,x) <- Statement(uO,ul,xO,xl) & Statementlist(ul,u,xl,x).

Statement(uO,u,Asgt(v,e)•xfx) <- Variable (uO,":=".ul,v.x,x) & Expression(ul,u,e.x,x).

Obviously this notation is more suited to a linear output than a tree form, though it is capable of doing both. However there are difficulties in applying it in some inst- ances and it has not therefore been used in the language examples in chapter 4.

A Syntax for M-grammars

In the preceding discussion a rather informal syntax for M-grammars has been used which corresponds to that used in the literature. In order to enable grammars to be handled automatically by program it is necessary to have a precise and versatile machine-readable syntax. Because M-grammars have been implemented as parts of Prolog systems, their representations have followed the dictates of these systems. The Marseilles notation (see Colmerauer 1978) consists of a sequence of almost independent terms, while Edinburgh Prolog uses a notation close to W-grammars (with ',' for 'followed by' and ';' for 'alternatively'). The syntax suggested below

32 Metamorphosis Grammars follows broadly Wirth's (1977) suggestions, with the excep- tion that an arrow (->) is used instead of '=' for 'prod- uces'.

The following symbols are used:

-> produces ; followed by (optional) I alternatively " " enclose terminal strings of characters. Thus "AB" is equivalent to A.B.NIL. @ precedes a terminal symbol which is a function & introduces conditions which do not produce terminal strings. indicates the absence of the specified symbol after that point (or negation within conditions). ( ) is used for grouping (as well as parameters) [ ] enclose optional parts of a production { } enclose sections repeated 0 or more times

Function and constant symbols are written with an init- ial capital letter and variables with lower case. Single quotes may be used to to enclose constant symbols which do not follow the normal rules for identifiers, such as space.

The syntax may be indicated by giving the operator definitions for Prolog:

i Op('-> / 10, RL) .

0P('1' t 20, RL) .

OP('7' t 30, RL) .

0P('&' r 30, RL) .

0p( r 40, PREFIX)

0P( r 40, PREFIX) where the second number stands for the precedence of the

33 Metamorphosis Grammars operator (higher is more binding) and the third parameter gives the type of the operator - RL means Right to Left binding.

Note that the sequence operator (';') is considered optional. It is mainly included to be compatible with Prolog's 'operator precedence' system. In the later sections of the thesis it is often omitted for syntax sections, although still used in the semantics parts.

34 Chapter 2.2

The Development of Syntax Descriptions

"Alice felt dreadfully puzzled. The Hatter's remark seemed to her to have no sort of meaning in it, yet it was certainly English" Lewis Carroll: Alice in Wonderland.

The classical approaches to syntax description are those of Chomsky and Backus. When these apply to context- free or regular grammars they are almost identical, and BNF (Backus Naur Form, or Backus Normal Form) has been the only widely accepted formalism for describing the syntax of prog- ramming languages. Chomsky's phrase structure grammars have been much less successful in describing the context- sensitive parts of real languages, and their main use has been for language theorists.

The Results of Language Theory

Language theory has advanced using two concepts which have advanced in parallel - generators and recognizers. Much of the practical benefit of this theory has come from the proofs of equivalence between these two that have been proved for different classes of language. The four main classes remain those defined by Chomsky by restricting the sets of rewriting rules: a -> b where a and b may contain any set of terminals or non-terminals. These are:

Type 0: Unrestricted. Type Is Context-sensitive. Each right hand side must be at least as long as the left hand side; hence 'empty1

35 The Development of Syntax Descriptions

productions are not allowed. Type 2: Context-free. Each left hand side can only contain a single non-terminal and the right hand side is not empty. Type 3: Regular. As for context-free with the additional constraint that each right-hand-side must start with a terminal symbol.

Corresponding to these categories are the recognizers which are normally classed as 'automata1. It may be shown (e.g. see Hopcroft & Ullman 1969) that the following equivalences hold:

A language is: Unrestricted iff it is defined by a two-way unbounded auto- maton (a Turing machine). Context-sensitive iff it is defined by a two-way linear bounded automaton. Context-free iff it is defined by a one-way non-determin- istic pushdown automaton. Regular iff it is defined by a one-way deterministic finite automaton.

These equivalences are important for practical language processing as one wants time and space guaranteed bounds for various parts of a system. The similarity of automata to basic M-grammars may be obvious: if we consider the two 'extra parameters' of the M-grammar as a read-only tape which may be traversed in one or two directions, then we may assert (without proof) that: a regular grammar may be parsed without backtracking and with no function symbols being used in the grammar, a context-free grammar may be parsed either using no func- tion symbols and backtracking (an implicit stack), or without backtracking and using an explicit stack.

36 The Development of Syntax Descriptions

context-sensitive and unrestricted grammars will in general use arbitrary function symbols.

The subclasses of M-grammars have not been deeply explored yet, though Warren (1975) proposed the use of the Earley parsing method as the basis for a Prolog deduction system.

The Vienna Definition Language

In 1962 McCarthy introduced an attitude to the context- sensitive parts of grammars which flowered in the Vienna Definition Language (Lucas, Walk 1969) and has continued through to denotational semantics. This treats the context- free parts as "the syntax" and all other parts as "seman- tics", which partly explains the attitude of various members of this school that "syntax is uninteresting, semantics is everything".

McCarthy used an "analytic" approach to syntax which contrasts with the "synthetic" approach using rewriting rules. His formulation is based on the use of recursively defined predicates and is 'abstract' in that it defines only the essence of a language, without defining the written representation. Thus, for instance, an assignment statement consists "essentially" of a left part and a right part; conditions attached to this are that the left part is a variable and the right part a term. McCarthy writes this as:

isasgt(t) = isvar(leftpart(t)) & term(rightpart(t)). where leftpart(t) and rightpart(t) are functions which de- compose the statement t into its constituent parts. He also suggests a synthetic form which is closer to the abstract syntax used in M-grammars, using constructive functions

37 The Development of Syntax Descriptions

(e.g. mkasgt). If this is expressed in the Prolog notation it appears as follows:

mkasgt(leftpart(s),rightpart(s)) = s <- isasgt(s). isasgt(mkasgt(s,t)).

The Vienna Definition Language (Lucas, Walk, '69) took over the analytic form, using a slightly different syntax. For instance, the assignment statement is represented:

is-asgt = (, ).

Here, left-part and right-part are selectors and is-var and is-term are predicates which define the subparts. Thus if x is an assignment statement (for which is-asgt(x) is true), then is-var (left-part(x)) is true.

As the aim of VDL was to produce a complete definition of a programming language (the main object of interest being PL/1), the researchers defined, in addition to the abstract syntax, a "concrete" syntax. Thus the definition of an assignment is split into two parts, is-c-asgt and is-abs- asgt, which may be written

is-c-asgt = (,, <53: is-c-exp>) is-abs-asgt = (, )

Obviously the concrete syntax is simply a (somewhat clumsy) restatement of BNF with annotations. There is no direct relationship between the concrete and abstract syn- taxes. The relationship is defined by a program, the Trans- lator, which is expressed in a functional language using McCarthy's conditional, which checks the context-sensitive restrictions. Following Marcotty et al (1976) the state- ments for the assignment (which we will not describe in

38 The Development of Syntax Descriptions detail) may be written: trans-asgt-stm(t) = [valid-mode-for-assignment(t)->translate-assignment(t) true -> error] translate-assignment(t) = [ref-chain-lengths(s^(t)) = 1 -> mu0(Ctarget: make-id(si(t))>, )

true -> mu0(, )

Thus to describe the assignment statement including the context-sensitive parts requires four parts, containing a total of six statements. These are distributed over three separate texts which must be cross referenced for use. This example goes a long way towards explaining why the PL/1 definition (the "Vienna telephone directory") is such a voluminous document and difficult to comprehend.

It should be noted that the point at which the context- sensitive restrictions are applied varies between different authors. For instance Lauer (1968) uses only an abstract syntax with an interpreter which defines both context-sens- itive restrictions and the semantics.

39 The Development of Syntax Descriptions

W-Grammars

The approach of van Wijngaarden contrasts strongly with that of McCarthy. Whereas one might say that McCarthy wished to make everything into semantics, it may equally be said that van Wijngaarden wished to make everything into syntax. The chief and definitive example of W-grammars is in the definition of Algol 68, particularly in the revised report (van Wijngaarden et al 1975).

Context-sensitive restrictions are introduced into a grammar by means of two levels of productions, called res- pectively metarules and production rules. A third class of rules - the hyper-rule - is in some sense a mixture of the two types of rule and is therefore not fundamental, although this is the form of most rules in the Algol 68 report. As an example of a rule let us take the rule for assignment (or 'assignation1 in Algol-68 jargon):

REF to MODE NEST assignation: REF to MODE NEST destination, becomes token, MODE NEST source.

The syntax of a hyper-rule is that ':' stands for 'pro- duces', ',' stands for 'followed by', and ';' for 'otherwise'. Words in capitals stand for Metanotions, which are ultim- ately defined by metarules. They perform the same function in W-grammars that variables play in M-grammars. Words in small letters are part of all the production rules that are derived from these rules. Metaproduction rules are written in much the same format except with a double colon to dis- tinguish them. Production rules are used as follows:

(1) The metanotions (in large letters) are produced using the metaproduction rules until one is left with only small letters. If one metanotion occurs more than once in a

40 The Development of Syntax Descriptions

rule, it must be produced identically in all places. (Meta- notions may be followed by digits to distinguish separate uses within one production).

(2) The resultant production rule is produced in a normal way similarly to BNF.

In the clause above there are actually three meta- notions (REF, MODE, NEST) although this is not altogether obvious as spaces are irrelevant in rules. Thus, if there was a metarule for "MODENEST" somewhere, it would be valid to use that. 'REF1 is the Algol 68 term for a pointer, MODE for a type, and NEST stands for all the names declared at this point of the program.

Let us follow through the expansion of this clause for the case of the assignment of an integer to an integer variable; for brevity we will omit the 'NEST' productions, and assume that only a single identifier 'a1 is declared anywhere in the program. The relevant metarules are

REF:: reference; transient reference. 1.2.1. MODE:: PLAIN; STOWED; REF to MODE;.... PLAIN:: INTREAL; BOOLEAN; character. INTREAL:: SIZETY integral; SIZETY real. SIZETY:: long LONGSETY; short SHORTSETY; EMPTY. EMPTY:: .

Choosing the appropriate rules, we generate

REF -> reference MODE -> integral

NEST -> new reference to integral letter a symbol

The rule then becomes the following

41 The Development of Syntax Descriptions reference to integral new reference to integral letter a symbol assignation: reference to integral new reference to integral letter a symbol destination, becomes token, integral new reference to integral letter a symbol source.

This may then be put aside for use as a normal prod- uction rule. It is obvious that, as many of the metarules can be produced indefinitely, the potential number of prod- uction rules is infinite. Thus the essential limitation of CFG rules (that it requires an infinite number to express context-sensitive grammars) is overcome.

The Algol-68 report demonstrates that W-grammars are capable of defining precisely the syntax of major and com- plex languages. Unfortunately they have not received wide acceptance in computing circles, and the method of defin- ition was certainly partly responsible for the slow accept- ance of the language. In fact most implementations of Algol-68 have returned to the use of BNF in their manuals (e.g. the Algol 68R Manual).

A detailed comparison of W-grammars and M-grammars is given in (Moss 1979) and will not be repeated here. But it is worth summarising some of the specific difficulties with the use of W-grammars.

(1) The method of two level productions is combinator- ially explosive and not feasible for practical use. Some modified method of using metaproductions directly, rather than to generate intermediate level rules, is presumably used by most (human) users. (2) Partly because of the above, no means of mechan-

42 The Development of Syntax Descriptions ically analysing W-grammars, or languages based on them, has been successful. Their use has been mostly limited to exposition.

(3) Unlike the functional notation used in mathemat- ics, there is no clear distinction between the 'name' of a production and its 'parameters' and it is frequently diffi- cult to see where one parameter starts and the other leaves off. Thus "REF to MODE NEST assignation" would better be written "assignation (REF to MODE, NEST)", where one might see that it has 2 and not 3 parameters. An ancestor of this hypernotion is "MODE FORM" and it is impossible to guess which is the name and which the parameter.

(4) There is inadequate distinction between produc- tions which produce strings and those which are there simply as conditions. The only distinction is that one produces a notion ending with the word 'symbol' and the other yields an empty terminal production, whereas unsuccessful productions of either kind end in "blind alleys". In the revised re- port, conditions are normally indicated by prefacing them with the word "where" or "unless", which aids comprehension considerably (although the corresponding hyper-rule may start with "WHETHER" which can produce either).

(5) There is no easy or general way of simplifying a W-grammar to a context-free grammar (by leaving out scope rules). It is of course possible to write a grammar in such a way that this is possible, but this was not done for Algol-68 - presumably the generality of the method invited short cuts which were hard to resist.

It is interesting to note that difficulties (1) and (2) are exactly the same objections that were raised against the use of logic by many in computing during the '60's (i.e. before the discovery of the Resolution principle (Robinson

43 The Development of Syntax Descriptions

1965) and its exploitation in a restricted form (Kowalski 1974). As Robinson (1979) points out concerning early theo- rem provers, "a proof would eventually be found by the computer - but only after running through combinations of instantiations that might be as many as 10"10"10"10. The 'combinatorial explosions' caused by these early experi- menters echoed through the corridors of computer centres and singed the eyebrows of the intrepid pioneers of 'mechanical theorem proving' several times".

An early implementation of W-grammars by de Chastellier and Colmerauer (1969) took 90 sees on a CDC 6400 to produce parses of a 30 symbol string, using up to 300 rules. A more systematic examination of the problem of parsing using W- grammars by Watt (1974) came to the conclusion that they are not suitable for the automatic construction of parsers.

Since both W-grammars and M-grammars are equivalent to Chomsky type-0 grammars, it is obvious that they are in some sense equivalent. To define a translation procedure from WG's to MG's is somewhat more difficult for reasons anyone who has looked closely at the Algol-68 report may realise.

(1) We will only attempt to define a translation method for "sensibly written" WG's. Thus we will exclude (as does the Algol 68 report para 1133d) any metanotions which may be concatenated to form other metanotions e.g. if MOID and FORM are metanotions, then MOIDFORM may not be a metanotion.

(2) We rely heavily on the idea of "abstraction" (report para 1142b) which produces hypernotions (called paranotions) which do not occur as such in production trees.

(3) We assume the existence of cross references bet- ween the definition and use of hypernotions (Report 1134f).

44 The Development of Syntax Descriptions

This is a necessary preliminary which is not always easy to deduce.

The translation procedure (of which an example is given below) is as follows:

(1) Group together the metanotions in every hyperrule by bracketing groups of symbols that appear as complete entities on either the left or right hand side of a meta- production rule or some metaderivative of these.

(2) Expand the metanotions occurring in the hyper- rules, possibly creating new rules, until there is an ident- ifiable common fragment between the definition of each hypernotion and all its uses. In cases where the cross reference does not include the complete hypernotion, there may be more than one such fragment. This fragment forms the 'non-terminal1 symbol of the MG. In cases where this frag- ment is a metanotion, an abstraction may be used instead.

(3) For each non-terminal symbol 's' formed in (2) a number of parameters is chosen depending on the bracketing introduced in (1) in all the hypernotions. Metaproductions maybe invoked to expand metanotions to ensure uniformity.

(4) Each hypernotion is now rearranged into the form of function symbol and parameters. Where a parameter con- sists of a single metanotion this is replaced by a variable consistently in each hyperrule. Where there is more than one, appropriate function symbols are introduced (which may include the list notation) so that a single term results. Where there is no more than one non-terminal symbol in the hypernotion, one must be a parameter of the other. The necessary syntax changes (':' to '->', ',' to ';', ';' to ' I') are also made. Where the parameter includes proto- notions, these may be represented as constant symbols or

45 The Development of Syntax Descriptions function terms.

(5) Any non-terminals all of whose terminal prod- uctions are empty may be replaced by conditions, whose productions are expressed as logic clauses rather than gram- mar rules.

(6) The M-grammar consists of the transformed hyper- rules.

As an example of this (somewhat loose) translation procedure let us consider the introduction and definition of the serial clause in the Algol 68 definition (para 311a and 321a). These demonstrate most of the possibilities above. They are written as they occur in the revised report, in- cluding cross references.

311a) SOID NEST closed clause (22a, 551a, A341h, A349a>: SOID NEST serial clause defining LAYER (32a> PACK. 321a) SOID NEST serial clause defining new PROPSETY {31a, 34f,l,35h}: SOID NEST new PROPSETY series with PROPSETY {b}.

We first group together the metanotions and their prod- uctions using various metarules:

(SOID) (NEST) closed clause: (SOID) (NEST) serial clause defining (LAYER) (PACK) (SOID) (NEST) serial clause defining (new PROPSETY): (SOID) (NEST new PROPSETY) series with (PROPSETY).

If we now try to produce the result of the first clause using the second, we find that (LAYER) (PACK) cannot produce (new PROPSETY). In fact, there are two primary fragments in the right hand side of the first rule. If we invoke the metarule:

46 The Development of Syntax Descriptions

31B) PACK:: STYLE pack. we discover that another hyperrule intervenes between these two rules:

133d) NOTETY STYLE pack: STYLE begin token, NOTETY, STYLEend token.

This rule corresponds to putting "begin., .end" or "(...)" round the serial clause (or any non-terminal se- quence) depending on which STYLE is chosen. So we must invoke the metarrule (31B) before transforming the clause. We may now identify the common fragments in the clauses, which will form the non-terminal symbols (these are under- lined) s

(SOID) (NEST) closed clause: (SOID) (NEST) serial clause defining (LAYER) (STYLE) pack. (SOID) (NEST) serial clause defining (new PROPSETY): (SOID) (NEST new PROPSETY) series with (PROPSETY).

We now choose the number of parameters for each non- terminal. For reasons of compatibility with the rest of the grammar, we expand the metaproduction SOID into SORT MOID (31A>. SORT expresses the amount of coercion to be applied to the resultant value, and MOID is its mode or type. Writ- ing variables in upper case, and constants in lower case the clauses now take on their final MG form.

closed clause(SORT, MOID, NEST) -> pack(STYLE, serial clause defining(SORT, MOID,LAYER,NEST)). serial clause defining(SORT, MOID, new(PROPSETY), NEST) ->

47 The Development of Syntax Descriptions

series with(SORTf MOID,PROPSETY, new (PROPSETYf NEST)).

Comments:

1. The use of 'pack1 in the clause above demonstrates the need for non-terminals as parameters of other non- terminals. As mentioned in ch. 2 this facility is not provided in some versions of MG's (e.g. DCG's).

2. In a small number of cases the translation of metanotions as variables means that restrictions implicit in the metarules are not applied to the variables. In this case an extra constraint must be added to the lowest level of rule in which this metanotion is used, and expressed by means of logic clauses which parallel the metaproduction rules.

Attribute Grammars

The idea of adding attributes to each non-terminal symbol of a context-free grammar dates back to Irons (1961). But the definition of attribute grammars is generally cred- ited to Knuth (1968), who distinguished for the first time the ideas of inherited and synthesised attributes. The value of an inherited attribute is derived from the surrounding productions, whereas that of a synthesised attribute is produed by the production itself. It is worth quoting one of his examples giving "the most natural" definition of binary numbers, which shows the potential complexity of the evaluation rules arising from a grammar. It is defined in terms of syntactic (context-free) rules and semantic (func- tional) rules. There are three functions which take as arguments the value represented by the corresponding produc- tion rule. Subscripts in the production rules are only used to distinguish multiple occurrences.

48 The Development of Syntax Descriptions

v(P) is the "value" of production P - a rational number s(p) is the "scale" of production P - an integer 1(P) is the "length" of production P - a natural number

Syntactic rules Semantic Rules N -> L v(N) = v (L) , s (L) =0

N -> l>1 . l2 v(N) = v(L1)+v(L2)f s(Lx) =0,

s(L2) = -l(lj,) L -> B v(L) = v(B), s(B) = s (L) , 1(L) = 1 h1 -> L2B S(L2) = sd^J+l, 1(L= 1(L2)+1 B -> 0 v(B) = 0 B -> 1 v(B) = 2 ** s(B)

Though the representation of the semantic rules has been considerably improved since, this does show clearly the underlying mathematical functions (subscripts simply dis- tinguish different instances of one symbol). The depend- encies are demonstrated by considering the parse of the binary number "110.01". The values of the functions in the top-level production (N -> 1^.1^) are as follows:

s(LX = 0

S(L2 -2

1(lx 3

kl2 2 6 v(Lx

V(L2 0.25 v (N) 6.25

The value of leading l's (set in the production B -> 1) depend on the scale of B, which depends in turn on the scale of L-^ in the top-level production which is not set until the radix point is reached. To the right of the radix point, the situation is even more complex. Considered with respect to the parse tree the length attributes must be evaluated from bottom up, before the 'scale1 attributes can be eval-

49 The Development of Syntax Descriptions uated from the top down and finally the value attributes from the bottom up.

Represented in M-grammars, the value can be evaluated in a single left to right pass, assuming that the functions are not evaluated and ignoring backtracking and left-recurs- ion. The grammar for this is:

N(v) -> L(v,1,0) . N(vl+v2) -> L(vl,11,0); " . " ; L(v2,12,-12). N(v,l,s) -> B(v,s). L(vl+v2,1+1,s) -> L(vl,1,s+1); B(v2,s). B(0,s) -> "0". B(2**s,s) -> "1".

Knuth pinpointed one valuable criterion in the analysis of attribute grammars. A grammar can only be well-defined if its attributes contain no circularities; i.e. the value of an attribute must not depend on its own value. This reduces to the problem of whether the directed graph, in any syntax graph, formed by the dependency links between inher- ited and synthesised attributes, contains an oriented cycle. The class of all dependencies can be generated by consider- ing the graphical fragments formed by each production. Dependencies are added by considering all possible matching rules and the process continued until no further links can be added. Since only a finite number of links are possible this process must terminate eventually, although, as Jazayeri et al (1975) point out, if the resultant subgraphs do not contain any cycles then no cycles are possible.

It is interesting to note in the context of Prolog that Knuth's algorithm corresponds to a determination of whether the "occurs check" is necessary in a given program. The occurs check is part of the unification algorithm which is

50 The Development of Syntax Descriptions designed to prevent the binding of a variable to a term which contains an instance of itself. To produce the corres- ponding graph in this case, some bidirectional links must be introduced where a variable occurs as the parameter of a predicate in more than one place. If the algorithm does not introduce cycles then the occurs check is unnecessary. The check, which is generally expensive, is normally omitted in working systems.

Closely related to Knuth's grammars are Roster's Affix grammars (Koster 1971a). These incorporate the notions of generation and consistent substitution from W-grammars but their main development was still geared towards efficient parsing; they formed the basis of the widely used CDL system (Koster 1971b) which produces a top-down parser for any LL(1) grammar. A grammar rule is written in a somewhat similar fashion to W-grammars but with values of "affix variables" following the symbol ' + ' after the name of a non- terminal. e.g.

assignment + env: identifier + env + model, ':=', expression + env + mode2, checkref + model + mode2.

Each affix is either a value in some specified domain or a variable which ranges over that domain. There are strict rules concerning the definition of variables: the leftmost occurrence of a variable is in general the defining occurrence and each subsequent use is an application of that value (excepting a derived instance on the left hand side, which is defined on the right hand side). Each variable must be defined exactly once. In addition to non-terminals there are 'primitive predicates'. These have affixes as well but are defined by functions; they must map their inherited derived affixes. A grammar satisfying these cond- itions is named "well-formed".

51 The Development of Syntax Descriptions

These, with two other conditions related to Roster's use of top-down parsing, form the basis of CDL. Watt (1977) defines a somewhat more general algorithm for bottom-up parsing.

Inherited (i.e. input) and derived (output) instances of variables are distinguished in Roster's grammars but the intention is only to ensure that for each non-terminal (or primitive predicate) expression these can be uniquely deter- mined. Affixes for parsing purposes are generated by context-free rules (similar to the metarules of W-grammar) which are subsequently substituted in the productions of the affix grammar. The concept of input-output as it is under- stood in resolution terms is not present.

The question of the evaluation of attributes was taken up by Bochmann (1976). Several language features - such as the scope of labels, and the declaration of mutually recurs- ive procedures, require more than one pass over the source text to evaluate all the attributes of a grammar.

Bochmann gives a test by which it can be determined whether or not the semantic evaluation can be performed in one pass. This asserts that the 'dependency set' (the attributes on which values depend) must only include attri- butes of symbols to the left of the symbol. This is very close to Roster's test for well-formednesss. However he goes further in determining the number of passes necessary to evaluate all the semantic attributes; on each pass one removes the attributes that can be evaluated, and this continues until either no more remain or no more are re- moved. This test obviously subsumes Rnuth's circularity test, though it also catches non-circular definitions which cannot be evaluated in a fixed number of passes. Bochmann gives the following example, which describes one possible definition of block structure in an Algol-like programming

52 The Development of Syntax Descriptions language:

::= empty-table ^usedl ::= ^used2^usedlTupdate & condition: used2 = update ^usedltfused2tupdate ::= ^usedltused2Tupdate ^usedltfupdateltupdate ^usedltused2Tupdate ::= ^usedl^used2Tupdate ^usedltused2Tupdate : : = Tdeclare concatenate ^used2tdeclareTupdate ^usedl^used2tused2 : : = ^usedl \husedl^used2Tused2 : : = begin ^usedl end

This example shows Bochmann's notation for attribute grammars (used in Marcotty et al 1976). Inherited attri- butes are shown following the symbol "tf". Synthesised attri- butes follow the symbol "Tn. The syntax is based around BNF with two other additions: in the second production a cond- ition is given which expresses the context-sensitive cond- itions. In the fifth is demonstrated an "action symbol" concatenate, which is not defined within attribute grammars, and was first suggested by Lewis, Rosencrantz and Stern (1974) .

In this grammar (for which lower levels are omitted) there are two types of attribute. Those named 'used1 and 'update' represent a symbol table, and 'declare' represents a single item in that table. The grammar is one in which statements and declarations may be mixed indiscriminately and the problem arises because of the nested block structure (in the last production). The number of passes required for

53 The Development of Syntax Descriptions this grammar is one more than the depth of blocks in the program. The attributes can be simply changed into a program requiring two passes.

Translation Grammars

The aim of any compiling system is to produce output corresponding to the text parse. Two approaches have been suggested for attribute grammars.

(1) Include one or more synthetic attributes with each symbol, so that ultimately the translation is expressed by a single term at the root of the syntax tree. This is envis- aged by Knuth (1968) but has not been widely used by propon- ents of attribute grammar. There are probably two reasons for this: (a) it involves storing the whole tree in memory, which is often not desirable for practical systems. (b) The mathematical functions used to compose the output generally have a distinctly algorithmic flavour and the languages in common use do not provide for the treatment of functions as data structures.

(2) The alternative is to use a second grammar to output the result. This is generally termed a 'Syntax Directed Translation Scheme' and has been used by many systems since Irons (see for instance Aho, Ullman 1972). The output grammar is frequently reverse Polish or similar easy-to-use form. The concept was incorporated into what Lewis, Rosencrantz and Stern (1974) call 'Attributed Trans- lations'. In these the terminal symbols of a single grammar are divided into two classes named 'input symbols' and 'action symbols'. The input grammar may then be abstracted by deleting the action symbols. The 'action symbols' can be implemented as procedures which perform output. They show that the standard pushdown machines used to recognize

54 The Development of Syntax Descriptions context-free grammars can be generalised to treat these translation schemes without sacrificing the desirable prop- erty of linear time.

In Marcotty et al (1976) action symbols are used in a slightly different way and are roughly equivalent to Roster's primitive predicates. They implement the semantic rules to express the semantics of a language considered as an interpreter. They have also been taken up by Watt and Madsen (1977) to form "Extended Attribute Translation Gram- mars". Here two entirely separate grammars are used, con- nected only by the non-terminal symbols which are common to both and the attributes of the input grammar which the output grammar can use.

Extended Attribute Grammars

Attribute grammars, while being highly suitable for use in parsing are not ideal for the description of languages. It is to facilitate this that Watts and Madsen (1977) def- ined Extended Attribute Grammars. The differences with Attribute Grammars are:

(1) Attribute positions may be occupied by expres- sions, not just variables or values.

(2) A variable may be defined in more than one place, so that there is an implicit check on the equality of each instance.

To illustrate the difference, one may note the defin- ition of an assignment statement using Bochmann's notation for attribute grammars and Watt and Madsen's notation for extended attribute grammars.

55 The Development of Syntax Descriptions

Attribute Grammar

assignmentsENV : identif ierSENVtMODEl, ":=", expressionSENVTM0DE2, where MODEl = ref(MODE2).

Extended Attribute Grammar

assignmentsENV : identifierSENVtref(MODE), ":=", expressionSENVtMODE,

Both grammars express the notion that the mode of the left hand side must be the same as the right hand side with an added "ref", but the first has to introduce an extra variable and a condition in order to do this. In the second example the variable MODE is evaluated in two places. Also the function "ref" is indicated as the result of an evalua- tion.

Watt and Madsen discuss the conditions under which an extended attribute grammar may be easily parsed, by showing extended attribute grammar can be translated back into an attribute grammar. Most of the transformations are illus- trated by the comparison above, but they produce one signi- ficant condition. Every function (such as ref above) may need to be translated to its inverse if in order to check a context condition it is necessary to get at its domain. This follows from the definition of these as mathematical functions which are assumed to be immediately evaluated. They thus deduce two rules for the well-formedness of EAG's:

(1) Every variable occurs in at least one defining position in each rule in which it is used.

(2) Every function used in the composition of an attribute expression in a defining position has a (partial)

56 The Development of Syntax Descriptions inverse function.

They comment that "These conditions do not seem to be too restrictive in practice".

Having surveyed the development of attribute grammars it is appropriate to evaluate the similarities and differ- ences between them and M-grammars.

Attribute grammars have evolved to a point at which the differences between EAG's and M-grammars are minimal. The most obvious difference is that the input-output pattern of attribute grammars is indicated whereas in M-grammars this is not necessary. In fact Watt and Madsen (1977) observe that "the distinction between inherited and synthesised attributes makes no difference to the language generated by an EAG. Nevertheless we believe that this distinction makes a language definition easier to understand". This leads us to ask what is the status of the input-output patterns.

The answer may be found in Kowalski's (1979) distinc- tion between logic and control. The input-output patterns may be viewed as annotations (Schwartz 1977) which provide information for the implementer. It is in this sense that they have been used in Prolog implementions (see Clark, McCabe 1979). In attribute grammars they serve several func- tions: (a) They ensure that every attribute is defined, (b) They ensure that an attribute is defined before it is used. In the case of a one-pass system this may be checked lex- ically. For multi-pass systems, Bochmann's analysis indi- cates to which pass the evaluation must be assigned, (c) They indicate the way in which an expression is evaluated in its context.

57 The Development of Syntax Descriptions

In a Prolog system the use of resolution provides a symmetry and flexibility in matching input and output which is not found in other languages. It gives a meaning to input and output that is not present in the van Wijngaarden method of substitution, which is closer to pre-resolution concepts in logic. It therefore provides a more realistic basis for attribute grammars. The explicit use of input- output annotations in attribute grammars has pragmatic value. For instance, it is highly desirable that the trans- lation of an input sentence (considered, say, as a term produced at the root of the syntax tree) should be a variable-free term. It is an open question (to the author's knowledge) whether this fact can be determined automatically from an M-grammar, without the help of annotations. With annotations it is a relatively simple task.

The use of these annotations also clarifies the use of functions in a logic program. Prolog systems generally use unevaluated functions symbols in most contexts (exceptions are within the "is" predicate in Edinburgh Prolog and in Robinson's logic in LISP). This makes many of the problems of multiple pass systems irrelevant, and means that func- tions can be treated as data structures. However, for efficiency it is very important to evaluate functions at the earliest opportunity. Hence the results proved for attri- bute grammars are relevant to Prolog systems in this area too.

On the other hand, the logic formulation provides a much better basis for semantics than attribute grammars. It is generally accepted that attribute grammars are not com- plete in themselves. The definitions of action symbols or conditions cannot be completely defined within the formal- ism. Although, as we have remarked, the word 'semantics' is often used by people working in the area of to

58 The Development of Syntax Descriptions refer to the context-sensitive parts of syntax, the original papers by Knuth provided complete semantic definitions. Since these are constructive definitions rather than specif- ications we would argue that the grammars presented in this thesis are more complete and the two-pass methodology we will present is generally easier to understand.

59 Chapter U.

The Basis of Relational Semantics

To define the semantics of a language means essentially to give a specification for the value of any program written in that language (where the term 'value1 is as defined below).

The primary means of giving specifications has trad- itionally been logical or mathematical. The semantics of logic was the area in which most ideas about semantics were worked out, following Tarski and Carnap.

As a preliminary, one might observe that in linguistics there are very separate fields dealing with the meanings of words and with the meaning of syntactic units, or sentences. This distinction is also useful in describing programming languages. The value of certain elements - e.g. a numerical constant - can be determined without reference to any other part of the language. However, the value of a 'while' loop may depend on any of the other parts of the language. In giving the semantics of a language it is useful to observe this distinction which is often made by using the terms "static" and "dynamic" semantics. (Note that we do not call context dependencies "static semantics" as they more properly belong to the syntax.) Some formal definitions attempt to make everything into a static function associated with syntax (e.g. attribute grammars) or treat semantics entirely separately from the syntax (e.g. denotational sem- antics) but th*y make an exposition less natural and easy to understand.

60 Relational Semantics

The value of a number in a program is easily given as an adjunct of the grammar, by specifying a 'translation1 from the text. Many grammar systems (e.g. Irons (1961)/ Lewis, Rosencrantz, Stearn (1974), Watt, Madsen (1977)) have specified such features. The value of a program - a dynamic execution involving conditionals, loops etc. - is more eas- ily considered as a function of the program as a whole. If the translation of a program is considered as a term includ- ing the value of static units and functional terms which name the dynamic units, then the value of the program is the value of this term.

One characteristic of the value of a program is that it may be undefined because of (for instance) a non-terminating loop. Although as a consequence of the unsolvability of the halting problem it is impossible to determine in all cases whether or not a particular program will terminate, it is desirable that the semantics should be able to specify both terminating and non-terminating programs. For this reason a purely mechanical means of extracting the solution - (e.g. by means of an interpreter) is unsatisfactory.

Programming languages are generally very complex ling- uistic objects incorporating many different domains (or types) of objects, the means of structuring these objects into more complex objects, and elaborate control structures to sequence the program. These interact together in a combinatorial fashion so that to describe the semantics directly would be a huge task.

It is therefore preferable to structure the task: first the semantics of a very simple language is defined; then this is used as a metalanguage to define the semantics of the language being defined.

In effect the language is defined using the meta-

61 Relational Semantics language as an interpreter. However the metalanguage itself is defined as a specification and not as a procedure.

Traditionally, the metalanguage used in computing has been the lambda calculus, developed by Church to explore the behaviour of functions. This did not itself have an ad- equate semantics until a model was developed by Dana Scott. This was a complex mathematical construction using lattices and more recently neighbourhoods (see Scott 1970). Apart from these complexities, which may be largely ignored in practice, the lambda calculus has difficulties for the un- initiated: for instance, it is necessary to introduce the "paradoxical" combinator Y in order to provide recursion in the language, and the same "call-by-name vs call-by-value" problems that have plagued computing languages reappear (albeit with some solutions).

In this thesis we propose the use of the first-order predicate logic as a metalanguage. Although logic has been used before in defining programming language (see Burstall 1969) the recent development of logic as a programming language (Kowalski 1974) gives new insights into the ways in which logic can be used in this process.

The semantics of logic may be defined in two separate ways. The first formalises the idea of logical consequence and is generally called the model-theoretic semantics. The second corresponds to the way in which theorems may be derived from axioms and is called the proof-theoretic sem- antics. For the first-order predicate logic these two are equivalent. Van Emden and Kowalski (1976) argue that the fixpoint semantics of logic considered as a programming language (the procedural interpretation) corresponds to the model-theoretic semantics. The proof procedure for logic is then the operational or proof-theoretic semantics.

62 Relational Semantics

Thus a logic program may be regarded in two ways: as a specification which is a static mathematical object whose semantics are described in a purely formal way; or as a program which may be interpreted on a computer. It does not, of course, follow that any logic program can be ex- ecuted efficiently or that it will terminate. For any given task one may have a range of logic programs ranging from one that provides the clearest specification, often requiring the full apparatus of logic (including explicit quantifiers etc) to something which is as efficient as possible in execution terms. It should then be possible to prove that these programs define the same relation.

It is desirable at this point to clear up one possible misconception. It is often assumed that because of the first-order nature of the logic that is being used (i.e. one does not have function symbols as variables) that this formalism is in some way "weaker" than a higher order logic, and cannot therefore be used to describe some programming languages. The answer to this lies in the way in which logic is used to describe a programming language: we repre- sent a program and its value as terms, although the set of predicate and function symbols that is used to define any program is fixed once and for all by the definition of the language. The limitation in first-order logic occurs only at the level of predicate symbols. At the term level, first-order logic is equivalent to any other formalism.

Thus the use of logic as a metalanguage gives the flexibility of higher order systems in a controlled and well-structured way. In the programming language defin- itions, the relation and function symbols used are essent- ially metaconcepts, used to name the values for which they stand.

In the following section we review the work that has

63 Relational Semantics been done on the semantic basis of the logic formulation that we use. There is no attempt to reproduce the work, only to summarise the results.

The Semantics of Predicate Logic

The semantics of a logic program can be given in two ways.

(1) The Model Theoretic Semantics. This is an essentially mathematical approach which involves considering the value of all possible formulae that can be derived by applying the predicate and function sym- bols of a program to a specified domain of individuals.

The value (or denotation) of a clause

A <- B1 & B2..&Bn. is true in an interpretation if and only if for every as- signment of individuals in the domain, if the antecedents Bl*-Bn are true' then so is the consequence, A; otherwise it is false.

In order to give a meaning to this it is necessary to interpret all the predicate and function symbols: this is an essentially intuitive process in which one gives the value 'true1 or 'false1 to each individual predicate applied to every individual being considered. In fact it can be shown that it is sufficient for the clausal form of logic to consider an extremely simple interpretation - the Herbrand interpretation - in which each term denotes itself. It follows from the Skolem-Lowenheim theorem that a program in clausal form has a model iff it has a Herbrand model. The use of a Herbrand model gives a 'syntactic' feel, which is

64 Relational Semantics

illusory - we are considering the semantics.

A model of a program is an interpretation for which the value of each clause is true for all possible substitutions.

It is argued by van Emden and Kowalski (1976) that the fixpoint semantics of the procedural interpretation is a special case of the model theoretic semantics. A logic program x is regarded as an equation x=P(x) where P is a continuous function. The ordering function that is used is the subset relation. Horn Clauses possess the model inter- section property: i.e. the intersection of any two Herbrand models is itself a model. Hence the subset relation is a valid partial order. Then Tarski's fixpoint semantics shows that the minimum fixed point exists. Van Emden and Kowalski show that this is consistent with the model theoretic sem- antics.

The formalism used in later definitions is Horn Clauses augmented by negation as failure. The proof rule for this includes a 'closed world1 assumption - that if a fact cannot be derived from a set of axioms, then it is false. The original justification for this (Clark 1978) is proof- theoretic. However, Apt and van Emden (1980) demonstrate that finite failure may be characterised as the maximal fixpoint of the definition. The use of negation is essential in certain areas of the semantic definition. Some error situations - e.g. the occurrence of an overflow - can easily be dealt with by introducing negative predicates, which do not require negation. However, in the case of a non-termin- ating program - which can only be described as one which does not produce a value - negation cannot be avoided. There is another category of programs which produce partial results but are not designed to terminate (e.g. operating systems). These must be described in terms of successive approximations.

65 Relational Semantics

(2) The Proof Theoretic Semantics

A proof procedure formalises the way in which theorems may be derived from axioms. In the case of first-order logic it was shown by Godel (1930) that there are complete proof procedures; i.e. procedures which prove any true statement which can be derived from a set of statements in logic. One such proof procedure is Robinson's (1965) resol- ution method which, together with the SL proof procedure, is used as a basis of the subset of first order logic known as Prolog (Kowalski 1974).

A definition of this proof procedure is given in sec- tion 4.2. Because of the two ways in which the semantics of Prolog are defined this need not be regarded as a meta- circular definition, but rather a model-theoretic definition of the proof procedure. In this definition, negation is defined by means of the 'cut' procedure, which requires an explicit treatment of unification and backtracking.

As a result of Godel's later work, it was shown by Church that any proof procedure which is complete is also undecidable: i.e. it cannot be guaranteed that the proof will terminate in a finite time with the report 'true' or 'false'. Thus the proof procedure might have the value 'undefined'; in practice an algorithm which implements the proof procedure might go into a loop.

The proof procedure thus shares the unfortunate proper- ty of operational definitions of programming languages that it cannot give the value of every program written in the language. However,it does provide a constructive method for determining the values of many programs in the language. Van Emden and Kowalski argue that a proof procedure is an "operational semantics" for logic.

66 Relational Semantics

A Methodology for Semantic Definition

In order to represent the values of numbers, strings, programs etc, we must write them in some way. Unfortunate- ly,the word 'denotation1 is used ambiguously in different areas to refer both to the sign that is used to denote something and the entity that is denoted. Thus '12' is a denotation, but its denotation is also the number it rep- resents (the number 12). To avoid confusion we will use the words 'name' and 'value1. Thus '12' is one of the names of the number whose value is 12 (other names are 1100, 14, C).

In elaborating a methodology for defining the semantics of a programming language, there are four key points that must be established.

(1) Static Semantics The syntax of a program generates a term which names a program. An example is ASGT(x,0:4) which names the state- ment written in Algol as "x:=4" (where 0:4 names the number 4). The form of this term is a tree structure whose main virtue is that it is unambiguous and represents the oper- ation simply. It does not imply any restrictions over the way that the language is implemented: it may be possible to "optimise out" this statement during the compilation so that it is never executed. This does not matter. What we are interested in is the value of the program. The value of the number 4 in the program requires no more elaboration than is given. The value of ASGT must be defined by the dynamic syntax, which is given later.

67 Relational Semantics

(2) States In order to talk of the value of a program we must introduce the concept of a state. This may be named like any other value, and its components consist of the vari- ables, files etc that a particular programming language introduces.

Thus in the definition of ASPLE which will be given, a state has 4 components: the value of declared variables, the input and output files and a value which indicates an error state. The components of the state will vary from defin- ition to definition. The advantage of "bundling them up" in a single function term is in uniformity: the definition of most constructs may be expressed in a way that it is un- altered by the exact nature of the state, which is repre- sented solely by variables.

We may distinguish two basic relations - which may be termed Cmd for command and Exp for expressions. Cmd takes a command and a state and returns a new state. Exp takes an expression and returns a value, together with possibly a change of state. They will be written as follows:

Cmd(command, oldstate, newstate)

Exp(expression, value, oldstate, newstate)

Then an assignment statement may be given as follows:

Cmd(Asgt(tag,exp),sl,s3) <- Exp(exp,val,si,s2) & Update(tag,val,s2,s3). One characteristic of this statement, which reoccurs constantly in the definitions presented later, is the way in which the states si, s2, s3 are used. We have again a form of relational composition in which the states are passed

68 Relational Semantics from one predicate to the next in the clause. In fact this is the same form as in M-grammars, so that the statement could be rephrased as:

Cmd(Asgt(tag,exp)) => Exp(exp,val); Update(tag,val).

Obviously this is not a 'grammar rule1 in traditional terms. We are using the notation simply as a convenient shorthand. If one considers the notation as a generalised form of relational composition it becomes more meaningful.

In denotational semantics, a further 'currying' of functions is used to separate out various aspects of the state (locations, stores, label values etc.) This is merely a notational convenience. We find it more useful to struc- ture the states. This means that the clause for top-level constructs, such as the assignment statement above, is un- changed if the form of the state is changed. Obviously any relation which access the state must be changed.

(3) Types and Function Symbols Predicate Logic as normally defined is type-free. This has disadvantages compared with typed systems when one wishes to make assertions about a set of axioms. For inst- ance, when one is using an induction schema, one may wish to use a specific property of integers and limit the type of a variable correspondingly.

In the definitions we will present, a unique set of function and constant symbols is used with the same effect. Different function symbols are required for:

1. Each data type in the object language. 2. Each descriptor used for a storage class. 3. Each predefined function in the system. 4. Each control structure.

69 Relational Semantics

Given these distinctions we can then make use of struc- tural induction (Burstall 1969b) in proofs of program prop- erties. The different function symbols are generated by the static semantics associated with the syntax description. It is obviously possible to include constructors which build up complex data types of the user's choice: Algol 68 and Pascal demonstrate ways of achieving this.

(4) Completeness Completeness is an essential characteristic of a sem- antic definition. Each possibility must be covered: this includes constructs which lead to error situations, non- terminating programs and space limitations which are in- herent to any practical implementations of a programming language.

This completeness is achieved by providing a full set of axioms for each construct. In some cases this can only be achieved by the use of negation, as has already been pointed out.

70 Chapter 2^2.

Axiomatic Semantics

Since axiomatic semantics is defined in logic it should not be surprising that the task of expressing this in Prolog notation is relatively simple, although there is a choice of several representations. Because of this we take the liberty in this section of defining the semantics of the whole 'ASPLE' language, which is not introduced until section 4.1. In fact the definition should be explicable to anyone ac- quainted with Axiomatic definitions, without needing to refer to the introduction of the language.

Expressing Hoare's axiomatic system in first-order logic one encounters two potential problems.

(1) Some of the axioms are actually "axiom schema" rather than axioms. Another way of saying this is that there are second-order features in the system (allowing quantification over predicate and function symbols).

(2) The 'rules of inference' are an addition to the normal first-order system.

Both of these problems can be handled if we treat predicates in the axiom schemes as function symbols and rules of inference as implications. This is equivalent to treating the first-order system as a metalanguage for ex- pressing the axioms. Then the 'rules of consequence' become the gateway to proving many assertions directly, and the 'rules of inference' can be expressed using normal variables in place of the assertions.

Thus where Hoare writes P{s}Q we will use the single predicate Ax(p,s,q), where p and q are any formulae in the

71 Axiomatic Semantics predicate calculus, and s is a program in the language, introduced as a tree of the abstract syntax such as that produced by the syntax analysis of the language.

We may then give the axioms for ASPLE statements in Fig 3/2/1. In addition to these we require what Hoare calls "rules of consequence". These may be stated as follows:

Ax(p,s,q) <- Ax(r,s,q) & Demonstrate(p->r). Ax(p,s,q) <- Ax(p,s,r) & Demonstrate(r->q).

These rules state in essence that we can always use a stronger precondition, or a weaker post condition in order to prove a program. The predicate Demonstrate was intro- duced by Kowalski (1979) to provide a link between the meta- language and the object language of the logic system.

Thus Demonstrate (x) is true if an instance of x can be proved from the facts known about it. We assume the exist- ence of a theorem prover which has the necesssary facts about algebraic expressions, simplification etc. For an elaboration of the 'facts' that a typical theorem prover needs to know, see Boyer & Moore (1979). They include such equivalences as:

x + y = y + x x* (y+z) = x*y + x*z FALSE & x = FALSE ~ (x < x) x t y = ~ (x = y)

72 Axiomatic Semantics

1. Statement Sequence P(sl>R R(s2>0

P(sl;s2)Q

2. The Null Statement P (SKIP)P

Pexp {x:=exp}P 3. Assignment P&b (sl>g P&Not(b)(s2}Q 4. Conditionals P(IF b THEN Si ELSE s2}Q

5. Loops P & b(s>P P(while b do s>P&Not(b)

6. Input FilKJStfy) {in?ut x}p

7. Output PFiil(8ut;^.y) output x)P

Fig. 3/2/1

Figure 1 shows the axioms for ASPLE in the normal Hoare notation, and the translation of these into Prolog notation as a metalanguage is shown in Fig 3/2/2. The only other axiom assumed in this definition is the 'Subs1 relation whose interpretation is:

Subs(arb,c,d) means "the expression produced by substituting a for b in c is d".

73 Axiomatic Semantics

1. Ax(p,si.s2,q) <- Ax(p,sl,r) & Ax(r,s2,q).

2. Ax(p,NIL,p).

3. Ax(p,Asgt(var,exp),q) <- Subs(exp,Deref(var),q,p).

4. Ax (p,Cond(exp,si,s2),q) <- Ax(p&exp,si,q) &

Ax(p&Not(exp),s2,q).

5. Ax(p,While(b,s),p&Not(b)) <- Ax(p&b,s,p)

6. Ax(p,Input(x),q) <- Subs(File(In,v.y),File(In,y),q,r) & Subs(v,Deref(x),r,p).

7. Ax(p,Output(x),q) <- Subs(File(Out,x.y),File(Out,y),q,p).

Fig. 3/2/2

Most of the axioms are familiar because of their inclu- sion in any treatment of the axiomatic system, so only the main points will be considered here.

1. The assignment rule is the 'back substitution1 axiom adopted by Hoare. This is in fact more general than Floyd's original 'forward' axiom as it allows one to argue about any variables not affected by the assignment, but also allows the deduction of Floyd's axiom as a special case.

This may be illustrated by the simple example

x:=x+l.

Assuming x initially has some value a, on enquiry the effect of this statement may be phrased in two different ways:

74 Axiomatic Semantics

Hoare's method is 'goal-oriented1 in that one makes a query starting with the final statement.

P{x:=x+l} x=a+l

which yields by substitution the formula for P

P=(x+l=a+l)

from which one can then derive the starting condition

P=(x=a).

Since logic programs are invertible one could use this in reverse to derive the result of Floyd's axiom,

x=a(x:=x+l) 3xQ [x=x0+l & xQ=a]

which reduces to

x=a{x:=x+l}x=a+l

Using the tree structure introduced by the ASPLE definition (ch 4.1), the assignment x:=x+l is represented

Asgt(Id(x),Plus(Deref(Id(x)),Val(l)))

One might note here the presence of the 'Deref' operation on the right hand side which is required in Algol 68-like languages to extract the value of an identifier from its location (the Strachey model of lmode and rmode is in some ways more convenient). It is for this reason that the extra 'Deref' is introduced into the call for Subs in clause 3 of Fig 3/2/2. One must substitute the value represented by the right hand side for the value represented by the left hand

75 Axiomatic Semantics side.

Thus this treatment corrects one criticism that has been made of Hoare's method: namely that a single assignment statement can mean many different things. Many languages have transfer functions that are invoked automatically on assignment. In ASPLE this is the dereference which may be invoked in different ways according to the modes of the operands. For instance, if y has mode REF REF INT, then the assignment x:=y would be represented AsgtCx, Deref(Deref(y))) whereas y:=x would be represented Asgt(y, x) where instances of Id() are omitted for clarity.

The assertion "y points at x" after the assignment above may therefore be represented by Deref(y)=x. Expressed in Hoare's notation this becomes:

PY { y: =x > Deref (y)=x which simplifies directly to the goal:

Subs(x, Deref(y), Deref(y)=x, x=x) which is clearly satisfied.

2. The axiomatisation of transput is a slight simplific- ation of that presented by Hoare and Wirth (1973) for Pas- cal. It assumes a specification of files as a dyadic func- tion symbol File(a,b) of which the first parameter specifies which file is used and the second gives the contents as a list of items, using the list constructor ,.1. The contents of this list is a set of expressions and not simply constant symbols, so that symbolic proofs may be carried out.

76 Axiomatic Semantics

A file may be considered at any point in the program as a variable naming the remainder of the file available at that point, (see fig 3/2/2). Thus at the beginning of the program this is the whole file, and at the end it is empty (assuming that the whole file is consumed).

For the input file, there is an implied condition that there is some input left, or, in Asple (or Pascal) terms, that the end of file has not been reached. Since the contents of the file before the input statement are assumed to be 'v.y1, this can only be satisfi ed bv the presence of at least one value in the file. Chapter 2^2.

The Development of Semantics

"Let's hear it", said Humpty Dumpty, "I can explain all the poems that ever were invented - and a good many that haven't been invented just yet." Lewis Carroll: Through the Looking Glass.

The Complementary Approach

J.W. de Bakker (1976) sums up well the current position of research into the semantics of programming languages.

"Unfortunately there is no agreement at all on what constitutes a proper methodology for semantic specification. On the contrary, we find ourselves confronted with an embar- rassingly rich choice of approaches."

There are three main reasons for this multiplicity of approaches.

(1) Semantics is used for a variety of different pur- poses and no one method works equally well in every area. Among the main uses are:

Proving the correctness of Programs Proving the equivalence of Program Schemes Defining unequivocally the constructs of a programming language Proving the correctness or equivalence of implement- ations Communicating to users the effects of a language

78 The Development of Semantics

This emphasis on different usage has been used by Hoare and Lauer (1974) to argue for complementary theories of languages. They distinguish the extremes of the construct- ive (or operational) approach (which defines what effect a language will have on a machine when it is executed) and the implicit approach (which makes statements about properties of programs). Within these they compare a number of ap- proaches, which are considered below.

(2) There are fundamental philosophic disagreements between mathematicians over the issue of semantics. The origin of this is the split between the 'constructivist' school (Hilbert, Brouwer) and the 'infinitist' assumptions of most mathematicians. This is demonstrated in the devel- opment of "proof theory" (Herbrand, Gentzen) as an altern- ative to Tarski's 'model theory' of semantics. This has given rise on the one hand to 'axiomatic' and constructivist approaches (Floyd, Hoare, Dijkstra, van Wijngaarden) and on the other to the 'mathematical' approach (McCarthy, Strachey, Scott).

That this cleavage exists may be illustrated by this exchange of views at the IFIP Working Conference (Steel 1966) between a 'loner' in the constructivist field - Aard van Wijngaarden who backs a constructive approach to opera- tional semantics, and Christopher Strachey, who represents the more abstract 'mathematical' school.

van Wijngaarden: "I cannot accept certain presuppos- itions. Deep in the foundations of mathematics there are fundamental differences between the opinions of mathemat- icians ... Any natural number exists only when I have cons- tructed it and the construction is a specific denotation for that number. If you take that attitude, the completely finitist and constructivist attitude, then I cannot talk

79 The Development of Semantics about any number without arriving at it." (p. 293)

Strachey (to van Wijngaarden): "Your technique has been to take all the things that people think are important in languages and replace them by all the features that everybody left out... The last thing I would want to do is to remove a function (or at least what I call a function) because it seems a much better understood mathematical ent- ity than a procedure - which I call a routine - which is a complicated command."

To comment on the details of this exchange requires a more careful treatment of the differences between model and proof-theoretic semantics, or on a deeper level infinitist and intuitionist mathematics, but the difference in emphasis is clear: there are two different paradigms (in Kuhn's sense) in use whose applicability cannot be argued out in a strictly logical fashion.

(3) There are a variety of different languages or notations in use to describe semantics whose relationship is not always clear. Although formal equivalence has been proved in some cases, it is not easy to bridge the gap mentally between the superficial differences. In other cases there is a distinct trade-off between 'power1 and 'obscurity'. The three main approaches which reappear in several otherwise dissimilar methods, are machine defin- ition, the lambda calculus, and predicate logic.

The resolution of this problem proposed by Hoare and Lauer (1974) is the principle of complementarity, already used in physics and other fields. They consider four 'levels of abstraction1 in the definition of a programming language.

80 The Development of Semantics

(1) The machine description, including variable states and control states, which interprets the program by means of an automata. (2) The computational model, which is defined by a function which maps programs onto memory states. (3) The relational theory which considers relations between memory states by means of axioms. (4) The deductive (or axiomatic) theory, in which mem- ory states are no longer considered but only formulae about propositions describing properties which one wishes to prove about the program.

Because of reason (2), the simple classification scheme of Hoare and Lauer is inadequate to deal with the develop- ment of denotational semantics, which is not simply a step on the road towards an axiomatic approach. The aim of 'abstraction' is in this case a means to make available general mathematical methods and thereby to characterise what must be true of the machine rather than positing a possible realisation.

Classification by means of notation is less helpful as it is often possible to use more than one notation for essentially the same method, and one notation for several methods. However, notation does have a significant effect on the comprehension and usability of a method.

In order to give substance to this comparison an out- line is given below of a few of the highwater marks in semantic definition. For more complete bibliographies one may refer to Steel (1966, 346 refs).

81 The Development of Semantics

Operational Semantics

The first complete operational definition of a language was published by Landin (1964) for what he calls 'applic- ative expressions1 based closely on the lambda calculus. The language is defined by an abstract syntax (see section 2.2) and executed on a hypothetical machine which has four basic components - Stack, Environment, Control and Dump (SECD) • The operations on this are simple data copies controlled by conditional expressions which depend simply on the presence of various 'constructed objects'. The effect of an expression may be followed by anyone with the intelli- gence of a patient bureaucrat.

This method was taken up by the IBM laboratory in Vienna as a means of defining the language PL/1 (see Lucas, Walk 1971). They took over McCarthy's (1962) formulation of abstract syntax and used rather more complicated machines (with several more states), including some non-deterministic features. Although cumbersome, the notation was sufficient to define PL/1, Algol-60 (Lauer 1968) and several other languages (e.g. see Lee 1972). It was also used to prove the equivalence of different implementations of certain features, such as the block concept (Lucas 1968).

A totally different operational approach is taken by van Wijngaarden (1964). He simplifies Algol-60 by a succes- sion of systematic translations until he is left with assignments, procedure calls with parameters (which are essentially jumps), the block concept and arithmetic and boolean operations. This is then processed by a simple processor which produces a dynamically varying text. The philosophy of his approach was taken up in the definition of Algol-68 (van Wijngaarden et al 1969) which defines a relat- ively small kernel with an outer language defined in terms of that kernel. The semantics of the Algol-68 report was

82 The Development of Semantics

defined in English, not in van Wijngaarden1s two level gram- mars, and the size of that task has apparently deterred any attempts to rectify this. But W-grammars were used by Cleaveland and Uzgalis (1977) to provide a definition of both syntax and semantics of a small subset of Algol 68, called ASPLE, used below. This was done purely by expand- ing sets of productions which describe the states of a hypothetical machine and the input and output files of the program. They admit that the addition of further features, such as jumps and block structure, would complicate the definition considerably. Also there has not been any at- tempt to use this style of definition for other purposes such as program proving and it must therefore be judged purely on its explicative value.

Mathematical Semantics

The requirements of a mathematical theory of programs were laid down by McCarthy (1962, 1966), but it has taken much longer to refine the basic ideas.

The basic aim of the mathematical theory is to define a function which gives the state s1 which results from ap- plying a program p to another state s. i.e. s1 = meaning (p, s), with functionality: meaning: Prog x State -> State.

To do this sensibly requires a structural description of the program (the abstract syntax) and the state of the machine. These are provided by a set of functions and predicates which act on terms of the language. McCarthy (1966) applied this to a very small subset of Algol includ- ing assignments, conditional expressions and jumps. For

83 The Development of Semantics the concrete syntax he gives a simple parser acting on strings of symbols written in pure LISP.

This farf there is very little difference between a mathematical and an operational definition. In particular, the notion of a 'state1 which holds the value of variables is central to operational definitions; indeed McCarthy's work was very influential on both Landin and the Vienna group. The difference is that the meaning is expressed as a function for which the manipulations which are normal in mathematics apply.

The concept of the store was further examined by Strachey (1966) who showed how the concepts of L and R values lead, for instance, to a generalisation of the Algol 60 conditional expression: e.g. IF a THEN b ELSE c FI : = d;

The store is then abstracted out (by Curry's method) so that it becomes itself a function (s)

s' = Meaning(Prog) (s)

So the functionality of the semantics is now:

Meaning: Prog -> (Store -> Store)

As with Landin (1964, 1966), Strachey uses the lambda calculus to express the meaning of declarations and block structure. To express recursive procedures (and thus gen- eral loops) in the lambda-calculus it is necessary to in- clude the 'paradoxical' fixed point operator Y. Landin (1964) gives a method for constructing this in his mechani- cal model, but Strachey goes further in using the equation produced to show the equivalence of a number of program

84 The Development of Semantics fragments that might be produced in evaluating conditions, loops etc.

But there is a problem inherent in this treatment: functions which return functions are encountered in at least three places: in the functionality of programs (which prod- uce functions from states to states); in the use of the fixed point operator Y (Xa. (Ay.a (yy)) (Xy.a (yy))); and in the intended generalisation of programming languages to treat functions as 'first-class citizens'. However, it was far from obvious that such constructions can be mathematically allowed, or that they will not, as in some applications of the fixed point operator, produce contradictions. This problem was solved by Scott's (1970) construction of a complete lattice which proves that the fixed point actually exists and allows one to talk about total instead of partial functions.

The work of Scott and Strachey has given rise to a very active field of study known as denotational semantics. This is a model-theoretic semantics using the Lambda Calcul- us as a metalanguage. All its domains are total, so that the meaning of non-terminating programs can be described. The last major addition to the armoury was the abstracting of 'continuations' (Strachey, Wadsworth 1974) as a means of handling jumps in a clean fashion. The full functionality of a command in an Algol-like language is now

Meaning: Command -> (Env -> (Cont -> (Store -> Store))) where 'Env' is an environment (to create variables) and 'Cont' is a continuation (which, intuitively speaking, sel- ects the next statement for execution). Thus the new state is

s' = meaning (prog) (env) (cont) (s)

85 The Development of Semantics

The virtue of this abstraction is that it allows one to deal algebraically with such tasks as proving the equiva- lence of program fragments, and providing a minimal defin- ition for implementation. However, it has been argued (Brady 1977) that the denotational definition is still an operational one, in that it postulates the existence of states, environments etc in a similar way to the operational definition:

"We would argue that the evidence is overwhelmingly against Strachey's conjecture about meaning and the idea that there is a clear distinction between mathematical and operational semantics."

The quest for abstraction comes up against the question "Is it possible to define all the characteristics of a programming language (including references, side effects, jumps etc.) without introducing the basic characteristics of the von Neumann machine - stores, program counters etc?" The answer seems to be "No".

Axiomatic Semantics

Axiomatic semantics started from a different perspect- ive to either the operational or mathematical approach. It does not attempt to trace the total effects of a program on some machine 'state1 but rather the effect of the program on statements in the predicate calculus that the programmer wishes to make about the value of various variables.

The history of axiomatics passes through three phases. The first, proposed by Floyd (1967) gives the 'strongest verifiable consequent' of a command, i.e. given that some formula is true before the command, it asks which statements

86 The Development of Semantics can be made after the command is executed. He shows how, by choosing suitable 'inductive1 asssertions, one can prove properties of programs, including termination proofs based on well-ordered sets. Illustrative of his treatment is the assignment statement, given as

V x:=f(x,y) (R(x,y); 3xQ (x=f(x0,y) & R(x0,y)) i.e. after executing the statement x:=f(x,y) from a state in which some relation R(x,y) is true (where y contains any variables other than x) , we can assert two things that are true about xQf some previous value of x.

Hoare (1969, 1971, 1972) extended this from being a set of proof rules to an axiomatic theory. The basic asser- tions about assignments, conditionals, loops etc become axioms (or axiom schema) in the theory and rules of infer- ence are given for constructing proofs in the theory. This axiomatic form has been applied by Hoare and Wirth (1973) to the definition of the Pascal language. Some features (e.g. side effects and jumps) are not treated, but the treatment is in other ways remarkably clear and presented as 'con- tract' between language designers, implementors and users.

There is a marked shift in Hoare's treatment of the assignment statement to the "backward" rule i.e. Py {x:=y> P

i.e. given an assertion P after the assignment, we can assert P before the assignment with all instances of x substituted by y, an expression which may include x and other variables. Here the emphasis is not on the "verifi- able consequence" of a command, but the necessary "precond- ition" to achieve a desired end result.

87 The Development of Semantics

The chief weakness of this axiomatic system is that it contains no reference to the termination of programs. Thus, while after the statement "while b do s" we can cer- tainly assert Not(b) if the statement finishes, there is no mechanism for deciding whether it will finish. Although suggestions such as the 'sometime1 construct of Manna and Waldinger (1976) were proposed to fill this gap, it was left to Dijkstra (1975) to complete the transition to the framing of axiomatic systems in terms of "weakest preconditions".

The 'weakest precondition' is a function which trans- forms a predicate that is applicable at the end of a program into the preconditions for the program to terminate and produce the required result. It thus completes the 'back- wards' process initiated by Hoare. Thus for instance, the rule for the iterative construct includes an induction cri- terion for termination.

Having considered very briefly the progress of the axiomatic approach it is necessary to compare it with the other methods of formal specification. The immediate ques- tion that arises is "what is the definition for?" Axiom- atic semantics is evidently superior in two respects - (a) for proving properties of programs, (b) for developing cor- rect programs.

But it fails to answer as readily questions such as "what are the requirements for implementing this language?", "what does this programming construct do?" or "do these two programs compute the same function?". A key drawback is this: to specify the meaning of a program having a loop, it is necessary for the programmer to supply an inductive "invariant" assertion, which cannot be deduced from the text of the program.

Thus from the point of view of specifying what the The Development of Semantics execution of a program does, axiomatic semantics fails to give a complete answer. It is not a tutor which explains carefully and exactly what a language does; rather it is an oracle which will answer "yea" or "nay" to questions which the user must carefully construct in order to find out the truth about her program.

Algebraic Semantics

The Algebraic approach was originally applied in prog- ramming languages to the development of the concepts of ab- stract data types (see Guttag, Horowitz, Musser 1978) and has only more recently been applied to the task of supplying a semantics for a language. To take a well-used example, it is intuitively obvious that the key properties of the con- cept 'stack' are given by the equations:

TOP(PUSH(item,stack)) = item POP(PUSH(item,stack)) = stack where TOP, PUSH and POP are three functions which act on stacks and their data and return values of stacks and data (this omits any checks for the 'empty stack' state). Such a definition says nothing about the representation of a stack, only about the behaviour of functions acting in sequence on the data type.

Although this approach is promising for producing spec- ifications, it is not necessarily obvious that such a spec- ification would be either unique, complete or minimal. Hence attention has centred on 'initial algebras' (see Goguen, Thatcher, Wagner 1977) which have the special property of having a unique homomorphism with every other algebra in that category.

89 The Development of Semantics

The algebraic approach has not been developed as far as the denotational semantics, though a start is made by Burst- all and Goguen (1977) in describing the semantics of their specification language CLEAR. It is, however, not yet clear whether every aspect of a programming language can be des- cribed in a straightforward manner: it can be very difficult to capture the algebraic 'essence1 of a concept.

Some work has been done (see Clark, Tarnlund 1977) in using Prolog to define theories of data types. Further work is being undertaken by van Emden on the relationship between clauses and equations for expressing specifications. It is probable that the definitions given here as 'relational semantics' can be recast as theories, though this requires further study.

Summary

Some of the work on semantics might be criticized as being an endless 'quest for the abstract'. The virtues of abstraction lie in the facts that it does not constrain an implementation more than is necessary and that it makes more powerful mathematical and logical tools available for manip- ulating and proving programs. The danger is that the not- ation becomes obscure and that it is not widely usable.

The specification presented in this thesis - relational semantics - is not the most abstract possible. It is in some senses a realisation. However it uses a well-founded language which has a firm denotational basis and using this attempts to express the meaning of the language clearly. It should be possible to extend the methods used following the lead of the algebraic school to present a more abstract specification if this is required.

90 The Development of Semantics

The axiomatic method may be seen as a proof method for programs and as a means of conveying the essence of a lang- uage to the programmer rather than as a specification for the language. Its simplicity makes it appealing to the programmer, but this is offset by the difficult (and non- intuitive) treatment of some of the more 'tricky1 features of programming language. The axioms should be derivable from the specification of the language, though this has not been attempted here.

91 Chapter AU.

The Definition of ASPiLE

h Brief introduction ASPLE

ASPLE is a simple language based on Algol-68, It is simple enough for the presentation not to be tedious, but complete enough to cover many of the features found in a typical programming language (with the notable exception of blocks, procedures and jumps). It includes:

Assignment. Several types and type constructors (with declarations) . Arithmetic and relational expressions. Structured control statements - IF and WHILE. Input and output files Implementation defined constraints. e.g. size of identifiers and integers, arithmetic errors.

As an example of ASPLE, which introduces most of its features and is almost self-explanatory, we give a program (Fig. 4/1/1) to compute factorials, reading the number from the input and outputting the results.

92 The Definition of ASPLE

begin int fact, i, n; input n? i := 1; fact := 1? if (n * 0) then while (i ^ n) do i := i + 1? fact := fact * i end fi? output n; output fact end

Fig. 4/1/1.

The data structuring tool in ASPLE is the reference. One can, for instance, declare a variable as 'ref int' or 'ref ref bool'. While this is not in itself a very powerful structuring tool it does serve to introduce the possibility of infinite modes and objects.

The BNF syntax for ASPLE is given by Marcotty et al (1976) to which the reader is referred for further details of the language. It is hoped that the M-grammar definition will be found to be simple enough to read so that it is unnecessary to reproduce the BNF.

93 The Definition of ASPLE

The Top Level of the ASPLE Pefjnjtjop

The ASPLE language may be considered as a relation between four entities. These are:

1. The text of the program. 2. The input file. 3. The output file. 4. The result of the program (i.e. successful or not).

These are the only 'external1 things we need to know about a program written in the language, and a formal defin- ition should, given the text of the program and the input file, tell us what the result is and the contents of the output file. We will use the notation:

ASPLE(text, input, output, result) to express this relation.

At the first level of refinement, we may detect three separate stages in the analysis of the language.

1. The lexical syntax. 2. The syntax. 3. The semantics.

The first stage, lexical syntax, groups the characters into words or tokens, and discards irrelevant items such as spaces, new lines and comments. In most formal definitions this stage is hidden deep within the syntax (as in the Algol-68 report) or expressed only informally (e.g. in the Pascal report). But it is intentionally a self-contained entity in most languages (Fortran is an exception) to simp- lify definition and compilers. The facilities of M-grammars make it possible to regard it as a separate relation, bet- ween the characters, or text, of a program, and the words or

94 The Definition of ASPLE tokens; symbolically 'LEXEME(tokens, text, NIL)1, (the final parameter may be ignored as it simply provides the 'differ- ence list1 used by the M-grammar).

In a similar way, the syntax may be regarded as a relation between the list of tokens and the program repres- ented as a tree of morphemes (indivisible grammatical elem- ents) - the 'abstract syntax1 of VDL. This relation is named 'MORPHEME(tree.env, tokens, NIL)'. It only holds in the case that the tokens of a program form a correct program. The morphemes are divided into two classes - 'env' (short for 'environment1) names the storage locations named by the program, and 'tree' the actions. In a block structured language the relation would be expressed slightly different- ly, but this is adequate for ASPLE.

Finally, the semantics of a program may be regarded as a relation between the morphemic tree, the initial 'state1 of the hypothetical machine executing the program, and the final state of that machine - named 'SEMEME(tree, statel, state2)'. These states are composed of 4 entities; the internal storage of the machine, the input and output files, and the status of the machine. A state may be rep- resented by a term 'STATE(env,input,output,status)which is explained more fully later.

We may thus express the meaning of a program in the programming language by saying that the ASPLE relation holds if the three other relations hold. This may be written formally in logic as:

ASPLE(text, input, output,result) <- LEXEME(tokens, text, NIL) & MORPHEME(tree.env, tokens, NIL) & SEMEME(tree, STATE(env,input,output,OK), STATE(envl,inl,NIL,result)).

95 The Definition of ASPLE

Here the variables tokens, tree and mem are local to the body of the clause and provide the linkage between the three relations in the body. The two variables meml and inl in the final state are simply ignored, and result and output returned as the result. It may appear surprising that 'out- put1 occurs in the initial state, but this will be explained later.

This clause may be read in two entirely different ways. It may be read declaratively as a statement of the predicate calculus; or it may be read procedurally as a series of goals: to prove the Asple relation, prove the lexeme, morpheme and sememe relations. These two readings correspond to two different uses of the clause - to under- stand a program, or to execute the program on a computer.

It is not necessary that the parameters are bound in exactly the way that has been assumed until now. We have assumed that 'text' and 'input' are bound at the start, and 'output' and 'result' are bound on completion. This corres- ponds to the normal process of compilation and execution. But it would be equally possible for all the parameters to be bound at the start, thus checking that a particular computation is performed. More fancifully, it is possible to bind only the input and output files at the start and let the definition produce the program, thus 'synthesising' a program to do this transformation. This will not produce any very interesting programs unless the inputs and outputs are generalised in some way.

96 The Definition of ASPLE

Lexical Syntax

There are three difficulties with normal methods of specifying lexical syntax, such as BNF:

1. A lexical syntax is normally a regular grammar in which any item may follow another. Thus if one has (say) five types of item, then each production needs four extra alternatives to allow for each possible item following. This makes the syntax rather tedious in BNF, and is why finite state diagrams are often used.

2. Many compilers use a 'reserved word' convention for structural words such as 'begin' and 'end'. To disting- uish these from normal identifiers in the language requires the use of negation, which is not available in BNF and many other formal systems.

3. Using BNF there is no way of formally linking the two levels - character and tokens - if the lexical syntax is described separately.

The M-grammar form of the lexical syntax is shown in Fig. 2. 'LEXEMES' is defined as a list of items, possibly separated by spaces. There are three types of item - identi- fiers, numbers and syntactic tokens.

In the output, identifiers are represented by a func- tor 'ID' whose parameter is the string consisting of the identifier, e.g. ID(f.a.c.t.NIL) or ID ("fact"). Numbers have the functor NUM, whose argument is an internal normal form of the decimal integer, using the (left to right) infix operator ':'. Thus 123 is represented NUM (0: Is 2:3). Syntax words such as BEGIN and := are represented by the strings themselves.

97 The Definition of ASPLE /* */

/* Fig 4/1/2. LEXICAL SYNTAX */ /* */

LEXEMES(x) => SPACE? LEXEMES(x). LEXEMES(x.y) => LEXEME(x); LEXEMES(y)• LEXEMES(NIL) => NIL.

SPACE => §ch & SPACECH(ch) I @V; @'*f; COMMENT.

LEXEME(word) => WORD(w) & SYSTEM(w, word). LEXEME(NUM(num)) => NUMBERCnum). LEXEME(other) => OTHERCHAR(other) .

COMMENT => @ '*C0MMENT1 I @ch & ~ch=l* 1; COMMENT. COMMENTl => V I §'*'; COMMENTl I @ch & ~ch='*' & ~ch='/1; COMMENT.

WORD(first.rest) => LETTER(first)? WORD(rest). WORD(last.NIL) => LETTER(last); "LETTER(next).

NUMBER(num) => DIGIT(n); (n=0; (NUMBER(num) I num=0) ~n=0; RNUMBER(0:n, num)); ~DIGIT(next) & MAXLENGTH(num, N3).

RNUMBER(hif num) => DIGIT(next); RNUMBER(hi:nextf num). RNUMBER(lo, lo) => NIL.

LETTER(ch) => @ch & ISLETTER(ch). DIGIT(ch) => @ch & ISDIGIT(ch).

SYSTEM(w, w) <- RESERVED(w). SYSTEM(w, ID(w)) <- "RESERVED(w) & MAXLENGTH(w, N4).

RESERVED("BEGIN"). RESERVED("BOOL"). RESERVED("DO"). RESERVED("ELSE"). RESERVED("END"). RESERVED("FALSE").

98 The Definition of ASPLE

RESERVED("FI"). RESERVED("IF"). RESERVED("INPUT") RESERVED("INT"). RESERVED("OUTPUT") RESERVED("REF"). RESERVED("THEN") RESERVED("WHILE").

OTHERCHAR(":=") => '='. OTHERCHAR(ch.NIL) => @ch & ~ USED(ch)

USED(ch) <- SPACECH(ch) I ISLETTER(ch) I ISDIGIT(ch) SPACECH(' ')

Thus the string of characters:

begin int fact, i, n; 1 • •••

is output as the string of terms:

"begin"."int".ID("fact").",".ID("i").",".ID("n")."?" .ID("i").":=".NUM(0:1).";"...

At the lexical level, terminals appear mainly as var- iables, as in the production:

LETTER(ch) => @ch & ISLETTER(ch)

The reason for this is that it is necessary to return the value as the result of the parse, and the predicate 'isletter' is normally supplied as an inbuilt (or evaluable) predicate in a Prolog system. This is preferable to the rather tedious definition which would otherwise be neces- sary:

LETTER(B) => @B LETTER(C) => @C

99 The Definition of ASPLE

The syntax ol Asple The M-grammar version of the syntax of ASPLE is shown in Fig. 3, and corresponds closely to the BNF version. In fact, if the parameters are removed the similarity is seen very clearly. But the parameters add two extra features that a BNF version does not have; it applies context-sensitive restrictions to the syntax, and shows the correspondence of the morphemes (grammatical items) to the lexemes (or tok- ens) .

/* */ /* Fig 4/1/3 SYNTAX */ /* */

MORPHEME(tree.env) => "BEGIN"; DCLTRAIN(env) & CHECK(env) & MAXLENGTH(env, N2); STMTRAIN(env, tree); "END";

DCLTRAIN(env) => DECLARATION(envO); ";"; DCLTRAIN(envl) & APPEND(envO,envl,env) . DCLTRAIN(NIL) => NIL. DECLARATION(env) => MODE(m); IDLIST(REF(m), env). MODE(INT) => "INT". MODE(BOOL) => "BOOL". MODE(REF(m)) => "REF"; MODE(m). IDLIST(m, env.envl) => IDDEC(m, env); ( ","; IDLIST(m, envl) I & envl = NIL). IDDEC(m, LOC(tag,m,UNDEF)) => @ID(tag).

100 The Definition of ASPLE /* */

/* STATEMENTS */ /* */

STMTRAIN(env, sem.seml) => STATEMENT(env, sem) ; (";" STMTRAIN(env, semi) I semi = NIL). STATEMENT(env, sem) => ASGTSTM(env, sem) I CONDSTM(env, sem) I LOOPSTM(env, sem) I TRANSPUTSTM(env, sem). ASGTSTM(env, ASGT(tag, exp)) => IDENTIFIER(REF(m), env, tag) ":="; EXP(m, env, exp). CONDSTM(env, COND(exp, si, s2)) => "IF"; EXP(BOOL, env, exp) "THEN"; STMTRAIN(env, si); ("FI" & s2 = NIL I "ELSE"; STMTRAIN(env, s2) ; "FI"). LOOPSTM(env, WHILE(exp, s)) => "WHILE"; EXP(BOOL, env, exp) "DO"; STMTRAIN(env, s); "END". TRANSPUTSTM(env, INPUT(exp)) => "INPUT"; EXP(REF(m), env, ex & INTBOOL(m). TRANSPUTSTM(env, OUTPUT(exp)) => "OUTPUT"; EXP(m, env, exp) & INTBOOL(m).

/* */ /* EXPRESSIONS */ /* */

EXP(m, env, exp) => FACTOR(m, env, lh) ; RESTEXP(m, env, lh, exp). RESTEXP(m, env, lh, exp) => "+"; FACTOR(m, env, rh) & 0P('+1,m,lh,rh,subexp); RESTEXP(m, env, subexp, exp). RESTEXP(m, env, lh, lh ) => NIL. FACTOR(m, env, exp) => PRIMARY(m, env, lh) ; RESTFACTOR(m, env, lh, exp). RESTFACTOR(m, env, lh, exp) => "*"; PRIMARY(m, env, rh) & OP(1*',m,lh,rh,subexp)

101 The Definition of ASPLE

RESTFACTOR(m, env, subexp, exp) RESTFACTOR(m, env, lh, lh ) => NIL. PRIMARY(m, env, exp ) => IDENTIFIER(ml, env, tag) & DEREFERENCE(ml, m, tag, exp) ii I "("; EXP(m, env, exp)? ")" I i"("t ; COMPARE(m, env, exp); ") ii I DENOTATION(m, env, exp). COMPARE(BOOL, env, EQ(expl, exp2)) => EXP(INT, env, expl); " = "; EXP(INT, env, exp2). COMPARE(BOOL, env, NE(expl, exp2)) => EXP(INT, env, expl); V; EXP(INT, env, exp2) .

/* */ /* IDENTIFIERS AND DENOTATIONS */ /*

IDENTIFIER(mode, env, ID(tag)) => @lD(tag) & MEMBER(LOC(tag, mode, v) , env). DENOTATION(BOOL, VAL(FALSE)) => "FALSE". DENOTATION(BOOL, VAL(TRUE)) => "TRUE". DENOTATION(INT, VAL(val)) => @NUM(val).

Context-sensitive restrictions are applied in three ways: which may be illustrated from Fig. 4/1/3.

1. The constraint 'CHECK' in the first production checks that each identifier in the program is defined only once. The definition of 'CHECK' is provided by a recursive logic definition in Fig. 4/1/4.

2. The constraint 'OP' in the definition of expres- sion (Restexp and Restfactor) checks that any expression involving '+' or '*' has mode INT or BOOL and forces the generation of a 'DEREF' operation in the expansion of PRIMARY if necessary. (Note that OP also has the effect of disambiguating the different meanings of '+' and '*' for

102 The Definition of ASPLE integer and boolean operands.)

3. In the definitions of expression there is a check that each part of the expression 'balances' the other by the sharing of the parameter 'mode1. Although this may take the values INT or BOOL, it must be consistently one or the other in any particular case.

/* */ /* Fig 4/1/4 SYNTACTIC CONSTRAINTS */ /* */

CHECK(LOC(tag, mode, val).env) <- ~MEMBER(LOC(tag,m,v),env) & CHECK(env). CHECK(NIL). INTBOOL(INT). INTBOOL(BOOL) . DEREFERENCE(mode, mode, exp, exp). DEREFERENCE(REF(mode), model, tag, DEREF(exp)) <- DEREFERENCE(mode, model, tag, exp). OP(* + ',INT,lh,rh,PLUS(lh,rh) ) . OP(' + ',B001,lh,rh,OR(lh,rh)) . OP('*',INT,lh,rh,TIMES(lh,rh)) . OP( '*1,BOOL,lh,rh,AND(lh,rh) ) .

/* General Relations */

APPEND(u.v, w, U.x) <- APPEND(v, w, x). APPEND(NIL, x, x). MEMBER(x, x.y). MEMBER(x, y.z) <- NOTEQUAL(x, y) & MEMBER(x, z). MAXLENGTH (list, max) <- VALUE (max, n) & LENGTHdist, 0, i) & LEQ(i, n). LENGTH(NIL, n, n) . LENGTH(head.rest, sofar, total) <- PLUS(0:1, sofar, more) & LENGTH(rest, more, total). NOTEQUAL(x, y) <- ~EQUAL(x, y) .

103 The Definition of ASPLE

EQUAL(x, x).

/* Value of machine dependent constants */

VALUE(Nl, 0:1:0:0). /* LENGTH OF ASPLE MORPHEME */

VALUE(N2f 0:5). /* NO. OF IDENTIFIERS DECLARED */ VALUE(N3, 0:6). /* NO. OF DIGITS IN INTEGER CONS*/ VALUE(N4, 0:8). /* NO. OF LETTERS IN IDENTIFIER */

VALUE(N5 f 0:1:0:0:0:0:0). /* VALUE OF LARGEST INTEGER */ VALUE(N6, 0:1:0:0) . /* SIZE OF OUTPUT FILE */

As an example of mode handling, take the statement in the factorial program:

fact := fact * i

The mode of both fact and i is REF (INT) - i.e. refer- ence to a location containing an integer value. The produc- tion for the assignment statement is:

ASGTSTM(env, ASGT(tag,exp)) => IDENTIFIER(REF(m), env, tag), ":=n, EXP(m, env, exp).

In this statement, 'env1 is a variable which is match- ed with a data structure containing all the declarations in the program - it is considered in more detail below. 'ASGT(tag,exp)' is the output of this clause (if we are considering this definition as a parsing procedure), which is a tree structure defining the program. On the right hand side of the production is the body of the assignment state- ment. The non-terminal 'IDENTIFIER' will be matched at the lower levels of the program to give

IDENTIFIER(REF(INT), env, ID("fact")) thus yielding the assignment in this clause of m=INT,

104 The Definition of ASPLE tag = ID ("f act"). The binding of env has been omitted for brevity.

The beginning of the expansion of the right hand side of the assignment statement is similar, with indentation indicating the depth of the production tree:

EXP(INT,env,exp) => FACTOR(INT, env, lh) => PRIMARY(INT,env,lh) => IDENTIFIER(ml,env,tag) => @ID("fact") & MEMBER(LOC ("fact" ,ml,v),env)

It may be noted in passing that the slightly cumber- some definition of EXP involving RESTEXP is a standard technique to get round the problem of left recursive rules, which would prevent the correct evaluation of the grammar using Prolog, which uses a left to right top-down evaluation rule. If the grammar isn't used as a program the more ob- vious definition would be preferable.

At this point, the constraint 'MEMBER' is matched, with resultant bindings: IDENTIFIER(REF(INT),env,ID("fact")) leaving the constraint to be matched: & DEREFERENCE(REF(INT),INT,ID("fact"),exp).

This constraint expresses the need to find an expres- sion 'exp1 which reduces the mode of ID("fact") from REF (INT) to INT. This is performed by the clause for deref- erence in Fig. 4/1/2, which follows a similar pattern to the productions already followed, and yields the binding exp=DEREF(ID("fact")) , where the functor 'DEREF' indicates the dereferencing of a store value; i.e. taking its value.

The rest of the productions for the assignment may be represented as follows, picking up the expansions of the

105 The Definition of ASPLE productions which have not been completed:

RESTFACTOR(INT,env,DEREF(ID("fact")),exp) =>

r PRIMARY(INT,env,rh) => IDENTIFIER (ml, env, tag) => eiDC'i") & DEREFERENCE(REF(INT,INT,ID("i"),exp) <- DEREFERENCE(INT,INT,ID("i n) ,DEREF(ID("i n & 0P('*',INT,DEREF(ID(" fact")) , DEREF(ID("i")),subexp). RESTFACTOR(INT,env,TIMES(DEREF(ID("fact")), DEREF(ID("i"))),exp) => NIL; RESTEXP (INT, env, TIMES (DEREF (ID ("fact") , DEREF (ID (" i ") ) ) ,exp) => NIL.

Thus the final form of the assignment statement is:

ASGTSTM(env, ASGT(ID("fact") , TIMES(DEREF(ID("fact")),DEREF(ID("i"))))) => IDENTIFIER(REF(INT), env, ID("fact")); ":="; EXP(INT, env, TIMES(DEREF(ID("fact")), DEREF(ID(nin))))).

By the process of matching, it is thus possible to build up complex trees which represent the morphemes of the program. 'DEREF', 'TIMES1 and 'ASGT* represent grammatical elements which must be interpreted semantically to produce the result of the program.

With this introduction, it should be possible to work out how the rest of the program is parsed according to these rules. Let us finally consider how the declaration environ- ment 'env' is built up. The final value of env in this program is as follows:

106 The Definition of ASPLE

env = LOC("fact",REF(INT),UNDEF).LOC("i",REF(INT),UNDEF)

• LOC("n " , REF(INT) rUNDEF).NIL.

'LOC1 is a three place function with arguments name, mode and value, with one occurrence for each variable de- clared in the program. The third argument is set to undef- ined (UNDEF) by the syntax, which is the value it has at the start of execution. The environment is simply a list of these declarations.

Tim Semantics ol ASPLE

Using logic it is possible to present all of the several methods of defining semantics outlined in the intro- duction in a consistent formalism, which demonstrates clear- ly their similarities and differences. It may then be seen that the difference stems not so much from the formalism employed (as has sometimes seemed the case) as from the purpose for which it is employed. The present treatment uses a relational method which has some of the advantages of the denotational method. The kernel of the axiomatic method may be found in a slightly different context in Moss (1977) as well as in chapter 3.2. Relational semantics provides a guide for the implementor, specifying exactly what must be adhered to and what is implementation dependent. Axiomatic semantics is more useful for the programmer who wants to understand the effects of a program without being bogged down in the details.

The relational semantics of a program is a relation between the program, an 'input' state and an 'output' state. These states consist of four components:

1. The contents of variables in the program. 2. The state of the input file.

107 The Definition of ASPLE

3. The state of the output file. 4. The status of the machine, indicating whether or not an error has occurred (required by the definition of ASPLE).

Thus the top level of the semantics is:

<- SEMEME(program, STATE(env,input,output,OK), STATE(envl,il,NIL,result))

where STATE is a four place constructor for a state having the components above; 'program1 and 1 exw' are derived from the morpheme relation; 'input1 and 'output' are the corres- ponding files, and 'OK' is the initial status. The corres- ponding items 'storel', 'il', 'NIL', are the resultant states of these variables at the end, which are ignored, and 'result' is the final result of the program.

The evaluation of the SEMEME relation produces a se- quence of states, and these states may be subdivided in a very similar fashion to a grammar. This suggests that we may use the same formalism to express the sequence of states as is used for a sequence of characters or lexemes; namely a grammar. The sequential evaluation of two statements, si and s2 can be represented by a clause:

SEMEME(si.s2, statel, state3) <- SEMEME(si, statel, state2) & SEMEME(s2, state2, state3). which is more tersely stated by the grammar rule:

SEMEME(si.S2) => SEMEME(si); SEMEME(s2).

Here the semi-colon may be interpreted as "after" in the temporal sequence. The use of the grammatical form does not alter the underlying logical clause, but enables one to

108 The Definition of ASPLE concentrate on the meaning of the clause without being distracted by the presence of an additional two parameters. This is demonstrated in Fig. 4/1/5 which shows the top level of the semantics for ASPLE.

The 'grammar1 that this forms does not have normal 'terminal symbols'. Instead the relations other than 'SEMEME' are expressed by standard Horn clauses including the extra two parameters, as shown in Fig. 6. These may be divided into two classes: those such as 'UPDATE' and 'TRANS- PUT' which change the state parameters; and those such as 'LOOKUP' and 'ADD' which do not.

/* /* Fig 4/1/5 SEMANTICS */ /* */

SEMEME(si.s2) => SEMEME(si);SEMEME(s2). SEMEME(NIL) => NIL. SEMEME(ASGT(ID(tag),exp)) => SEMEME(exp, val); UPDATE(tag, mode, val). SEMEME(COND(exp,si,s2)) => SEMEME(exp, VAL(TRUE));SEMEME(si) I SEMEME(exp, VAL(FALSE));SEMEME(s2). SEMEME(WHILE(exp,s)) => SEMEME(exp, VAL(FALSE)) I SEMEME(exp, VAL(TRUE)); SEMEME(s); SEMEME(WHILE(exp, s)). SEMEME(INPUT(exp)) => SEMEME(exp, ID(tag)); ( TRANSPUT(IN, mode, val); UPDATE(tag, mode, val). I INPUTERROR). SEMEME(OUTPUT(exp)) => SEMEME(exp, VAL(val)); TRANSPUT(OUT, mode, val). SEMEME(any) => ERROROCCURRED.

109 The Definition of ASPLE

/* SEMANTICS OF EXPRESSIONS */ /* */

SEMEME(PLUS(expl,exp2), val) => SEMEME(expl, vail); SEMEME(exp2, val2) & ADD(vail, val2, val). SEMEME(TIMES(expl,exp2), val) => SEMEME(expl, vail); SEMEME(exp2, val2) & MULT(vail, val2, val). SEMEME(AND(expl,exp2), val) => SEMEME(expl,vail); SEMEME(exp2,val2) & AND(vail,val2,val)• SEMEME(OR(expl,exp2), val) => & 0R(vall,val2,val). SEMEME(EQ(expl,exp2), val) => SEMEME(expl, vail); SEMEME(exp2, val2) & EQUAL(vail, val2, val). SEMEME(NE(expl,exp2), val) => SEMEME(expl, vail); SEMEME(exp2, val2) & NOTEQUAL(vail, val2, val). SEMEME(DEREF(exp), val) => SEMEME(exp, ID(tag)); LOOKUP(tag, val). SEMEME(ID(tag), ID(tag)) => NIL. SEMEME(VAL(val), VAL(val)) => NIL. SEMEME(exp, exp) => EVALUATIONERROR(exp).

As an example, let us take the assignment statement that was parsed earlier. The top level of the semantics for this is:

SEMEME(ASGT(ID("fact") , TIMES(DEREF(ID("fact")), DEREF(ID ( "i"))))) => SEMEME(TIMES(DEREF(ID("fact")), DEREF(ID("i"))),val); UPDATE("fact",mode,val).

110 The Definition of ASPLE

The value of STATE at the beginning of this statement contains values for "fact" and "i" which are used in the evaluation of the expression. The value of "fact" in the state is changed by UPDATE to the new value, val, computed by TIMES. The parameter 'mode1 to UPDATE is not used in the assignment statement, as the type check has been done at 'compile time'. It is used in the evaluation of the INPUT statement only.

Referring again to Fig. 4/1/5, it will be seen that there are two distinct SEMEME relations; those which des- cribe statements have only one parameter, and those which describe expressions have two. This is suitable for ASPLE in which statements do not return results. In an 'expression' language such as Algol 68, the first type would be unneces- sary.

While the elaboration of statements is sequential, to insist on the sequential evaluation of expressions would be an overspecification of ASPLE as of Algol 68. This means that it is possible to execute either the left hand branch of an expression first or the right hand branch. This may be represented in the grammar by allowing either of these as options. e.g. SEMEME(PLUS(expl,exp2),val) => ( SEMEME(expl,vail); SEMEME(exp2,val2) I SEMEME(exp2,val2); SEMEME(expl,vail)) & ADD(vail,val2,val) .

This is an example of an indeterminate specification which is perfectly acceptable in a logic program. If this were executed as a program, each of the alternatives would be generated in turn by backtracking. Of course, in the absence of side-effects in expressions, the same result will be achieved and we have therefore not bothered to express all the operators in this fashion.

Ill The Definition of ASPLE

/* */ */ /* Fig 4/1/6 SEMANTIC RELATIONS /* */

LOOKUP(ID(tag), val, stO, stO) <- stO=STATE(mem,i,orOK) & LOOKUP(tag, mem, val). LOOKUP(tag,LOC(tag,mode,val).r, val) <- NOTEQUAL(val, UNDEF). LOOKUP(tag, LOC(tagl,m,v).rest, val) <- NOTEQUAL(tag, tagl) & LOOKUP(tag, rest, val). UPDATE(tag, mode, val, STATE(meml,in,out, OK), STATE(mem2, in, out, OK))<- UPDATE(tag, mode, val, meml, mem2) UPDATE(tag, mode, val, LOC(tag,REF(mode),v).env, LOC(tag,REF(mode),val).env). UPDATE(tag, mode, val, l.envl, l.env2) <- 1 = LOC(tagl,m,v) & NOTEQUAL(tag, tagl) & UPDATE(tag, mode, val, envl, env2).

TRANSPUT(IN, mode, val, STATE(mem,in,out,OK), STATE(mem,inl,out,OK)) <- DENOTATION(mode, val, in, inl) . TRANSPUT(OUT, mode, val, STATE(mem,in,out,OK), STATE(mem,in,outl,OK) ) <- DENOTATION(mode, val, out, outl). ERROROCCURRED(STATE(m,i,o,ERROR(x)),STATE(m,i,o,ERROR(x))). EVALUATIONERROR(exp,v,state,STATE(m,i,o,ERROR(exp)) ) <- state=STATE(m,i,o,0K) & ~ SEMEME(exp,any,state,state) . INPUTERROR(STATE(m,NIL,o,OK) , STATE(m,NIL,O,ERROR(INPT) ) ) .

112 The Definition of ASPLE

/* ARITHMETIC AND BOOLEAN OPERATIONS */ /* */

ADD(x, y, z) <- PLUS(x, y, z) & INRANGE(z)•

MULT(x,y,z) <- TIMES(x, yf z) & INRANGE(z).

PLUS(xa:xf ya:yf z) <- SUCC(xlf x) & SUCCCy, yl) &

PLUS(xa:xl, ya:ylf z) .

PLUS(xa:0f ya:y, za:y) <- PLUS(xar ya, za). PLUS(xa:x, ya:9, za:z) <- SUCC(Z, x) & PLUS(0:1, ya, zl) &

PLUS(xar zl, za).

PLUS(0, xr x). PLUS(x, 0, x) <- NOTEQUAL(x, 0). OR(FALSE, FALSE, FALSE). OR(FALSE, TRUE, TRUE). OR(TRUE, x, TRUE).

TIMES(0, x, 0) . TIMES(xa:0, y, za:0) <- TIMES(xa, y, za). TIMES(xa:x, y, z) <- SUCC(xl, x) & TIMES(xa:xl, y, zl) & PLUS(zl, y, z). AND(TRUE, TRUE, TRUE). AND(TRUE, FALSE, FALSE). AND(FALSE, X, FALSE).

SUCC(5,6). SUCC(6,7) SUCC(7,8). SUCC(8,9). SUCC(0,1). SUCC(1,2) SUCC(2,3). SUCC(3,4) . SUCC(4,5). EQUAL(x, y, TRUE) <- EQUAL(x, y) . EQUAL(x, y, FALSE) <- NOTEQUAL(x, y) . NOTEQUAL(x, y, TRUE) <- NOTEQUAL(x, y) . NOTEQUAL(x, y, FALSE) <~ EQUAL(x, y) . LEQ(x, x). LEQ(x, y) <- LTH(x, y). LTH(xa:x, xa:y) <- SUCC(x, y). LTH(xa:x, xa:y) <- SUCC(x, z) & LTH(0:z, 0:y). LTH(xa:x, ya:y) <- LTH(xa, ya). INRANGE(int) <- VALUE(N5, max) & LEQ(int, max).

113 The Definition of ASPLE

The semantics of the subsidiary relations are given by means of Horn clauses shown in Fig. 4/1/6. Note that this includes the semantics of the arithmetic and relational operators in the language. For instance, '+' is described by the ADD and PLUS relations, which use decimal arithmetic for clarity, defined by recursive routines. The ADD relation interfaces this to the SEMEME relation, including the cond- ition INRANGE which applies a maximum limit to the size of an integer, which is a feature of the definition of ASPLE for a particular machine.

This contrasts with the method of denotational semant- ics which relates the arithmetic of a program back to the 'normal' mathematical meaning; of course computer arithmetic is not 'normal' arithmetic to the extent that it is bounded or modular. By giving a logical definition the expected behaviour of the arithmetic in extreme conditions can be made plain. The definition still remains a specification - it is not intended, for instance, that decimal arithmetic be obligatory, though it useful for tutorial purposes.

As an example of the other relations, we will consider the TRANSPUT relation (Fig. 4/1/5). This has two clauses, the only difference between them being the file, which is either input or output.

TRANSPUT(IN, mode, val, STATE(mem, inl, out, OK) STATE(mem, in2, out, OK)) <- DENOTATION(mode, val, inl, in2). TRANSPUT(OUT,mode, val, STATE(mem, in, outl, OK) STATE(mem, in, out2, OK)) <- DENOTATION(mode, val, outl, out2).

The body of each clause is DENOTATION which is part of the syntax (Fig. 3). This acts on the contents of a file.

114 The Definition of ASPLE

which is a list of characters. The clauses illustrate the flexibility of input output in logic programs. When the input clause is run as a program in the normal way, the variable inl is bound and the variables mode and val are unbound initially, but bound by the elaboration of the clause. For the output clause val is bound and mode and outl are unbound initially, and the action of DENOTATION is thus reversed, generating the string of characters in the file.

Note that the output file appears in the first occur- rence of STATE in the second clause, and not in the second as one might expect. If one attempts to put it in the 'res- ultant1 state, then although the numbers appear the right way round in the resultant file, they appear in the wrong order, with the last number coming first. There is no in- trinsic difference between input and output files in the logic of the program. The only difference lies in the 'con- trol' of the logic. This means for instance that the same file may be used for both the input and output. This gives a much more satisfactory means of treating 'interactive' files such as terminals, than considering them as two separate files, which does not capture the essence of the interac- tion. It also aids in the treatment of errors.

Errors are treated in this definition by the provision of a separate 'status' parameter in the state of the machine. In normal execution the value of the status is 'OK'. This state is checked by all of the semantic relations which access the state of the machine - e.g. LOOKUP and TRANSPUT. On the occurrence of an error which occurs in an expression (which may include arithmetic overflow, undefined element or end-of-file), the function is 'completed' by the last clause in Fig 4/1/5 which invokes a call to EVALUATION- ERROR. This changes the status OK to ERROR(EXP), indicating that an error has occurred in the evaluation of an expres- sion. Because expressions may be evaluated in any order, it is not possible to indicate unequivocally what type of error

115 The Definition of ASPLE has occurred (since two different errors might occur in independent branches). Any further statements that are exec- uted can only now be completed by the last clause for the statement SEMEME which has a call to ERROROCCURRED. Thus the SEMEME relations are complete, in that any finite failure will lead to a relation which succeeds.

This leaves the question of non-terminating programs open. Since the definition here may be evaluated using a computer we have not 'completed1 the definition, although it is quite possible to write a clause similar to EVALUATION- ERROR to complete the semantics of statements. However the meaning of negation as failure has been assumed to be the maximal fixpoint which corresponds to finite failure.

Therefore by convention, we assume that the final state of a non-terminating program has the value UNDEFINED. This occurs in place of the whole state, not just the status, as it is impossible to state anything about that state.

We might note however that it is still possible to say something about the results of a non-terminated program through the value bound to the output file. Since the term which represents the output file appears in the initial state, any values bound in it are recoverable. The tail of the list will be unbound, as the file is not completed.

116 The Definition of ASPLE

h Worked Example

To demonstrate the effect of an implementation of

ASPLE using M-grammarsf we will demonstrate the bindings of the top level of the ASPLE relation given earlier.

ASPLE(text, input, output, result) <- LEXEMES(tokens, text, NIL) & MORPHEME(tree.mem, tokens, NIL) & SEMEME(tree, STATE(mem, input, output, OK) STATE(ml, il, ol, result)).

text = the factorial example of Fig. 1. tokens = "BEGIN"."INT".ID("FACT").",".ID("I").n,".ID("N")."?" . "INPUT".ID("N").";".ID("I").": = ".NUM(0:1)."?" .ID("FACT").":=".NUM(0:1).";" . "IF"."(". ID ( "N") NUM (0:1) .")"."THEN" ."WHILE"."(".ID("I").V".ID("N").")"."DO" .ID("I").":=".ID ("I")." + ".NUM(0:1) .ID ("FACT") .":=".ID("FACT") ."-*". •"END"."FI".";" ."OUTPUT".ID("N").";"."OUTPUT".ID("FACT") ."END".NIL mem = LOC("FACT", REF(INT), UNDEF) .LOC("I", REF(INT), UNDEF) •LOC("N", REF(INT), UNDEF).NIL tree = ASGT(ID("I"), VAL(0:1)) .ASGT(ID("FACT"),VAL(0:1)) • INPUT(ID("N")) .COND(NE(DEREF(ID("N")),VAL(0) ), WHILE(NE(DEREF(ID ( "I")),DEREF(ID("N"))) , ASGT(ID("I"),PLUS(DEREF(ID("I")),VAL(0:1))) .ASGT(ID("FACT"),TIMES(DEREF(ID("FACT")), DEREF(ID("I")))) .NIL) .NIL)

117 The Definition of ASPLE

.OUTPUT(DEREF(ID("N"))) • OUTPUT(DEREF(ID("FACT"))) .NIL input = NUM(0:8) output = NUM(0:4:0:3:2:0) result = OK

118 CHAPTER

A Formal Definition JQ£ Prolog

In this section the presentation of a definition of Prolog is given. This is of interest for two reasons. Firstly, it shows the definition of a language very differ- ent from the Algol-like structure of ASPLE, a non-determin- istic language suitable for symbolic manipulation rather than general algorithms. Secondly, it allows a slightly closer look at the language on which the definitions are based. Thirdly, it shows how definitions can successively be refined to show more of the underlying detail of the implementation.

The syntax of a basic version of Prolog is shown in Fig. 4/2/1. Three function symbols only are used in the abstract syntax which this generates:

(1) C(c) This names the constant c. (2) F(n,a) This names a function which has name n and list of arguments a. (3) V(v) This names the variable named v.

No special function symbol is used for clauses, which are represented as lists, of which the first item is the head Cleft hand side) and the remaining elements the body (right hand side, or subgoals). A program is assumed to consist of a sequence of procedures and goals, and these are collected in separate parameters of the non-terminal 1 Program1.

Note that in the follow definitions, the optional separator is omitted, for reasons of clarity.

119 A Formal Definition of Prolog /* */

/* Fig 4/2/1 The Syntax of Prolog */ /* */

Program(a.b,c) -> Clause(a) "." Program(b,c). Program(a,b.c) -> Goal(b) ".n Program(a,c). Program(NIL,NIL) -> NIL.

Clause(a.b) -> Atom(a) "<-" Conjunction(b). Clause(a.NIL) -> Atom(a).

Goal(a) -> "<-" Conjunction(a).

Conjunction(a.b) -> Atomvar(a) "S" Conjunction(b). Conjunction(a.NIL)-> Atomvar(a). Atomvar(a) -> Atom(a) I Variable(a). Atom(a) -> Function(a) I Constant(a). Function (F (n,a)) ->Constant (n) " ("Ter ml ist (a) ")

Termlist(a.b) -> Term(a) "," Termlist(b). Termlist(a.NIL) -> Term(a). Term (a) -> Function(a) I Constant(a) I Number(a) I Variable (a). Constant(C(c)) -> @C(c). Number(C(c)) -> @N(c). Variable(V(v)) -> @V(v).

/* The Lexical Syntax */ /* */

Tokenlist (a) -> @1 ' Tokenlist(a). Tokenlist(a) -> G1/1 e1*1 Comment Tokenlist(a) . Tokenlist(a.b) -> Token(a) Tokenlist(b). Tokenlist(NIL) -> NIL.

Token(C(a.b)) -> Upper(a) Alphamerics(b).

120 A Formal Definition of Prolog

Token(V(b)) -> Lower(a) Alphamerics(b) Token(N(a)) -> Digit(b) Digits(b,a). Token (" (") -> Token(")") -> Token(",") -> Token(".") -> Token("&") -> Token("<-") -> 1 1 Token(C(a.b)) -> ' Quoted ' . Token(C(a.NIL)) -> 8a.

Alphamerics (a.b) -> (Upper (a) 1 Digit (a)) Alphamerics(b). Alphamerics(NIL) -> NIL. Digits(a,b) -> Digit(c) & Prod (a,l 0,d) & Sum(d,c,e) Digits(e,b Digits(a,a) -> NIL. Upper(a) -> @a & Letter(a) . Lower(a) -> 6 '*' . Digit(a) -> @a & Digit(a). 1 Comment -> §'*' 6'/ 1 @a Comment. 1 i . . @ « • « . Quoted(•' .b) -> @ Quoted(a.b) -> @a & ~a=''1 Quoted(b). Quoted(NIL) -> NIL.

As an example the clause

A(B(*CfD))<-E(*C). is represented by the term

F ("A" ,F("B",V("C") .CCD") .NIL) .NIL) .F ("E" ,V ( "C") .NIL) .NIL.

In this representation, variables are represented by

121 A Formal Definition of Prolog constants, and functions and constants by terms, and the atoms (or literals) at the top level of the clause are treated the same as function terms. One might remember that an expression in double quotes is equivalent to a list of its component characters, e.g. "AB" is the same as A.B.NIL.

The semantics of a Prolog program may be given as a series of programs which depend in increasing amounts on the language being used in the definition. The model-theoretic and fixpoint semantics have already been given in the prev- ious chapter. The semantics considered in this chapter are increasingly "operational" in nature.

1. The Abstract Definition

The first definition assumes most of the features of Prolog - backtracking, unification etc, and most of the definition involves the replacing of constants naming vari- ables by instances of variables.

The top level takes three parameters: a list of goals, a list of clauses forming the program, and a list of solu- tions. A solution in this definition is taken to be the goal with variable terms replaced by a valid instance of them. The top level is

Prove(goal.goals, prog, soln.solns)<- Provel(goal, prog, soln) & Prove(goals, prog, solns). Prove(NIL, prog, NIL). ProveMgoal, prog, soln) <- Renamelist (goal, soln, s) & Solve(soln, prog). ProveKgoal, prog, NOSOLN) <- Renamelist (goal, soln, s) & ~Solve(soln, prog).

122 A Formal Definition of Prolog

The first clause splits a list of goals and processes each of them separately. 'Provel' is basically present to provide output in the case where there is no solution. 'Renamelist' (see below) replaces the names of variables in the goal clause by instances of the variables. The kernel of the interpreter is in the clause for 'Solve', which has only two parameters - the atom list and the program clause list.

Solve(NIL, prog). Solve (atom, rest, progK- Select (proc, prog) & Renamelist(proc, atom.subgoals, s) & Solve(subgoals, prog) & Solve(rest, prog).

The clause 'Select', selects one procedure from the list of procedures which is the program. 'Renamelist' rep- laces variable terms in that procedure (which are represent- ed by function terms) by real variables (which represent infinitely many terms). If the first term of this renamed procedure agrees with the goal, then the subgoals of that procedure and the rest of the goals are solved by recursive calls. These terminate in assertions for which the subgoals are nil.

It should be noted that this procedure does not specify or constrain the proof procedure. The 'standard' method in Prolog is to take subgoals from left to right, and to take subgoals of the leftmost subgoal first - the so-called left to right depth first (LRDF) search. However the clause can be interpreted in other ways - e.g. breadth first, or as a specification which has no order of evaluation.

Non-determinism in this clause derives from two sources - one in solving the subgoals, and one from the Select procedure, which returns any of the clauses in the program.

123 A Formal Definition of Prolog

It is coded as follows:

Select(a,a.b) Select(a,b.c)<- Select(a,c).

The remaining procedure to be described is 'Rename- list1. This has two clauses, which process elements of a list, and the main part is done by the clause 'Rename1. This has three clauses corresponding to the three function symbols in the representation of a program, and one subsid- iary procedure 'Pairlist'.

Renamelist(a.b, c.d, s) <- Rename(a, c, s) & Renamelist(b, d, s) . Renamelist(NIL, NIL, s) . Rename(C(c), C(c), s). Rename(F(n,a), F(n,b), s) <- Renamelist(a, b, s) . Rename(V(v), x, s) <- Pairlist(v, x, s) .

In these clauses, the third parameter is initially unbound, and when bound consists of pairs of arguments - a constant giving the name of the variable, and an instance of this variable, which may be returned by the 'Rename1 proced- ure.

Pairlist has the form

Pairlist(a,b, a.b.c) Pairlist (a,b, c.d.eX- ~a=c & Pairlist (a,b,e) .

This is used in two ways. On the first occurrence of a variable in a procedure a pair of items is added to the list 's'. Subsequent occurrences of this variable find the same pair in the list.

Rename is similarly used in two modes. While it is

124 A Formal Definition of Prolog

renaming the head of the clause, both of the first two parameters are bound; but when renaming the subgoals, the second parameter is always output (i.e. unbound at the start of execution).

This completes the first definition of the semantics, which is summarised in fig 4/2/2. This definition returns only the first possible solution of any goal, in common with most Prolog implementations. Other "control regimes" could easily be programmed.

/* */ /* Fig 4/2/2 The Abstract Definition */ /* */

Prove(NIL, prog, NIL). Prove (goal.goals, prog, soln.solns)<- & Prove(goal, prog, soln) & Prove(goals, prog, solns). Provel (goal, prog, solnX- Renamelist (goal, soln, s) & Solve(soln, prog). Provel (goal, prog, NOSOLN)<- Renamelist(goal, soln, s) & ~Solve(soln, prog).

Solve(NIL, prog). Solve (atom, rest, progX- Select (proc, prog) & Renamelist(proc, atom.subgoals, s) & Solve(subgoals, prog) & Solve(rest, prog). Renamelist(NIL, NIL, s). Renamelist(a.b, c.d, s)<- Rename(a,c,s) & Renamelist(b,d,s). Rename(C(c), C(c),s). Rename(F(n,a), F(n,b),s)<- Renamelist(a,b,s). Rename(V(v), x,s)<- Pairlist(v,x,s).

125 A Formal Definition of Prolog

Pairlist(a,b, a.b.s) Pairlist(a,b, c.d.sX- ~a=c & Pai rlist (a, b, s) .

Select(a, a.b). Select(a, b.c) <- Select(a,c).

2 Prolog with Unification

The first definition leaves open several important questions. The most obvious is the treatment of variables. We will now present a definition of the semantics which makes explicit the way in which parameters are matched against each other, but leaves open other features, such as the control and backtracking mode.

The main change is in the method of representing vari- ables. We give each instance of the use of a clause a unique number or level, n, starting at 0. Then a variable X which is named by V("X") in the term which names the program, is named by V("X",n) when the clause containing it is used.

The values of variables are held in a list of terms of the form

V(x,n) = S(t,f). which implies that the variable V(x,n) is bound to some term t encountered at the level f. This term may include vari- ables of the form V(y) and the value f indicates that it corresponds to the instance of the variable V(y,f). This method of representing the value of variables is called "structure sharing" (see Boyer, Moore, 1972) and has the advantage that the copying of terms is minimised. Other representations are possible, so this represents only one of the possible ways of representing the values of variables.

126 A Formal Definition of Prolog

The heart of this refinement lies in the clauses for

Match and Unify. These have the form MatchCtl, t2f bl, b2) where tl and t2 are terms, bl is the bindng list before the unification and b2 the list afterwards. The clauses for Match are shown in Fig 4/2/3.

The top levels of the interpreter do not change much, except that the goals are now held as lists of pairs, in which each pair has a level number and a list of subgoals still to be solved for the procedure. There are now two calls to versions of Select - to select the goals and to select the matching procedures. In this way the proof strat- egy of the interpreter is still left undefined.

127 A Formal Definition of Prolog

/* */ /* Fig 4/2/3 Prolog with Unification */ /* */

ProveMgoal, prog, soln) <- Solve( (0.goal).NIL, prog, 0, NIL, bindings) & Bindlist(goal, soln, 0, bindings). ProveKgoal, prog, NOSOLN) <- Solve( (0.goal).NIL, prog, 0, NIL, bindings).

Solve(NIL, prog, f, b, b) . Solve(atoms, prog, fl, bl, b3) <- Sum(fl,l,f2) & Select2(atom, atoms, rest) & Selectl(head.subgoals, prog, others) & Match(atom, f2.head, bl, b2) & Solve((f2.subgoals).rest, prog, f2, b2, b3)

Selectl(a, a.b, b). Selectl (a, b.c, b.d) <- SelectKa, c, d) .

Select2(a, (n.NIL).b, c) <- Select2(a, b, c). Select2 (f. a, (f.b.c).d, (f .e) .d) <- SelectKa, b.c, e) . Select2(a, (f.b.c).d, (f.b.c).e) <- Select2(a, d, e).

/* The Unification Algorithm */ /* */

Match(fl.tl, f2.t2, bl, b2)<- Deref(tl, t3, fl, f3, bl) & Deref(t2, t4, f2, f4, bl) & Unify(t3, t4, f3, f4, bl, b2)

Deref(C(c), C(c), f, f, b). Deref(F(n,a), F(n,a), f, f, b). Deref(V(v), t2, f1, f2, b)<- Member(V(vl,fl)=S(t3,f3), & Deref(t3, t2, f3, f2, b).

128 A Formal Definition of Prolog

Deref(V(v), V(v)f f, f, b)<- "Member(V(v,f)=s, b).

Unify (C(c), C(c), f 1, f2, bf b) .

Unify(F (nfal), F(n,a2), fl, f2, bl, b2) <- Unifylist(al, a2, fl, f2, bl, b2). Unify(V(v), V(v) , f, f, b, b). Unify(V(vl), t2, f 1, f2, b, (V(vl,fl)=S(t2,f2)).b) <-

~ (V(vl)=t2 & fl=f2) & - Occurs(V(vl), t2f flff2).

Unify(tl, V(v2), f1, f2f bf (V(v2,f2)=S(tlffl)).b) <- ~ tl=V(vl)

& ~ (tl=V(v2) & fl=f2) & - Occurs(V(v2),tlr£2,£l)

Unifylist(NIL, NIL, fl, f2, b, b) Unifylist(al•rl, a2.r2, fl, f2, bl, b2)<- Match(fl.al, f2.a2, bl, b3) & Unifylist (rl, r2, fl, f2, b3, b2).

Occurs(V(v), V(v), f, f).

Occurs(V((v), F(n,a), fl, f2) <- Occurs(V(v), a, f1, f2).

Member(a, a.b).

Member(a, b.c) <- Member(a, c).

Bindlist(NIL, NIL, f, b) Bindlist(al.rl, a2.r2, f, b) <- Bind(al, a2, f, b) & Bindlist(rl, r2, f, b). Bind(C(c), C(c), f, b). Bind(F(n,al), F(n,a2), f, b) <- Bindlist(al, a2, f, b). Bind(V(v), t, f, b) <- Member(V(v,f)=S(t2,f2), b) & Bind(t2, t, f2, b). Bind(V(v), V(v,f), f, b) <- ^Member(V(r,1)=s, b).

129 A Formal Definition of Prolog

Match first calls »Deref' to apply any bindings of variables which exist already . The new tuple S(t2,12) is then used in the clause 'Unify1 which has five cases corres- ponding to all the different possibilities. Constants match if they are equal. Function terms match if their names are equal and their arguments match. For variables, one does not need (or wish) to bind a variable to itself, otherwise one binds the first variable to the second term, or the second variable to the first term.

The conditions in these clauses would in practice be replaced by the 'cut' symbol, as they simply exclude cases which have already been allowed for.

As an example, consider the following goal and clause against which it is matched:

<- P(u,A)

P(q(x),x)<-...

This gives rise to the list of bindings

V(X,1)=S(C(A),0).V(U,0)=S(F(Q,V(X).NIL),1).NIL

By virtue of this the variable u is bound to the term q(A), but this is not represented explicitly. To extract it, the interpreter uses the procedure Bindlist, which con- verts the term

F(P,V(U).C(A).NIL). to (F(P,F(Q,C(A).NIL).C(A).NIL).NIL.

The text of Bindlist follows easily from the unific- ation routines: the first two arguments are a term and a level. The third is the constructed term, and the fourth

130 A Formal Definition of Prolog the list of bindings.

The illustration above demonstrates the situation in which variables (such as u) act as output, as input (x matched against A) and act in the construction of complex terms (f(x)). It is this capability which adds much to the flexibility of resolution logic and provides a more pleasing approach to the whole question of parameter handling than other mechanisms used in computing.

3 A Backtracking Interpreter

The third refinement of the definition of Prolog is to represent explicitly the backtracking involved in searching for solutions. To do this we must introduce a new data structure - a list of nodes manipulated in a FIFO manner as a stack.

A node is a 6-place function

Node(atom, procs, rest, parent, binding, level) where 'atom1 is the procedure which has been matched, •procs' is the list of procedures remaining to be tried, 'rest1 is the list of outstanding goals to be tried in the procedure, 'parent' is the state of the stack corresponding to the goal, and 'binding' and 'level' are those applicable at this stage. Initially, these are all NIL, with level equal to 0.

The Solve procedure is represented by

Solve(goal, parent, program,stack 1, stack 2) where stackl is the initial and stack2 the final state of

131 A Formal Definition of Prolog

the proof stack. Solve is now represented by three clauses, which correspond to the end of the program, the end of a procedure, and the general case respectively (see Fig. 4/2/4) .

/* */ /* Fig 4/2/4 The Backtracking Interpreter */ /* */

Provel(goal, prog, soln) <- Solve(goal, NIL, prog, Node(NIL,NIL,NIL,NIL,NIL,0).NIL, stack) & Instantiate(goal, stack, soln). ProveMgoal, prog, NOSOLN) <- ~Solve(goal, NIL, prog, Node(NIL,NIL,NIL,NIL,NIL,0).NIL,s).

Solve(NIL, NIL, prog, stack, stack). Solve(NIL, Node(g,p,rest,par,b,f).s, prog, stackl, stack2)<- Solve(rest, par, prog, stackl, stack2). Solve(goall.goal2, par, prog, stackl, stack2) <- Solvel(goall, prog, goal2, par, prog, stackl, stack2). Solvel(C("/"), procs, rest, par, prog, Node(g2,p2,r2,par2,b2,f2).s2, stack2) <- stackl = Node(NIL,NIL,rest,par,b2,f2).par & Solve(rest, par, prog, stackl, stack2). Solvel(goal,(head.subgoals).procs, rest, par, prog, stackl, stack2) <- ~goal=C("/") & Bindings(stackl, bl, f) & Match(head, f.goal, bl, b2) & Solve(subgoals, stackl, prog. Node(goal,procs,rest,par,b2,12).stackl, stack2). Solvel(goal, (head.subgoals).procs, rest, par, prog, stackl, stack2) <- ~goal=C("/) & Bindings(stackl, bl, f) & Match(head, f.goal, bl, b2) & Solve(goal, procs, rest, par, prog, stackl, stack2).

132 A Formal Definition of Prolog

Solve l(goalf NIL, rest, par, prog, Node(old,procs,restl,pari,b,l).stackl, stack2) <- Solvel(old, NIL, restl, pari, stackl, stack2).

Bindings(Node(g,p,r,par,b,l), b, 11, 12) <- Sum(ll, 1, 12).

Instantiate (goal, Node (g,p,r,par,B,l) .s, solnX- Bindlist (goal, 0, soln, b).

Since we are representing sequence and backtracking it is possible to represent the other control mechanism of Prolog - the "cut" or "slash" symbol, "/". This is given as the first clause of an "inner loop" of Solve, called Solvel, which has the parameters:

Solvel(goal, procs, rest, parent, program, stackl, stack2)

The effect of the cut predicate is to remove the poss- ible backtracking points within the procedure, e.g. given the procedure

A <- B & / & C. A <- D. then if B succeeds, backtracking to B, and any use of the second clause of A is prevented. We will discuss later the merits and disadvantages of this predicate, but the semant- ics of this are shown in the first clause for Solvel.

Using the cut predicate we can define a "negation as failure predicate This is written:

~(a) <- a & / & FAIL. ~ (a) .

133 A Formal Definition of Prolog

using the "metavariable" facility in the first clause. To show "not a" one attempts to show a. If this succeeds one evaluates the cut predicate (which succeeds) and the FAIL predicate, which has no defining clause and therefore fails. At this point backtracking would normally occur, but this is prevented by the cut predicate, which fails its parent. Thus if a succeeds, "a fails. Alternatively if a fails, the second clause is chosen and "a succeeds.

Interpreting the cut predicate operationally, the stack is replaced by the subset of the stack represented by the parent of the call to cut, with an extra node representing the bindings and levels established in the intervening calls. This is in fact slightly different to most imple- mentations which store all bindings on the stack so that they can only be deleted on backtracking.

The other three clauses deal with three mutually ex- clusive cases: when the goal matches with the first proced- ure ; when it does not; and when there are no more clauses left. Note that the call to Match in the third clause is simply a check, so that the second and third clauses would be written with an IF..THEN..ELSE in a conventional program (or using cut in Prolog).

The remainder of the program (the unification routines) is as in the previous section.

Discussion

The idea of demonstrating the implementation of a lang- uage at several levels has been well demonstrated for a Lambda Calculus based language by Bjorner (1977). For the simple block structures language SAL he demonstrates four

134 A Formal Definition of Prolog levels of interpreters and three levels of compilation. This shows some of the choices open to the implementer and the concrete details which make certain aspects more explicit.

The same may be observed in the three levels of defin- ition of Prolog. The first level is rather more concrete than a fixed point definition of Prolog but leaves many aspects of the implementation undefined: for instance, the method of handling variables (which is crucial to the use- fulness of resolution) and the order of evaluation of the literals within a clause.

The second level makes the method of handling variables explicit but still says little about the ordering of the clauses (except insofar as this is implied by the bindings) or whether backtracking or some other method of evaluation is to be used (e.g. using parallel evaluation). The third level is much more specific and rather more restrictive. It specifies backtracking and strict left-to-right depth first evaluation.

It is important though to remember that all three are intended as specifications? any implementation which yields the same results is permissable. For instance, the use of structure sharing for variables is a convenient but by no means obligatory method. Any other method which produces the same results would be acceptable.

But the introduction of the cut predicate does intro- duce several interesting questions.

(1) Its introduction is only meaningful at the third level, when the notions of left-to-right evaluation and back- tracking have been introduced. Its semantics cannot be given at either of the higher levels.

135 A Formal Definition of Prolog

(2) It requires the incorporation into the node of the variable 'parent1 which refers not to a logical term or variable, but to the proof stack itself. Cut is thus in some sense a meta-level facility.

(3) The method by which it is introduced, which involves naming the proof stack is in some ways similar to the method of continuations employed for handling jumps (see section 4.3).

The introduction of the cut predicate has been content- ious in Prolog circles. It has frequently been compared to the 'GOTO' statement (and this is supported by point 3 above) and has given rise to some very obscure programming, with the result that there have been calls for its abolition or severe restriction. Is it therefore possible to give it a "declarative" rather than an "operational" semantics?

One of the novel features introduced into mathematics by computing has been the conditional statement and expres- sion. McCarthy (1962) uses it as the basis of his mathemat- ical approach to computing and Manna (1974) has shown that it can be used as a complete replacement for all the logical operators, e.g. 'Not P' can be represented IF P THEN FALSE ELSE TRUE.

It would be attractive to have such a feature in Pro- log. A natural equivalent of the clause

A <- B THEN C ELSE D. would be

A <- B & C. A <- ~B & D.

136 A Formal Definition of Prolog

However, in most Prolog systems these are not equiva- lent statements, whether implemented using 'Cut' or by 'neg- ation as failure1 (Clark, 1978). The reasons are twofold:

(1) Implementations using cut do not allow unres- tricted backtracking to B in the first clause. The equivalent statements are

A <- B & / & C. A <- D.

which is equivalent to

A <- Oneof(B) & C. A <- ~B & D.

where Oneof(x) is a metapredicate which returns only a single solution of its parameter.

(2) If there are variables shared between B and the other predicates, then both B and ~B can be true for discrete ranges of that variable.

e.g A (x) <- B (x) & C (x) . A (x) <- ~B(x) & D (x) . B (A) . C (B) . D (B) .

This gives different answers depending on the order of evaluation of the subgoals.

When we ask what the utility of the conditional cons- truct is, it emerges that it is basically useful as an extension to clausal form for distinguishing several cases of the input. Thus we are only interested in establishing a single value of the predicate in the condition (so objection

137 A Formal Definition of Prolog

1 does not matter) and we are concerned to constrain the order of evaluation so that the condition is evaluated before the body (so objecton 2 is not important).

Hence from this point of view, a sequence of clauses containing the cut may be considered a sensible equivalent to a conditional or case statement, as follows.

Prolog Pseudo Algol

A <- B & / & C. A: = if B then C A <- D & / & E & F. else if D then E & F A <- G. else G

This structured use of cut, when it is used in every clause except perhaps the last, overcomes the conceptual difficulty of the predicate.

It also points out the key requirements for the equiva- lent for the cut in versions of Prolog which do not follow the Left-to-Right Depth-First regime: namely that predicates on the left must be evaluated before those on the right, that only one solution of those on the left is required, and that when a left hand side is evaluated to be true, the system is "committed" to that clause. To draw the parallel with Dijkstra's "guarded commands" even more strongly, we should insist that the left hand sides of different clauses may be evaluated in any order (with the exception of the default), so that the order of the clauses does not affect the semantics.

138 Chapter 4.3

Definition of an Algol 68 Subset

"The Lord said, Go to, let us go down and there confound their language that they may not under- stand one another's speech'" Gen. 11/4 A.V. (1611)

The ASPLE language avoids dealing with some of the most important and tricky features of programming languages - such as block structure, procedures and jumps. In this section we expand the language so that it is closer to Algol 68. Thus most constructs return values, the full procedure mechanism is allowed, together with generalised jumps. The full text of the definition is given in Appendix A.

However, significant features of the language - such as parallel evaluation, arrays, structures etc will be omitted. Also certain other features, such as parameterless proc- edures and widening, are omitted in order to simplify the range of permitted coercions. These were omitted chiefly because of pressures on space and time. Apart from parallel processing, they pose few difficulties and would add little to the exposition.

The lexical syntax is different to that of ASPLE in only one important respect. Algol 68 requires two typefaces - 'bold' and 'small'. These may implemented using capital and lower case characters, or, when using one case through- out, by enclosing the bold symbols in strops (e.g. 'BEGIN'), or preceding them with a quote ('BEGIN) or a period (.BEGIN). We assume that the lexical syntax transforms any

139 Definition of an Algol 68 Subset

of these conventions into the standard one, using upper and lower case. Given this specification, the details of the lexical syntax are omitted.

Elock Structure

In the Algol languages, blocks may be placed almost anywhere in the program. Local variables introduced inside that block have a scope which is the same extent as the block. Procedures are essentially recursive, so that a num- ber of 'activations' of a block may be in existence at the same point of time. Algol 68 extends this in several ways: declarations may occur anywhere in a block, up to the first occurrence of a label in that block? 'identities' may be declared which cannot be assigned to again? the procedure mechanism is also more general, including procedures used as variables and unnamed procedures.

Several function symbols must be incorporated into the relational definition to model these constructs. We make no apology that these are "stack-like", as the introduction of such a mechanism seems to be the clearest way of introducing the concepts involved. It has been shown by Russell (1977) that the stack model is equivalent to the more abstract denotational model. Hence there is no loss of generality. As in ASPLE, the 'parse' and 'execution' phases are separated and the stack structures are different for each phase.

With block structure a name is no longer sufficient to identify a variable in the semantics: therefore we use a composite value - Id (tag,offset). The parameter 'offset' indicates the difference between the static nesting level of the block in which the variable is declared and the level in which it is used. Thus in the assignment statement in the program:

140 Definition of an Algol 68 Subset

BEGIN INT i; BEGIN INT j; 3 * ~ ^ / •••

•j* is identified by ld(nj",0) and 'i' by Id("i",l).

When applying the context sensitive scope restrictions in the grammar rules, the following structure is used to name the environment:

Range(declarations,environment) where the first parameter is a list of declarations intro- duced at this level and the second is the declarations at outer levels. For the program fragment above this would be:

Range ("j":Ref(INT).NIL, Range("i":Ref(INT).NIL, r) where r indicates the surrounding ranges.

The introduction of blocks in the language may be illustrated by the productions for Closed Clause and Series which are reproduced below in slightly telescoped form:

ClosedClause(mode,env,Block(stms)) -> "BEGIN" Series(mode,Range(props,env),decs,labs,stms) "END" & Append(decs,labs,props) .

There are three outputs of the serial clause 'Series': the declarations made in the block, labels in the block and

141 Definition of an Algol 68 Subset the statements output. Labels are separated from other dec- larations for several reasons: primarily because in certain positions, such as the 'test1 ('enquiry1) part of a condi- tional, labels are not allowed, and separating the label definition allows this to be specified easily. The second parameter of Series is the environment for the inner range which is derived from the 'decs' and 'labs' which are them- selves output from the statements within the block. As this is a nearly circular definition, it would not run effic- iently as a Prolog program in this form without the use of coroutining. The system would always assume that the vari- able was declared in this block, causing excessive back- tracking if it was not. See also chapter 5.1.

The semantics of blocks require a new definition of 'State', used to model the machine contents at run-time. This has three components (ignoring transput, which can be added following the model of the ASPLE definition):

State(Stack, Heap, Continuation)

Stack provides all local storage for names and values in a series of "frames". Heap provides storage for items which do not follow the stack mechanism. Continuation - will be discussed later in the section on jumps. It may be omitted from the definition if full jumps are not permitted.

The semantic representation of values in Algol 68 is more complex than in ASPLE but allows a very straightforward treatment of pointers and parameters. Values may be stored in the stack or the heap and there is a clear distinction between (what is called) the name of a location and its contents. The names of locations are described by two func- tion symbols:

142 Definition of an Algol 68 Subset

Loc(number,dynamiclevel) - for values on the stack Heap(number) - for values in the heap where 'number1 is a unique identifier for that location and 'dynamiclevel' describes the stack frame in which the location is declared.

Accessing the value of a variable is a two-stage pro- cess in the semantics: (1) From Id (tag,of f set) look up the name on the stack which will yield either Loc(n,dl) or Heap(n). (2) Look up this value in either stack or heap.

The stack is composed of a number of frames. A frame is a function with 7 parameters:

Frame(display, value of this block, names of locals, values of locals, rest of the stack, statements in this block, previous continuation)

The second parameter holds the value returned by this block . The third and fourth parameters have already been assumed: "names of locals" is a list of pairs mapping iden- tifiers declared to their names (Loc or Heap). "Values of locals" is simply a list of the current local variables. The "rest of the stack" is other frames, and the last two para- meters can be ignored for the moment.

To understand the dynamic behaviour of the stack we must consider the first parameter - 'display1. This is a list of frames which are accessible to this block; the first

143 Definition of an Algol 68 Subset

item is a number which names the current frame. Each frame is numbered in this way, starting with 1 for the global block. There are two ways in which the display is built up: (1) At a normal block entry a new frame is created which has the old frame (as parameter 5) and a display consisting of the old display with the next higher number at the front. (2) For a procedure call there is a two stage process. When the procedure is declared, what is created and stored as the value of the procedure is a closure, which is a three-place function: Closure(formal parameters, body of procedure, current display) When the procedure is evaluated, the new display is the display in the closure, which applies to the declared environment, not to the calling environment, together with the new frame level added to the front.

To see how this applies to parameters, consider a procedure whose declaration is:

PROC test=(INT value,REF INT ref,PROC(INT)VOID p) VOID: BEGIN ... END; and is called by the statement test(i,j,test).

For the first parameter, the required mode is INT and both stages of the evaluation are invoked to yield the value of i. For the second (call by reference) only the first stage is invoked, and what is passed is Loc(n,dl) or Heap(n). Hence when the procedure assigns a value to 'ref1 the value is placed in j's location.

Thus to pass a procedure as parameter, one simply finds the value of the procedure (a closure) in the normal way

144 Definition of an Algol 68 Subset

and sets the value of the formal parameter to that closure. When the parameter is invoked, the correct environment is automatically set up. We might note that the example above illustrates a procedure being passed as a parameter to itself. This causes no problems to the semantics or the description method.

Finally, to see how declarations are handled we will examine the declaration of variables. Examples of valid declarations are:

BOOL a,b,c; HEAP REF INT d? INT e:=2, f:=x+y, g?

The second declaration shows a pointer variable which is to be kept on the heap. The third demonstrates ways in which a variable can be initialised within the declaration. The productions for this are:

Declaration(decs,env,val) -> Generator (env,mode,gen) JoinedDefinition(Var(mode,gen),env,val). Generator(env,Ref(mode),val) -> (["LOC"] & val=LOC(mode) I"HEAP" & val=HEAP(mode)) Mode(Actual,mode,env,n). Definition(Var(Ref(mode),gen),env,dec,NewVar(gen,id,val)) -> Definingldentifier(Ref(mode),env,dec,id) (":=" Unit(mode,env,val) I NIL & val=SKIP). Definingldentifier(mode,env,tag:mode,tag) -> i @Id(tag) & Unique(tag,mode,env).

The 'Generator' defines whether the location of the variable is to be on the stack or the heap, and the mode of this is always a reference to some mode. The 'joined definition' (whose productions are not shown) allows several

145 Definition of an Algol 68 Subset variables to share the same 'definition1. The output from 'definition' is a function: NewVar(gen,name,val) where 'gen' names the generator, a function which allocates the correct space on either stack or heap, 'name' is simply the tag for the variable and 'val' is the expression that must be evaluated to initialise the variable. The semantics of NewVar is given as

Semantics(NewVar(gen,name,source),NoVal) => Semantics(gen,locn) // Semantics(source,val); Set(name,locn) // Update(locn,val).

Thus one first evaluates the generator and the source colla- terally (this is the meaning of //, see last section). One then calls the procedure 'Set' which assigns a pair in the list 'names of locals' and 'Update' which replaces the value in the list 'values of locals'. Note that Set is also invoked in the declaration of identities (in which case the second parameter is a value) and Update is also used in the assignment statement.

146 Definition of an Algol 68 Subset

Jumps

There is a basic problem with handling jumps in any systematic way using the clauses we have described. This is that they assume that the actions will always be completed properly (or not at all). For instance, a sequence of two statements is represented as a list and may be given in the semantics as:

Semantics (a?b, val) -> Semantics (a,c); Semantics (b, val) • and is given in the underlying Prolog statement as:

Semantics(a?b,val,sl,s3)<- Semantics (a, c, si ,s2) & Semantics(b,val,s2,s3)•

If the statement 'a1 happens to be a GOTO statement, then the following statement 'b' does not need to be exec- uted, and the final state 's3' cannot be achieved. This also applies to abnormal termination due, for instance, to over- flow in an expression. It will be remembered that in the definition of ASPLE this difficulty was 'patched over1 by always testing the 'OK' state of the processor before exec- uting a statement. If the processor was not 'OK', then the statement was skipped. Although possible, this method does not work well in Algol 68, where expressions may contain statements which may themselves include jumps. The following is perfectly allowable:

x := y / IF ztO THEN z ELSE GOTO divbyO FI?

In this case, the division is never completed if z = 0. Since almost any construct in Algol 68 can return a value, the amount of checking that would go on would swamp the rest of the definition in irrelevant detail and the implement-

147 Definition of an Algol 68 Subset ation of jumps would be rather messy.

The key to the resolution of this dilemma is to consid- er the way in which relations, or functions, are composed together in the definition and to make the remaining actions into a parameter of the function. This method, called con- tinuations, was devised by Strachey and Wadsworth (1975) for denotational semantics and can easily be applied to the case of relational composition. (We use the method they call "impure continuations").

Consider the simple sequential phrase a;b for which the normal Prolog representation was given above. Instead of translating it in the normal fashion, we will translate it using another predicate name, called Do, and treat the non- terminal symbols as function names in the following fashion:

Do(Semantics(a;b,val), State(stack,heap,cont), s2) <- Do(Semantics(a, c), State(stack, heap. Semantics(b,val)?cont), si) & Continuation(si, s2).

The third parameter, cont, for state, is the continu- ation. Before 'a1 is evaluated, the other action 'b' is pushed onto the continuation to form a stack of its state. Assuming that a is a single action which completes normally, the relation 'Continuation' will then pick the clause 'Semantics (b,val)' from the front of the stack and execute it in the same way. Thus the first clause for Continuation (the others deal with the case of block exit) is simply:

Continuation(State(s, h, b;c), si) <- Do(b, State(s,h,c), si).

A jump can now be handled by ignoring the continuation

148 Definition of an Algol 68 Subset that is provided and substituting a new one. Let us deal first with local jumps (within the same block). The abstract syntax of a jump is 'Gotodabel, off set) where offset indi- cates the (static) block level in the same way as in the case of identifiers. For a local jump, this offset is 0. As indicated in the earlier section, the sixth parameter of the frame is the list of statements for the block. The continua- tion after a jump comprises the values of the list of statements in the block after the label. This is represented by the following statements:

Semantics(Goto(label,offset), x) => Jump(label,offset).

Do(Jump(label,0), State(stack,heap,cont) , State(stack,heap,Semantics(rest,val)) <- stack = Frame(a,val,b,c,d,stms,e) & FindCont(label,stms,rest). where FindCont extracts the rest of the statements in the block after the label.

One might note the role of the second parameter of Frame - this holds the value of the block. In Algol 68 every block can return a value. In normal execution this is handled by the second parameter of Semantics. However, this action is destroyed by a jump. The variable 'val1 thus provides the linkage by which the value is returned (and would be an equally suitable mechanism for describing the 'VALOF-RESULTIS1 pair in BCPL). Note that the 'value1 part of the semantics of the Goto statement is an unreferenced variable, but not the value 'NoVal1 which is used elsewhere. If it were the latter and the GOTO was the last statement in the block, then the result of the block would be NoVal. However, because of the 'EXIT1 statement (described in the appendix), this is not necessarily so.

149 Definition of an Algol 68 Subset

Non-local jumps present few difficulties now that the groundwork has been laid. In this case the 'offset1 parameter and the display are used to find the frame in which the label occurs, and the stack is replaced by this frame, discarding the rest of the stack. The clause which describes this is:

Do(Jump(label,offset), State(stack,heap,cont), State(frame,heap,Semantics(rest,val)) <- FindFrame(offset,stack,frame) & frame = Frame(a,val,b,c,d,stms,e) & FindCont(label,stms,rest).

The clauses to handle jumps are in fact more complicated than most other parts of the semantics. This is not necessarily a disadvantage. Jumps are complex operations in a block-structured language and it is appropriate that this should be apparent in the semantics.

Collateral Actions

Many actions in Algol 68 are defined to be 'collat- eral', by which is meant that their constituent actions can be evaluated in any order. Examples are the two branches of an expression and any number of declarations joined by commas, as well as the collateral statement itself. It is simple to give a non-deterministic program to allow for this, but inconvenient to have to specify this for each case.

Hence a new operator will be introduced into the syn- tax, implemented by means of a generic rule of the type discussed in chapter 2.2 to signify 'collateral', written '//'. Its definition is simply:

a // b -> a ; b I b ? a.

150 Definition of an Algol 68 Subset

This is an example of a non-determinate specification. In any particular program only one of the two branches will be used, but the implementor is free to choose either.

This operator is suitable when only two actions are to be taken in either order, but if a number of actions may be merged it is insufficient. This may be illustrated by taking three actions, which will be implicitly paired:

a // b // c = a // (b // c)

By the above clauses, a can come before or after be and b and c can be interchanged, but a can never be interleaved between b and c. Another definition must be given.

In the abstract syntax, a number of collateral actions are represented by a list and the result is also a list. The semantics of this is given as follows:

Semantics(a.b, val) => DoCollateral(a.b, val). DoCollateral(a, val) => Select(a,val, al,vl, a2,v2); Semantics(al, vl); DoCollateral(a2,v2). Do(Select(al.a2, vl.v2, al, vl, a2, v2) , s, s) . Do(Select(al.a2, vl.v2, a3, v3, al.a4, vl.v4), s, s) <- Do(Select(a2, v2, a3, v3, a4, v4) , s, s) .

Here the procedure Select is the non-deterministic one. From a list of actions and values it selects one pair and returns the others as a list. It has two clauses: the first selects the first item in the list, the second chooses some other value. Since either clause can be taken, any of the possible actions can be chosen first. The clause DoCollat- eral selects one pair, executes it and then does the remaining pairs in the same way. In this way any of the

151 Definition of an Algol 68 Subset possible permutations of order can be allowed.

This treatment needs to be augmented outside of the current syntax of Prolog by the observation that only a single result is required even though two ambiguous clauses are given.

It is obviously very much easier to introduce collat- eral actions into a relational definition than into a func- tional definition, although there are aspects of this treat- ment that are less than totally satisfactory. For an extend- ed account of the mathematics required to introduce collat- eral evaluation using denotational semantics, see de Bakker (1980). The full treatment of parallelism in Algol 68 introduces further problems that have not yet been solved.

152 Chapter 5.1

Applications

Prototype Versions Q£ & Language

One of the merits of a Prolog-based definition is that although it is entirely formal, it is runnable as a program. This opens new possibilities for the language designer. It becomes possible to construct the specifications of a new language and test it out in a much shorter time and with a much greater certainty that implementation errors have not been included. Other systems that may be adapted to this purpose included program proving systems such as Edinburgh LCF (Gordon, Milner, Wadsworth 1979) and the compiler gen- erator SIS (Mosses 1978) which are both based on denot- ational semantics.

In fact the main additions that must be made to turn an M-grammar definition into a prototyping system is the hand- ling of error situations (but see section 5.2 also for limitations on acceptable syntax). This fact raises inter- esting questions as to how far these should be specified in the original language.

Although Prolog definitions are runnable for small example languages such as ASPLE, with small example prog- rams, it would not be feasible to test a large-scale lang- uage such as Algol 68 on reasonable size programs using existing Prolog implementations. There are limitations in both time and space, and there are two main tactics which could be used to overcome these - coroutining and the ex- ploitation of determinate computations.

153 Prototyping a language

(1) Prolog is a strictly functional language, with very few inherent notions of sequence. Hence there is much scope for using coroutining and parallel evaluation strategies. Coroutining supervisors have been experimented with from the beginning on the Marseilles and Edinburgh Prolog systems and built in at a much more basic level into the IC-Prolog system (Clark, McCabe 1979). The facilities for parallelism are less developed, but also under investigation.

The top level of the ASPLE definition may be written using the annotation facility of IC-Prolog as

ASPLE(text, input, output, result) <- LEXEME(tokens", text, NIL) & MORPHEME(tree, mem, tokens?, NIL) & SEMEME(tree?, STATE(mem, input, output, OK), STATE(meml, inl, NIL, result)).

where '"' signifies that the predicate is a producer of the value and '?' that it is a consumer. One can say alternatively that '*' signifies an output and '?' an input variable.

With the data-flow coroutining used in IC-Prolog, '?' becomes an 'eager1 consumer and '"' a 'lazy' producer. This works very well in the case of the lexical and syntax analysis: immediately some variable is bound by the lexical analysis (the lazy producer), it is consumed by the syntax analysis phase (the eager consumer). Control is then returned to the lexical analysis, and so on. If this is coupled with the detection of determinate analysis in both of these computations, it should be possible to run both the lexical and syntax analysis in bounded and reasonable memory capacity.

154 Prototyping a language

A multipass syntax analyser could also be evaluated directly using coroutining. A typical case is the checking of identifiers. In the ALgol 68 definition the top level of this (using the annotations above) is:

SerialClause(Mode,envrval) -> Series(mode,newenv?,decs",labs",stms") Newrange(decs?,labs?,stms?,env,newenv",val)

The essential limitation is that 'newenv1, the environ- ment of declared identifiers within the serial clause, will be incompletely specified on the first pass, and must not be instantiated by this use of the identifier. IC-Prolog would commence this goal immediately and perhaps complete it (if the definition was found), but it might suspend execution if it found a variable in the 'newenv' tree. (Note that this clause is overspecified for IC-Prolog? some of the annot- ations should be omitted.)

The generalised coroutining facilities in IC-Prolog were not designed for efficiency and a rather simpler interpreter could be envisaged which would be adequate for these tasks. Predicates such as 'Identified1, which checks the scope of identifiers, could simply be tagged with the pass number in which they can be evaluated. Then evaluation would consist of a series of left-to-right passes over the proof tree in which predicates for later passes are suspend- ed and reactivated on later passes.

Coroutining between passes is not particularly relevant for the semantics evaluation phase. It is in any case simple to store the abstract syntax tree produced and proceed with the semantics phase separately, as is done in most 'compile and load' systems.

155 Prototyping a language

(2) Detection of non-determinacy. One of Prolog's strengths is that it can handle non-determinacy with acceptable efficiency. But non-determinacy has its costs - a time cost, spent in needless backtracking and a space cost because of the backtracking points that are left which mean that stack space is not recovered on exit from a procedure. The ability to handle non-determinacy is particularly useful in handling the syntax analysis of a prototype language which has not been converted to one of the standard forms. Judicious use of the cut procedure can reduce the cost of unnecessary backtracking during this phase.

But in the lexical analysis and semantics phases the non-determinacy carries a cost which brings very few bene- fits. The tail recursion optimisation (Warren 1980) goes some way towards improving the situation, but more careful analysis seems to be necessary if Prolog is to become a useful prototyping tool.

The equivalence of regular grammars and finite state machines is a useful starting point for this analysis. A program which could take regular grammars expressed in M- grammar form and convert it to acceptable code based on the finite state representation would have many applications - including its incorporation into the Prolog system itself. (Current Prolog systems have ad-hoc tokenizers which are not suitable for all applications.) An example of such a finite state recognizer is provided by the lexical analysis of the ASPLE compiler in Appendix B.

A less obvious feature of any Prolog language defin- ition is the time spent searching symbol tables. In the definitions presented we have opted for simplicity in using linear lists, though binary trees are almost as straight- forward to implement. A symbol table is essentially a map- ping in which the first elements form a set and the second

156 Prototyping a language elements are the values. A language such as SETL provides maps as a basic facility and every Prolog system incorp- orates at least one symbol table (for predicate names). These inbuilt procedures employ efficient algorithms (such as hash codes) to produce acceptable efficiency. Making them available to the user is primarily a question of system design which future implementers of Prolog are encouraged to consider. Such a feature would also provide a more satis- factory treatment of collateral evaluation (see chapter 4.3) .

157 Chapter 5.2

Towards a logic Compiler - Compiler

One of the first uses of Prolog was Colmerauer's (1975/8) very neat definition of a compiler for a small Algol-like language. This is a classical four-phase com- piler consisting of

(1) Lexical analysis (2) Syntax analysis (3) Synthesis of assembly code (4) Allocation of space and Assembly together with input and output phases. The listing of a compiler for the ASPLE language, which is similar except for the addition of error handling facilities, is given in Appendix B.

However, compilers are generally regarded as "systems programs" for which efficiency is of paramount importance. It is for this reason that assembler was used in writing them for many years, and only now is the use of higher level languages (normally efficient "compiler-writing" languages) becoming standard. It is therefore doubtful that compilers written in Prolog would become widely used, despite the ease with which they can be developed. Even more crucial is the use of space by Prolog systems, which currently prevents the compilation of large segments. Space is always a problem for compilers written in conventional language, and Prolog cannot as yet compete with these.

The "UNCOL" problem - the combinatorial effect of matching source languages to target machines - is simplified in a Prolog compiler (as in other similar systems) by the

158 Towards a Logic Compiler-Compiler adoption of suitable intermediate level "abstract syntax". However, it is not eliminated as the same analysis of languages and machines must be made.

The classical answer to these problems is the compiler- compiler. Since the first seminal version (Brooker, et al 1963), there have been several widely used versions. Roster's Compiler Definition Language (Roster 1971b) is partially based on Affix grammars, but uses in addition a very flexible system of macros, producing a top-down com- piler for any LL/1 grammar. XPL (McReeman, Horning, Wortman 1970) has also been widely used.

A more recent project is the Production Quality Com- piler Compiler (see Aho, 1980) which is attempting for the first time to combine several aspects of the problem that have not been attempted in a general way. These include the areas of global and local optimisations, together with the automatic derivation of code generators - an area which has previously escaped formal methods.

In the following sections we will examine two aspects only of a compiler-compiler: the identification of amenable subsets of M-grammars and the construction of parsers for them; and the automatic derivation of code generators from formal machine descriptions. The question of optimisation, both machine dependent and independent, will not be addressed.

Parsers for M-grammars

The basic problem in constructing a usable parser is to reduce backtracking to a minimum, or eliminate it entirely. Backtracking not only affects efficiency, but more import- antly reduces enormously the possibility of accurate error-

159 Towards a Logic Compiler-Compiler

detection. It is a well-known maxim for compiler-writers that "Any fool can write a compiler for correct source programs" (Rohl 1975). The user of a compiler does not want to know either that "there are errors in this program" or that "the first error occurs on line x", but to have each error pinpointed, and does not wish to be flooded by hund- reds of "consequent errors" arising because of inadequate recovery from previous errors.

It is largely because of their better error recovery methods that there has been much greater interest in bottom- up methods of syntax analysis in recent years - for instance LR/(1) parsers (see also Aho, Ullman, 1977). Most work with Prolog has been done with top-down methods, but some work has been done (Warren 1975) on the use of Earley's general- ised parsing method, as a basis of Prolog implementation. This is of order n in time and space (where n is the length of input string) for an unambiguous CFG, but does not have good error detection abilities. Another approach is based on Kowalski's (1974b) connection graph theorem proving method which can mix both top-down and bottom-up methods.

Our approach is much more specific. Most programming languages have a "context-free" basis, so that errors in the context-sensitive parts may be identified subsequently. It is on the basis of this that efficient parsers have been built. It is therefore necessary to identify and separate those parts of an M-grammar which represent a context-free grammar so that these can be used as a basis of the parse and error recovery, with the context-sensitive parts taking a second place. We are thus following the opposite path to Attribute Grammars, which were generalised from context-free grammars (see Watt, Madsen 77).

This may be illustrated by the ASPLE compiler in Appendix B. The syntax analysis phase is written as a

160 Towards a Logic Compiler-Compiler standard M-grammar, but there are three basic differences between the syntax for the compiler and that presented in chapter 4.1: (1) The grammar for the compiler recognizes, or accepts, any string of tokens that is fed to it. If the program is incorrect it will print out error messages (though for technical reasons these are not given line numbers). Error recovery is attempted in two ways. At the expression level an attempt is made to insert symbols to repair the text. At a statement level, tokens will be skip- ped until the occurrence of ";n, "END", "ELSE" or nFIn. (2) Most ambiguities in the syntax have been removed to reduce backtracking, though it has not necessarily been reduced to LL/1 form. Examples are the use of factoring in expressions and the use of brackets, which in the original syntax was used to introduce integer expressions or relations. (3) Context-sensitive checks have been 'moved back* so that they only check modes, without causing backtracking over the text. An example is the mode of primaries, where the dereferencing cannot be carried out immediately. They are also made 'complete' in that they will never fail (al- though they may cause some backtracking).

In order to provide a basis for a parser-generator, we need to examine general methods for turning M-grammars into efficient parsers. Here we will consider only top-down methods of analysis. The most flexible currently in use is the condition for a 'one-track' grammar (see Bornat (1979)). The algorithm for this is, in outline, as follows:

For each production in the grammar, construct a list of all the symbols that can possibly start the production (using Warshall's Closure Algorithm and including the non- terminal itself). Then if the lists for each non-terminal have no repeated symbols, the grammar is 'one-track'. This

161 Towards a Logic Compiler-Compiler method will work with M-grammars which do not have variables as initial terminal symbols (and can be extended to those of the form @a & Letter(a)).

There are several transformations that can be performed automatically to improve a grammar which does not meet these requirements (see Foster (1968)). These include factoring and removal of left-recursion. Unfortunately there are no procedures that are guaranteed to produce a one-track gram- mar. The most that a system can do is to point out where a grammar fails to meet the criteria, (though this is obvious- ly better than missing them altogether).

We need to define for a grammar the conditions that the parameters of the grammar will not interfere with the pars- ing process. To do this we must follow the example of Attribute Grammars (Knuth 68) in distinguishing "inherited" and "synthesised" attributes. These are more often called "input" and "output", respectively, in Prolog. For each non- terminal, we must distinguish which parameters are synthes- ised and which are inherited.

Given these definitions we may now distinguish 'defining1 and 'applied' occurrences of a variable in the clause, as follows: A defining occurrence of a variable is an occurrence as a term of an inherited attribute on the left hand side of a production, or a synthesised attribute on the right hand side. An applied occurrence of a variable is an occurrence at the base level of a synthesised attri- bute on the left hand side, or an inherited attribute on the right hand. For each variable there must be exactly one defining occurrence in each production. In addition, cons- tant and function symbols may only occur in applied positions.

Given these conditions, we may then be sure that the

162 Towards a Logic Compiler-Compiler grammar is in fact context-free. We must exclude from this process the "total" conditions discussed earlier.

These conditions are rather more strict than those envisaged by Knuth and correspond closely to those used by Bochmann (1976) and Watt and Madsen (1977). They are how- ever less strict than those used by Koster (1971a) in Affix Grammars, which insist that a defining position on the right hand side must be the first r.h.s. occurrence.

We have now reached a stage at which, at least as far as the syntax analysis is concerned, we can parse an M- grammar without backtracking. What are the benefits of this? (1) We can envisage using a deterministic form of Prolog for parsing, such as those based on pushdown autom- ata. As has been pointed out by several authors recently (e.g. Bruynooghe 1980, Warren 1980, Mellish 1980), subst- antial economies both in space and time may be made, while maintaining all the advantages of resolution in handling data structures. For instance, by separating local and global variables, local variables can be entirely deleted on exit from a procedure, so that a more Algol-like regime is encountered. Also the size of the local stack frame is reduced. (2) Instead of having to read in the whole program first, and then pass it between each procedure, one can envisage a more direct form of coroutining in which the recognition of a terminal symbol leads to the direct reading of the next symbol, including the next calling of the lex- ical analysis phase.

163 Towards a Logic Compiler-Compiler

Automatic Code Generation

An idealised description of code generation might be the following:

One starts with two formal descriptions - one of source language and one of the target language. The semantics of both of these languages are expressed in a common notation.

Given a program in the source language, it is then a task of plan-formation (see Cattell 1978, Warren 1974) to find a program in the target language which achieves the goals expressed by the source program.

To take a very simple example an assignment statement:

x:=y will probably need to be composed of at least two steps in the target language:

LOAD y STORE x

(where we assume that problems of storage allocation have already been dealt with). This is a small plan composed of two steps which must be deduced by composition from the semantics of the two instructions. Many other plans to achieve this effect are possible, and we therefore wish to choose the optimal (or near optimal solution) each time.

Clearly this is an infeasible design for a compiler for various reasons: (1) the same code segments are used over and over again and it is stupid to generate them everytime (2) local and global optimisation get completely confused and the strategies for both are rather different (3) Plan-

164 Towards a Logic Compiler-Compiler

formation and optimisation are both essentially combinat- orial in nature, so that they are impracticable on a large scale.

We should note in addition that translation is not essentally a symmetric task. If a program in language a is translated into language b it is very unlikely that the effect will be precisely the same. For instance, in the above example of an assignment, the target language version results also in the previous value of y being kept in the accumulator. The reverse process of "decompilation" might or might not need to express this fact depending on whether that value is used again. In the presence of jumps and procedure calls it is necessary to do it anyway, for safety. The implication of this is that there will be in general no 1:1 correspondence between two languages.

The method of using a code generator envisaged by Cattell is as follows:

(1) The semantics of source and target language is expressed in some intermediate form. He defines a language called TCOL which is a Tree based Computer-Orientated Lang- uage very close to the abstract syntax used here.

(2) Plan and search techniques are used to generate a series of templates in which the elements of the source language are each represented in the target language. More than one template may be supplied for each to allow for special cases, arranged so that the optimum occurs first.

(3) Code generation takes place by matching the por- tions of the tree generated for a source language program against the templates and recursively matching subtrees.

This may be illustrated in pictorial form as follows:

165 Towards a Logic Compiler-Compiler

program text t source syntax — - syntax analyser language ^^ \ semantics source I code trees formation — — Code Segments semantics \ target targetlcode trees language * syntax f* assembler V output

Fig. 5/2/1

One interesting aspect of Colmerauer's work is the "back-to-back" use of two grammars: the first is used to analyse the source text, and the second to generate the assembler code. The intermediary part between the two is what we have described as the 'abstract syntax' which Col- merauer (following Chomsky) calls the 'deep structure'. This dual use of grammars provides a very neat and conven- ient way of describing a compiler.

Take for example the generation of arithmetic expressions in ASPLE. The clauses for this are as follows:

• Arithexp(a.b.c, v, t) => & Simple(c);

Arithexp(b, vr t)? @Code(e.d) & Op(a,e) & /.

Arithexp(a.b.c, vf t) => & Simple(b);

Arithexp(cf vf t)? @Code(e.d) & Op(a,e) & /.

Arithexp(a.b.cf vf t) => Arithexp(bf v, t) & t=TEMP.d.tl; @Code(STOR.d);

Arithexp(cf v, tl);

166 Towards a Logic Compiler-Compiler

@Code(e.d) & Op(a, e) & /. Arithexp(a, v, t) => LoadSimple(a, v, t).

The first parameter is a triple representing the arithmetic expression, with the operator first (and the other two parameters are considered below). In the first two cases, at least one of the operands is 'Simple1, i.e. it can be accessed in a single memory reference for all instruc- tions. Thus the output consists of the code to load the first operand into the accumulator (by a recursive call to Arithexp), followed by the code to perform the operation. The '§' specifies a terminal symbol of the grammar, which is a function symbol 'Code1 whose parameter represents the operation.

In the second clause the expression is commuted (all operators in ASPLE are commutative). The third clause uses a temporary location to store the result of the first part of the expression. The third parameter 't' is treated as a (compile-time) stack of temporary locations. When compiling the subexpression 'c', the location addressed by 'd' is unavailable. On exit from the whole expression it will again become available. At the start of compilation 't' is an unbound variable; at the end, a number of pairs 'TEMP.d1 will be bound to it, which will allow the correct space to be allocated.

In a similar way, the parameter 'v' is used to allocate variables. Since all variables are allocated the same space there is no need to pass the dictionary from the syntax analysis phase. Space is only allocated if variables are used.

The semantics of the machine instructions may be repre- sented by very low level tree expressions. This may be illustrated by those needed for the machine which is used

167 Towards a Logic Compiler-Compiler for the ASPLE compiler, shown in Fig. 5/2/2

Machine Instruction Tree Representation

LOAD n Ass.Ac. (Con.M.n) STOR n Ass.(Con.M.n).Ac ADD n Ass.Ac.(+.Ac.(Con.M.n)) SUB n Ass.Ac.(-.Ac.(Con.M.n)) MULT n Ass.Ac.(*.Ac.(Con.M.n)) JEQ n If.(Eq.Ac.O).Ass.Pc.n JNZ n If.(Ne.Ac.O).Ass.Pc.n JMP n Ass.Pc.n LDI n Ass.Ac.(Con.M.(Con.M.n)) STI n Ass.(Con.M.(Con.M.n)).Ac

In this tree structure there are operators and memory decriptors. 'Ass.x.y1 is the assignment x:=y. 'Con.M.n' is the contents of primary memory (M) location n. 'Ac1 is the accumulator and 'Pc1 the program counter. Most of the other operators are self-evident.

The problem of the code-generator-generator is there- fore to produce the clauses used in the compiler, such as Arithexp, that perform the mapping between the abstract syntax describing the source language and that describing the target machine. These may be regarded as templates of greater or less generality to be used by the code-generator.

The generation of the code templates depends on a set of equivalence axioms used as rewriting rules between ex- pressions (which we express in the normal notation, though they are obviously not M-grammar rules ). Most of these are standard arithmetic, relational and boolean axioms. Several though are particular to the code generation problem, and will be discussed briefly.

168 Towards a Logic Compiler-Compiler

(1) Fetch/Store Decomposition

El(E2) => D:=E2; Ei(D)

Dj := E => D2 := E; Dj := D2

In these rules E stands for an arbitrary expression and D for a storage location. The first expresses the idea that a subexpression (E2) may be calculated and stored; the second is a special case of this for assignment.

(2) Side effects

S; D:=E => S (if D is a temporary type) S; D:=E => Alloc(D); S {if D is a general type)

These deal with ignoring the side effect of a statement S which assigns to a storage location D. This may be ignored either if the storage location is classed as temporary (which includes for instance the carry bit of a processor) or if it is a general type (which includes registers) and that location has been preallocated. Allocation is a common task in compiling which may include saving some other value.

(3) Sequencing semantics

If E Then S => If NOT E Then Goto L; S; L: Goto L => Pc :=L

These are examples of transformations which can be made to simplify control structures of the source language. Jumps on the target machine are expressed in terms of assignments to the program counter (Pc).

It is to be noted that the fetch/store decomposition axioms are the only axioms to introduce side effects, and the side efffect axioms the only ones which resolve unwanted

169 Towards a Logic Compiler-Compiler side effects. Hence it is possible to deal with these in special ways, and the planning process is simplified so that it is similar to algebraic simplification. With each action in the target language is associated a space and time cost, and therefore all solutions within predefined bounds may be found by search and the least expensive chosen.

An example will show the sort of complexity that is in- volved on real machines, rather than the idealised machine used in the ASPLE compiler. Suppose we wish to produce code for subtracting the contents of the accumulator on a PDP-8. This may be represented: Ac := (D-Ac)

The relevant instructions available on the PDP-8 are

Instruction Effect TAD x Ass.Ac.+.Ac.Con.M.x CMA Ass.Ac.NOT.Ac (one's complement) IAC Ass.Ac.+.Ac.1

Thus the PDP-8 does not have a subtract instruction and it is necessary to use the rewrite rule, x - y -> x + (-y)

But the PDP-8 does not directly have a negate instruction. We must use the rule for two's complement arithmetic -x = (NOT x) + 1

In fact the two instructions CMA and IAC on the PDP-8 may be combined at no extra cost. These may be combined using the commutative rule: x + y => y + x to give the sequence: D - Ac -> D + (-Ac) -> (-Ac) + D -> ((Not Ac) + 1) + D

170 Towards a Logic Compiler-Compiler

which may be represented in PDP-8 code as CMA IAC TAD D

The generation of code sequences has in fact been carried out using Prolog by Warren (1974) using a general purpose planning system called 'Warplan1. This system is at least as general as that used by Cattell. It should not be thought that the generation of either the machine descrip- tions for real machines, or the list of transformation axioms is trivial, (There are several points in Cattell's thesis that might be questioned). However, because it is possible to maintain separately data base descriptions of many machines (as is being done at Carnegie-Mellon) and programming languages it means that the idea of a "compiler factory" is several stages nearer to completion.

171 Chapter

Program Proving and Equivalence

There are two different approaches to semantics that have been described - relational and axiomatic - which both give rise to ways of proving the correctness of programs. We will consider these separately without attempting to assess their relative merits.

Axiomatic Proving Systems

There are two questions that must be settled in order to construct a useful program proving system:

1. What form of assertion language will be used? 2. How will the theorem proving be accomplished?

It is usually asssumed that a full first-order logic complete with quantifiers and a full set of operators is essential to provide a rich enough language to prove any programs. For many programs this is not the case and a much simpler notation - such as that provided by Prolog with a few auxiliary notions - is adequate.

For instance, take the program for Factorial given in ASPLE in chapter 4.1. The top-level goal of the proof may be given using the axiomatic semantics presented in chapter 3.2 as:

<- Ax(File(In,N), prog,

File(Out,N.Fact.NIL) & Factorial(N,Fact)).

where 'prog1 is a term standing for the abstract program in

172 Program Proving and Equivalence

ASPLE, and the first and third parameters are the pre and post-conditions respectively. The values N and Fact repre- sent the data and result of the program; they appear in this case in the input and output files. Although they have the same names, they are of course not related to the program variables in any way. The relation Factorial may be defined by the Prolog clauses:

Factorial(0,1). Factorial(i+1,fact*(i+1)) <- Factorial(i,fact).

Let us sketch informally the proof of the section of the program which is the heart of the algorithm. To do so we will omit most of the syntactic formalities in order to concentrate on the method. The section of program is:

if (n ^ 0) then while (i ^ n) do i := i + 1; fact := fact * i; end fi?

At this point, the goal that must be proved when working from the end (and after handling the output file) is

<- Ax (p, Cond (nj^O, loop) , Factor ial (n, fact)) . where p is the precondition of the loop and 'loop1 is the text of the loop. When we match this with the conditional axiom:

Ax(p, Cond(exp,si,s2), q) <- Ax(p & ~exp, s2, q) & Ax(p & exp, si, q) . we get two new goals:

173 Program Proving and Equivalence

<- Ax(p&~ n^O, NIL, Factorial(n,fact))

& Ax(p & n^0f loop, Factorial(n,fact)).

The first goal is easily solved using the axioms for and the first clause for Factorial, resulting in the necessary precondition p of fact=l. The second goal may be matched with the clause:

Ax(p, while(b,s) p & ~b) <- Ax(p & b, s, p) . giving a new goal:

<- Ax(Factorial(i,fact) & i^n, (i:=i+l; fact:=fact*i), Factorial(i,fact)). by using the postcondition ~(i^n), or i=n, to substitute for n in the Factorial relation. If we now apply the axioms of assignment and sequence we can show

Ax(Factorial(i+1,fact*(i+1)), (i:=i+l; fact:=fact*i), Factorial(i,fact)).

By invoking the rule of consequence:

Ax(p, s, q) <- Ax(r, s, q) & Demonstrate( p->r ). we are left with the goal:

<- Demonstrate (Factor ial (i,fact) -> Factorial(i+l,fact*(i+l))). which is just a restatement of the original Prolog clause.

174 Program Proving and Equivalence

Two lessons may be drawn from this example:

(1) We are using the relation 'Ax1 to derive the preconditions from the postconditions which we then attempt to establish. We are in fact using it as a predicate transformer.

(2) The problem reduction pattern of Prolog programs is in fact very suitable for this type of argument. In many cases it is possible to present the clauses in such a way as to simplify the proof. This can also make the specification easier to understand for the user, which reduces another potential source of error.

We are not arguing that a full version of first-order logic is unnecessary - there are many examples which could disprove this. Rather we are suggesting that it is better to start with a clausal base and add more features into the system only when necessary. A parallel may be drawn with natural language understanding using Prolog. Explicit quant- ifiers have been successfully introduced (Colmerauer 1978, Warren, Pereira 1981) as additional and controlled features.

Program Equivalence

Another approach to the development of efficient and correct programs lies in the area of program transformation. This is a very live research area in the development of Prolog programs (e.g. see Hogger (1979), Clark, Darlington (1979)) as well as the use of recursion equations (Burstall, Darlington 1975).

175 Program Proving and Equivalence

The application of this thesis lies in a related area - that of showing the equivalence of Algol-like languages for which we wish to argue that the relational semantics pro- vides an adequate framework. This area of research arose out of Ianov's program schemas which were extended by Luckham, Park and Patterson (1970). Most of the recent work has been in the context of denotational semantics (e.g. Stoy, (1977), de Bakker (1980)).

Let us take a trivial example to illustrate this: the associativity of the sequencing operator. Given any

(straight-line) statements xf y, z we wish to show that:

x ; (y ; z) = (x ? y) ? z.

We will assume that the language we are dealing with is the ASPLE language described in Ch. 4.1, using the relational interpretation of sequence, not the continuation semantics used in Ch. 4.3, and that the sole effect of statements is on the states, or environments.

In relational form, what we wish to prove is:

For all a,b:

Semantics(x;(y?z), a, b) <-> Semantics((x;y);z, a, b).

The definition of the sequence operator is:

Semantics(u;v, si, s3) <- Semantics(u, si, s2) & Semantics(v, s2, s3) .

This is the only clause in the definition that matches a term containing ,;1. Using a closed world definition, it is possible to replace the 'if' by 'if and only if', written as '<->'.

176 Program Proving and Equivalence

The left hand side of what we want to prove may be expanded to:

Semantics (x;(y;z), a, b) <-> Semantics(x, a, si) & Semantics(y;z, si, b) <->

Semantics(xf a, si) & Semantics(y, si, s2) & SemanticsCz, s2, b) .

Similarly the right hand side may be expanded:

Semantics((x?y);z, a, b) <->

Semantics(x;y, af t2) & Semantics(z, t2, b) <-> Semantics(x, a, tl)

& Semantics(y, tlf t2) & Semantics(z, t2, b).

To complete the proof of equivalence we must show that sl = tl and s2 = t2. To prove this we must show that for any statement u and initial state v there is only one final state w such that Semantics (u,v,w) is true. This is equivalent to a proof of functionality, which may be achieved by showing that only one clause matches each definition at all levels. From this we can deduce that sl=tl and hence s2=t2.

More complicated examples require induction. An example is to show that:

while a do b is equivalent to

if a then b; while a do b end fi

177 Program Proving and Equivalence

To prove this requires the use of the fixpoint property in the elaboration of the while loop. As yet little work has been done in this area. However, the treatment of states in the relational semantics appears to offer a promising entry.

178 References

Ahof A.V. (1980): Translator Writing Systems: where do they now stand? Computer. 13/8, 9-14. Aho, A.V., Oilman, J.D. (1972): Theory of parsing Trans- lation and Compiling. (2 vols). Prentice-Hall.

Aho, A.V.r Ul1man, J.D. (1977): Principles of Compiler Design. Addison-Wesley. Anderson, E.R., Belz, F.C. (1978) Issues in Formal Specif- ication of Programming Languages. In E.J. Neuhold (1978) 1-30. Apt, K.R., van Emden, M.H. (1980): Contributions to the Theory of Logic Programming. Res. Rep. CS-80-12, Dept. of Comp Sc, University of Waterloo, Ontario. de Bakker, J.W. (1976): Semantics and the Foundations of Program Proving. Rep. IW 62/76, Mathematisch Centrum, Amsterdam. de Bakker, J.W. (1980): Mathematical Theory of Program Correctness. Prentice-Hall. Battani, G., Meloni, H. (1973): Interpreteur du Language de Programmation PROLOG. Rapport do DEA, Groupe d'lntel- ligence Artificielle, UER der Luminy, Universite d'Aix Marseille. Bjorner, D. (1977): Programming Languages: Formal Develop- ment of Interpreters and Compilers. Int. Comp. Symp. 1977, 1-22. Bjorner, D., Jones, C. (1978): The Vienna Definition Method: The Metalanguage. Springer Verlag. Bochmann, G.V. (1976): Semantic Evaluation from left to right. CACM 19, 55-62. Bornat, R. (1979) Understanding & Writing Compilers. MacMillan. Boyer, R.S., Moore, J.S. (1972): The Sharing of Structure in

179 References

Theorem Proving Programs, in Meltzer, Michie (1972) 101-116.

Boyer, R.S.r Moore, J.S. (1979): A Computational Logic. Academic Press. Brady, J.M. (1977): The Theory of Computer Science, a Prog- ramming Approach. Chapman & Hall, p.252. Brooker, R.A., MacCallum, I.R., Morris, D., Rohl, J.S. (1963): The Compiler Compiler. Ann. Rev. Automatic Prog. 3, 229-275. Bruynooghe, M. (1980):The Memory Management of PROLOG Imple- mentations. Logic Programming Workshop, Debrecen. 12- 20. Burstall, R.M. (1969): Formal Description of Program Struc- ture and Semantics in First Order Logic. In Meltzer, Michie (1969) 79-98. Burstall, R.M. (1969b): Proving Properties of Programs by Structural Induction. Comp. J. 12, 41-48. Burstall, R.M., Darlington, J. (1975): A Transformation system for developing recursive programs. JACM. 24/1. 44-67. Burstall, R.M., Goguen, J.A. (1977): Putting Theories to- gether to make Specifications. IJCAI 5, 1045-1058. Cattell, R.G. (1978): Formalization and Automatic Generation of Code Generators. Ph.D. Thesis. TR 78-115. Carnegie Mellon U. de Chastellier, G., Colmerauer, A. (1969): W-grammar. Proc. 24th. ACM National Conference, New York, 511-517. Chomsky, N. (1956): Three Models for the description of Language. IEEE Trans. Information Theory, 2/3, 113-124. Chomsky, N. (1959): On certain formal properties of Gram- mars. Information & Control, 2/2, 137-167. Clark, K.L. (1978): Negation as Failure. In Gallaire, Minker (1978). 293-322. Clark, K.L. (1977): Synthesis and Verification of Logic Programs. Res. Rep., Dep. of Computing, Imperial College, London.

180 References

Clark, K.L., McCabe, F. (1979): Control Facilities of IC- Prolog. In Expert Systems in the Micro-Electronic Age. ed. D. Michie. Edinburgh Univ. Press. Clark, K.L., Tarnlund, S-A. (1977): A First Order Theory of Data and Programs. IFIP 77. North-Holland. 939-944. Cleaveland, J., Uzgalis, R.(1976): Grammars for Program- ming Languages. American Elsevier. Clint, M., Hoare, C.A.R. (1972): Program Proving: Jumps and Functions. Acta Informatica 1/3, 214-224. Colmerauer, A. (1978): Metamorphosis Grammars. In Natural Language Communication with computers. Springer-Verlag. 33-189. (Originally issed 1975, as internal report, Universite d'Aix Marseille.) Dijkstra, E.W. (1975): Guarded Commands, Nondeterminacy and the Formal Derivation of Programs. CACM 18/8, 453-457. Dijkstra, E.W. (1976): A Discipline of Programming. Prentice-Hall. Donahue, J.E.(1976): Complementary definitions of Program- ming Language Semantics. Springer-Verlag. van Emden M.H., Kowalski, R.A. (1976): The Semantics of Predicate Logic as a Programming Language. JACM 23, 733-742. van Emden, M.H., Maibaum, T.S.E. (1980): Equations compared with Clauses for Specification of Abstract Data Types. Dept. of Comp. Sc. University of Waterloo, Ontario. Engeler, E. (Ed) (1971): Symposium on Semantics of Algorith- mic Languages. Springer-Verlag. Floyd, R.W. (1967): Assigning Meanings to Programs. Proc. Symp. Appl. Maths, Amer. Math. Soc., 19-32. Foster, J.M. (1968): A Syntax Improving Device. Comp. J. 11/1, 31-34. Gallaire, H., Minker, J. (Eds) (1978): Logic and Databases. Plenum Press. Godel, K. (1930): Die Vollstandigkeit der Axiome der Log- ischen Funktionenkalkuls. Monatsh. Math. Phys 37, 349- 360. (see J. van Heijenhoort (1967): From Frege to

181 References

Godel•) Goguen, J.A., Thatcher, J.W., Wagner, E.G., Wright, J.B. (1977): Initial Algebra Semantics and Continuous Alge- bras. JACM 24/1, 68-95. Gordon, M.J.C. (1979): The Denotational Description of Prog- ramming Languages. Springer-Verlag. Gordon, M.J., Milne, A.J., Wadsworth, C.P. (1979): Ediburgh LCF. Springer-Verlag. Guttag, J.V., Horowitz, E., Musser, D.R. (1978): The Design of Data Type Specifications. In Yeh (ed) Current Trends in Programming Methodology. Vol 4. Prentice-Hall. Hoare, C.A.R. (1969): An Axiomatic Basis for Computer Prog- ramming. CACM 12/12, 576-580. Hoare, C.A.R. (1971): Procedures and Parameters: an Axiom- atic Approach. In Engeler (1971). 102-116. Hoare, C.A.R. (1974): Programming Correctness Proofs. In Formal Aspects of Computing Science. Newcastle Univ. 7- 46. Hoare, C.A.R., Lauer, P.E. (1974): Consistent and Comple- mentary Formal Theories of Programming Languages. Acta Informatica 3, 135-153. Hoare, C.A.R., Wirth, N. (1973): Axiomatic definition of the Programming Language Pascal. Acta Informatica 2. 335- 355. Hogger, C.J. (1979): Derivation of Logic Programs. Ph.D. Thesis. Imperial College, London. Hopcroft, J.E., Ullman, J.D. (1969): Formal Languages and their relation to Automata. Addison-Wesley. Irons, E.T. (1961): A Syntax directed Compiler for Algol 60. CACM 4, 51-55. Jazayeri, M., Ogden, W.F., Rounds, W.C. (1975): The Intrins- ically exponential complexity of the circularity prob- lem for attribute grammars. CACM 18/12, 697-706. Kennedy, K., Warren, S.K. (1976): Automatic generation of efficient Evaluators for Attribute Grammars. In 3rd ACM Symp. on Principles of Programming Languages, 32-49.

182 References

Knuth, D.E. (196 8): Semantics of context free languages. Maths. Systems Theory 2. 127-145, Corrections 5(1971) 95,96. Knuth, D.E. (1971): Examples of Formal Semantics, in Engeler (1971) 212-235. Koster, C.H.A. (1971a): Affix Grammars. In Algol 68 Imple- mentation. (ed. J.E.L. Peck). North-Holland. 95-110. Koster, C.H.A. (1971b): A Compiler Compiler. MR 127/71, Mathematisch Centrum, Amsterdam. Kowalski, R.A. (1974a): Predicate Logic as Programming Lang- uage. IFIP 74, 569-574. Kowalski, R.A. (1974b): A Proof Procedure using Connection Graphs. JACM 22. 572-595. Kowalski, R.A. (1978): Logic for Data Description. In Gal- laire, Minker (1978). 77-102. Kowalski, R.A. (1979a): Logic for Problem Solving. North- Holland. Kowalski, R.A. (1979b): Algorithm = Logic + Control. CACM 22/7, 424-436. Landin, P.J. (1964): The Mechanical Evaluation of Expres- sions. Comp. J. 6/4. Landin, P.J. (1965): A Correspondence between Algol 60 and Church's Lambda Notation. CACM 8/2, 89-101. 8/3, 158- 165. Landin, P.J. (1966): A Formal Description of Algol 60. In Steel (1966), 266-294. Lauer, P.E. (1968): Formal Definition of Algol 60. T.R. 25.088, IBM Laboratory, Vienna. Lecarme, 0., Bochmann, G.V. (1974): A (truly) Usable and Portable Compiler Writing System. IFIP 74, 218-221. Ledgard, H.F. (1977): Production Systems. IEEE Transactions on Software Engineering. Apr. Leverett, B.W. et al (1980): An Overview of the Production Quality Compiler Compiler Project. Computer 13/8, 38- 49. Lewis, P.M., Rosencrantz, D.J., Stearns, R.E. (1974): Attri-

183 References

buted Translations, J. Comp. Sys. Sc. 9, 279-307. Lucas, P. (1968): Two constructive realisations of the Block Concept and their Equivalence. TR. 25.085. IBM Labora- tory, Vienna. Lucas, P. (1972): On the Semantics of Programming Languages. In Rustin (1972), 41-58. Lucas, P., Walk, K. (1969): On the formal definition of PL/1. Ann. Rev. Automatic Programming. 6/3, 105-182. Luckham, D.C., Park, D.M.R., Paterson, M.S. (1970): On Formalised Computer Programs. J.Comp.Sys.Sc. 4/3, 220- 249. McCarthy, J. (196 2): Towards a mathematical science of comp- utation. in Information Processing 1962. North-Holland. 21-28. McCarthy, J. (1963): A Basis for a Mathematical Theory of Computation. Braffort & Hirschberg (Eds): Computer Programming and Formal Systems. North-Holland, 33-70. McCarthy, J. (1966): A Formal Description of a subset of Algol 60. In Steel (1966), 1-12. McKeeman, W.M., Horning, J.J., Wortman, D.B. (1970): A Comp- iler Generator. Prentice-Hall. Manna, Z. (1974): Mathematical Theory of Computation. McGraw-Hill. Manna, Z., Vuillemin, J. (1972): The Fixpoint Approach to the Theory of Computation. CACM 15/7. 528-536. Manna, Z., Waldinger, R. (1976): Is "sometime" sometimes better than "always"? Memo A.I.M. 281. Stanford A.I. Laboratory. Marcotty, M., Ledgard, H.F., Bochman, G.V. (1976): A Sampler of formal definitions. Computing Surveys. 8/2, 191-276. Mellish, C. (1980): An Alternative to Structure Sharing in the Implementation of Prolog. Logic Programming Work- shop, Budapest. 21-3 2. Meltzer, B., Michie, D. (Eds) (1969): Machine Intelligence 5. Edinburgh Univ. Press. Meltzer, B., Michie, D. (Eds) (1972): Machine Intelligence

184 References

7. Edinburgh Univ. Press.

Meltzerf B., Michief D. (Eds) (1979): Machine Intelligence 9. Edinburgh Univ. Press. Moss, C.D.S. (1977):The Relationship between Hoare's Axiom- atic Semantics and Plan Formation Studies. M.Sc. Thesis. Imperial College, London. Moss, C.D.S. (1979): A New Grammar for Algol 68. DCC 79/6, Imperial College, London. Moss, C.D.S. (1980): A Formal Definition of ASPLE using Predicate Logic. DOC 80/18, Imperial College, London. Mosses, P. (1974): The Mathematical Semantics of Algol 60. Tech. Monograph 12. Oxford Univ. Comp. Lab. Mosses, P. (1978): SIS, A Compiler-Generator System using Denotational Semantics. (Reference Manual). Dept. of Computer Science, Aarhus Univ. Denmark. Neuhold, E.J. (Ed) (1978): Formal Description of Programming Concepts. North-Holland. Park, D. (1969): Fixpoint Induction and proofs of program Semantics. Meltzer, Michie(1969), 59-78. Pereira, F. (1980): Extraposition Grammars. Logic Program- ming Workshop, Budapest. 231-242 Pereira, F.L.N., Warren, D.L.(1978): Definite Clause Gram- mars compared with Augmented Transition Networks. DAI. Rep. 58. Univ. of Edinburgh. Popplestone, R.J. (1979): Relational Programming, in Meltzer, Michie (1979), 3-25. Roberts, G.M. (1977): An Implementation of Prolog. Master!s Thesis. Computer Sc. Dept. Univ. of Waterloo, Canada. Robinson, J.A. (1965): A Machine oriented Logic based on the Resolution Principle. JACM 12. 23-41. Robinson, J.A. (1979): Logic: Form and Function. Edin- burgh.U.P. Rohl, J.S. (1975): An Introduction to compiler writing. McDonald & Jones. Roussel, P. (1975): Prolog: Manuel de Reference & d'Utilis- ation. Groupe d1 Intelligence Artificielle, Univ. d'Aix

185 References

Marseille Luminy. Russell, B. (1977): On the equivalence between Continuation and Stack Semantics. Acta Informatica 8, 113-123. Rustin, R. (Ed) (1972) Formal Semantics of Programming Lang- uages. Courant Comp. Sc. Symp. Prentice-Hall. Schwarz, J. (1977): Using Annotations to make recursion equations behave. Research Memo, D.A.I. Univ. of Edin- burgh. Scott, D. (1970): Outline of a mathematical theory of Comp- utation. PRG. 2. Oxford Univ. Computer Lab. Scott, D., Strachey, C. (1972): Towards a mathematical sem- antics for computer languages. PRG 6. Oxford Univ. Computer Lab. Simonet, M. (1977): An Attribute Description of a Subset of Algol 68. Proc. Strathclyde Algol 68 Conf. SIGPLAN 12/6 129-137. Simonet, M. (1980): Bibliography on Attribute Grammars. SIGPLAN 15/3, 35-44. Stoy, J. (1977): Denotational Semantics. M.I.T. Press. Steel, T.B. (Ed) (1966): Formal Language Description Lang- uages. North-Holland. Strachey, C. (1966): Towards a Formal Semantics, in Steel (1966), 198-220. Strachey, C., Wadsworth, C.P. (1974): Continuations: a Math- ematical Semantics for handling Full Jumps. PRG 11. Oxford Univ. Comp. Lab. Tennent, R.D. (1976): The Denotational Semantics of Program- ming Languages. CACM 19/8, 437-453. Warren, D.H.D. (1974): Warplan: A System for Generating Plans. Memo 76. DAI. Univ. of Edinburgh. Warren, D.H.D. (1975): Implementation of an Efficient Pred- icate Logic Interpreter based on Earley Deduction. Research Proposal. DAI. Univ. of Edinburgh. Warren, D.H.D. (1977a): Implementing Prolog. Res. Rep. 39,40. DAI. Univ. of Edinburgh. Warren, D.H.D. (1977b): Logic Programming and Compiler Writ-

186 References

ing. DAI Rep. 44. Univ of Edinburgh. Warren, D.H.D. (1980): An Improved Prolog Implementation which Optimises Tail Recursion. Logic Programming Work- shop, Budapest. 1-11. Warren, D.H.D., Pereira, L.M., Pereira F. (1977): Prolog - the Language and its Implementation compared with Lisp. Watt, D.A. (1974): Analysis oriented two-level Grammars. Ph.D. Thesis, Univ. of Glasgow. Watt, D.A. (1977): The Parsing Problem for Affix Grammars. Acta Informatica, 1-20. Watt, D.A. (1979): An Extended Attribute Grammar for Pascal. SIGPLAN 14/2. 60-74. Watt, D.A., Madsen, O.L. (1977): Extended Attribute Gram- mars. Report 10, Computing Science Dept, Univ. of Glas- gow. Wegner, P. (1971): Data Structure Models fox Programming Languages. Proc. Symp. on Data Struct, in Prog. Lang. SIGPLAN 12/8, 109-115. Wegner, P. (1972): Programming Language Semantics, in Rustin (1972), 149-248. van Wijngaarden, A. (1966): Recursive Definitions of Syntax and Semantics, in Steel (1966) 13-24. van Wijngaarden, Mailloux, B.J., Peck, J.E.L., Koster, C.H.A. (1969): Report on the Algorithmic Language Algol 68. MR101. Mathematisch Centrum, Amsterdam. van Wijngaarden, Mailloux, B.J., Peck, J.E.L., Koster, C.H.A., Meertens, L.G.L.T., Fisker, R.G. (1975): Re- vised Report on the Algorithmic language Algol 68. Acta Informatica 5, 1-236. (Also Springer Verlag 1976). Wirth, N. (1977): What can we do about the Unnecessary Diversity of Notation for Syntactic Definition. CACM 20/11, 822-823.

187 Appendix A

Algol 68 Subset Defined using M-Grammars

In the productions of the grammar the names of the nonterminals usually specify a context-free grammar. Term- inal symbols are in quotes Parameters which are cons- tants start with a capital, and variables with a lower case letter. Conditions, which do not take part in the prod- uction, follow the symbol 'fir1. An example which demonstrates most features of the syntax is:

Label(env,lab:Label,lab) -> @Id(lab) & Unique(lab,Label,env).

Here, § indicates a terminal symbol which is itself a func- tion (generated by the lexical syntax), and ':' is used as an infix function symbol in the output as well as being a terminal symbol. The parameters are in two groups, with 'env' in the middle, standing for 'environment1, i.e. the set of all declarations available at this point. Parameters before env (absent in the above case) are mostly context sensitive restrictions; those after are the 'output' or static semantics, though in the case of declarations they contribute to the context conditions. Comments are written between '/*' and '*/'. Paragraph numbers correspond to the Algol 68 report (revised edition).

The semantics are also given using production-like statements acting on an abstract syntax tree produced by the syntax, but using '=>' instead of '->'. A statement of the form:

188 Algol 68 Definition

a => b; c. means intuitively 'to do a, do b followed by c'. This act- ually translates into Prolog as the statement:

Do (a, si, s4) <- Do (b, s2, s3) & Continuation(s3,s4)• where sl..s4 are states of the machine of the form

State(stack, heap, continuation).

and s2 is the same as si except that the statement 'c' is added to the 'front' of the continuation. The clauses which define continuation are:

Continuation(State(st,h,a;b),s2) <- Do(a,State(st,h,b),s2). Continuation (State(st,h,a) ,s) <- Do(a,State(st,h,ExitBlock),s). Continuation(State(st,h,STOP),State(st,h,STOP)).

and the initial call is:

<- Do(program,State(NIL,NIL,STOP),final).

Thus intuitively there is a machine which alternately exec- utes 'Do's and 'Continuation's until it stops. In Algol 68, the semantics are given by a production called 'Semantics', which has two parameters - for the construct to be evaluated and the resultant value. The symbol '//' means evaluate collaterally - it is defined by: a // b -> a ; b I b ; a.

The sections in the definition are:

189 Algol 68 Definition

2.2 Programs. 3. Clauses. 3.1 Closed Clauses 3.2 Serial Clauses 3.3 Collateral Clauses 3.4 Choice Clauses 3.5 Loop Clauses 4.1 Declarations. 4.2 Mode Declarations 4.3 Priority Declarations 4.4 Identifier Declarations 4.5 Operator Declarations 4.6 Modes 4.8 Identifiers and Bold Symbols 5.1 Units. 5.2 Units associated with names 5.4 Routines 5.5 Units associated with values off any mode 8 Denotations

190 Algol 68 Definition

/* —*/

/* Programs and Clauses */ /* */

/* 2.2 The Program */

Program(val) -> Close dClause (VOID, Range (NIL, NIL), val).

/* The 'Range1 above is the 'primal environment'. Each range has two parameters: Range(declarations in this range (including labels), enclosing ranges). */

/* 3.1 Closed Clauses */

ClosedClause(mode,env,val) -> Begin(style) SerialClause(mode,env,val). End(style).

Begin(Bold) -> "BEGIN". /* BEGIN comes in two styles */ End(Bold) -> "END". Begin(Brief) -> "(". End(Brief) -> ")".

EnclosedClause(mode,env,val) -> ClosedClause(mode,env,val) I CollateralClause(mode,env,val) I ChoiceClause(mode,env,val) I LoopClause(mode,env,val).

/* 3.2 Serial Clauses

Serial Clauses are the basic units of a program. Declarations (decs) may be mixed with statements

191 Algol 68 Definition

until the first occurrence of a label (labs) inthe block */

SerialClause(mode,env,val) -> Series (mode,newenv,decs,labs,stms) & Newrange (decs,labs,stms,env,newenv,val) .

Series(mode,env,decs,labs,stmsl? stms2)-> Unit(VOID,env,stmsl) ";" Series(mode,env,decs,labs,stms2)• Series(mode,env,decs,labs,stmsl;stms2) -> Declarations(env,decsl,stmsl) Series(mode,env,decs2,labs,stms) & Append(decsl,decs2,decs). Series(mode,env,NIL,dec.labs,Label(lab);stms) -> Label(env,dec,lab) Series(mode,env,NIL,labs,stms). Series(mode,env,NIL,dec.labs,Exit(stm,lab,stms)) -> Unit(mode,env,stm) "EXIT" Label(env,dec,lab) Series(mode,env,NIL,labs,stms). Series(mode,env,NIL,NIL,stms) -> Unit(mode,env,stms).

Label(env,lab:Label,lab) -> @Id(lab) ":" & Unique(label,Label,env).

/* Newrange constructs the 'Range' used for checking scope restrictions and the 'Block' which is the output form. */

Newrange(decs,labs,stms,env, Range(props,env), Block(stms)) <- Append(decs,labs,props).

/* Blocks construct stacks which have 7 elements in each frame:

192 Algol 68 Definition

Frame(display of frame levels (including this one), value of this frame, names of locals, values of locals, rest of the stack, statements in this block, previous continuation) /*

Semantics(Block(stms),val) => EnterBlock(stms,val).

Do(EnterBlock(stms,val), State(stack,heap,cont), State(Frame(display,val,NIL,0.NIL,stack,stms,cont) , heap,Semantics(stms,val))) <- NewDisplay(stack,display).

NewDisplay(Frame(n.a,b,c,d,e,f,g),level.n.a) <- Sum(n,l,level). NewDisplay(NIL,1.NIL).

Exitblock(State(Level(a,b,c,d,e,stack,cont),heap,f), State(stack,heap,cont)). ExitBlock(State(NIL,a,b),State(NIL,a,STOP)).

Semantics(a?b, val) => Semantics(a,v); Semantics(b,val).

Semantics(Lab(i), NoVal) => NIL.

Semantics(Exit(stm,lab,stms), val) => Semantics(stm, val);

ExitBlock.

/* 3.3 Collateral Clauses */

CollateralClause(VOID,env,val) -> Begin(style)

193 Algol 68 Definition

JoinedPortrait(mode,env,val) End(style).

JoinedPortrait(mode,env,vail.val2) -> Unit(model,env,vail) ( n," JoinedPortrait(mode2,env,val2) & Balances(mode,model,mode2) I NIL & mode=model & val2=NIL).

/* The . operator in the semantics specifies col- lateral evaluation - any action first. The proc- edure DoCollateral evaluates the answer in any order by taking any of the actions first. The resultant list is still ordered correctly */

Semantics (al.a2, val) => DoCollateral (al.a2, val).

DoCollateral(a, v) => Select(a,v,al,vl,a2,v2)? Semantics(al,vl); DoCollateral(a2,v2). Do(Select(al.a2, vl.v2, al, vl, a2, v2), s, s). Do(select(al.a2, vl.v2, a3, v3, al.a4, vl.v4), s, s) <- Do(Select(a2, v2, a3, v3, a4, v4), s, s).

/* 3.4 Choice Clauses

Conditionals (choices) come in two kinds - boolean and integer (the Case clause). The test, or enquiry, part of a conditional is itself a serial clause which may have declarations etc. (but not labels) */

ChoiceClause(mode,env,val) -> Start(kind,style) ChooserClause(kind,style,mode,env,val) Finish(kind,style).

194 Algol 68 Definition

ChooserClause(kind,style,mode,env,val) -> Series(kind,env,decs,NIL,val) AlternateClause(kind,style,mode, newenv,then.else) & NewRange(decs,NIL, IF(kind,test,then,else),env,newenv,val).

AlternateClause(kind,style,mode,env,vail.val2) -> In(kind,style) InClause(kind,mode,env,vail) ( OutClause (kind,style,mode2,env,val2) & Balances(mode,model,mode2) & val = vall.val2 I NIL & val2=NIL).

InClause(BOOL,mode,env,val) -> SerialClause(mode,env,val). InClause(INT,mode,env,val) -> JoinedPortrait(mode,env,val). OutClause(kind,style,mode,env,val) -> Out(kind,style) SerialClause(mode,env,val) Again(kind,style) ChooserClause(kind,style,mode,env,val).

/* The words introducing conditionals come in two styles - Bold and Brief. For Bold, there are two kinds, for boolean and integer conditionals. Note that '(' occurs all over the place in Algol 68 */

Start(BOOL,Bold) -> "IF". Start(INT,Bold) -> "CASE In(BOOL,Bold) -> "THEN In(INT,Bold) -> "IN". Out(BOOL,Bold) -> "ELSE Out(INT,Bold) -> "OUT" Again(BOOL,Bold) -> "ELIF Again(INT,Bold) -> "OUSE

Fini sh(BOOL,Bold) -> "FI". Finish(INT,Bold) -> "ESAC

Start(kind,Brief) -> In(kind,Brief) -> "1". Out(kind,Brief) -> "1".

195 Algol 68 Definition

Again(kind,Brief) -> "Is". Finish(kind,Brief) -> ")".

Semantics(If(BOOL,test,then,else),val) => Semantics(test,Bool(TRUE))? Semantics(then,val) I Semantics (test,Bool(FALSE))? Semantics(else,val) • Semantics(If(INT,test,in,out),val) => Semantics(test,Int(int))7 Switchon(int,in,out,stm)7 Semantics(stm,val).

/* The case statement selects the n'th statement of the options, but takes the OUT clause if not */

Do(Switchon(a,b,c,d),s,s) <- Switchon(a,b,c,d). Switchon(1, a.b, c, a). Switchon(n, a.b, c, d) <- Gt(n,l) & Sum(m,l,n) Switchon(m,b,c,d). Switchon(n, a, b, d) <- Lt(n,l). Switchon(n, NIL, a, a).

/* 3.5 Loop Clauses

Loops are very flexible - almost any of the preliminary parts may be omitted with sensible defaults applied. In the 'range1 part, the introductory word is supplied as the first parameter. */

LoopClause(VOID,env,val) -> ForPart(envl,dec, id) RangePart("FROM",Int(1),env,from) RangePart("BY",Int(1),env,by) RangePart("TO",NIL,env,to)

196 Algol 68 Definition

& Newrange(dec,Label(1) .NIL, (For(from.by.to.NIL,id,b,t)?Label(l); Test(id,b,t,whiledo)), env,envl,val); WhilePart(Repeat(do,id,b,off)),envl, newenv,whiledo,off) DoPart(newenv,do).

ForPart(env,dec,id) -> "FOR" Def iningldentifier(INT,LOC,env,dec,id). ForPart(env,Aleph:int,Id(Aleph)) -> NIL.

RangePart(kind,default,env,val) -> @kind Unit(Int,env,val). RangePart(kind,default,env,default) -> NIL.

WhilePart(do,env,newenv,val,1) -> "WHILE" Series(BOOL,newenv,decs,NIL,test) & NewRange(decs,NIL,If(BOOL,test,do,skip), env,newenv,val)• WhilePart(do,env,env,do,0) -> NIL.

DoPart(env,val) -> "DO" SerialClause(VOID,env,val) "OD".

Semantics(For(fbt,id,by,to),x) => Semantics(fbt,from.by.to.NIL); Set(id,from). Semantics(Test(id,by,NIL,whiledo),val) => Semantics(whiledo,val). /* if TO omitted, no test is done */ Semantics(Test(id,by,to,whiledo),NoVal) => Lookup(id,0,from); (Test(from,by,to); Semantics(whiledo,val) I "Test(from,by,to)). Do(Test(from,by,to),s,s) <- by > 0 & from <= to I by < 0 & from >= to I by = 0. /* always succeeds */ Semantics(Repeat(do,id,by,off),x) => Semantics(do,y); Semantics(Plus(id,by),newval);

197 Algol 68 Definition

Reset(idfnewval); Jump(Label(l,off)). /* */

/* 4.1 */ /* Declarations */ /* */

/* Several sets of declarations may be given at the same time, interspersed with commas */

Declarations(env,decsrval) -> Declaration(env,decl,vail) Declarations(env,dec2,val2) & Append(decl,dec2,decs) & Append(vail,val2,val).

/* There are 6 basic kinds of definition. Several instances of any one may be given together in a 'joined definition' */

Declaration(decs,env,val) -> "MODE" JoinedDefinition(Mode,env,decs,val) I "PRIO" JoinedDefinition(Prio,env,decs,val) I Generator(env,mode,gen) JoinedDefinition(Var(mode,gen),env,decs,val) I Mode (Formal,env,mode) JoinedDefinition(Ident(mode),env,decs,val) I "PROC" JoinedDef inition (Proc, env, decs, val) I "OP" JoinedDefinition(Op,env,decs,val).

JoinedDefinition(kind,env,decs,val) -> Def inition (kind,env,decl,vail) ( n,n JoinedDefinition(kind,env,decs2,val2) & Append(decl.NIL,decs2,decs) & Append(vail,val2,val) I NIL & decs=decl.NIL & val=vall.NIL).

198 Algol 68 Definition

/* 4.2 Mode Declarations */

Definition(Mode,env,dec,SKIP) -> DefiningBold(Mode(mode,i),env,dec,id) "=" Mode(Actual,env,mode,i).

/* 4.3 Priority Declarations */

Definition(Prio,env,Prio(tag):num,SKIP) -> DyadicOp(tag) "=" ^Integer(num) & Le(0,num) & Le(num,9) & Unique(Prio(tag),num,env).

/* 4.4 Identifier Declarations

Identities have fixed values once they are evaluated. Variables may be initialised on declaration. The form for routine is the abbrev- iated one - procedure variables are also allowed */

Definition(Ident(mode),env,dec,NewConst(id,val)) -> Definingldentifier(mode,env,dec,id) n=n Unit(mode,env,val). Definition(Proc,env,dec,NewConst(id,val)) -> Definingldentifier(mode,env,dec,id) n=" RoutineText(mode,env,val). Definition(Var(Ref(mode),gen),env,dec,NewVar(gen,id,val)) -> Definingldentifier(Ref(mode),env,dec,id) (":=" Unit (mode,env,val) I NIL & val=SKIP).

/* A new constant is created by evaluating its value and adding a definition of the local name of the constant bound to the value */

199 Algol 68 Definition

Semantics(NewConst(name,source) , NoVal) -> Semantics(source,val); Set(name,val).

Do (Set (name,val),State(stack,heap,cont), State(newstack,heap,cont)) <- stack=Frame(a,b,names,c,d,e,f) & newstack=Frame(a,b,name.val.names,c,d,e,f)•

/* A New variable is created by generating a space, either on stack or heap, putting the appropriate value in it, and setting the local name of the variable to point to the location */

Semantics(NewVar(generator,name,source),NoVal) => Semantics(generator,loc) // Semantics(source,val); Update(loc,val); /* see 5.2 */ Set(name,loc).

/* 4.5 Operator Declarations

Operators behave much the same as procedures, but there can be several declarations for the same operator with different modes - but the modes must be 'independent1 (see 4.8). */

Definition(Op,env,dec,NewConst(id,val)) -> DefiningDyadicOp(mode,env,dec,id) "=" RoutineText(Proc([pi,p2],m),env,val) I DefiningMonadicOp(mode,env,dec,id) "=" RoutineText(Proc([par],m),env,val)

/* 4.6 Declarers - Modes

200 Algol 68 Definition

The modes below are only a skeleton, as most modes are declared in the standard prelude. Those given represent the ways of combining modes to- gether. INT and BOOL are given as (predefined) bold symbols. The last parameter, n, in Mode is a device to stop circular definitions. */

Mode(Actual,env,mode,S(n)) -> BoldSymbol(Mode(mode,n),env)• Mode(kind,env,Ref(mode),0) -> "REF" Mode(Virtual,env,mode,n). Mode(kind,env,Proc(pars,mode),0) -> "PROC" "(" ModeList(env,pars) ")" Mode(Formal,env,mode).

BoldSymbol(Mode(INT,0),env) -> nINTn. BoldSymbol(Mode(BOOL,0),env) -> "BOOL".

ModeList(env,mode.modes) -> Mode(Formal,env,mode,n) ( n,n ModeList(env,modes) I NIL & modes=NIL).

Definition(Param(mode),env,dec,id) -> Definingldentifier(mode,Param,env,dec,id). /* */

/* 4.8 */ /* Identifiers and Bold Symbols */ /* */

/* Defining Identifiers are for variables and constants. The clauses below test that an identi- fier is only declared once in any range */

Definingldentifier(mode,env,tag:mode,tag) -> @Id (tag)

& Unique(tag,mode,env).

Unique(tag,mode,Range(decs,env)) <-

201 Algol 68 Definition

& OneMember(tag,mode,decs) .

OneMember(a,b,(a:b).c) <- Member(a,d,c). OneMember(a,b,(c:d).e) <- a ^ c & OneMember(a,b,

Member(a,b,a.b.c). Member(a,b,c.d.e) <- Member(a,b,e).

/* Bold Symbols are used for system words, mode and operator names - upper case is the standard. But DefiningBold only applies to mode names */

DefiningBold(mode,env,tag:mode,id) -> @Bold(tag) & Unique(tag,mode,env).

/* Operators are either Bold symbols or certain other combinations established by the lexical analysis phase */

DefiningDyadicOp(mode,env,tag:mode,id) -> ( @Bold(tag) I @Dyad(tag) ) & (Unique(tag,mode,env) I Independent(tag,mode,env)).

DefiningMonadicOp(mode,env,tag:mode,id) -> ( @Bold(tag) I @Monad(tag) ) & (Unique(tag,mode,env) I Independent(tag,mode,env)).

/* The definitions below are for the applied use of the symbols */

Identifier(mode,env,Id(tag,offset)) -> @Id(tag) & Identified (tag, mode, of f set, env).

202 Algol 68 Definition

BoldSymbol(mode,kindfId(tag,offset)) -> @Bold(tag) & Identified(tag,mode,offset,env)• DyadicOperator(mode,env,Id(tag,offset)) -> (@Bold(tag) I @Dyad(tag)) & Identified(tag,mode,offset,env)• MonadicOperator(mode,env,Id(tag,offset)) -> (@Bold(tag) I @Monad(tag)) & Identified(tag,mode,offset,env).

Identified(id,mode,offset,Range(decs,env)) <- (Member(id,decs) & offset=0 I Member(id,m,decs) & Identified(id,mode,lev,env) & Sum(lev,l,offset)).

/* The value of an identifier is the location at which it is stored, not the contents of that location */

Semantics(Id(var,offset), val) => Lookup(var,offset,val) .

Do(Lookup(var,offset,val), s, s) <- s = State(stack,heap,cont) & Lookupl(var,offset,stack,val). Lookupl(var,0,Frame(a,b,names,c,d,e),val) <- Lookup2(var,names,val)• Lookupl(var,offset,Frame(display,a,b,c,stack,d,e),val) <- FindFrameNumber(offset,display,num) & FindFrame(num,stack,frame) & Lookupl(var,0,frame,val). Lookup2(var,var.val.rest, val). Lookup2(var,v.w.rest,val) <- Lookup2(var,rest,val).

FindFrameNumber(0,current.rest,current). FindFrameNumber(lev,current.rest,num) <-

203 Algol 68 Definition

Sum(next,l,lev) &

FindFrameNumber(next,restfnum)• FindFrame(num,frame,frame) <- frame = Frame(num.rest,a,b,c,d,e,f). FindFrame(num,Frame(a,b,c,d,stack,e,f),frame) <- FindFrame(num,stack,frame).

/* The Deref operation is never supplied by the programmer explicitly - it derives from the coercions (see 5.1). It has the effect of fetching a value from the store */

Semantics(Deref(exp), val) => Semantics(exp, v); Fetch(v, val).

Do(Fetch(Loc(x,dyn),val), s, s) <- s=State(stack,heap,cont) & FindFrame (dyn,stack, Frame(dyn.r,a,b,local,c,d,e)) & Fetchl(x,local,val). Do(Fetch(Heap(x),val), s, s) <- s=State(stack,heap,cont) & Fetchl(x,heap,val). Fetchl(n,n.val.rest,val). Fetchl(n,m.this.rest,val) <- Sum(k,1,m) & Fetchl(n,k.rest,val).

204 Algol 68 Definition

/* */ /* 5.1 */ /* Units */ /* */

/* Units are the basic 'statements1 of the lang- uage. They are separated into several categories to determine the grouping of constructs, units being the least binding and primaries the most, and to aid the coercions. Brackets may be used to change the groupings (forming an Enclosed Clause) */

Unit(model,env,vail) -> ( Assignment(mode,env,val) I IdentityRelation(mode,val,env) I RoutineText(mode,env,val)) & Coerce(Strong,mode,model,val,vail) I Tertiary(model,env,val).

Tertiary(model,env,val) -> ( Jump(mode,env,val) I Skip(mode,env,val) I Formula(p,mode,env,val) I Nihil(mode,env,val)) & Coerce(Strong,mode,model,val,vail) I Secondary(model,env,val).

Secondary(model,env,val) -> Generator(mode,env,val) & Coerce(Strong,mode,model,val,vail) I Primary(model,env,val).

/* Coercions specify the automatic type conversions that are available. Those given here - VOIDing and Dereferencing - are only two of six possible in the full language */

205 Algol 68 Definition

Coerce(Strong,model,mode2,vail,val2) <-

Coerce(Meekrmodel,mode2,vail,val2). Coerce(Strong,mode,VOID,val,val).

Primary(model,env,vail) -> ( Call(mode,env,val) I Cast(mode,env,val) I Denotation(mode,val) I Identifier(mode,env,val) I EnclosedClause(mode,env,val)) & Coerce(Strong,mode,model,val,vail)

Coerce(Meek,mode,mode,val,val) . Coerce(Meek,Ref(model),mode2,vail,Deref(val2)) Coerce(Meek,model,mode2,vail,val2).

/* 5.2 Units associated with names */

Assignment(mode,env,Asgt(target,val)) -> Primary(Ref(mode),env,target) n:=" Unit(mode,env,val).

/* The target of an assignment yields a location - which may be on stack or heap - and the value of the expression is inserted in this and also returned as the value of the assignment */

Semantics(Asgt(target,exp),val) => Semantics(target,store) // Semantics(exp,val); Update(store,val).

Do(Update(Loc(n,dyn),val), State(stack,heap,c State(newstack,heap,co M Updatel(n,dyn,val,stack,newstack).

206 Algol 68 Definition

Do(Update(Heap(n)rval), State(stack,heap,cont), State(stack,newheap,cont)) <- Update2(n,val,heap,newheap). Updatel(n,dyn,val,Frame(dyn.r,a,b,locs,c,d,e), Frame(dyn.r,a,b,new,c,d,e)) <- Update2(n,val,Iocs,new). Updatel(n,dyn,val,Frame(a,b,c,d,stack,e,f), Frame(a,b,c,d,newstack,e,f)) <- Updatel(n,dyn,val,stack,newstack)• Update2(n,val,n.old.rest,n.val.rest). Update2(n,val,m.item.rest,m.item.new) <- Sum(k,l,m) & Update2(n,val,k.rest,k.new).

/* An identity relation tests whether the names of two objects are the same, not their values */

IdentityRelation(BOOL,env,Ident(vail,val2)) -> Tertiary(Ref(mode),env,vail) ":=:" Tertiary(Ref(mode),env,vail).

Semantics(Ident(kind,argl,arg2),Bool(val)) => Semantics(argl, vail) // Semantics(arg2, val2)? Ident(kind,vail,val2,val).

Do(Ident(a,b,c,d),s,s) <- Ident(a,b,c,d). Ident (": = :", x, x, TRUE). Ident(":=:", x, y, FALSE) <- Ne(x,y). Ident n, x, y, TRUE) <- Ne(x,y). Ident x, x, FALSE).

Generator(env,Ref(mode),val) -> (("LOC" I NIL) & val=LOC(mode) I "HEAP" & val=HEAP(mode)) Mode(Actual,mode,env,n)

207 Algol 68 Definition

/* A local generator creates a new location on the local stack of the appropriate size for the mode and returns a pointer to it, which is a pair, a position in the local stack and the dynamic level */

Semantics(LOC(mode), location) => Generate(LOC(mode), location).

Do(Generate(LOC(mode), Loc(locn,dyn)), State(stack,heap,cont), State (newstack,heap,cont)) <- stack=Frame(dyn.r,a,b,n.locs,c,d,e) & news tack =Frame (dyn.r,a, b, locn.m ode. Iocs, c, d, e) & Sum(n,l,locn).

/* A heap generator creates a new space on the heap of the appropriate size and returns a pointer to it, which is the number of the location */

Semantics(HEAP(mode), location) => Generate(HEAP(mode),location).

Do(Generate(HEAP(mode), Heap(locn)), State(stack,n.heap,cont), State (stack,locn.mode.heap, cont)) <- Sum(n,l,locn).

NihiKRef (mode) ,env,NIL) -> "NIL".

Semantics(NIL, NIL) => NIL.

208 Algol 68 Definition

/* 5.4 Routines

A routine text may be used anywhere that a procedure name is used, e.g. as an actual parameter of a procedure, as well as the normal use in procedure declarations */

RoutineText(Proc(parmode,mode),env,Routine(pars,body)) -> "(n ParameterDeclaration(env,parmode,pars) ") ( Mode(env,mode) I "VOID" & mode=VOID ) Unit(mode,Range(pars,env),body) & Level(env,lev) & Sum(lev,1,new)•

ParameterDeclaration(env,decs,pars) -> Mode(env,mode) JoinedDefinition(Param(mode),env,decsl,parsi) ( n,n ParameterDeclaration(env,decs2,pars2) & Append(decsl,decs2,decs) & Append (parsi,pars2,pars) I decs = decsl & pars=parsl).

/* The evaluation of a routine forms a 'closure1 which records the local environment. Hence a proc- edure used as an actual parameter knows where to get non-local variables */

Semantics(Routine(pars,body), Closure(pars,body,display)) => CurrentDisplay(display).

Do(CurrentDisplay(display)), s,s) <- s = State (Frame (display, a,b, c,d, e, f), heap,cont).

/* Ten levels of expression (formula) are available - with 0 as the least binding. Level 10 is for monadic operators */

209 Algol 68 Definition

Formula(priority,mode,env,Call(vail,lh.rh.NIL)) -> Operand(1,model,env,lh) DyadicOperator (Proc ([model ,mode2], mode),env,vail) Operand(r,mode2,env,rh) & LE(priority,l) & LT(priority,r) & LE(1,priority) & LE(priority,9). Formula(10,mode,env,Call(vall,opnd)) -> MonadicOperator(Proc([model],mode),env,vail) Operand(10,model,env,opnd).

Operand(priority,mode,env,val) -> Formula(prior ity,model,env,vail) & StronglyCoerce(model,mode,vail,val). Operand(10,Mode,env,val) -> Unit(mode,env,val).

/* A call invokes a procedure - parameterless procedures are omitted here */

Call(mode,env,val) -> Primary(Proc(pars,mode),env,proc) "(" ParameterList(pars,env,pars) ")M & val = Call(proc,pars).

ParameterList(mode.modes,env, val.vals) -> Parameter(mode,env,val) ( ",n ParameterList(modes,env,vals) I NIL & modes=NIL & vals=NIL).

Parameter(mode,env,val) -> Unit(mode,env,val).

Semantics(Call(routine,pars),val) => Semantics(routine,closure) // Semantics(pars,actual); Enterproc(closure,actual,val).

210 Algol 68 Definition

/* Parameter passing is handled by the same mechanisms as used in identity declarations (4.4). SetList associates parameters with values */

Do(EnterProc(bodyfdisplay), State(stack,heap,cont),

State(Frame(dynfval,NIL,0.NIL,stack, body,cont),heap, SetList(actual,formal); Semantics(body,val))) <- NewLevel(stack,dyn).

SetList(NIL,NIL) => NIL. SetList(actual.restl,formal.rest2) => Set(formal,actual)? SetList(restl,rest2).

NewLevel(Frame(n.r,a,b,c,d,e,f),new) <- Sum(n,l,new).

/* The range of jumps is limited by the syntax - one cannot jump into a conditional for instance. However one can jump out of almost anything, an expression, a block etc. */

Jump(mode,env,Goto(label,offset)) -> ("GOTO" I "GO" "TO") @Id(label) & Label(label,offset,env).

Label(tag,0,Range(decs,env)) <- Member(tag,Label,decs). Label(tag,offset,Range(decs,env)) <- (Member(tag,decs) & Label(tag,lev,env) & Sum(lev,l,offset).

/* The semantics of Jumps is handled by continua-

211 Algol 68 Definition

tions - at each stage the remaining expressions are pushed on the continuation stack and pulled off one at a time at the completion of the stage. A local jump throws away the continuation and finds the label. A non-local jump may also throw away the continuations of higher levels */

Semantics(Goto(label,offset), x) => Jump(label,offset).

Do(Jump(label, 0), State(stack,heap,cont), State(stack,heap,Semantics(newcont,val)) stack=Frame(a,val,b,c,d,stms,e) & Findcont(label,stms,newcont). Do(Jump(label,offset), State(stack,heap,cont), State(frame,heap,Semantics(newcont,val)) FindFrame(offset,stack,frame) & frame=Frame(a,val,b,c,d,stms,e)) & FindCont(label,stms,newcont).

FindCont(label,Lab(label);stm, stm). FindCont(label,a;b, c) <- FindCont(label,b,c). FindCont(label,Exit(stm,label,stms), stms). FindCont(label,Exit(stm,lab,stms), cont) <- FindCont(label,stms,cont).

/* 5.5 Units associated with values of any mode */

Cast(model,env,vail) -> Mode(Formal,env,mode,n) EnclosedClause(mode,env,val) & StronglyCoerce(mode,model,val,vail).

Skip(mode,env,SKIP) -> "SKIP".

Semantics(SKIP, NoVal) => NIL.

212 Algol 68 Definition

/* 8 Denotations

Denotations - or constant symbols - are given in the lexical syntax */

Denotation(BOOL,Bool(val)) -> @Bool(val). Denotation(INT,Int(val)) -> @Int(val).

/* The values of denotations are themselves */

Semantics(Int(x), Int(x)) => NIL.

Semantics(Bool(x), Bool(x)) => NIL.

213 Appendix B.

An ASPLE Compiler

COMPILE(file) <- LEX(1 1,file,a) & SYNTAX(b,a,NIL) & SYNTHESIS(b,c,NIL) & ASSEMBLY(c,d,NIL) & PRINTING(0,d).

/* LEXICAL SCAN */ /* */

LEX(1 ',f,a) <- READCH(ch,f) & / & LEX(ch,f,a) I A=NIL. LEX(ch,f,a.b) <- LETTER(ch) & WORD(ch,f,a,n) &/& LEX(n,f,b) I DIGIT(ch) & NUMBER(ch,f,a,n) &/& LEX(n,f,b). LEX(,:,,f,a.b) <- READCH(ch,f) & ( ch = '=' & a=":=" I a=ch) & READCH(n,f) & LEX(n,f,b). LEX(ch,f,(ch.NIL).b) <- READCH(n,f) & LEX(n,f,b). LEX(c,f,NIL).

WORD (f irst,f ,word,n) <- READCH(ch,f) & RESTWORD (ch, f, rest, n) & SYSTEM(first.rest,word). RESTWORD(ch,f,ch.rest,n) <- ALPHANUM(ch) & / & READCH(next,f) & RESTWORD(next,f,rest,n). RESTWORD(ch,f,NIL,ch). NUMBER(n,f,NUM(num),next) <- READCH(ch,f) & (DIGIT(ch) & PROD(n,l(),m) & SUM(m,ch,p) & / & NUMBER(p,f,NUM,next) I LETTER(ch) & COMPLAIN(ch.' FOLLOWS 1.n) & FAIL I ch=next & n=num).

214 An ASPLE Compiler

ALPHANUM(c) <- LETTER(c) I DIGIT(c). SYSTEM(a,a) <- RESERVED(a) & /. SYSTEM(a,ID(a)).

RESERVED("BEGIN") RESERVED("BOOL"). RESERVED("DO"). RESERVED("ELSE"). RESERVED("END"). RESERVED("FALSE"). RESERVED("IF"). RESERVED("INPUT"). RESERVED("INT"). RESERVED("OR"). RESERVED("OUTPUT") RESERVED("REF"). RESERVED("THEN"). RESERVED("TRUE"). RESERVED("WHILE"). RESERVED("FI")•

/* SYNTAX */ /* */

SYNTAX(tree) => CHECK("BEGIN")? DCLTRAIN(mem) & CHECKIDLIST(mem)? STMTRAIN(mem, tree); CHECK("END"). DCLTRAIN(env) => DECLARATION(envO) & /? CHECK(";"); (DCLTRAIN(envl) I & envl=NIL) & APPEND(envO, envl, env).

DECLARATION(env) => MODE(m); (IDLIST(REF(m), env) & / I ERROR("?",1 NOT IDENT1))

MODE(INT) => "INT". MODE(BOOL) => "BOOL". MODE(REF(m)) => "REF"? MODE(m).

IDLIST(m, LOC(tag,m).env) => @ID(tag); ( ","? IDLIST(m, env) I & env = NIL).

STMTRAIN(env, sem.semi) => STM(env, sem)?

215 An ASPLE Compiler

RESTSTM(env,semi).

RESTSTM(envfstm) => "?" & /; STMTRAIN(env,stm). RESTSTM(env,NIL) => NEXT(a) & ENDTRAIN(a) & /. RESTSTM(env,stm) => a & COMPLAIN(1 INVALID SYMBOL:1,a); RESTSTM(env,stm). ENDTRAIN("END"). ENDTRAIN("FI"). ENDTRAIN("ELSE").

STM(env, ASGT(tag, exp)) => IDENTIFIER(REF(m), env, tag) & /; CHECK(":="); EXP(ml,envl,exp) & DEREF(ml,m,expl,exp). STM(env, COND(exp, si, s2)) => "IF" & /; EXP(m,env,expl) & DEREF(m,BOOL,expl,exp) CHECK("THEN")? STMTRAIN(env, si) ? ("FI" & s2 = NIL I "ELSE"? STMTRAIN(env, s2)? CHECK ("FI") I & COMPLAIN(fFI OR ELSE EXPECTED1)) STM(env, WHILE(exp, s)) => "WHILE" & /? EXP(m,env,expl) & DEREF(m,BOOL,expl,exp)? CHECK("DO")? STMTRAIN(env, s)? CHECK("END"). STM(env, INPUT(exp)) => "INPUT" & /? EXP(ml, env,expl) & DEREF(ml,REF(m),expl,exp) & INTBOOL(m)• STM(env, OUTPUT(exp)) => "OUTPUT" & /? EXP(ml, env,expl) & DEREF(ml,m,expl,exp) & INTBOOL(m). STM(env,NIL) => NIL.

EXP(m, env, exp) => FACTOR(ml, env, lh) ? RESTEXP(m, env, ml, lh, exp). RESTEXP(m,env,ml,lhl,exp) => "+"&/& DEREF(ml,m2,lhl,lh) & INTBOOL(m2)? FACTOR(m3, env, rhl) & DEREF(m3,m2,rhl,rh)? RESTEXP(m,env,m2,op.lh.rh,exp) & PLUSOP(m,op). RESTEXP(m,env,m,exp,exp) => NIL. FACTOR(m,env,exp) => PRIMARY(m, env, lh)? RESTFACTOR(m, env, lh, exp). RESTFACTOR(m,env,ml,lhl,exp) => "*"&/& DEREF(ml,m2,lhl,lh) & INTBOOL(m2)? PRIMARY(m3,env,rhl) & DEREF(m3,m2,rhl,rh)?

216 An ASPLE Compiler

RESTFACTOR(m,env,m2,op.lh.rh,exp) & MULOP(m, RESTFACTOR(m,env,m,exp, exp ) => NIL. PRIMARY(m,env,exp) => IDENTIFIER(m, env, tag) I n("; COMPARE(m, env, exp); n)" I DENOTATION(ml, exp).

COMPARE(m, env, EQ(expl, exp2)) => EXP(ml, env, expl); RESTCOMP(m,ml,expl, env, exp2). RESTCOMP(BOOL,env,ml,lhl,op.lh.rh) => RELOP(op) & / & DEREF(ml,INT,lhl,lh); EXP(m2,env,rh2) & DEREF(m2,INT,rh2,rh). RESTCOMP(m,env,m,exp,exp) => NIL. RELOP(EQ) => " = ". RELOP(NE) => n$n.

IDENTIFIER(mode, env, ID(tag)) => @ID(tag) & (MEMBER(LOC(tag, mode), env) & / I COMPLAIN(tag.1 NOT DECLARED1)).

DENOTATION(BOOL, VAL(O)) => "FALSE". DENOTATION(BOOL, VAL(1)) => "TRUE". DENOTATION(INT, VAL(val)) => @NUM(val).

/* ERROR ROUTINES */ ERROR(stop,message) => GET(stop,string) & COMPLAIN(1?1.message.string). COMPLAIN(message) <- WRITEST(1?1.message) & NEWLINE. CHECK(token) => @token & /. CHECK(token) => NEXT(a) & COMPLAIN(token.1 INSERTED BEFORE '.a). NEXT(c,c.r,c.r). CHECKEQ(a,a) <- /. CHECKEQ(a,b) <- COMPLAIN(a.1 SHOULD BE 1.b). GET(stop,NIL) => @stop & /. GET(stop,NIL) => ";" & /. GET(stop,a.b) => @a; GET(stop,b).

217 An ASPLE Compiler

WRITEST(a.b) <- WRITECH(a) & / & WRITEST(b)• WRITEST(a) <- WRITECH(a).

CHECKIDLIST(dec.env) <- CHECKl(dec,env) & CHECKIDLIST(env). CHECKIDLIST(""). CHECKl(LOC(tag,mode),env) <- MEMBER(LOC(tag,m),env) & / & WRITEST('? IDENTIFIER DECLARED TWICE:1.tag). CHECKl(dec,env).

INTBOOL(INT) <- /. INTBOOL(BOOL) <- /. INTBOOL(m) <- COMPLAIN(m.1 SHOULD BE INT OR BOOL1).

DEREFERENCE(mode, mode, exp, exp). DEREFERENCE(REF(mode), model, tag, DEREF(exp)) <- DEREFERENCE(mode, model, tag, exp). DEREFERENCE(mode,reqd,tag,tag) <- WRITEST(1? WRONG MODE, '.tag.1 SHOULD BE '.reqd).

APPEND(u.v, w, U.x) <- APPEND(v, w, x). APPEND(nn, x, x).

MEMBER(x, x.y). MEMBER(x, y.z) <- ~x =y & MEMBER(x, z).

PLUSOP(INT,PLUS). PLUSOP(BOOL,OR). MULOP(INT,TIMES). MULOP(BOOL,AND).

/* SYNTHESIS */ /* */

SYNTHESIS(prog) => GEN(prog,temps,vars); ALLOCATE(temps); ALLOCATE(vars).

GEN (a. b, t,v) => GEN (a, t, v) ; GEN(b,t,v).

218 An ASPLE Compiler

GEN(NIL, t,v) => NIL.

GEN(ASGT(nfe)ft,v) => LOADVAL(eftfv) & ADR(n,a)? @CODE(STORE.a) & /.

GEN(COND(bfsfNIL)ftfv) => IFNOTGO(b,1,t,v); GEN(sftfv); §LAB(L) & /.

GEN(COND(brsi,s2)rt,v) => IFNOTGO(br11ftfv); GEN(slftfv);

§CODE(GOTO.12); @LABEL(11); GEN(s2ftrv); @LABEL(12) & /.

GEN(WHILE(bfs),tfv) => @LABEL(11); IFNOTGO(bf11,tfv); GEN(s, t, v); @CODE(GOTO.11)? §LABEL(12).

GEN(INPUT(ID(x))ftrv) => §CODE(INPUT)? @CODE(STORE.y) &

ADR(x,vfy).

GEN(INPUT(exp)ft,v) => GENEXP(exp,t,v); @CODE(LDI)? @CODE(INPUT)? @CODE(STI.O).

GEN(OUTPUT(exp),tfv) => GENEXP(expft,v); @CODE(OUTPUT).

LOADVAL(afv,t) => LOADSIMPLE(afvf t) & /.

LOADVAL(a.bfcfvft) => ARITHEXP(a.bfcfvft) & /.

LOADVAL(a, v,t) => IFNOTGO(af11fvft); LOADSIMPLE(VAL(TRUE)fvft) @CODE(GOTO.L2)? §LABEL(11)?

LOADSIMPLE(VAL(FALSE),vft); 8LABEL(12).

LOADSIMPLE(DEREF(a)fV/t) => LOADEREF(afvft) & /. LOADSIMPLE(ID(a),v,t) => §CODE(LOAD.b) & ADR(ID(a),v,c)

& ADR(CONST(c)/Vfb) & /.

LOADSIMPLE(VAL(a),v,t) => @CODE(LOAD.b) & ADR(CONST(a)fv,b) &

LOADEREF(ID(a),vft) => @CODE(LOAD.b) & ADR(afvfb) & /.

LOADEREF(DEREF(ID(a)),v,t) => @CODE(LDI.b) & ADR(afvrb) & /.

LOADEREF(DEREF(a),vft) => LOADEREF(a,vft)? @CODE(LDI.O) & /.

IFNOTGO(EQ.a.bflfvft) => ARITHEXP(MINUS.a.brv,t); @CODE(JNE.1).

IFNOTGO(NE.a.brlfVft) => ARITHEXP(MINUS.a.bfVrt)? @CODE(JEQ.1).

IFNOTGO(AND.a.bfIfVft)=> IFNOTGO(a,lfv,t)? IFNOTGO(bflfVft).

IFNOTGO(OR.a.bf l,vrt) => IFNOTGO(af11fvrt); @CODE(GOTO.1); @LAB(11); IFNOTGO(bflfVft).

IFNOTGO(DEREF(a)flfV,t) => LOADVAL(DEREF(a),vft)? @CODE(JEQ.1).

219 An ASPLE Compiler

IFNOTGO(VAL(TRUE)fl,v,t) => NIL. IFNOTGO(VAL(FALSE),l,v,t) => @CODE(GOTO.1).

ARITHEXP(a.b.c,v,t) => SIMPLE (c>/^fav/Z); ARITHEXP(b,v,t); @CODE(e.d) & OP(a,e) & /. ARITHEXP (a.b.c,v,t) => SIMPLE (bHA^fc^d); ARITHEXP(c,v,t) ;

@CODE(e.d) & OP(afe) & /. ARITHEXP(a.b.c,v,t) => ARITHEXP(b,v,t) & t=TEMP.d.tl?

@CODE(STOR.d); ARITHEXP(crvrtl);

@CODE(e.d) & OP(afe) & /.

ARITHEXP(afvft) => LOADSIMPLE(a,v,t).

SIMPLE(VAL(x)). /* SIMPLE OPERANDS CAN BE LOADED BY ANY OP */ SIMPLE(ID(x)) . SIMPLE(DEREF(ID(x))).

OP(PLUS, ADD). OP(TIMESr MUL) . OP(MINUS, SUB). /* MACHINE OP NAMES */

ADR(idf(id.adr).rest,adr). /* FIND VALUE OF VAR OR CONSTANT */

ADR(idr(b.c).rest,adr) <- ADR(id,rest,adr).

ALLOCATE(a.b) => @a; ALLOCATE(b). /* SPACE FOR TEMPS AND VARS ALLOCATE(NIL) => NIL.

/* ASSEMBLY */ /* */

ASS(LAB(adr).rest,out,adr) <- ASS(rest,out,adr) & /. ASS(other.rest,other,out,adr) <- SUM(adr,1,next) & ASS(rest,out,next). ASS(NIL,NIL). /* ASSEMBLY ALLOCATES NAMES OF LABELS */

220 An ASPLE Compiler

/* OUTPUT OF RESULTS */ /* */

PRINT(a,n) <- WRITECH(n) & WRITECH(' ') & PRINTl(a) & NEWLINE

& SUM(nrl,m) & / & PRINT(b,m).

PRINT(NILfn) <- NEWLINE & NEWLINE.

PRINT1(a.b) <- WRITECH(a) & WRITECH(1 ') & WRITECH(b) & /. PRINTl(a) <- WRITECH(a).

221 Appendix £

Converting Metamorphosis Grammars Prolog

/* These operator declarations allow the system to perform the basic parsing of an M-grammar */

, 0P(»=> f RL, 10). OP('?', RL, 25). OP(&, PREFIX, 30). OP(@,PREFIX, 60).

/* The grammar converts an MG tree into a PROLOG tree in 2 passes. The first leaves possible 'NIL1 branches, which the second removes. To use, call LOADG(filename), or LOADW(filename) to get clauses listed. */

LOADG(f) <- READ(a,f) & MG(a,b) & ADDAX(b) & FAIL. LOADG(f). LOADW(f) <- READ(a,f) & MG(a,b) & ADDAX(b) & WRITE(b) & FAIL. LOADW(f).

MG(v => w, x) <- / & MGl(v => w, x). MG(v, v).

/* MGl does the top level translation */ MGl(v => w, u) <- z = sO.s.NIL & MG2(v, x, z) & MG3(w, y, z, T) & MG4(x <- y, u) & /. MGl(v, x) <- WRITECH(1 TRANSLATION FAILURE: ') & WRITE(v) & FAIL.

/* MG2 converts a LHS term */ MG2(u, v, x) <- ATOM(u) & CONS(u.x, v) & /. MG2(u, v, x) <- CONS(w, u) & MGAPP(w, x, y) & CONS(y, v).

222 Metamorphosis Grammars in Prolog

/* MG3 converts an RHS - the last parameter is T normally or F if the descendent of a I operator */

MG3(u.v, x,y,z) <- / & MG5(z,u.vfx,y).

MG3(uIvf wIxf y, z) <- / & MG3(u, wr y,F) & MG3(v, xf y,F).

MG3(u?Vf w&x, sO.s.yf z) <- / & MG3(ur wf sO.sl.y, z) & MG3(v, x, sl.s.y, z)•

MG3(u&Vf w&v, yr z) <- / & MG3(u, wf yr z) .

MG3(@ur Xf y, z) <- / & MG5(zf u, x, y).

MG3(~uf Xf yf z) <- / & MG6(zf ~uf x, y).

MG3(NILf Xf yf z) <- / & MG6(zf NIL, xf y).

MG3(&Uf Xf yf z) <- / & MG6(z, u, x, y).

MG3(uf Vf wr z) <- MG2(uf vf w).

/* MG4 gets rid of NIL productions */ MG4(NILf NIL) <-/.

MG4(u & NILf v) <- / & MG4(uf v). MG4(NIL & Uf v) <- / & MG4(u, v).

MG4 ( (u & v) St wr x) <- / St MG4 (u St v St w, x) .

MG4 (u St vf x St y) <- / & MG4(u, x) St MG4(v, y) .

MG4 (u I Vf w I x) <- / St MG4(u, w) St MG4(vf x) .

MG4 (u <- Vf w) <- / St MG4(vf x) St (x = NIL & w = u I w = (u <- x)) .

MG4(ur u) .

/* MG5 sorts out terminal symbols with non- terms */

MG5(Tr x, NIL, (x.s).s.t). MG5(Ff x, sO=(x.s), sO.s.t).

/* MG6 does the same for empty productions */

MG6(Tr Xf x, s.s.t). MG6 (F f Xf x St sO=s, sO.s.t).

MGAPP(u.v, w, u.x) <- MGAPP(v, w, x). MGAPP(NILf u, u).

223