T the Formal Description of Programming Languages Using

t The Formal Description of Programming Languages using Predicate Logic by Christopher D.S. Moss Submitted for the Ph.D. Degree Department of Computing Imperial College, London July 1981 1 ABSTRACT Metamorphosis grammars and the Horn Clause subset of first-order predicate logic as used in the Prolog language provide a powerful formalism for describing all aspects of conventional programming languages. Colmerauer's M-grammars generalise the traditional grammar rewriting rules to apply to strings of function symbols with parameters. Expressed in first-order logic and using resolution these provide the facilities of other formalisms such as W-grammars and Attribute Grammars with great- ly improved ease of use and comprehension and the added advantage that they may be run directly on a computer. The thesis provides a methodology for expressing both syntax and semantics of programming languages in a coherent framework. Unlike some formalisms which attempt to give most of the definition in terms of either syntax or semantics, this tries to preserve a natural balance between the two. The syntax separates lexical and grammar parts and generates an abstract syntax which includes the semantics of 'static' objects such as numbers. The semantics is expressed by means of relations which express state transformations using logic rather than the more traditional lambda calculus. Prolog has a well-defined fixpoint or denotational semantics as well as its proof theoretic semantics, which gives the definitions an adequate mathematical basis. The traditional axiomatic method can also be used to express the semantics using a metalevel proof system in which the proof rules become axioms of the system. 2 To demonstrate these principles, descriptions of three example languages are presented. These are ASPLE, a small language which has been used to compare other methods, the Prolog language itself (a non-deterministic applicative language) and a subset of Algol 68 including full jumps and procedures. The definition of the latter uses a method similar to the continuation method. An extensive survey is given of methods of syntax and semantic definition and several applications of the method are suggested, including language prototyping systems, comp- ilers and program proving systems. 3 CONTENTS 1. Introduction 7 2. Grammars and Logic 2.1. Metamorphosis Grammars 16 2.2. The Development of Syntax Descriptions. ... 35 3. Semantics 3.1. Relational Semantics 60 3.2. Axiomatic Semantics 71 3.3. The Development of Semantics 78 4. Examples of Formal Definitions 4.1. ASPLE 92 4.2. Prolog 119 4.3. Mini-Algol 68 139 5. Applications of Formal Definitions 5.1 Prototyping of languages ..... 153 5.2 Towards a logic compiler-compiler 158 5.3 Program proving and transformation 172 References 179 Appendices A. The definition of a subset of Algol 68 188 B. A Compiler for ASPLE 214 C. The Conversion of M-grammars to Prolog 222 4 At the still point of the turning world. Neither flesh nor fleshness; Neither from nor towards; at the still point, there the dance is, But neither arrest nor movement. And do not call it fixity, Where past and future are gathered. Neither movement from nor towards, Neither ascent nor decline. Except for the point, the still point, There would be no dance, and there is only the dance. T. S. Eliot (1935) The Four Quartets - Burnt Norton 1 Thanks . I would like to express my appreciation to everyone who has helped me in so many ways: by providing inspiration and frustration; by encouragement in chatting over issues and criticizing inane notions; by making life worth living in the real world that exists outside the thesis factory; and by practical help in many ways. In particular I must thank Bob Kowalski for his continual inspiration as my supervisor; Keith Clark and Maarten van Emden for discussions of tricky questions; Ian Moor and Moez Agha Hosseini for acting as sounding boards and being extremely hospitable room-mates; Sarah Bellows and Ellen Haigh, who managed to locate the most obscure reports in the library; Diane Reeve and Sandra Evans for typing large sections of the thesis; and Karen King for being patient with me when the whole exercise seemed futile. I was supported during this time by a studentship from the Science Research Council. They have my deep gratitude. Chris. 6 Chapter 1 Introduction and Summary The aim of providing an entirely formal specification for a programming language is a quest which has attracted a great deal of attention over the past twenty years. Al- though the majority of the problems have now been solved using a variety of techniques, what is still lacking is a common formalism with which to draw these together to make them readily comprehensible to the average practitioner of computing. The easiest part of a language to formalise is the context-free syntax. In this area BNF and its variants have gradually prevailed over the alternatives such as those used to define COBOL. The context-sensitive parts were solved in principle by van Wijngaarden in the definition of Algol 68, but other related formalisms, such as attribute grammars, have been attracting more attention because of their in- creased readability and amenability to computer implement- ation compared with W-grammars. The definition of semantics has taken much longer to establish and there is still considerable variation in the style of presentation, although the main lines are more generally agreed. Early definitions were essentially "oper- ational" in nature, based on simple automata which could "execute" programs. These are unsatisfactory on several counts: they cannot easily be used for many of the basic tasks for which semantics are required, such as proving properties of programs, or input-output relationships and equivalence; they have no way of describing non-terminating programs; and they are too "low-level" to provide an easy conceptualisation of the "meaning" of program constructs. 7 Introduction and Summary Later methods have been much more abstract, with a mathematical or logical basis. Currently, the most complete and widely used method is that of denotational semantics, introduced by Strachey and Scott, which describes a language in functional notation. The lambda calculus is used as a metalanguage and various mappings, of identifiers to stores and stores to values, are described in terms of this. There are two other popular methods which are more abstract than denotational semantics: one is the axiomatic or inductive assertion method of Floyd and Hoare which is widely used in program proving but requires the user to supply the inductive assertions along with the program, and also has diffi- culty with such intrinsic programming constructs as jumps and functions with side effects. The other is the algebraic method which characterises the semantics of programs by a set of properties which are required of programs. It is not clear at this point how well this deals with the more comp- licated parts of programming languages or how easy it is to show that an algebraic definition is complete. The axiomatic method is probably best considered as a set of theorems or lemmas derived from the denotational definition and useful for specific purposes such as program proving. In this thesis we demonstrate a logic programming ap- proach to the definition of programming languages. The basis of this is the Prolog language, which uses the Horn-clause subset of predicate logic linked with the resolution method for matching clauses. Each clause is composed of predicates in the form: A B <- ! & B2 & &Bn. where n> = 0 and A and are predicates, and ,<-' stands for 8 Introduction and Summary 'if1. This may be regarded as an assertion if n=0 and either an implication or a procedure if n>0. If A is absent this may either be regarded as a denial or a goal. Any variables in the clause are regarded as universally quantified over the clause. An example of a complete Prolog program (including a goal statement) is: Human(Turing). Human(Socrates). Fallible(x) <- Human(x). Greek(Socrates). <- Fallible(y) & Greek(y). for which the only valid solution is y=Socrates. The procedural interpretation involves matching goals with the heads (left hand sides) of procedures and replacing these by the bodies (right hand sides) of the procedures in a manner very similar to the productions of a grammar, with each branch terminating in an assertion. This process is non-deterministic, since more than one head may match a goal. It can also be interpreted or compiled on a computer with remarkable efficiency. An integral feature of Prolog systems is the use of metamorphosis grammars, originally envisaged by Colmerauer. These may be regarded as a regularised form of W-grammars and can be applied directly to the definition of both context-free and context-sensitive portions of programming languages. They are much simpler to comprehend than W- grammars because of the type-free nature of the logic used, and compare favourably with the use of attribute grammars for this purpose. In addition, they can be run directly on a computer to parse or generate a program, and for one-pass languages this requires no modification to the normal definition. 9 Introduction and Summary Colmerauer's original definition followed the style of Chomsky's production rules in allowing several symbols on the left hand side, although only following a non-terminal. However, these grammars can be systematically transformed into rules which are similar to context-free rules in that they only have a single non-terminal on the left hand side. These are slightly more general than the "definite clause grammars" defined by Pereira and Warren since they allow non-terminals as parameters, and they have the same power as W-grammars.

Load more