A Provably Correct Generator

Jens Palsberg palsberg@daimi, aau. dk

Computer Science Department, Aarhus University Ny Munkegade, DK-8000 Aarhus C, Denmark

Abstract the target language from being an independent product, for general use. We have designed, implemented, and proved the This paper addresses the use of independent, correctness of a compiler generator that accepts realistic target languages without type informa- action semantic descriptions of imperative pro- tion in the semantics. The paper also concerns gramming languages. The generated the possibility of proving correctness of a com- emit absolute code for an abstract RISC machine piler generator, thus making correctness proof a language that currently is assembled into code for once-and-for-all effort. the SPAKC and the HP Precision Architecture. We have overcome these problems. We have Our marline language needs no run-time type- designed, implemented, and proved the correct- checking and is thus more realistic than those ness of a compiler generator, called Cantor, that considered in previous compiler proofs. We use accepts action semantic descriptions of program- solely algebraic specifications; proofs are given in ming languages. The generated compilers emit the initial model. absolute code for an abstract KISC [57] ma- chine language without run-time type-checking. The considered subset of action notation, see ap- 1 Introduction pendix A, is suitable for describing imperative programming languages featuring: The previous approaches to proving correctness of compilers for non-trivial languages all use tar- 9Complicated control flow; get code with run-time type-checking. The fol- lowing semantic rule is typical for these target 9Block structure; languages: 9Non-recursive abstractions, such as proce- (FIRST : C, (Vl, v2) : S) ~ (C, vl : S) dures and functions; and The rule describes the semantics of an instruc- 9Static typing. tion that extracts the first component of the For an example of a language description that has top-element of the stack, provided that the top- been processed by Cantor, see appendix B. The element is a pair. If not, then it is implicit that abstract RISC machine language can easily be the executor of the target language halts the ex- expanded into code for existing RISC processors. ecution. Hence, the executor has to do run-time Currently, implementations exist for the SPARC type-checking. [25] and the HP Precision Architecture [42]. Run-time type-checking imposes an unwelcome The technique needed 'for managing without penalty on execution time because more work has run-time type-checking in the target language is to be done by the executor of the target language. the following: It may be argued, though, that the executor can rely on the source language being statically type- 9 Definethe relationships between semantic checked, and thus avoid the run-time type-checks. values in the source and target languages This implies an unwelcome coupling of the source with respect to both a type and a machine and target languages, however, which prevents state. 419

Thus, we define an operation which given a tar- Mosses [35, 33, 34]. Unified algebras allows get value V, a machine state M, and a type T the algebraic specification of both abstract data will yield the sort of source values which have types and operational semantics in a way such type T and are represented by V and M. Here, that initial models are guaranteed to exist, except "sort" can be thought of as "set". For example, when axioms contradict constraints, in which an integer can represent a value of type truth- case no models of the specification exist. We have value-list by pointing to a heap where the list demonstrated that also a non-trivial compiler can components are represented. In this case, our op- be elegantly specified using unified algebras. In eration will yield a sort containing precisely that comparison with structural operational seman- truth-value-list, when given the integer, the type tics and natural semantics, we replace inference "truth-value-list", and the heap. rules by Horn clauses. The notational difference In contrast, for example Nielson and Nielson is minor, and only superficial differences appear [40] does not involve the machine state when re- in the proofs of theorems about unified specifica~ lating semantic values. Instead, they require tar- tions. Where Despeyroux [10] could prove lem- get values to be "self-contained". Hence, they mas by induction in the length of inference, we in- need to have several types of target values and a stead adopt an axiomatization of Horn logic and target machine that does run-time type checking. prove lemmas by induction in the number of oc- With our approach we can make do with just currences of "modus ponens" in the proof in the one type of target values, namely integer, thus initial model. avoiding run-time type-checking and getting close This paper gives an overview of the author's to the 32-bit words used in the SPARC. Note that forthcoming PhD thesis [44]. Most definitions we do not insert type tags in the run-time repre- and proofs are omitted. For an overview of our sentations of source values; no type information experiments with generating a compiler for a sub- is present at run-time. set of Ads, see [43]. The relationship between semantic values al- In the following section we examine the major lows the proof of a lemma expressing "code well- previous approaches to compiler generation and behavedness" which is essential when reasoning compiler correctness proofs. In section 3 we out- about executions of compiled code. The required line the structure of the Cantor system, includ- type information is useful during compilation, ing the abstract RISC machine language and the too; it is collected by the compiler in a separate action compiler, and we give some performance pass before the code generation. This pass also measures. In section 4 we state the correctness collects the information needed for generating ab- theorem, and finally in section 5 we survey our solute, rather than relative, code. approach to proving correctness in the absence The development of Cantor was guided by the of run-time type-checking in the target language. following principles: We also discuss why we do not treat recursion. The reader is assumed to be familiar with al- Correctness is more important than effi- gebraic specification [12], compilation of block ciency; and structured languages [64], and the notion of a RISC architecture [57]. Specification and proof must be completed before implementation begins. 2 Previous Work As a result, on the positive side, the Cantor im- plementation was quickly produced, and only a 2.1 Compiler Generation handful of minor errors (that had been overlooked in the proof!) had to be corrected before the sys- The problem of compiler generation is usually ap- tem worked. On the negative side, the generated proached by choosing a particular definition of a compilers emit code that run at least two orders specific target language [46]. The task is then to of magnitude slower than corresponding target write and prove the correctness of a compiler for programs produced by handwritten compilers. a notation for defining source languages. Such a The specification and proof of correctness of compiler can then be composed with a language the Cantor system is an experiment in using definition to yield a correct compiler for the lan- the framework of unified algebras, developed by guage, see figure 1. Compiler generators that 420

It appears that the poor performance charac- Language Source definition teristics of the classical compiler generators do P Intermediate language notation not simply stem from inefficient implementations of lambda notation. Mosses observed that deno- rational semantics intertwine model details with Compiler the semantic description, thus blurring the under- lying conceptual analysis [31]. Pleban and Lee further observed that not only a human reader Target but also an automatic compiler generator will language have difficulty in recovering the underlying anal- ysis [48]. Attempts to recover useful information Figure 1: Semantics-directed compiler genera- from lambda expressions include Schmidt's work tion. on detecting so-called single-threaded store ar- guments and stack single-threaded environment operate in this way are often called semantics- arguments [52, 54], and the binding-time analy- directed compiler generators. The Cantor sys- sis of Nielson and Nielson [41]. Despite that, it tem described in this paper is an example of a seems unlikely that the performance characteris- semantics-directed compiler generator. It accepts tics of compiler generators based on denotational language definitions written in action notation, semantics soon will be improved beyond that of and it outputs compilers that emit code in an existing such systems. abstract RISC machine language. A number of compiler generators have been The traditional approach to compiler genera- built that produce compilers of a quality that tion is based on denotational semantics [53]. Ex- compare well with commercially available com- amples of existing compiler generators based on pilers. Major examples are the system of this idea include Mosses' Semantics Implementa- Schmidt and VSller [55, 56], the compiler gen- tion System (SIS) [29], Paulson's Semantics Pro- erator of Kelsey and Hudak [21], and the Mess cessor (PSP) [45, 46], and Wand's Semantic Pro- system of Pleban and Lee [47, 23, 49, 22]. These totyping System (SPS) [62]. In SIS, the lambda approaches are based on rather ad hoc notations expressions are executed by a direct implemen- for defining languages, and they lack correctness tation of beta-reduction; in PSP and SPS they proofs, like the classical systems. They indicate, are compiled into SECD and Scheme code, re- however, that better performance of the produced spectively. There are no considerations of the compiler is obtained when: possible correctness of either the implementation Some model details are omitted from a lan- of beta-reduction, the translations to SECD or guage definition; and Scheme code, or the implementation of SECD or Scheme. The target programs produced by these The notation for defining languages is bi- systems have been reported to run at least three ased towards %ompilable languages". orders of magnitude slower than corresponding target programs produced by handwritten com- A radically different approach to compiler gen- pilers [22]. eration is taken by Dam and Jensen [9]. They After these systems were built, several trans- consider the use of natural semantics [20] (which lations of lambda notation into other abstract they call "relational semantics") as the basis of a machines have been proved correct. Notable compiler generator. They devise an algorithm for instances are the categorical abstract machine transforming a natural semantic definition into a [8] and the abstract machines that can he de- compiling specification. The algorithm requires rived systematically from an operational seman- a language definition to satisfy some conditions; tics of lambda notation, using Hannan's method it is sufficiently general to apply to a language of [16, 14, 15]. It remains to be demonstrated, how- while-programs, but has not been implemented. ever, if a compiler which incorporates one of them The generated compilers emit code for a stack will be more efficient than the classical systems. machine; the correctness of these compilers has Also, the correctness of implementations of these been sketched, whereas the implementation of the abstract machines has not been considered. stack machine is not considered. 421

Finally, compiler generation can be obtained [65] work, using the Boyer-Moore theorem prover, by self-application of a partial evaluator. The and Joyce's [19, 18] work using the HOL system. Ceres system of Tofte [60] is an early example of In both cases, the target code of the translation is this, demonstrating that even compiler genera- a non-idealized machine-level architecture whose tors can be automatically generated. Ceres uses implementation has been verified with respect a language of flowcharts with an implicit state as to a low level of the computer, see for example the notation for defining source languages. An- [17, 27]. The verification of both architectures other notable partial evaluator is the Similix of has even been automatically checked. These ex- Bondorf and Danvy [5, 6] which treats a subset amples of systems verification [4] are important: of Scheme. Gomard and Jones implemented a they minimize the amount of distrust one need self-applicable partial evaluator, called mix, for have to such a verified system. Of course, one an untyped lambda notation [13]. It has been can still suspect errors in the implementation of used to generate a compiler for a language of the gate-level of the computer, or in the imple- while-programs. The generated compiler emits mentation of the theorem prover, but many other programs in lambda notation. The correctness sources of errors have been eliminated. of this compiler generator has been proved; it re- The use of denotational semantics renders dif- mains to be seen, however, if the partial evalua- ficult the specification of languages with non- tion approach will lead to the generation of com- determinism and parallelism. Such features can pilers for conventional machine architectures. be specified easily, however, by adopting the The lack of correctness proofs for the realistic framework of structural operational semantics compiler generators limits the confidence we can [50]. For a survey of recent work on proving have in a generated compiler. Let us therefore ex- the correctness of compilers for such languages, amine the major previous approaches to compiler see the paper by Gammelga~rd and Nielsen [11], correctness proofs. which also contains a detailed account of the ap- proach taken in the ProCoS project, where the source language considered is Occam2. 2.2 Compiler Correctness Proofs In a special form of structural operational se- The traditional approach to proving compiler mantics, called natural semantics [20], one con- correctness is based on denotational semantics siders only steps from configurations to final [24, 26, 58, 51, 39] or algebraic variations hereof states. When both the source and target lan- [7, 28, 59, 3, 30]. The correctness statement can guages have a natural semantics, then there is be pictured as a commuting diagram, see figure 2. hope for proving the correctness of a compiler using the proof technique of Despeyroux [10]. As with the proof techniques used when dealing with SOU;'ce I compiler [ target denotational semantics, Despeyroux's technique syn ;ax " syntax amounts to giving a proof by induction on the length of a computation. The correctness state- ment is different, though. Instead of proving that a diagram commutes, she proves the validity of source target two properties, which informally can be stated as semantics semantics follows:

s Completeness: if the source program ter- minates, then so does the target program, source [ encode J target ....[ and with the same result; and meanings [ meanings [ I Soundness: if the target program termi- nates, then so does the source program, and Figure 2: Compiler correctness. with the same result.

It has been demonstrated that complete proofs Despeyroux proves the correctness statement by of compiler correctness can be automatically induction in the length of the proofs of the as- checked. Two significant instances are Young's sumptions of these properties. A central lemma 422 states that the code for an expression behaves in some performance measurements of the Cantor a disciplined way. We call this property "code system. well-behavedness". We will use a variation of All specifications in this paper, including those Despeyroux's technique, adapted to the frame- of syntax, are given in Mosses' meta-notation for work of unified algebras, see later. unified algebras [36]. A major deficiency of all the previous ap- proaches to compiler correctness, except that 3.1 An Abstract RISC Machine of Joyce [19, 18], is their using a target lan- guage that performs run-time type-checking, as Language explained above. Joyce considers only a language The machine language is patterned after the of while-programs, and it is not clear how to gen- SPARC architecture; it is called Pseudo SPAR.C. eralize his approach. It contains 14 instructions that operate on the Our concern can be sloganized as follows: following machine state: If "well-typed programs don't go wrong", spare-state = then it should be possible to generate cor- (program, program-counter, was-zero, rect code for an independent, realistic ma- was-negative, globals, windows, memory) . chine language that does not perform run- time type-checking. 'program' is a mapping from linenumbers to in- structions. 'program-counter' is a linenumber, The Cantor system is based on the use of such a and 'was-zero' and 'was-negative' are status-bits machine language. (truth-values). 'globals' models the global reg- isters, and 'windows' models a non-overlapping 3 The Cantor System version of the SPARC register-windows. Finally, 'memory' models six separate "pages" of the main Our compiler generator accepts action semantic memory, as a mapping from page-identifications descriptions. Action semantics is a framework for to pages. A page is a mapping from addresses formal semantics of programming languages, de- (natural numbers) to integers. For example, one veloped by Mosses [31, 32, 33, 37, 36] and Watt of the pages is used as a stack, another as a heap. [38, 63]. It is intended to allow useful semantic The only data manipulated by this language descriptions of realistic programming languages, are integers. This means that it is impossible to and it is compositional, like denotational seman- see from a given data value if it should be thought tics. It differs from denotational semantics, how- of as a pointer to an instruction in the program, ever, in using semantic entities called actions, as an address in the memory, or as modeling a rather than higher-order functions. truth-value, an integer, etc. We have designed a subset of action notation The uniformity of the data values makes the which is amenable to compilation and which we Pseudo SPARC language more realistic than have given a natural semantics, by a systematic those considered in previous compiler proofs. It transformation of its structural operational se- contains two major idealizations, however, as fol- mantics [36]. The syntax of this subset is given in lows: appendix A together with a brief overview of the Unbounded word and memory size: principles behind action semantics. Appendix B The data values are unbounded integers and presents a complete description of a toy program- this requires unbounded word size. We also ming language. (Readers who are unfamiliar with assume that the program and memory sizes, action semantics are expected to understand not the number of of registers in a register win- the details in appendix B, despite the suggestive- dow, and the number of register windows ness of the symbols used. See [36] for a full pre- are unbounded. sentation of action semantics.) The central part of the Cantor system is a com- Read-only code: The program is placed piler from action notation to an abstract RISC separately, not in 'memory'. This implies machine language. This section presents both the that code will not be overwritten, and that machine language and the compiler, and it states data will not be "executed". 423

These idealizations simplify the correctness proof Here, 'global' is one of the global registers, and considerably, without removing any of the diffi- 'return-address' is a user-inaccessible register in culties that we address. the register-window. The use of 'default' mod- els that all registers and memory addresses are initialized to 0 before execution starts. Likewise, Pseudo SPARC Real SPARC the program area contains 'skip' instructions ev- skip sub gg0, Zg0, Zg0 erywhere before the program is loaded. jump Z jmpl Z, Zg0 Note that 'step _' and 'next _ _' axe total func- branchequal Z be Z tions. This emphasizes that computation con- branchlessthan Z bneg Z tinues infinitely,once started. For exaxnple, the call jmp]. global, Zr8 'call'instruction will be executed even though the return jmp]. Zr8+8, ~gO global register contained a value that we thought store R1 in t/2 Z P st R1, R2+Z'~P load R1 Z P into R2 ld RI+Z+P, R2 of as a truth-value! It also means that we have stor eregisters save avoided alignment problems, etc., so that a typ- Ioadregisters restore ical run-time error such as "bus error" will not moveRItoR or Zg0, RI, R occur. This is accomplished by having a word- move sum R RI to R' add R, RI, R I rather than byte-oriented definition of the Pseudo move difference R RI to R I sub R, RI, R' SPARC machine. compare R with RI subcc R, RI, 7,gO 3.2 Compiling Action Notation Figure 3: The Pseudo SPARC machine language. The compiler from action notation to Pseudo SPARC machine code proceeds in two passes: Figure 3 shows the 14 Pseudo SPARC instruc- 1. Type analysis and calculation of code size; tions and how they (approximately) can be ex- and panded to real SPARC instructions. In prac- tice, the expansion has to take care of fitting in- 2. Code generation. structions using large integers into several real For each pass there is a function defined for ev- SPARC instructions. It also has to insert ad- ery syntactic category. Those defined for 'Act' ditional "nop" instructions into so-called "delay have the following signatures (we cheat a little slots". Pseudo SPARC instructions can also be bit here, compared to [44], to improve the read- expanded to instructions for the HP Precision Ar- ability): chitecture, though with a littlemore difficulty. The function that models one step of compu- a-count _ _ _ :: Act, data-type, symbol-table -4 tation is defined as follows: (natural, truth-value, data-type, truth-value, data-type, block) . step _ :: sparc-state --~ spare-state (total) . perform ...... :: step m = next Act. data-type, general-register, ((program of m) at frozen, symbol-table, (program-counter of m) default skip) m . cleanup, cleanup, cleanup, linenumber, linenumber-complete, 'step _' models the loading of the current instruc- linenumber-escape, linenumber-fail --~ tion, followed by its execution. The operation (program, general-register, general-register) . 'next _ _ ' is defined in the following style (we give only a single example): Since action notation contains unusual con- structs, e.g., 'complete', 'escape', 'fail',the def- next _ _ :: inition of the type analysis and code genera- instruction, sparc-state --~ spare-state (total) . tion employ unusual techniques, though not very difficult. For example, the definition of 'per- next call (p, pc, cz, cn, g, w, q) = form' requires as argument both the desired start- (p, g at global default 0, cz, cn, 9, address ('[inenumber') of the code to be gener- update to (map of return-address to pc), q) . ated, but also addresses of where to jump to, 424

syntax semantics program input 11 1 I cantor b compiler code output

Figure 4: The Cantor system. should the performance complete ('linenumber- The compiler generator cantor is written in complete'), escape ('linenumber-escape'), or fail Perl [61], and the generated compilers are writ- ('linenumber-fail'). These addresses are calculated ten in Scheme [1]. Examples of a syntax and using 'a-count' which, in addition to type analy- a semantics are given in appendix B; it is the sis, calculates the size of the code to be generated. I~TEX source of the appendix that is processed The function 'a-count' is defined as a forwards by cantor. The generated compiler contains a abstract interpretation, computing with types of syntax checker, a program-to-action transformer, tuples of data ('data-type'), types of bindings the action compiler described above, and finally ('symbol-table'), and code sizes ('natural'). The a Pseudo SPAKC assembler that currently can first 'truth-value' component tells if the action emit code for the SPARC and the HP Precision being analyzed has a chance of completing. If Architecture. The input file is a sequence of in- it does, then the following 'data-type' component tegers, as is the output file. tells the type of the tuple of data that will pro- The HypoPL language, defined in appendix B, duced. The next two components give similar is taken from Lee's book on realistic compiler information about escaping. generation [22], with the difference that we treat The function 'perform' takes as arguments the nesting of procedures in its full generality but do 'data-type' and 'symbol-table' that are also sup- not allow recursion. (For a discussion of why re- plied to 'a-count'. In addition, it takes a 'general- cursion is problematic, see later.) register' which at run-time will contain a pointer Generating a compiler for HypoPL takes 3 to a representation of the tuples of data that seconds. will be received when executing then code. The set 'frozen' contains those registers that the code We have used this compiler to translate Lee's to be produced must not modify, and the three bubblesort program (50 lines). 'cleanup' values are natural numbers that indi- Compile time: 486 seconds; cate how much to pop from the stack, should the performance complete, escape, or fail. Object code size: 114688 bytes; and The calculation of whether an action can com- plete or not, and whether it can escape or not, Object code execution time (for sorting 10 are examples of the compile time analyses that integers): 0.1 seconds. axe built into the compiler. They are used to gen- These figures indicate that the system is rather erate better code, and they are fully integrated in tedious to work with in practice. Additional ex- the proof of correctness, see later. periments, see [43], have shown that the code runs at least two orders of magnitude slower 3.3 Performance Evaluation tha~ a corresponding target program produced by the C compiler (without optimization). This is The Cantor system hem the structure shown in somewhat disappointing but still an improvement figure 4. In practice, a session with Cantor looks compared to the classical systems of Mosses, as follows on the screen: Paulson, and Wand where a slow-down of three orders of magnitude has been reported [22]. In- cantor syntax semantics compiler spection of the code emitted by Cantor-generated compiler program code compilers reveals that the inefficiency mainly code input output stems from three sources: 425

Lack of compile time constant propagation; 2. The operation 'sparc-run p n Be' specifies loading the program p into the program Poor register allocation; and area, and then taking n steps starting in Naive representation of bindings, closures, line 0. It also records if the execution at and lists. any point "jumps outside the code". The memory, registers, status bits, and output Improving the action compiler to avoid this in- file are initialized appropriately, the input efficiency would significantly complicate the cor- file is initialized to Be. 'sparc-run' is defined rectness theorem, which we consider next. in terms of 'step', described above. 3. The operation 'compile A' translates the ac- 4 The Correctness tion A into a machine language program Theorem p and it also gives type information about what will be produced when performing A. To give an overview of the correctness theorem, The program p will start in line 0. we will introduce a bit of notation, as follows (we 4. The operation 'abstract rnp zn hn ze he an cheat a little bit again, compared to [44], to im- ae' will give a sort of all those states (from prove the readability): the action-level) that are represented by the spare-state rap, and that have the type ex- run _ _ :: Act, [integer] list --* state. pressed by the following four arguments. sparc-run _ _ :: The last two arguments are those registers program, natural, page --+ sparc-state. which will contain pointers to the represen- tations of the data produced, should the ac- compile _ :: Act --~ tion complete or escape. (program, truth-value, data-type, truth-value, data-type, 5. The operation 'i-abs n Be' will give the general-register, general-register) . input-file ('[integer] list') which is repre- sented by the natural number n and the abstract ...... :: page Be. sparc-state, truth-value, data-type, truth-value, data-type, The use of both type information and a machine- general-register, general-register --+ state. state in the definition of 'abstract' makes it pos- sible to make do without type information in the i-abs _. :: natural, page --~ [integer] list . semv,ntics of Pseudo SPARC. None of the above five operations are total. (1) a-count A 0 (list of empty-list) = The performance of an action may diverge; the (n, ~,/~, z~, he, empty-list) ; execution of a machine program may "jump out- (2) perform A () (reg 0) empty-set side the code"; the compilation of an action may (list of empty-list) 0 0 0 0 n n n = find a type error; the machine state may repre- (p, ~, a.) sent no state at all from the action-level; and the =:~ compile A = (p, z,,,/~, z., he, ~, a.). page for input-files may contain something with- out the right format. We have only given the definition of 'com pile _', in The lets-notation for unified algebras makes terms of 'a-count' and 'perform'. The operations it particularly easy to specify such partial op- have the following informal meaning: erations. This is because it supports a unified 1. The operation 'run Ail' specifies the per- treatment of sorts and individuals: an individ- formance of an action A which is given ual is treated as a special case of a sort. Thus the empty tuple of data, no bindings, an operations can be applied to sorts as well as in- empty-storage, an empty output-file, and dividuals. A vacuous sort represents the lack of the input-file il (an integer-list). If the per- an individual, in particular the 'undefined' result formance terminates, then that will result of a partial operation. For example, if the per- in a final state ('state') which can be either formance of the action A with input-file il ter- completed, escaped, or failed. minates, then 'run Ail' will be an individual, 426 otherwise it will be a vacuous sort. We need not 2. Soundness: If an n-step execution of specify that such sorts axe vacuous; if it does not p (with input se) reaches m n (and the follow from the specification that they contain an program-counter points to the last line of individual, then they will automatically be vacu- p), then there exists a state m,, represented ous. by rnp, such that a performance of A (with The operations 'run', 'sparc-run', 'compile', and input il) will terminate in ms. 'i-abs' will all yield either an individual or a vac- Notice that it is built into the definition of 'spare- uous sort. In contrast, 'abstract' may yield a sort run', and hence the correctness theorem, that the containing several individuals, and it may also execution of the machine language program never yield a vacuous sort. The possibility of yielding a "jumps outside the code". sort containing several individuals is needed when abstracting with respect to a closure type. This is because if two actions differs only in the naming 5 The Proof Technique of tokens (they axe equal with respect to "alpha- conversion"), then the compiled code for them A number of lemmas axe needed to prove the the- will be identical. orem; here is an overview: We can now state the correctness theorem. * Compiler consistency: These lemmas Note that 't :- s' is another syntax for's : t'. state that the calculation of code size is cor- The meaning is that s is an individual contained rect. They also state that the code is placed in t. consecutively, starting in the desired line. Theorem: Correctness of analysis: These lemmas (z) compile A:Act = state that the type analysis asserts correct (p:program ~:truth-value/~:data-type typings, relative to the semantics of actions. ze:truth-value he:data-type Code well-behavedness: These lemmas ~:general-register a~:general-register) ; state that if the execution of some compiled (2) i-abs (se at 0) se =//:[integer] list code at some point reaches "the end of the code", then it used the memory and reg- =P (z) run Ail = m~:state =~ isters in a disciplined fashion, and, in ad- (:1 mv:sparc-state :l n:natural , dition, the machine state will represent an sparc-run p n se ~- ~, ; abstract state (with the type given by the abstract m v zn/~ ze he an o~ :- me ) ; compiler). (2) sparc-run p n se = mv:sparc-state =~ It is also necessary to prove strengthened versions (:[ ms:state. of completeness and soundness. run A if= mr,; Let us now consider how to adopt Despeyroux's abstract rr~ z~/~ zc he an ae :- ms ) proof technique to the framework of unified alge- The structure of the theorem resembles the cor- bras. rectness statement of Despeyroux. Informally: Despeyroux expresses natural semantics in the Oentzen's system style, with axioms and infer- If the action A is compiled into a machine lan- ence rules. In such a system one can make natu- guage program p (and some additional type in- ral deduction, and can then prove lemmas about formation, etc., is produced), and the input-file the system by induction in the length of such de- il is represented properly in the machine as se, ductions. In contrast, the framework of unified then two properties hold: algebras provide Horn clauses, and there are no 1. Completeness: If the performance of the build-in deduction rules. We have replaced infer- action A (with input-file il) terminates in ence rules by Horn clauses, so to be able to do state ms, then there exists a spaxc-state m r deduction, we adopt a standard axiomatization and a number n such that an n-step execu- of Horn logic, as follows. tion of p will reach mr, and m r represents All specifications in the meta-notation for uni- ms (and the program-counter points to the fied algebras can be transformed into a core no- last line of p). tation which is outlined in the following. Let ~2 427

(Reflexivity) (ft, r) ~- t = t

(ft, r) ~- ~ = t (ft, r) t- t = ,~ (Transitivity) (ft, r) ~ s = u

{(ft, r) ~ al = ti}in_-i if f E E (Functional Congruence) (ft, F) I- f(sl ..... s,) -.= f(tl,..., t,,)

{(ft, F) I- s/= t i}i=,9. (ft, F) I- p(s,, sg.) if p E {- < : -} (Predicative Congruence) (ft, r) ~- p(t,, tg.) -' -

{(ft, F) I- Fi},~l if (Fx;...;F, =~ F) e F (Modus Ponens) (ft, r) F- f Figure 5: Axiomatization of Horn clause logic. be a so-called homogeneous first-order signature, With these deduction rules, we can do proof by that is, a pair (E, H) where E is a set of operation induction in the number of occurrences of "modus symbols and II is a set of predicate symbols. In ponens" in deductions. Note that a single ap- the setting of unified algebras, it is required that plication of modus ponens corresponds closely to a natural deduction step. This makes our _.D {nothing, _ ] _, _&_} proof strategy close to Despeyroux's. All lemmas and proved by induction in the length of deduction are H= {_=_,_<_,_:_} satisfied by the initial model of the specification. The key property of an initial model needed here The value of the constant 'nothing' is a vacuous is that it only contain entities that are values of sort, included in all other sorts. The operation ground terms (it contains "no junk"). '- I -' is sort union, and '_&_' is sort intersection. We will end this section by explaining why we The predicate '_ = _' asserts equality, '_ < _' as- do not treat recursion, in contrast to Despeyroux. serfs sort inclusion, and 'T1 : I"2' asserts that the The reason for this is rather subtle; it hinges on value of the term T1 is an individual included in the expre~iveness of the unified meta-notation. the (sort) value of the term Tg.. Full action notation offers self-referential bind- Further, let F be a set of Horn clauses built up ings as the means for describing for example re- from ft. Any specification F of such Horn clauses cursive procedures. A self-referential binding is a will be augmented with some basic Horn clauses, cyclic structure; the run-time representation will stating for example the reflexivity of '_ < _', see obviously also be cyclic. In Despeyroux's paper, [34]. Finally, let F be a formula built up from ft. such cyclic structures are represented as graphs We will then write with self-loops--both in the source and target (ft, r) t-- F languages. This allows her to uniquely determine the run-time representation of a self-referential (read F is (ft, F)-deducible) if (ft, F) I-- F can environment. be obtained by finitely many applications of the Compared to Despeyroux, we use a much more deduction rules shown in figure 5. A deduction low-level target language where values can be rule consists of a conclusion (given beneath the placed in more than one plas in the memory. line), none, one, or several premises (given above This means that not only can one target value the line), and possibly a condition (given at the represent more than one source value, as in De- right-hand side of the line). A deduction rule speyroux's paper, it is also possible for one source stands for the statement: value to be represented by different parts of the If all premises are deducible, the condition memory. In other words, there is no functional is satisfied, and F is a formula built up from connection between source and target values; ft, then the conclusion (ll, F) I- F is de- there is only a relation stating which source val- ducible. ues are represented by a given part of the mere- 428 ory. Appendix A: In the case of cyclic structures, the relation be- tween semantic values seems to be impossible to Action Notation define in the unified meta-notation. This is be- cause the meta-notation only allows the expres- grammar: sion of Horn clauses. Evidence for this is found in Amadio and Cardelli's paper on subtyping re- Act = "complete" I "escape" I "fail" I "commit" I "diverge" I "regive" I cursive types [2]. They axiomatize several rela- ~" "give" Dep ~ I [ "check" Dep ~ I tionships between cyclic structures, and it seems J "bind" token "to" Dep ] I that a rule of the following non-Horn kind cannot | "store" Dep "in" Dep ] I be avoided: | "allocate" "truth-value" "cell" ] I [ "allocate" "integer" "cell" ~ I ( z R y =P a R # ) =~ gz.a R py.# | "batch-send" Dep ] I | "batch-receive" "an" "integer" ] I Since we want to apply the unified recta-notation [ "enact" "application" Dep exclusively in all specifications, we avoid self- "to" Tuple ] [ referential bindings. Thus we cannot treat re- [ "indivisibly" Act ] I cursion. | "unfolding" Unf ] I | Act Infix Act ] I [ | "furthermore" Act ] "hence" Act ] I 6 Conclusion | | "furthermore" Act J "thence" Act ] . Our compiler generator is specified and proved Unf = | Act Infix Unf ] I [ Unf "or" Act ] I correct solely in an algebraic framework. To our "unfold" . knowledge, it is the firsttime that this has been accomplished. Tuple = "0"I Dep I [Tupte "," Tuple ] I The generated compilers emit realistic, albeit "them" . poor, machine code. Future work includes build- "true" J "false" ] natural J ing in more analyses, for the benefit of the code Dep = [ "empty-llst .... ~: .... [" Type "]" "list" ] I generator. | "closure" "abstraction" "of" Act "~" The use of action semantics makes the process- "[" "perhaps" "using" Data "]" "act" | I able specifications easy to read and pleasant to Unary Dep ] l work with. We believe that the Cantor system is [ Binary "(" Dep "," Dep ")" ~ I a promising first step towards user-friendly and [ Dep "is" Dep ] l automatic generation of realistic and correct com- ~" Dep | "is" "less" "than" ] Dep ] I pilers. ~" "component#" Dep "items" Dep ] I Acknowledgements. This work has been sup- "it" I ported in part by the Danish Research Council [ "the" "given" Datum "#" natural ~ I under the DART Project (5.21.08.03). The au- [ "the" Datum "bound" "to" token ] I thor thanks Peter Mosses, Michael Schwartzbach, [ "the" Datum "stored" "in" Dep ] I and the referees for heipful comments on a draft | "(" Dep ")" ]. of the paper. The author also thanks Peter Or- Infix = [ "and" "then" ] I "then" I "before" I b~ek for implementing the Cantor system. "trap" ] "or".

Unary = "not" I "negation" I [ "llst" "of"] I "head" I "lair'.

Binary = "both" I "either" I "sum" I "difference" I "concatenation".

Datum = "datum" I "cell" I "abstraction" I "list" I [ Datum " I " Datum ] I Type. Data = "0" I Type I [ Data "," Data |. 429

Type = "truth-value" I "integer" I but not hidden, in the action, and it persists un- [ "truth-value" "cell" ]l I til explicitly destroyed. Permanent information [ "integer" "cell" J I cannot even be changed, merely augmented. [ "[" Type "]" "list" ]. When an action is performed, transient infor- mation is given only on completion or escape, A.1 Action Principles and scoped information is produced only on com- pletion. In contrast, changes to stable informs. Action notation is designed to allow comprehen- tion and extensions to permanent information are sible and accessible descriptions of programming made during action performance., and are unaf; languages. Action semantic descriptions scale up fected by subsequent divergence or failure. smoothly from small example languages to real- Our subset of action notation omits all no- istic languages, and they can make widespread tation for communication. Instead, the ad hoc reuse of action semantic descriptions of related constructs 'batch-send' and 'batch-recelve' allow languages. a primitive form of communication with batch- Actions reflect the gradual, stepwise nature of files, as in standard Pascal. computation. A performance of an action, which The information processed by actions consist may be part of an enclosing action, either of items of data, organized in structures that give access to the individual items. Data can include completes, corresponding to normal termi- various farniliar mathematical entities, such as nation (the performance of the enclosing ac- truth-values, integers, and lists. Actions them- tion proceeds normally); or selves are not data, but they can be incorpo- escapes, corresponding to exceptional ter- rated in so-called abstractions, which axe data, mination (the enclosing action is skipped and subsequently 'enacted' back into actions. until the escape is trapped); or Dependent data are entities that can be eval- uated to yield data during action performance. fails, corresponding to abandoning the per- The data yielded may depend on the current in- formance of an action (the enclosing action formation, i.e., the given transients, the received performs an alternative action, if there is bindings, and the current state of the storage and one, otherwise it fails too); or batch-files. Evaluation cannot affect the current information. Data is a special case of dependent diverges, corresponding to nontermination data, and it always yields itself when evaluated. (the enclosing action also diverges).

The information processed by action performance may be classified according to how far it tends to be propagated, as follows:

transient: tuples of data, corresponding to intermediate results;

scoped: bindings of tokens to data, corre- sponding to symbol tables;

8table: data stored in cells, corresponding to the values assigned to variables;

permanent: data communicated between distributed actions.

Transient information is made available to an ac- tion for immediate use. Scoped information, in contrast, may generally be referred to throughout an entire action, although it may also be hidden temporarily. Stable information can be changed, 430

Appendix B: coercively A:act = IA HypoPL Action Semantics then I give the given item #1 or B.1 Abstract Syntax give the item stored in the given cell #1 . grammar: B.3 Semantic Functions Program = iT "program" Identifier Block ]. introduces: Declaration = [ "int" Identifier ] I run _, establish _, activate., execute _, R" "bool" Identifier ] I evaluate _, operatlon-result _, ~" "const" identifier "=" Integer ] J integer-value _, id _. "array" Identifier "[" Integer "]" ] I "procedure" Identifier "(" Identifier ")" Block ] I B.3.1 Programs Declaration ";" Declaration ] .

Block = [ Declaration run _ :: Program --* act. "begin" Statement "end" ] I IT "begin" Statement "end" ]. run [ "program" /:Identifier B:block ] = activate B.

Statement = ~" Expression ":=" Expression ] I R" "write" Expression ]] I B.3.2 Declarations ~" "read" Expression ] I "if" Expression "then" Statement establish _ :: Declaration -+ act. "else" Statement "endif" ] I ~" "while" Expression "do" establish IT "int" /:identifier ] = Statement "endwhile" ] I allocate integer cell then bind id I to it . IT Identifier "(" Expression ")" ] I [ Statement ";" Statement ] [ "skip" . establish IT "boor' /:Identifier ] = allocate truth-value cell then bind id I to it. Expression = "true" I "false" I Integer I Identifier I establish ~" "const" /:Identifier "=" j:integer ] = IT Identifier "[" Expression "]" ] J bind id I to integer-value j . Expression Operation Expression ] ] H "not" Expression ] . establish IT "array" /:identifier "[" j:integer "]" ] = Operation = "+" I"-" I"<" I "=" I "and". I give empty-list & [integer cell] list and then give sum(integer-value j, 1) Integer = natural I ~" "-" natural ]. then unfolding Identifier = token. I check the given integer #2 is 0 and then give the given list #1 B.2 Semantic Entities or I regive and then allocate integer cell B.2.1 Items then give concatenation( introduces: item. list of the given integer cell #3, the given list #1) item = truth-value [ integer. and then give difference( B.2.2 Coercion the given integer #2, 1) then introduces: coercively _. I unfold then coercively _ :: act ~ act. I bind id I to the given list #1. 431 establish ] "procedure" ll:ldentifier execute [ "while" E:Expression "do" S:Statement "(" Is:Identifier ")" B:Block ] = "endwhile" ] = bind id 11 to unfolding closure abstraction of I coercively evaluate E furthermore then I give the given integer #1 and then I I check it then execute S then unfold allocate integer cell or check not it. then I I store the given integer #1 execute [/:Identifier "(" E:Expression ")" ] = in the given cell #2 I give the abstraction bound to id I and then and then coercively evaluate E J bind id I2 to the given cell #2 then thence activate B I enact application the given abstraction #1 & [perhaps using integer] act . to the given integer #2. establish ~ Dl:Declaration ";" ~:Declaration ] = execute ~ Sl:Statement ";" 82:Statement ] = establish D1 before establish D2 . execute $1 and then execute $2 execute "skip" = complete. B.3.3 Blocks

activate _ :: Block --~ act . B.3.5 Expressions activate ~D:Declaration evaluate _ :: Expression --~ act . "begin" S:Statement "end"~ = I furthermore establish D evaluate "true" = give true. hence execute S. evaluate "false" = give false. activate ~ "begin" S:Statement "end"] = execute S. evaluate i:lnteger = give integer-value i. B.3.4 Statements evaluate/:Identifier = give the datum bound to id I.

execute _ :: Statement --* act. evaluate [/:Identifier "[" E: Expression "]" ] = I give the list bound to id I and then execute ~F_,l:Expression ":=" F_~:Expression ] = I coercively evaluate E then give sum(it, 1) l evaluateEland then then coercively evaluate/~ I give component# (the given integer #2) then items (the given list #1) . I store the given item #2 in the given cell#l. evaluate El:Expression O:0peration F__~:Expression ] = execute ] "write" E:Expression ] = I coercively evaluate El and then coercively evaluate E then batch-send it. coercively evaluate F_~ then give operation-result O. execute ~ "read" E:Expression ]= I batch-receive an integer and then evaluate E evaluate ~ "not" E:Expression ] = then coercively evaluate E then I store the given integer#l give not it. in the given integer cell #2. B.3.6 Operations execute I" "if" E:Expression "then" Sl:Statement "else" S2:Statement "endif" ] = operation-result _ :: I coercively evaluate E Operation --~ dependent datum . then I check it then execute $1 operation-result "-I-" = or sum(the given integer #1, I check not it then execute $2 the given integer #2). 432

operation-result "-" = [6] Anders Bondorf and Olivler Danvy. Auto- difference(the given integer #1, matic autopr0jection of recursive equations with the given integer ~2) . global variables and abstract data types. Sci. ence of Computer Programming, 16:151-195, operation-result "<" = 1991. (the given integer ~1) is less than (the given integer #2). [7] Rod M. Burstall and Peter J. Landin. Programs and their proofs: an algebraic approach. In operation-result "--" -- B. Meltzer and D. Mitchie, editors, Machine In. (the given item I cell ~1) is telligence, Vol. ~, pages 17-43. Edinburgh Uni- (the given item I cell #2) . versity Press, 1969. operation-result "and" = [81 G. Cousinean, P.-L. Curien, and M. Mauny. both(the given truth-value #1, The categorical abstract machine. Science of the given truth-value ~2) . Computer Programming, 8:173-202, 1987.

Mads Darn and Frank Jensen. Compiler gen- B.3.7 Integers t9] eration from relational semantics. In Proc. ESOP'86, European Symposium on Program. integer-value _ :: Integer --, integer. ruing. Springer-Verlag (LNCS 213), 1986. integer-value n:natura[ = n. [10] Jo~lle Despeyrotuc. Proof of translation in nat- integer-value [ "-" n:natural ] = negation n . ural semantics. In LICS'86, First Symposium on Logic in Computer Science, June 1986.

B.3.8 Identifiers [11] Anders Gammelgaard and Flemmlng Nielson. Verification of the level O compiling specifica- * id. :: Identifier ~ token. tion. Technical report, Department of Com- puter Science, Aarhus University, July 1990. id k:token = k . [12] Joseph A. Goguen, James W. Thatcher, and Eric G. Wagner. An initial algebra approach to References the specification, correctness, and implementa- tion of abstract data types. In Raymond T. Yeh, [1] Harald Abdson, Gerald Jay Sussman, and Julie editor, Current Trends in Programming Method- Sussman. Structure and Interpretation of Com- ology, Volume IV. Prentice-Hail, 1978. puter Programs. MIT Press, 1985. [13] Carsten K. Gomard and Neil D. Jones. A par- [2] Roberto M. Amadio and Luca Cardelli. Subtyp- tiai evaluator for the untyped lambda-calculus. ing recursive types. In Eightteenth Symposium Journal of Functional Programming, 1(1):21- on Principles of Programming Languages. ACM 69, 1991. Press, January 1991. [14] John Hannan. Making abstract machines less [3] Rudolf Berghammer, Herbert Ehler, and Hans abstract. In Proc. Conference on E~nctional Zierer. Towards an algebraic specification of Programming Languages and Computer Archi- code generation. Science of Computer Program- tecture. Springer-Verlag LNCS, 1991. ming, 11:45-63, 1988. [15] John Hannah. Staging transformations for ab- [4] William R. Bevier, Warren A. Hunt, J. Strother stract machines. In Proc. ACMSIGPLANSym- Moore, and William D. Young. An approach posium on Partial Evaluation and Semantics to systems verification. Journal of Automated Based Program Manipulation. Sigplan Notices, Reasoning, 5:411--428, 1989. 1991. [5] Anders Bondorf. Automatic autoprojection [161 John Hannah and Dale Miller. From opera- of higher order recursive equations. In Proc. tlonal semantics to ahtract machines. Journal ESOP'gO, European Symposium on Program- of Mathmatical Structures in Computer Science, ming. Springer-Verlag (LNCS 432), 1990. To appear, 1991. 433

[17] Warren A. Hunt. Microprocessor design verifi- [30] Peter D. Mosses. A constructive approach to cation. Journal of Automated Reasoning, 5:429- compiler correctness. In Proc. Seventh Collo- 460, 1989. quium of Automata, Languages, and Program. ruing, July 1980. [18] Jeffrey J. Joyce. Totally verified systems: Link- ing verified software to verified hardware. In [31] Peter D. Mosses. Abstract semantic alge- Proc. Hardware Specification, Verification and bras! In Proc. IFIP TC2 Working Confer- Synthesis: Mathmatical Aspects, July 1989. ence on Formal Description of Programming Concepts II (Garmisch-Partenkirehen, 1982). [19] Jeffrey J. Joyce. A verified compiler for a ver- North-Holland, 1983. ified microprocessor. Technical report, Univer- [32] Peter D. Mosses. A basic abstract semantic sity of Cambridge, Computer Laboratory, Eng- algebra. In Proc. Int. Syrup. on Semantics of land, March 1989. Data Types (Sophia-Antipolis). Springer-Verlag [20] Gilles Kahn. Natural semantics. In Proc. (LNCS 173), 1984. STACS'SZ Springer-Verlag (LNCS 247), 1987. [33] Peter D. Mosses. Unified algebras and action semantics. In Proc. STACS'89. Springer-Verlag, [21] Richard Kelsey and Paul Hudak. Realistic com- 1989. pilation by program transformation. In S/x- tecnth Symposium on Principles of Program- [34] Peter D. Mosses. Unified algebras and institu- ming Languages. ACM Press, January 1989. tions. In LICS'89, Fourth Annual Symposium on Logic in Computer Science, 1989. [22] Peter Lee. Realistic Compiler Generation. MIT Press, 1989. [35] Peter D. Mosses. Unified algebras and mod- ules. In Sizteenth Symposium on Principles of [23] Peter Lee and Uwe F. Pleban. A realistic Programming Languages. ACM Press, January compiler generator based on high-level seman- 1989. tics. In Fourteenth Symposium on Principles of [36] Peter D. Mosses. Action semantics. Lecture Programming Languages, pages 284-295. ACM Notes, Version 9 (a revised version is to be pub- Press, January 1987. fished by Cambridge University Press in the Se- ries Tracts in Theoretical Computer Science), [24] John McCarthy and James Painter. Correct- 1991. ness of a compiler for arithmetic expressions. In Proc. Symposium in Applied Mathematics of the [37] Peter D. Mosses. An introduction to action American Mathmatical Society, April 1966. semantics. Technical Report DAIMI IR-102, Computer Science Department, Aaxhus Univer- [25] Sun Microsystems. A RISC tutorial. Technical sity, July 1991. Lecture Notes for the Markto- Report 800-1795-10, revision A, May 1988. berdoff'91 Summer School. [26] Robert E. Milne and Christopher Strachey. A [38] Peter D. Messes and David A. Watt. The use of Theory of Programming Language Semantics. action semantics. In Pros. IFIP TCf Working Chapman and Hall, 1976. Conference on Formal Description of Program- ming Concepts III (Gl. Avernces, 1986). North- [27] J. Strother Moore. A mechanically verified lan- Holland, 1987. guage implementation. Journal of Automated Reasoning, 5:461-492, 1989. [39] Flemming Nielson and Hanne Riis Nielson. Two-level semantics and code generation. The- [28] Francis Lockwood Morris. Advice en structur. oretical Computer Science, 56, 1988. ing compilers and proving them correct. In [40] Flemming Nielson and Hanne Itiis Nielson. Symposium on Principles of Programming Lan- Two-level functional languages. Draft book. To guages, pages 144-152. ACM Press, October be published by Cambridge University Press, 1973. 1991. [29] Peter D. Mosses. SIS--semantics implementa- [41] Hanne It. Nielson and Flemming Nielson. Au- tion system. Technical Report Daimi MD-30, tomatic binding time analysis for a typed ~- Computer Science Department, Aarhus Univer- calculus. Science of Computer Programming, sity, 1979. 10:139--176, 1988. 434

[42] Hewlett Packard. Precision architecture and in- [55] Uwe Schmidt and Relnhard V~ller. A multi- struction. Technical Report 09740-90014, June language compiler system with automatically 1987. generated codegenerators. In Proc. A CM SIG- PLAN'84 Symposium on Compiler Construc- [43] Jens Palsberg. An automatically generated and tion. Sigplan Notices, 1984. provably correct compiler for a subset of Ads. In Proc. ICCL'92, Fourth IEEE International [56] Uwe Schmidt and Keinhard VSller. Experi- Conference on Computer Languages, 1992. ence with VDM in Norsk Data. In VDM'87. VDM--A Formal Method at Work. Springer- [44] Jens Palsberg. Provably Correct Compiler Gen- Verlag (LNCS 252), March 1987. eration. PhD thesis, Computer Science Depart- ment, Aarhus University, 1992. Forthcoming. [57] William StalUngs. Reduced Instruction Set [45] Lawrence Paulson. A semantics-directed com- Computers. IEEE Computer Society Press, piler generator. In Ninth Symposium on Princi- 1986. ples of Programming Languages, pages 224-233. [58] Joseph E. Stoy. Denotational Semantics: The ACM Press, January 1982. Scott-Strachey Approach to Programming Lan- [46] Uwe F. Pleban. Compiler prototyping using for- guage Theory. MIT Press, 1977. real semantics. In Proc. ACM SIGPLAN'8$ [59] James W. Thatcher, Eric G. Wagner, and Symposium on Compiler Construction, pages Jesse B. Wright. More on advice on structuring 94-105. Sigplan Notices, 1984. compilers and proving them correct. Theoretical [47] Uwe F. Pleban and Peter Lee. On the use of Computer Science, 15:223-249, 1981. LISP in implementing denotational semantics. [60] Mads Torte. Compiler Generators. Springer- In Proc. ACM Conference on LISP and Func- Verlag, 1990. tional Programming, August 1986. [48] Uwe F. Pleban and Peter Lee. High-level se- [61] Larry Wall and Itandal L. Schwartz. Program- mantics, an integrated approach to program- ming Perl. O'Reilly, 1991. ming language semantics and the specifica- [62] Mitchell Wand. A semantic prototyping sys- tion of implementations. In Proc. Mathmatical tem. In Proc. ACM SIGPLAN'84 Symposium Foundations of Programming Language Seman- on Compiler Construction, pages 213-221. Sig- tics, April 1987. plan Notices, 1984. [49] Uwe F. Pleban and Peter Lee. An automatically [63] David Watt. Programming Language Syntax generated, realistic compiler for an imperative and Semantics. Prentice-Hall, 1991. programming language. In Proc. SIGPLAN'88 Conference on Programming Language Design [64] Nildans Wirth. Algorithms § Data Structures and Implementation, June 1988. = Programs. Prentice-Hall, 1976. [50] Gordon D. Plotldn. A structural approach [65] William D. Young. A mechanically verified code to operational semantics. Technical Report generator. Journal of Automated Reasoning, DAIMI FN-19, Computer Science Department, 5:493-518, 1989. Aarhus University, September 1981. [51] Wolfgang Polak. Compiler Specification and Verification. Springer-Verlag ( LNCS 213), 1981. [52] David A. Schmidt. Detecting global variables in denotational specifications. ACM Transac- tions on Programming Languages and Systems, 7(2):299-310, 1985. [53] David A. Schmidt. Denotational Semantics. A1- lyn and Bacon, 1986. [54] David A. Schmidt. Detecting stack-based envi- ronments in denotational semantics. Science of Computer Programming, 11:107-131, 1988.