A Projected Comparison of Systems with " Abilities
Total Page:16
File Type:pdf, Size:1020Kb
4 , 360 GUIDE CABAL CABAL DESCRIPTION TM CCU- 51 " Author Mary Shaw and Janet Fierst Carnegie Institute of Technology Technical Editor Computation Center Release Approval Distribution 360 CABAL mailing list TECHNICAL MEMO Date January 27 , 1967 2 Replaces C Notes No. 2 and No. 3 Supplements 360 REFERENCE GUIDE Addends Page Page 1 FORMAL COMPILER-DESCRIPTIVE SYSTEMS This is a preliminary version of a projected comparison of systems with " abilities. The intent here is to exhibit the formalisms compiler-descriptive to examine the black boxes rather than their contents. T have aimed for de- scription rather than evaluation, for it seems desirable to work with the formalisms for a while before attempting evaluation. Taking last things first, I remark that the Bibliography contains rather more than less than it should -- the aim was for completeness, even at the ex- Appendix shoul pense of including material which is not really too useful. This be fairly complete up to November, 1965 as a compendium of existing or specified compiler compilers. Comments and additions in this area will be particularly appreciated. ', Mary Shaw Porter Hall 18D Ext. 44 Appendix B contains design criteria for compiler compilers, which were developed by the CABAL group through study of the compiler compilers noted here and through personal soul searching. Again, we have tried to put in too much rather than too little. A later note is planned to describe how CABAL meets (or does not meet) these criteria. Janet Fierst, Mary Shaw, Rick Dove " Porter Hall 18N, Ext. 54 REFERENCE, 31 4 CABAL - 3 - 2 360 REFERENCE GUIDE CABAL DESCRIPTION " Algol, n. a star of the second magnitude in the constellation Perseus. It is re- markable for its variability, which is due to periodic eclipse by a fainter stellar companion. The American College Dictionary The process of algorithmic problem-solving with a computer may normally be regarded as independent of the particular machine on which the computation is to be performed. This was acknowledged early in the history of computing by the development of so-called machine-independent or problem-oriented languages. Many different languages and dialects were developed, most of them directed at particular applications or particular machines, and each of them consuming vast amounts of man and machine time. Indeed, a 1962 survey (International Standards Organization, 1963) showed well over 300 mpilers. " Because of the effort involved in implementing a new programming language escribed by Pratt (1965) as the implementation bottleneck), little experi- mentation with languages (e.g., special-purpose languages) was undertaken, and most "new" languages were general-purpose translators differing from their decessors in only a few features. The first efforts toward standardization were made by manufacturers and s' groups; the most notable of these was FORTRAN and all its many dialects. ere have been three main lines of attack on the language multiplicity problem (1) The universal algorithmic language. There has been, in recent years, a trend toward fewer and more common languages in particular fields. ALGOL and FORTRAN in scientific areas and COBOL ( to some extent) in commercial fields have become acceptable common languages. This legislative approach may hold down the proliferation, but it also tends to discourage experimentation and freezes the available language forms. In addition to lacking applicability to all present problems, a standardized language specified at any particular time " 360 REFERENCE GUIDE CABAL - 3 - 3 CABAL DESCRIPTION " cannot provide for the language requirements resulting from advances in computer technology or in the problem areas themselves. (2) The universal machine-oriented language. In 1958 the SHARE organization proposed a universal intermediary language (UNCOL), to be designed such that any problem-oriented language could be transformed to UNCOL, and UNCOL could be transformed to any machine i language. JOVIAL was implemented in this spirit. Just as a "universal algorithmic language leads to inflexibilityat the problem level, so a "universal" machine-oriented language lends itself poorly to trans- lation to a large number of computers with wildly varying instruction and data formats and instruction sets (3) The compiler compiler. An increasing interest has been shown in programming systems which accept as input not only a description of an algorithm in some language, but also a description of that source language and a specification of the target or object language, which might be the machine code for the machine on which the algorithm " is to be executed. The problem which arises in this case is one of description: formalisms must be constructed to describe both the source and target languages. Metcalfe (1964:3) summarized the possible solutions to the proliferation problem: For M machines and N languages, (a) No standards require M X N compilers for completeness; (b) Standard programming language (N = 1 ) requires M compilers, one for each machine; (c) Standard machine language (M = 1 ) requires N compilers, one for each language; (d) Standard programming language (N = 1 ) and standard machine language (M = 1 ) requires 1 compiler; (e) UNCOL requires M+ N partial compilers; (f) Compiler compiler requires 1 compiler plus M + N or M x N language specifications, depending on whether the source and target languages are specified independently. " Of these possibilities, the compiler compiler is the most promising. CABAL - 3 - 4 360 REFERENCE GUIDE CABAL DESCRIPTION Consider now only systems where the target language is either machine code or assembly language. Iliffe (I960) noted that an actual process of " translation from a formula language to sequential code consists of three parts: (1) An initial equivalence transformation of the formula; (2) The translation into sequential code; (3) A final equivalence transformation of the sequential code. These steps may be associated with, respectively, syntax, semantics, and optimization/assembly/relocation. A meta-compiler should, above all, contain a convenient facility for describing both source and target languages; the descriptions should, moreover, be independent. It should be possible for any interested and informed programmer to understand the meaning of the source language being defined; the distinction between the syntax of the source language, the associated semantics, and the actual generation of code should be clean. The meta-language of the compiler compiler should be machine-independ- ent, but in the absence of a good formalism for machine description (and quite possibly in the presence of such description) the meta-language should not prevent either the language designer or the language user from getting at the " machine directly. In the interest of generality, the meta-language should permit the language specifications to describe data structures (and should provide a data-descriptive facility), For the sake of flexibility, the corn- piler should provide control over the form of the output code. For the sake of production users, the compiler compiler should provide for both local and global optimization of the object code. For the convenience of the sophisti- cated user, the meta-language should permit modification of the source language by an individual program. For the sake of reproduction (as well as aesthet- ics) the meta-language should be capable of describing itself. Error detection and recovery procedures should be available in the compiler compiler and also expressible in the source language -- it may be desirable for the compiler compiler to check on the consistency of the source language as well as on the syntactic validity of its description. The most important systems with compiler-descriptive abilities are de- scribed below. A number of other systems are briefly described in Appendix A; a set of design criteria for compiler compilers is given in Appendix B; Ap- pendix C is an extensive bibliography. " 360 REFERENCE GUIDE CABAL - 3 - 5 CABAL DESCRIPTION " Brooker and Morris In the early part of this decade Brooker and Morris described in sev- eral papers (1960A, 19608, 1961, 1962, 1963) a compiler-building system which they have developed for the Ferranti Atlas. (The papers were drawn together in a single discussion by Rosen (1964)). In the discussion above the compila- tion process was segmented into three phases: Brooker and Morris acknowledge the existence of formal descriptions of source languages and algorithms for their syntactic analysis (the first phase), and concentrate on the second, semantic interpretation. Semantic analysis requires generators which take action when a source statement is parsed; the authors propose a system of format routines associated with the particular source statement forms of each language, and provide a language in which the format routines may con- veniently be written. This language is described in the same formal terms as the source language and is converted into tables by the same service rou- tines; it thus provides a set of basic formats which may be used to build up " a more complex system. The format routines may be regarded as macro-generators, and a source language statement as a series of calls on appropriate elementary routines. The system proper contains a basic structure consisting of a number of basic instruction formats interpreted directly as format routines to handle housekeeping functions such as system sequencing and table manipulation. A compiler is built up on this structure by adding format routines, each a list of statements in formats already in the system, in a macro-building fashion. These new formats define classes of phrases, source statement formats, and further intermediate formats; the code to be generated is implicit in these routines. With enough source statement formats added, the system will act as a compiler for the language so described. Syntax: A set of elementary symbols is recognized by the system; the set may be extended to include class identifiers, strings of elementary symbols enclosed in square brackets. A phrase is a string of elementary symbols or class identifiers; a phrase class is defined in a form similar to BNF.