Some Tools for Automating the Production Of

SOME TOOLS FOR AUTOMATING THE PRODUCTION OF COMPILERS AND OTHER TRANSLATORS by Christopher Lester, B.Sc., M.Sc. April 1982 A thesis submitted for the degree of Doctor of Philosophy of the University of London and for the Diploma of Membership of the Imperial College. Department of Computing and Control, Imperial College, London S.W.7. 3 ABSTRACT This thesis explores possible tools to cheapen the creation of compilers of sufficient quality for studies in programming language design. A literature study showed only (many) incomplete leads towards the: goal: this prompted an analysis of what properties such tools must have and might have, notably the forms of the inputs. The key conclusion is that it is desirable to describe the source language and (separately) the machine instructions, rather than a translation between them. A design according to this conclusion is proposed, and implementations of the crucial parts of the design are presented. The description of the language is given in four parts. The first two describe the lexis and syntax: the third gives transformations from the syntactic constructs to intermediate language (IL). An implemented tool considers inter-relationships between the transform cases, and from these inter-relationships fabricates the required case-analysis code to select which transforms to apply for instances of the syntax. A parser consistent with the case analysis is also f abricated. The fourth part of the language description describes the IL in terms of machine description language (MDL), which is also used to describe the instructions of the target machine. Another tool is parameterised by these descriptions: it then inputs an IL sequence, analyses it for its data-flow tree, converts that into a tree rep- resenting the corresponding MDL sequence, and finally chooses an optimised machine instruction sequence equivalent to the original IL sequence. Another study suggested that the choosing is more easily optimal if it includes the allocation of temporaries needed for the chosen instructions: a technique for this is suggested. Also, a method is given to reduce searching for optimal choices and avoid building the MDL tree. Other possible applications of the tools are discussed. 4 Acknowledgements. This project would not have been possible without the help of the following people. My most sincere thanks go to them all. My supervisors at Imperial College: Ian Livingstone and John Darlington, for all their varied assistance, particularly for helping to reduce the problems arising from my part-time status, and also for the many (often unknowing) technical insights they gave me. At City of London Polytechnic: the Assistant Directors, Ronald Sturt and Michael Brandon-Bravo, for their moral support, and for arranging day-release and financial support; the Head of the Computer Service, Pat Johnson, and all his staff, for their tolerApce of what must often have been their most difficult user. Special mention must go to Pam Roberts for her amazingly accurate key-punching, and the operators for their many kind- nesses beyond the call of duty. My present employer, ALSYS S.A., for reproduction facilities. Finally, and most of all, to my wife, for AVZ years of irregular hours and lost weekends without complaint but with encouragement. 5 CONTENTS Section Title Page. Abstract 3 Contents 5 List of figures 9 Preface 11 1 ORIGINS AND OBJECTIVES 13 2 FUNDAMENTAL CONCEPTS 2.1 INTRODUCTION... 19 2.2 MODELS OF COMPILATION 21 2.2.1 Non-optimising 21 2.2.2 Optimising model 24 2.2.2.1 Examples of optimisations for different phases of compilation 25 2.2.2.2 Optimisation-order problems 27 2.2.2.3 Review and terminology 29 2.2.2.4 Correspondences 29 2.2.3 Limitations of the models 30 2.3 CHEAPER METHODS OF CREATING COMPILERS 30 2.3.1 The UNCOL problem 30 2.3.2 "Portable" compilers 32 2.3.3 High-level machine codes 35 2.3.4 System implementation languages 36 2.3.5 Compilers based on macro-processing and pattern matching 38 2.3.6 Compiler-compilers: main concept and brief history 41 <4 CLARIFICATION OF TERMINOLOGY 47 ".4.1 Definitions 47 2.4.2 Summary of definitions 51 2.5 COMPILER-COMPILERS AND COMPILER-GENERATORS..52 2.5.1 Required inputs versus inbuilt information..53 2.5.2 Choice of languages for inputs and outputs..55 3 OVER VIE!1/ OF A COMPILER FACTORY 3.1 INTRODUCTION 59 3.2 THE STRATEGY UNDERLYING THIS RESEARCH 60 3.2.1 Design strategy of CF1 63 3.2.2 Design strategy of CF2 68 4 FABRICATING A PRIMARY CODE GENERATOR 4 • 1 INTRODUCTION 73 4.2 THE LEXIC DIVISION 75 4.3 THE SYNTACTIC DIVISION 83 4.4 THE TRANSFORM DIVISION 91 4.4.1 The notation of transforms 93 4.4.2 Overlapping cases 98 4.4.3 The "more specific than" relation 101 4.4.4 Commonalities in matching patterns 102 4.4.5 The influence of syntax on the allowable patterns 103 4.4.6 Calling the transforms 104 4.4.7 Editing the tree 105 4.4.8 Cases not expressible as simple patterns...108 4.4.9 Syntactic and lexic rules applicable only to patterns 109 4.4.10 Theories of transforms Ill e 4.5 EXAMPLE SETS OF TRANSFORMS 114 4.6 REVIEW OF THE PROPOSED TECHNIQUES 120 5 A PILOT PRIMARY CODE-GENERATOR FABRICATOR 5.1 INTRODUCTION 123 5.2 OVERVIEW OF THE CF2 SUITE OF PROGRAMS 123 5.2.1 Bootstrap use of the parser generator, and bootchecking 130 5.2.2 The source processor, and the source library 134 5.3 THE LEXICAL ANALYSER FABRICATOR AND • SYNTACTIC ANALYSER FABRICATOR 139 5.3.1 The Lexical-analyser fabricator 141 5.3.2 The Syntactic-analyser fabricator 144 5.4 THE INTERPRETIVE PARSER 146 5.5 THE TRANSFORM FABRICATOR 152 5.6 REVIEW OF THE PRIMARY CODE-GENERATOR FABRICATOR 158 6 FABRICATING A SECONDARY CODE-C-ENERATOR 6.1 INTRODUCTION 161 6.2 THE EXAMPLE 162 6.2.1 The example instruction and its IL memory architecture 163 6.2.2 The example ILs 165 6.2.2.1 A temporary-stack IL 165 6.2.2.2 An infinitely-registered IL 167 6.2.2.3 The PDP-8 170 6.3 MACHINE DESCRIPTION LANGUAGE (MDL) 172 6.3.1 MDL summary of the addresses in the I:=J+K instruction 174 6.3.2 MDL definition of the temporary-stack IL 174 6.3.3 MDL definition of the register-IL 175 6..4 INSTRUCTIONS AND INSTRUCTION-SEQUENCES AS TREES 178 6.4.1 The gross tree for an IL sequence 181 6.4.2 The trees for temporary-stack IL instructions.185 6.4.3 The trees for register-IL instructions 186 6.4.4 The fine tree for an IL sequence 188 6.4.5 The trees for the PDP-8 instructions 193 6.4.6 The trees of PDP-8 instruction sequences 196 6.5 RELATIONSHIP TO SYMBOLIC EXECUTION 198 6.5.1 Two demonstrations 199 6.6 RELATIVE MERITS OF THE STYLES OF IL 203 6.6.1 Organisation of registers in a register-IL.....203k. 6.6.2 Data-flow analyses 205 6.6.2.1 Data-flow analysis for a stack IL 205 6.6.2.2 Data-flow analysis for register-IL 208 6.6.3 Register-IL vs. Stack-IL 213 6.7 AN INSTRUCTION CHOICE METHOD 215 6.7.1 Transforming an MDL fine tree to a machine code gross tree 216 6.7.1.1 a demonstration of the method 217 6.7.1.2 Summary of the method 226 6.7.1.3 Another demonstration of the method 237 6.7.2 Register allocation on the gross tree 242 6.7.3 Choice of alternative machine code gross 0„D trees 248 7 6.7.4 Optimisation of the method of instruction choice 250 6.8 INSTRUCTION ORDERING FROM THE TARGET-MACHINE GROSS TREE 260 6.8.1 Assembly cases of instruction ordering 260 6.8.2 The "Linear section" case of instruction ordering 263 6.8.3 The "Fork point" case of instruction ordering 266 6.9 REMAINING TOPICS IN SECONDARY CODE GENERATION 271 6.10 REVIEW OF THE PROPOSED TECHNIQUES 274 7 A PILOT SECONDARY-CODE-GENERATOR SYSTEM 7.1 INTRODUCTION 275 7.2 THE SYSTEM 275 7.3 A STRUCTURED NOTATION FOR TREES WHICH CONTAIN ALTERNATIVES 283 7.4 A SIMPLE EXAMPLE CODE GENERATION 284 7.5 EXAMPLES INVOLVING TRANSFER SQUENCES 286 7.6 RETARGETING 287 7.7 TWO REALISTIC EXAMPLES 293 8 OTHER USES OF THE TOOLS 8.1 INTRODUCTION 295 8.2 SOURCE PROGRAM TRANSFORMATION 296 8.2.1 Program transformation for simplification,.297 8.2.2 Program transformation for optimisation...298 8.2.3 Program transformation for language extension 298 8.3 MISCELLANEOUS PROGRAMMERS AIDS 299 8.3.1 Prettyprinting 300 8.3.2 Program profiling 301 8.3.3 Language specific editing 301 8.3.4 Flowcharting 301 8.3.5 Cross-referencing 301 8.3.6 Portability aids 302 8.4 OTHER APPLICATIONS OF CF2 302 8.4.1 Verifications of programs and translations.302 8.4.2 Assembler-level bridgeware 303 8.4.3 Verification of architecture descriptions..304 8.5 DE-COMPILATION 304 8.6 LANGUAGE DEFINITION 305 8.7 OTHER POSSIBLE CF TOOLS 305 8.8 OTHER APPLICATION DOMAINS 307 9 SUMMARY 9.1 INTRODUCTION AND OBJECTIVES 309 9.2 THE ANALYSIS 310 9.3 THE DRAFT SOLUTION 311 9.4 BEYOND THE OBJECTIVES 312 9.5 FUTURE WORK 313 9.6 CONCLUSION 313 A ALLIED WORK 315 A. 1 PQCC AND 'WASI LEW 316 A.1.1 Expert systems 316 A.1.2 Overview of PQCC 317 A.1.3 Instruction Choice and Register Allocation 319 A.1.3.1 Register allocation 320 A.1.3.2 Instruction choice 322 A.

Some Tools for Automating the Production Of

Compiler Construction

Mathematical Software 5

Docworks PDF 600

845 1 4 MIT/CC Computation Center Memos

Oral History of Bjarne Stroustrup

The Language List - Version 2.4, January 23, 1995

Panos Technical a C O R N S C I E N T I F I C

Cambridge Computing

TOWARDS a MACHINE-INDEPENDENT TRANSPUT SECTION J.C. Van Vliet, Mathematisch Centrum, Amsterdam Abstract If the Transput Section

C Session Chair: Brent Hailpern

Maurice Vincent Wilkes

DISCLAIMER the Contents of This Document Are Intended for Leaning Purposes at the Undergraduate Level