Incremental Code Generation in a Distributed Integrated Programming Environment
Total Page:16
File Type:pdf, Size:1020Kb
g^>-- q Incremental Code Generation in a Distributed Integrated Programming Environment Michael James McCarthy Thesis submitted for the degree of Doctor of Philosophy ln The University of Adelaide (Department of Computer Science) December 1995 To Julia, who does nothing incrementølly. Table of Contents 1. Introduction ... I 1.1. Contributions of this thesis 3 1.2. Outline of this thesis 5 2. Incremental compilation ..... 7 2.1. Basic definitions 8 2.2. Goals and issues 10 2.3. Incremental code generation l4 2.4. Previous systems 29 2.5. Conclusions 45 3. A distributed architecture for an integrated programming environment 48 3.1. Integrated programming environments .. 49 3.2. A client-server architecture . 59 3.3. Conclusions 69 4. Abstract syntax trees 7r 4.1. Abstract syntax 72 4.2. Tree domains and tree addresses 74 4.3. Tree patterns 77 4.4. Attribution of abstract syntax trees 78 4.5. A simple model of editing 82 4.6. A language specification formalism . .. 83 4.7. Conclusions .... 85 5. Incremental instruction selection .. 86 5.1. Retargetable code generation 87 5.2. BURS-based incremental instruction selection r07 5.3. Transforming abstract syntax trees Lt4 5.4. The incremental instruction selection algorithm .r25 5.5. Variations t42 5.6. Conclusions .t44 6. Parallel incremental compilation L45 6.1. Parallel greedy incremental code generation 146 6.2. Interaction with static semantic analysis .. 152 6.3. Parallel BURS-based incremental instruction selection 159 6.4. Conclusions ...r72 7. A prototype implementation ...t74 7.1. The MultiView prototype ... r75 7.2. Architecture independence . .... .. 186 7.3. The incremental code generator 189 7.4. Timing trials 191 7.5. Conclusions 195 8. Conclusions and future work 196 8.1. Conclusions .. 196 8.2. Future work .. 198 Appendix A. Glossary of Symbols ..202 Appendix B. Extract from a language specification 204 Appendix C. Extract from an architecture description 209 Appendix D. A dispatcher specification 214 Bibliography . ... 215 Table of Figures 2.r Granularity for incremental compilation. .. I 2.2 Inconsistent incremental compilation. t2 2.3 Skeletons from Earley's method 29 2.4 Summary of Earley's method. 30 2.5 The structure of the DICE system. .. 31 2.6 The DICE incremental compiler... 32 2.7 The DICE incremental code generator. 33 2.8 Summary of incremental compilation in DICE... 34 2.9 The LOIPE environment.. 35 2.t0 Summary of LOIPE. 37 2.tL Summary of the Magpie environment. ... 38 2.r2 Operator-operand trees in SEP 39 2.r3 Summary of SEP. .. 40 2.L4 Summary of Bivens' incremental register reallocation 42 2.I5 MFAD representation of redundant store elimination 44 2.16 Summary of Pollock's incremental optimisation... 45 3.1 Cornell Program Synthesizer derivation trees. 52 3.2 Attribution of a tree with threaded code. 54 3.3 Integration of a compiler into the Field environment. 57 3.4 The MultiView architecture... 60 ó.Ð The high-level specification of the LOAD-UNIT q 63 3.6 Communication between the MultiView database and vtews. 63 3.7 Broadcast of an edit notification 64 3.8 Structure of a MultiView view 65 3.9 A MultiView code generator view. 66 4.r Evaluating instances of attributes. .81 4.2 Declaration of a sort, an operator and a generator set ,83 4.3 Productions from the concrete syntax .84 nt 4.4 Attribute declaration and definition. .. 85 5.1 Extract from a TMDL descriPtion 90 5.2 Reduction by a Graham-Glanville code generator 91 5.3 Extract from a tree-pattern based architecture description. 93 5.4 Covering a subject tree with a set of tree patterns. 94 5.5 Two covers for an intermediate code tree. 95 5.6 Twig tree-rewriting rules. 98 5.7 Rewrite rules from a BURS architecture description. '. r02 5.8 Two rewrite sequences of a subject tree. .. 103 5.9 Subtree replacement. .. 108 5.10 Inferring goal symbols for subnodes. 110 5.11 Normalisation of a rewrite rule.. TT2 5.12 Some operators from PCC-IR. 114 5.13 Loading an integer constant into the SPARC register %10. .116 5.r4 Identifier resolution in an abstract syntax tree. 116 5.15 Semantic relabelling of an abstract syntax tree' 118 5.16 Semantic transformation after an edit. r23 5.17 Recompilation after subtree replacement. .. ..126 5.18 The BURS pattern matching automaton ..L27 5.19 Expansion of the extent of recompilation. 128 5.20 Top-down reduction and generation of object code., I29 5.2L Template specifications for rewrite rules... .130 5.22 Implicit representation of a local rewrite assignment. ..132 5.23 The size and ofset attribution . .135 5.24 Evaluation of size and offset in Red'uce. .136 5.25 Object code management in Recompile..... .. L37 5.26 Code file adjustment 138 5.27 Branching in a conditional statement 140 5.28 Simple branch updating. .. 141 6.1 Structure of the PSEP system. .. r47 tv 6.2 Part of the simplified PSEP algorithm. 149 6.3 Structure of the MultiView code generator view 151 6.4 Effect of inserting a declaration on an activation record (AR). 153 6.5 Subtree replacement corresponding to Figure 6.4 L54 6.6 Semantic relabelling and code templates for assignment 154 6.7 Change of relabelling after a source code update. r57 6.8 Outline of the greedy pa,rallel incremental code generation algorithm. .160 6.9 Naive greedy parallel incremental recompilation 161 6.10 Filtering of the update queue. 163 6.11 Worklist-based variant of Recompile derived from Figure 5.25. 165 6.t2 The parallet BURS-based incremental code generator.. 167 6.13 Process-relabelling.. .169 6.L4 Process-matching ...170 6.15 Processing structural changes in the parallel incremental code generator. .. L7L 6.16 Processing detail changes in the parallel incremental code generator. L72 7.1 Basic MultiView abstract syntax tree nodes t75 7.2 Typical sizes of abstract syntax trees. ...t77 7.3 Generation of language specifrc components. ...179 7.4 Specification of the type of generator instance representations. 180 7.5 Transmission of a query to the database. r82 7.6 Generation of CSS components from the protocol specifications 183 7.7 Extract of the dispatcher specification for TextView. .. 184 7.8 Generation of view components from the message specifications. .185 7.9 Extract from an architecture descriptlon. .. ... 187 7.L0 Generation of the code generator from formal specifications.. 188 7.tr Structure of the code generator view. 189 7.r2 Results from a series of trials of duration 25 seconds L92 7.L3 Average recompilation time plotted against the size of recompilation... 193 v Abstract An incremental compiler recompiles a program, after an edit, in time proportional to the size of the edit; this incremental recompilation will include incremental generation of object code for the program. This thesis presents a new method for performing incremen- tal code generation in a distributed integrated programming environment. A prototype implementation of such an incremental code generator is also described. Retargetable instruction selectors are generated from a formal specification of the architecture of the target computer. This thesis derives a ner\¡ retargetable incremental instruction selection algorithm from a non-incremental instruction selection technique in the framework of a precise model of the underlying program representation. The resulting algorithm incrementally regenerates locally optimal object code after the replacement of a subtree in an abstract syntax tree program representation. A greedy incremental code generator recompiles the updated program immediately after an edit. The impact on the response time of the environment is reduced if the incremental code generator executes in parallel with editing. An efficient greedy paral- lel incremental code generation algorithm is derived from the retargetable incremental instruction selection algorithm; the derivation of the parallel algorithm is also based on an analysis of the propagation of static semantic information after the replacement of an arbitrary subtree in the abstract syntax tree. The integration of a greedy incremental code generator into the MultiView distributed integrated programming environment is demonstrated via a prototype implementation. An instance of this environment for a particular programming language is generated from a formal specification of the language and the target computer architecture on which the programs in the language are to be executed. Experimental data gathered from the proto type verifies that the recompilation time after a program update depends approximately linearly on the number of recompiled nodes. vl Acknowledgements I am indebted to my supervisor Chris Marlin who has encouraged and advised me on the tortuous road to the completion of this work. Without his support the research would never have been done. The quality of this thesis has been immeasurably improved by his red pen. Thanks are also owed to Paul Calder for his advice and encouragement, to Jenny Harvey for suggesting that I rewrite Chapter Three, to Todd Rockoff for keeping me honest, and to David Jacobs for the loan of his dictionary. I am grateful to the Department of Computer Science at The Flinders University of South Australia for providing the research facilities that I have used over the la.st few years. vlu Chapter 1 Introduction The power of computer systems has increased dramatically over the last three or four decades, while the cost of these systems has dropped sharply. Over the same period of time, the range and complexity of applications for these computer systems has expanded considerably. However, the development and maintenance of the software that is critical to all computer systems remains a difrcult and expensive task. Many approaches to software development have been proposed in an attempt to rec- oncile the difficulty of software development with the demand for the timely delivery of safe and reliable software.