Abstract Data Type, 5 Abstract Syntax, See Syntax, Abstract Access Link

Total Page:16

File Type:pdf, Size:1020Kb

Abstract Data Type, 5 Abstract Syntax, See Syntax, Abstract Access Link Cambridge University Press 052182060X - Modern Compiler Implementation in Java, Second Edition Andrew W. Appel Index More information Index abstract data type, 5 associative, nonassociative abstract syntax, see syntax, abstract attribute grammar, 12 access link, see static link available expressions, 356 activation record, 6, 116–125 Ada, 336, 348, 349 Baker’s algorithm, 274 addressing mode, 183, 188 basic block, 170, 172, 361, 365, 382 ADT, see abstract data type beta reduction, see inline expansion Aiken-Nicolau algorithm, 444–448, 459 binding, 103–110, see also precedence alias in type environment, 111 analysis, 357, 369–374, 392 blacklist, 281 see in coalescing register allocation, 234 block structure, function, nested blocking, 477–480, 482 alignment, see cache alignment branch prediction, 456–459 alloca, 197 buffered input, 33 allocation bypass datapaths, 442, 444 of activation records, 116, 118, 156 of arrays and records, 151 C programming language of heap data, 275 linking to, 153 register, see register allocation writing compiler for, 18, 90, 116, 117, alphabet, 18 122, 130, 139, 144–146, 150, 151, ambiguous grammar, see grammar 197, 322, 369, 371, 374, 377 analysis C++, 291, 336, 369 dataflow, see dataflow analysis cache, 464–467 liveness, see liveness alignment, 468–470 antidependence, see dependence, write-after- and garbage collection, 267, 480–481 read cache alignment, 481 approximation CALL, 162, 163, 168 dataflow analysis, 209, 212, 352 call in garbage collection, 257 by name, 322 of spill effect, 220 by need, 323 of strictness, 331 by reference, 123, 124 argument, see parameter callee-save, see register, callee-save array, 144, 146, 151 caller-save, see register, caller-save bounds check, 148, 391–395 Canon module, 163 Assem module, 191 canonical tree, see intermediate represen- associativity, see right-associative, left- tation, canonical 495 © Cambridge University Press www.cambridge.org Cambridge University Press 052182060X - Modern Compiler Implementation in Java, Second Edition Andrew W. Appel Index More information INDEX card marking, 270 dangling else, 68 CISC, 180, 187–190, 459 dangling reference, 122 class descriptor, 285–289, 292–297 data type, abstract, see abstract data type classless language, 293 dataflow, see also liveness, reaching defi- cloning, 293 nitions, available expressions, etc. closure analysis, 6 conversion, 316–317, 320 bit vector, 361 , 27, 28, 33 equations, 205–210, 352, 354, 356, 368, function, 301, 303, 331 372, 379 Kleene, 19, 39 iteration, see iteration algorithms of LR state, 60, 63 work-list algorithms, 363 coalescing, 223–240, 245, 320, 360 dead code, 312, 326, 360, 364, 365, 368, conservative, 223 369, 389, 394, 417, 426–428 of SSA variables, 428 dead state, 23 code generation, see instruction selection def (of variable), 205 code-generator generator, 185 def-use chain, 399, 438 Codegen module, 196 deforestation, 327–328 coercion, 290 dependence coloring, see graph coloring control, see control dependence comma operator, see expression sequence data, 423, 442, 476 common-subexpression elimination, 356, loop-carried, 445 359 memory and array, 423–425, 445 commute, 164, 166–174 read-after-write, see dependence, data complex instruction set, see CISC write-after-read, 423, 441, 476 computer, see CISC and RISC write-after-write, 423, 441, 476 conditional jump, 140, 149, 162, 169 depth-first search conditional move, 454 for dataflow analysis, 207, 209, 362, conflict 363 in predictive parser, 46 garbage collection, 257, 268, 279 reduce-reduce, 68, 75 spanning tree, 410–411 resolution of, 72–75 derivation, 41 shift-reduce, 62, 67, 68, 72, 74 descriptor conservative approximation, see approxi- class, 276, 285–289, 292–297 mation level, 155 constant folding, 419 record, 276, 278 constant propagation, 356, 418–419 DFA, see finite automaton conditional, 419–422 display, 134 constraint, functional-unit, 441, 443 class hierarchy, 290, 295, 296 constructor, 8 dominance frontier, 404, 436 continuation, 304, 332 dominance property, see static single-assignment continuation-passing style, 435 form control dependence, 425–426 dominator, 379–382, 384, 392–395, 436 graph, 426 efficient calculation of, 410–416, 434 control flow, 170, see also flow graph dynamic programming control-flow graph, see flow graph for instruction selection, 182–186 coordinated induction variable, 388–392 for register allocation, 241–244 copy propagation, 359, 419, see also coa- dynamic scheduling, see out-of-order ex- lescing ecution 496 © Cambridge University Press www.cambridge.org Cambridge University Press 052182060X - Modern Compiler Implementation in Java, Second Edition Andrew W. Appel Index More information INDEX edge splitting, 408 nested, 117–118, 124–126, 131, 135, edge-split SSA, see static single-assignment 155, 298, 301–302, 369 form functional intermediate form, 430–435 else, dangling, 68 functional programming, 12, 104, 298–334, emission see also side effect in instruction selection phase, 183, 185 impure, 299–301 of assembly code, 5, 6, 198, 244 pure, 302–308 end-of-file marker, 45 symbol tables, 107–108 environment, 11, 103–111, 115, 284, 301, functional unit, 441, 442 317 multiple, 442 functional, 107 garbage collection, 151, 257–282, 321, 333 imperative, 106 and cache, 267, 480–481 multiple, 105 Baker’s algorithm, 274 equational reasoning, 298–302, 306, 321, compiler interface, 275–278 430 concurrent, 272 error message, 91 conservative, 281 error recovery, 53 copying, 264–269 escape, 124, 302, 321, 332, see also FindEscape cost, 259, 264, 268, 271, 275 ESEQ, 162–169 flip, 274 expression sequence, see also ESEQ generational, 269–271, 480 finite automaton, 18, 21–30 incremental, 272–275 deterministic, 22 mark-sweep, 257–262 minimization, 36 reference counts, 262–264 nondeterministic, 24 generic, 336, 348 Generic Java, 336 converting to DFA, 27 GJ, 336 FIRST set, 47–52, 63 grammar, 5, 40–45, see also syntax fixed point, 48, 206, 357, 374 ambiguous, 42, 50, 51, 67–68, 90, 185 least, 209, 218, 368, 419 attribute, 12 Flex, 34 factoring, 53 flow graph, 203 for intermediate representation, 7–9 reducible, 377 for parser generator, 89 flow, data, see dataflow hierarchy of classes, 66 FlowGraph module, 215 LALR, 66, 67 FOLLOW set, 48–50, 52, 54, 62 LL(1), 51 forward reference, see recursion, mutual of straight-line programs, 7 forwarding, 265–268 to specify instruction set, 183–186 fragmentation, 261 transformations, 51, 88, 90 frame, see activation record unambiguous, 51 Frame module, 127, 251 graph frame pointer, 118–120, 134, 143, 155, 197– coloring, 219–223, 250, 286, 360 198 optimistic, 221 on Pentium, 188 with coalescing, 223–240, 245, 320 freeze, 224, 233, 239 work-list algorithm, 232–240 function interference, see interference graph dead, 312 Graph module, 214 higher-order, 117, 298 graph, flow, see flow graph integration, see inline expansion leaf, 122 halting problem, 351, 374 497 © Cambridge University Press www.cambridge.org Cambridge University Press 052182060X - Modern Compiler Implementation in Java, Second Edition Andrew W. Appel Index More information INDEX hash table, 106, 114 liveness analysis, 206–207 hazard, 441, see also constraint, functional- LR parser construction, 60 unit minimization of finite automata, 36 reaching definitions, 355 IBM 360/91, 455 iterative modulo scheduling, see modulo induction variable, 385–391 scheduling coordinated, 388, 390, 392 linear, 387 Java, 336 inheritance, 283, 284 writing compiler for, 18, 90, 105, 145, multiple, 286 276, 282, 289, 290, 292, 297, 322, single, 285, 294, 295 370, 371, 392, 393 inline expansion, 276, 308–316, 332, 431 writing compiler in, 3, 9–11, 91 instanceof, 12, 93 JavaCC, 7, 68–89 instantiation of variable, 116 JavaCC parser generator, 89 instruction Jouette, 176–180, 192–195 fetch, 456, 470 Schizo, 184 Instr representation of, 191 Kleene closure, 19, 39 pipeline, see pipeline resource usage of, 442 label, 131 selection of, 6, 176–202 lambda calculus, 430 side effect of, 188, 196 lambda-calculus, 331 three-address, 188 landing pad, 435 two-address, 188, 193 lattice, 420 variable-length, 188 lazy evaluation, 321–327, 435 instruction set, see CISC and RISC leaf function, 122 instruction-level parallelism, 440 left-associative operator, 73, 74 Intel, see Pentium left-factoring, 53 interfaces, 5 left-recursion, 51 interference graph, 212–232, 244 Lengauer-Tarjan algorithm, 410–416, 434, construction of, 213, 216–217, 236 see also dominator for SSA form, 429 Lex, 6, 33 from SSA form, 429, 438 lexical analyzer, 6, 16–37, 93 intermediate representation, 6, 137–139, lexical scope, see function, nested see also Tree Lisp, 348 canonical, 162–169 live range, 203, 213 functional, 430–435 live-in, 205 interpreter, 91 live-out, 205 invariant, see loop invariant liveness, 6, 203–218, 236, 358, 360, 363, IR, see intermediate representation 365, 367, 368 item in SSA form, 429 LR(0), 59 of heap data, 257 LR(1), 63 LL(k), see parser, LL(k) iteration algorithms local variable, 116 alias analysis, 372 locality of reference, see cache dominators, 379 lookahead, 37 -closure, 28 loop, 376 efficient, 360–364 header, 376, 381–382 first and follow sets, 48 inner, 381 invention of, 374 interchange, 476–477 498 © Cambridge University Press www.cambridge.org Cambridge University Press 052182060X - Modern Compiler Implementation in Java, Second Edition Andrew W. Appel Index More information INDEX invariant, 314, 326, 382, 384–389, 398 parallel processing, instruction-level, 440 natural, 381–382 parameter, see also view shift nested, 382 actual, 194, 312, 319 postbody, see postbody
Recommended publications
  • Modern Compiler Implementation in Java. Second Edition
    Team-Fly Modern Compiler Implementation in Java, Second Edition by Andrew W. ISBN:052182060x Appel and Jens Palsberg Cambridge University Press © 2002 (501 pages) This textbook describes all phases of a compiler, and thorough coverage of current techniques in code generation and register allocation, and the compilation of functional and object-oriented languages. Table of Contents Modern Compiler Implementation in Java, Second Edition Preface Part One - Fundamentals of Compilation Ch apt - Introduction er 1 Ch apt - Lexical Analysis er 2 Ch apt - Parsing er 3 Ch apt - Abstract Syntax er 4 Ch apt - Semantic Analysis er 5 Ch apt - Activation Records er 6 Ch apt - Translation to Intermediate Code er 7 Ch apt - Basic Blocks and Traces er 8 Ch apt - Instruction Selection er 9 Ch apt - Liveness Analysis er 10 Ch apt - Register Allocation er 11 Ch apt - Putting It All Together er 12 Part Two - Advanced Topics Ch apt - Garbage Collection er 13 Ch apt - Object-Oriented Languages er 14 Ch apt - Functional Programming Languages er 15 Ch apt - Polymorphic Types er 16 Ch apt - Dataflow Analysis er 17 Ch apt - Loop Optimizations er 18 Ch apt - Static Single-Assignment Form er 19 Ch apt - Pipelining and Scheduling er 20 Ch apt - The Memory Hierarchy er 21 Ap pe ndi - MiniJava Language Reference Manual x A Bibliography Index List of Figures List of Tables List of Examples Team-Fly Team-Fly Back Cover This textbook describes all phases of a compiler: lexical analysis, parsing, abstract syntax, semantic actions, intermediate representations, instruction selection via tree matching, dataflow analysis, graph-coloring register allocation, and runtime systems.
    [Show full text]
  • Maximal-Munch” Tokenization in Linear Time Tom Reps [TOPLAS 1998]
    Fall 2016-2017 Compiler Principles Lecture 1: Lexical Analysis Roman Manevich Ben-Gurion University of the Negev Agenda • Understand role of lexical analysis in a compiler • Regular languages reminder • Lexical analysis algorithms • Scanner generation 2 Javascript example • Can you some identify basic units in this code? var currOption = 0; // Choose content to display in lower pane. function choose ( id ) { var menu = ["about-me", "publications", "teaching", "software", "activities"]; for (i = 0; i < menu.length; i++) { currOption = menu[i]; var elt = document.getElementById(currOption); if (currOption == id && elt.style.display == "none") { elt.style.display = "block"; } else { elt.style.display = "none"; } } } 3 Javascript example • Can you some identify basic units in this code? keyword ? ? ? ? ? var currOption = 0; // Choose content to display in lower pane. ? function choose ( id ) { var menu = ["about-me", "publications", "teaching", "software", "activities"]; ? for (i = 0; i < menu.length; i++) { currOption = menu[i]; var elt = document.getElementById(currOption); if (currOption == id && elt.style.display == "none") { elt.style.display = "block"; } else { elt.style.display = "none"; } } } 4 Javascript example • Can you some identify basic units in this code? keyword identifier operator numeric literal punctuation comment var currOption = 0; // Choose content to display in lower pane. string literal function choose ( id ) { var menu = ["about-me", "publications", "teaching", "software", "activities"]; whitespace for (i = 0; i < menu.length; i++) { currOption = menu[i]; var elt = document.getElementById(currOption); if (currOption == id && elt.style.display == "none") { elt.style.display = "block"; } else { elt.style.display = "none"; } } } 5 Role of lexical analysis • First part of compiler front-end Lexical Syntax AST Symbol Inter. Code High-level Analysis Analysis Table Rep.
    [Show full text]
  • CS 444: Compiler Construction
    CS 444: Compiler Construction Gabriel Wong Contents Introduction 3 Front-end Analysis . .3 Formal Languages . .3 Lexical Analysis / Scanning 3 Regular Expressions . .3 Regex to DFA . .4 Definition of NFA and DFA . .4 Computing -closure . .4 NFA to DFA Conversion . .5 Scanning . .6 Constructing Scanning DFA . .6 Parsing 7 Context-Free Grammar . .7 Top-down / Bottom-up Parsing . .8 LL(1) Parsing . .9 Augmented Grammar . 10 LL(1) Algorithm . 10 LR(0) Parsing . 11 LR(1) Parsing . 15 SLR(1) Parser . 15 Comparison Between LR Parsers . 16 LR(1) NFA . 16 LALR(1) Parser . 17 Abstract Syntax Tree . 19 Semantic Analysis 19 Implementing Environments . 20 Namespaces . 20 Java Name Resolution . 21 1. Create class environment . 21 2. Resolve type names . 21 3. Check class hierachy . 21 4. Disambiguate ambiguous names . 23 5. Resolve expressions (variables, fields) . 23 6. Type checking . 24 7. Resolve methods and instance fields . 27 Static Analysis 28 Java Reachability Analysis (JLS 14.20) . 28 Java Definite Assignment Analysis (JLS 16) . 29 1 CONTENTS CONTENTS Live Variable Analysis . 30 Code Generation 31 IA-32 Assembly . 32 Assembler Directives . 32 Strategy for Code Generation . 33 Data Layout . 33 Constants . 34 Local Variables . 34 Method Calls . 35 Object Layout . 36 Vtables . 37 Dispatching Interface Methods . 37 Subtype Testing . 38 Arrays................................................. 38 2 Regular Expressions LEXICAL ANALYSIS / SCANNING Notes taken from lectures by Ondřej Lhoták in Winter 2016. Introduction A compiler translates from a source language (eg. Java) to a target machine language (eg. x86). Scanning, parsing (A1) and context sensitive analysis (A234) are part of the compiler front end.
    [Show full text]
  • Context-Aware Scanning and Determinism-Preserving Grammar Composition, in Theory and Practice
    Context-Aware Scanning and Determinism-Preserving Grammar Composition, in Theory and Practice A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY August Schwerdfeger IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY July, 2010 c August Schwerdfeger 2010 ALL RIGHTS RESERVED Acknowledgments I would like to thank all my colleagues in the MELT group for all their input and assis- tance through the course of the project, and especially for their help in providing ready- made tests and applications for my software. Thanks specifically to Derek Bodin and Ted Kaminski for their efforts in integrating the Copper parser generator into MELT’s attribute grammar tools; to Lijesh Krishnan for developing the ableJ system of Java ex- tensions and assisting me extensively with its further development; and to Yogesh Mali for implementing and testing the Promela grammar with Copper. I also thank my advisor, Eric Van Wyk, for his continuous guidance throughout my time as a student, and also for going well above and beyond the call of duty in helping me to edit and prepare this thesis for presentation. I thank the members of my thesis committee, Mats Heimdahl, Gopalan Nadathur, and Wayne Richter, for their time and efforts in serving and reviewing. I especially wish to thank Prof. Nadathur for his many detailed questions and helpful suggestions for improving this thesis. Work on this thesis has been partially funded by National Science Foundation Grants 0347860 and 0905581. Support has also been received from funds provided by the Institute of Technology (soon to be renamed the College of Science and Engi- neering) and the Department of Computer Science and Engineering at the University of Minnesota.
    [Show full text]
  • Rewriting Strategies for Instruction Selection
    Rewriting Strategies for Instruction Selection Martin Bravenboer Eelco Visser www.stratego-language.org Technical Report UU-CS-2002-021 Institute of Information and Computing Sciences Utrecht University April 2002 This technical report is a preprint of: M. Bravenboer and E. Visser. Rewriting Strategies for Instruction Selection. To appear in S. Tison (editor) Rewriting Techniques and Applications (RTA'02). Lecture Notes in Computer Science. Springer-Verlag, Copenhagen, Denmark, June 2002. ( c Springer-Verlag) Copyright c 2002 Martin Bravenboer and Eelco Visser Address: Institute of Information and Computing Sciences Universiteit Utrecht P.O.Box 80089 3508 TB Utrecht email: [email protected] http://www.cs.uu.nl/~visser/ Rewriting Strategies for Instruction Selection Martin Bravenboer Eelco Visser Institute of Information and Computing Sciences, Universiteit Utrecht, P.O. Box 80089, 3508 TB Utrecht, The Netherlands. http://www.cs.uu.nl/ visser, [email protected], [email protected] ∼ Abstract. Instruction selection (mapping IR trees to machine instruc- tions) can be expressed by means of rewrite rules. Typically, such sets of rewrite rules are highly ambiguous. Therefore, standard rewriting en- gines based on fixed, exhaustive strategies are not appropriate for the execution of instruction selection. Code generator generators use spe- cial purpose implementations employing dynamic programming. In this paper we show how rewriting strategies for instruction selection can be encoded concisely in Stratego, a language for program transformation based on the paradigm of programmable rewriting strategies. This em- bedding obviates the need for a language dedicated to code generation, and makes it easy to combine code generation with other optimizations. 1 Introduction Code generation is the phase in the compilation of high-level programs in which an intermediate representation (IR) of a program is translated to a list of machine instructions.
    [Show full text]
  • Lecture 9 Code Generation Instruction Selection Instruction Selection
    Lecture 9 Code Generation Instruction Selection Instruction Selection • Mapping IR trees to assembly instructions • Tree patterns • Retain virtual registers (temps) • Algorithms for instruction selection • Maximal munch • Dynamic programming • Tree grammars • Instruction selection for the Minijava compiler • Generic assembly instructions (Asm.instr) Instructions and Tree Patterns + * ADD ri ← rj + rk MUL ri ← rj * rk LOAD ri ← M[rj + c] FETCH FETCH FETCH FETCH MEM MEM MEM MEM CONST + + CONST CONST STORE M[ri + c] ← rj MOVE MOVE MOVE MOVE MEM MEM MEM MEM CONST + + CONST CONST Tiling with Instruction Patterns a[i] := x where i = TEMPi a = M[fp + ca] = ca($fp) x = M[fp + cx] = cx($fp) MOVE MEM FETCH + MEM FETCH * + FP CONST(c ) MEM TEMPi CONST(4) x + FP CONST(ca) Tiling with Instruction Patterns MOVE a[i] := x MOVEM M[ri ] ← M[rj] MEM FETCH MEM MOVE MEM FETCH + MEM FETCH * + FP CONST(c ) MEM TEMPi CONST(4) x + FP CONST(ca) Optimal, Optimum Tilings A tiling of an IR tree with instruction patterns is optimum if the set of patterns has the least “cost” by some measure, e.g.: 1. the smallest number of patterns (producing the shortest instruction sequence, thus smallest code size) 2. total number of cycles to execute, thus fastest code A tiling is optimal if no two adjacent tiles can be combined into a single tile of lower cost. optimum ⇒ optimal Maximal Munch 1. Choose the “largest” tile that matches at the root of the IR tree (this is the munch) 2. Recursively apply maximal much at each subtree of this munch.
    [Show full text]
  • One Parser to Rule Them All
    One Parser to Rule Them All Ali Afroozeh Anastasia Izmaylova Centrum Wiskunde & Informatica, Amsterdam, The Netherlands {ali.afroozeh, anastasia.izmaylova}@cwi.nl Abstract 1. Introduction Despite the long history of research in parsing, constructing Parsing is a well-researched topic in computer science, and parsers for real programming languages remains a difficult it is common to hear from fellow researchers in the field and painful task. In the last decades, different parser gen- of programming languages that parsing is a solved prob- erators emerged to allow the construction of parsers from a lem. This statement mostly originates from the success of BNF-like specification. However, still today, many parsers Yacc [18] and its underlying theory that has been developed are handwritten, or are only partly generated, and include in the 70s. Since Knuth’s seminal paper on LR parsing [25], various hacks to deal with different peculiarities in program- and DeRemer’s work on practical LR parsing (LALR) [6], ming languages. The main problem is that current declara- there is a linear parsing technique that covers most syntactic tive syntax definition techniques are based on pure context- constructs in programming languages. Yacc, and its various free grammars, while many constructs found in program- ports to other languages, enabled the generation of efficient ming languages require context information. parsers from a BNF specification. Still, research papers and In this paper we propose a parsing framework that em- tools on parsing in the last four decades show an ongoing braces context information in its core. Our framework is effort to develop new parsing techniques.
    [Show full text]
  • Instruction Selection
    Lecture Notes on Instruction Selection 15-411: Compiler Design Frank Pfenning∗ Lecture 2 1 Introduction In this lecture we discuss the process of instruction selection, which typ- cially turns some form of intermediate code into a pseudo-assembly lan- guage in which we assume to have infinitely many registers called “temps”. We next apply register allocation to the result to assign machine registers and stack slots to the temps before emitting the actual assembly code. Ad- ditional material regarding instruction selection can be found in the text- book [App98, Chapter 9]. 2 A Simple Source Language We use a very simple source language where a program is just a sequence of assignments terminated by a return statement. The right-hand side of each assignment is a simple arithmetic expression. Later in the course we describe how the input text is parsed and translated into some intermediate form. Here we assume we have arrived at an intermediate representation where expressions are still in the form of trees and we have to generate in- structions in pseudo-assembly. We call this form IR Trees (for “Intermediate Representation Trees”). We describe the possible IR trees in a kind of pseudo-grammar, which should not be read as a description of the concrete syntax, but the recursive structure of the data. ∗Edited by Andre´ Platzer LECTURE NOTES L2.2 Instruction Selection Programs ~s ::= s1; : : : ; sn sequence of statements Statements s ::= t = e assignment j return e return, always last Expressions e ::= c integer constant j t temp (variable) j e1 ⊕ e2 binary operation Binops ⊕ ::= + j − j ∗ j = j ::: 3 Abstract Assembly Code Target For our very simple source, we use an equally simple target.
    [Show full text]
  • Compilers & Programming Systems
    CS502: Compilers & Programming Systems Low IR and Assembly code generation Zhiyuan Li Department of Computer Science Purdue University, USA • We first discuss the generation of low-level intermediate code of a program. – Inside the compiler such a code has its low-level representation, called low-level internal representation, or low IR. • We then discuss the generation of assembly code by walking through the low-level IR Motivation of machine independent representation • From the MIPS code example shown previously, we see that the assembly code is heavily dependent on the specific instruction format of a particular processor • We want to perform optimizations on machine independent representation such that we do not implement a new set of optimizations for each particular processor Differences between Low IR and Assembly Code •In low IR – Unlimited user-level registers are assumed. Hence, • symbolic registers, instead of hardware registers are named as operands. – Operators on different data type are overloaded. • For example, “+” represents both integer add and float add. – Instructions selected for special cases are abstracted away. • For example, a simple load instruction represents loading of a value, instead of two instructions (loading the high half and then the low half) as on RISC processors such as MIPS and SPARC. • Another example is that we simply write R4 * 2 when it could be done by a potentially less expensive left shift instruction. Intermediate Code • A well-known form of intermediate code is called three- address code (3AC) – In the “quadruples” form, 3AC for a function is represented by a list quadruples (dest, op1, op2, op), with “op” being the operation – An alternative form is a list of “triples” (op1, op2, op) and the index or the pointer to a specific triple implies the location of the intermediate result • When we plug such locations (indices or pointers) to the operand fields of an operation, we in effect obtain a forest, i.e.
    [Show full text]
  • 7. Code Generation
    7. Code Generation Oscar Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes. http://www.cs.ucla.edu/~palsberg/ http://www.cs.purdue.edu/homes/hosking/ Roadmap > Runtime storage organization > Procedure call conventions > Instruction selection > Register allocation > Example: generating Java bytecode See Modern compiler implementation in Java (Second edition), chapters 6 & 9. !2 Roadmap > Runtime storage organization > Procedure call conventions > Instruction selection > Register allocation > Example: generating Java bytecode !3 Typical run-time storage organization Heap grows “up”, stack grows “down”. • Allows both stack and heap maximal freedom. • Code and static data may be separate or intermingled. !4 In a 32-bit architecture, memory addresses range from from 0 to 4GB, broken into pages, of which only the low and the high pages are actually allocated. The low pages hold the compiled program code, static data, and heap data. The heap “grows” upwards as needed. High memory addresses refer to the run-time stack. They “grow” downward with each procedure call and “shrink” upward with each return. Certain memory pages (e.g., holding compiled code) may possibly be protected against modification. The Procedure Abstraction > The procedure abstraction supports separate compilation —build large programs —keep compile times reasonable —independent procedures > The linkage convention (calling convention): —a social contract — procedures inherit a valid run-time environment and restore one for their parents —platform dependent — code generated at compile time !5 Procedures as abstractions function foo() function bar(int a) { { int a, b; int x; ... ... bar(a); bar(x); ... ... } } bar() must preserve foo()’s state while executing.
    [Show full text]
  • Javacc Tutorial
    Chapter 1 Introduction to JavaCC 1.1 JavaCC and Parser Generation JavaCC is a parser generator and a lexical analyzer generator. Parsers and lexical analysers are software components for dealing with input of character sequences. Compilers and interpreters incorporate lexical analysers and parsers to decipher files containing programs, however lexical analysers and parsers can be used in a wide variety of other applications, as I hope the examples in this bookwill illustrate. So what are lexical analysers and parsers? Lexical analysers can breaka sequence of characters into a subsequences called tokens and it also classifies the tokens. Consider a short program in the C programming language. int main() { return 0 ; } The lexical analyser of a C compiler would breakthis into the following sequence of tokens “int”, “ ”, “main”, “(”, “)”, “”,“{”, “\n”, “\t”, “return” “”,“0”,“”,“;”,“\n”, “}”, “\n”, “” . The lexical analyser also identifies the kind of each token; in our example the sequence of 2 token kinds might be KWINT, SPACE, ID, OPAR, CPAR, SPACE, OBRACE, SPACE, SPACE, KWRETURN, SPACE, OCTALCONST, SPACE, SEMICOLON, SPACE, CBRACE, SPACE, EOF . The token of kind EOF represents the end of the original file. The sequence of tokens is then passed on to the parser. In the case of C, the parser does not need all the tokens; in our example, those clasified as SPACE are not passed on to the parser. The parser then analyses the sequence of tokens to determine the structure of the program. Often in compilers, the parser outputs a tree representing the structure of the program. This tree then serves as an input to components of the compiler responsible for analysis and code generation.
    [Show full text]
  • Instruction Selection Compiler Backend Intermediate Representations
    Instruction Selection 15-745 Optimizing Compilers Spring 2007 1 School of Computer Science School of Computer Science Compiler Backend Intermediate Representations instruction IR IR IR Assem Source Front Middle Back Target selector Code End End End Code Front end - produces an intermediate representation (IR) register TempMap allocator Middle end - transforms the IR into an equivalent IR that runs more efficiently Back end - transforms the IR into native code instruction Assem scheduler IR encodes the compiler’s knowledge of the program Middle end usually consists of several passes 2 3 School of Computer Science School of Computer Science Page ‹#› Types of Intermediate Intermediate Representations Representations Decisions in IR design affect the speed and efficiency of Structural the compiler – Graphically oriented Examples: Some important IR properties – Heavily used in source-to-source translators Trees, DAGs – Ease of generation – Tend to be large – Ease of manipulation Linear – Procedure size – Pseudo-code for an abstract machine Examples: – Freedom of expression – Level of abstraction varies 3 address code Stack machine code – Level of abstraction – Simple, compact data structures The importance of different properties varies between – Easier to rearrange compilers Hybrid Example: – Selecting an appropriate IR for a compiler is critical – Combination of graphs and linear code Control-flow graph 4 5 School of Computer Science School of Computer Science Level of Abstraction Level of Abstraction The level of detail exposed in an IR
    [Show full text]