Seminarslargely Meet

Total Page:16

File Type:pdf, Size:1020Kb

Seminarslargely Meet ■■■"""" " ■ "■'* " ' ummmmmimmmmmmmmtmamttmmm^^^mtmm.a^-^-~-^-, mmmv Acta Informatica 2, 12—39 (1973) © by Springer-Verlag 1973 to the left canonical deri; formal notions embodie paper, Knuth defined a which are precisely the LR(\) be obtained determinisi Efficient Parsers decision already made) T. Anderson, J. Eve and J. J. Horning looking at most k svml decision. Similarly Lew Received April 7, 1972 the left canonical deri constraints. Summary. Knuth's LR(\) parsing algorithm is sufficiently general to handle the Both classes of lang parsing of most programminglanguages,with the additionalbenefit of earlierdetection 0 c ienefh nf th of syntax errors than in other formal methodsused in compilers. The major obstacle . impeding the use of this algorithm is the large space requirement for parsing tables. rP 'y contained in tr In addition, certain optimisations described here, which increase parsing speed, greater practical inti exacerbate the space problem. at°* any point in a parse 01 Two approachesfor overcoming this problem are explored. need to be matched wit 1. Using parsers related to the LR(\) parser which parse a large subset of the was used in the derivat; LR(\) grammars, but which have much smaller parsing tables. general, to determine wli 2. Devising compact encodings and performing space optimisations on these beyond the last generate tables The LR(k) property Time and space efficiency is evaluated using the Stanford University Algol W virtue of there existing compiler as a base for comparison since the parsing routine in this compileris among has given two procedures the more efficient presently used. a grammar js LR{k] j^ " In the event that the gr; 1. Introduction ning these tables, in ie ,an The widespread use of context free grammars to model the syntax of program- guage. Knuth has ming languages has resulted in the parsing problem for such languages receiving I sentences of the lain a great deal of attention both theoretically and practically. appear elsewhere in .1 s or It is known that, for parsing an arbitrary context free language, an amount . una*ely, exploitation of time proportional to the cube and an amountof space proportional to the square involves generating an /. of of the length of the input string is sufficient (see, for example, Earley, 1970). the orig«nal grammar i For various subclasses of context free grammars these bounds can be considerably *n the case of prograi improved. Programming languages are generally amenable to analysis by parsing loo'< ahead arises prirr methods with time and space requirements which are linearly related to the available on many curre length of the input string. Several such linear parsing algorithms are described Algol 60 is a well-known in the survey article of Feldman and Gries (1968). lexical analysis is separa Historically, these parsers were classified into top downand bottom up categories eal vv'dh look ahead pn s according to the primary strategy adopted. In the former, starting from the other tasks. It is also 1 ni)' principal nonterminal of a grammar, an attempt is made to produce a derivation ng language grammars of the input string. Bottom up methods start with the input string and try to resulting in grammars\vh produce a parse directly. Problems have been treai It was pointed out by Knuth (1965) that the bottom up strategy results in a Seminars largely meet thi parse which has the property, when viewed (in reverse) as a derivation, that the Ranges to a grammar to rightmost nonterminal in each sentential form is replaced in each step. The top ' mending a grammar to . down strategy results in a derivation in which the leftmost nonterminal in each , vocated in this paper i ls a sentential form is replaced at each step (Lewis and Steams, 1968); i.e. the bottom small additional burdi " up strategy leads to a right canonical parse while the top down strategy leads ' eliminating genuine am *■ \ Efficient l.R(i) Parsers 13 case that theseare the to the left canonical derivation. It is probably the appropriate formal notions embodied in the intuitive ideas of bottom up and top down. In his paper, Knuth defined a subclass of thecontext free languages, the __./_(/.) languages, which are precisely the set of languages for which the right canonical parse can he obtained deterministically (i.e. without backing up and changing any parsing decision already made) in a single pass over the input string from left to right, looking at most k symbols ahead on the input string to determine each parsing decision. Similarly Lewis and Steams consider the LL(k) languages, in which the left canonical derivation can be obtained deterministically under the same constraints. in linearly proportional eral to handle the Both classes of language can be parsed time and space of enrher detection (f) the length of the input string, but the fact that the LL(k) languages are the major obstacle „roperly contained in the LR(k) languages implies that the latter are potentially for parsing tables. greater practical interest. (It is true, however, that in the former category at any point in a parse only thefirst k terminal symbols generated by a production need to be matched with the input string to determine whether that production derivation. In the case of LR(k) languages it is not possible, in bs-t of the was used in the general, to determine whether a production has been used until k terminal symbols nidations on these beyond the last generated by that production have been inspected.) The LR{k) property is associated with a grammar; a language is LR(k) by it. Knuth University Algol W virtue of there existing at least one such grammar which generates . compiler is among iias given two procedures which decide, when given aparticular value of k, whether a grammar is LR(k). The second of these can be modified to provide state tables. In the event that the grammaris LR(k), a relatively simple interpreter for scan- ning these tables, in conjunction with a stack, provides a parsing routine for language. Knuth has also shown that any language which is LR(k) is LR(\). s\nta\^Pprogram the sentences of the language terminate with a unique symbol which does not languages receiving elsewhere in a sentence, this result can be sharpened to LR(0).) Un- fortunately, exploitation of this result in compilers founders on the fact that it an amount nguage, generating an LR(i) or LR(0) grammarin which the phrase structure tional to the square involves of the original grammar is corrupted. iple, Farley, 1970). In the case of programming languages, the need for more than one character ran be considerably look primarily at the lexical level due to the paucity of symbols analysis by parsing of ahead arises available on many current input devices. The occurrence of ":" and ": in arlv related to the =" Algol is a well-known example. Typically,1 ypically, forlor efficiencyetticiency ifn forior no ornerotherreason, 1 d cribed 60 is well-known lexical analysis is separated from syntactic analysis, and the lexical scan can to ategories deal with look ahead problems at this level of analysis as a minor addition "_" .r,o its other tasks. It is also the case that construction of other than trivial program- starting from. Z the . min lang" grammars is extremely error prone process almost invariably .ro uce d ivation S 2^ an resulting in grammars which are syntactically ambiguous. Assuming that lexical problems have been treated separately, it appears that programming language _ a grammars largely meet the constraints which allow the use of theLR(\ ) algorithm ; strategy resultsi. inir. a o ■> , _ , . , * . from the constraints. are slight.. that the changes to a grammar to accomodate deviations more algorithm top Amending a grammar to enable the use of the restrictive SLR(\) h The worst . , - j. advocated in this paper is not always necessary (see Section 6.1) and at L„ x„_, is a small additional burden which can be treated in conduction with the problem i.e. the bottom+ 968 ....eliminating . ambiguities... ... down strategy leads "< genume ; " 14 T. Anderson et al. : , too The state tables of the LR{k) algorithm, evenfor k= 1 are large for use if Q — (vs> y p directly in a compiler or compiler writing system; e.g. a program implementing (written a =>/5) if thei theLR(\) algorithm was applied to an Algol 60 grammar, therun was terminated with the insertion of the 10000th state-input entry into the state table, at which X point 1 237 states had been created (see also Korenjak, 1969). The work reported Also a derives B ( here addresses the problems raised by Knuth, of developing algorithms that that accept LR(k) grammars or special classes of them and of mechanically producing efficient parsing programs. If co0, v>x , ..., &.„<_ producti 2. Definitions and Notation The p-th The p-th producti Let 0 denote the empty set and A denote the null string. recursive. If X and V are sets of strings and xy denotes the concatenation of x and y, A nonterminal sy: then XV ={xy\ xdX and y € V] If 4eFv and a 6 V If X° ={A} then sets X{ are defined by first (<x)={; +1 =X{ X for X' t^O follow (A) ={: and X* is defined by X* = x\ ('—vo V In this section algc A context free grammar(with an endmarker) is a quadrupleG = (VN, T, P, S J_ ), where form of state tables. T discussion presumes fa " V is a finite set of nonterminal symbols, N algorithms derived fro VT is a finite set of terminal symbols, In Knuth's algoritl is defined as a set VN nVT = Q and V denotes VNu VT, of o a state corresponds to start or nonterminal of the grammar, Se VN is the symbol principal a point consistent with J_ft is the endmarker, V (a) production nun SJ_ is the start string, sentence scanned thus (b) if P is a finite subset of VN X V* called the set of productions.
Recommended publications
  • Validating LR(1) Parsers
    Validating LR(1) Parsers Jacques-Henri Jourdan1;2, Fran¸coisPottier2, and Xavier Leroy2 1 Ecole´ Normale Sup´erieure 2 INRIA Paris-Rocquencourt Abstract. An LR(1) parser is a finite-state automaton, equipped with a stack, which uses a combination of its current state and one lookahead symbol in order to determine which action to perform next. We present a validator which, when applied to a context-free grammar G and an automaton A, checks that A and G agree. Validating the parser pro- vides the correctness guarantees required by verified compilers and other high-assurance software that involves parsing. The validation process is independent of which technique was used to construct A. The validator is implemented and proved correct using the Coq proof assistant. As an application, we build a formally-verified parser for the C99 language. 1 Introduction Parsing remains an essential component of compilers and other programs that input textual representations of structured data. Its theoretical foundations are well understood today, and mature technology, ranging from parser combinator libraries to sophisticated parser generators, is readily available to help imple- menting parsers. The issue we focus on in this paper is that of parser correctness: how to obtain formal evidence that a parser is correct with respect to its specification? Here, following established practice, we choose to specify parsers via context-free grammars enriched with semantic actions. One application area where the parser correctness issue naturally arises is formally-verified compilers such as the CompCert verified C compiler [14]. In- deed, in the current state of CompCert, the passes that have been formally ver- ified start at abstract syntax trees (AST) for the CompCert C subset of C and extend to ASTs for three assembly languages.
    [Show full text]
  • Efficient Computation of LALR(1) Look-Ahead Sets
    Efficient Computation of LALR(1) Look-Ahead Sets FRANK DeREMER and THOMAS PENNELLO University of California, Santa Cruz, and MetaWare TM Incorporated Two relations that capture the essential structure of the problem of computing LALR(1) look-ahead sets are defined, and an efficient algorithm is presented to compute the sets in time linear in the size of the relations. In particular, for a PASCAL grammar, the algorithm performs fewer than 15 percent of the set unions performed by the popular compiler-compiler YACC. When a grammar is not LALR(1), the relations, represented explicitly, provide for printing user- oriented error messages that specifically indicate how the look-ahead problem arose. In addition, certain loops in the digraphs induced by these relations indicate that the grammar is not LR(k) for any k. Finally, an oft-discovered and used but incorrect look-ahead set algorithm is similarly based on two other relations defined for the fwst time here. The formal presentation of this algorithm should help prevent its rediscovery. Categories and Subject Descriptors: D.3.1 [Programming Languages]: Formal Definitions and Theory--syntax; D.3.4 [Programming Languages]: Processors--translator writing systems and compiler generators; F.4.2 [Mathematical Logic and Formal Languages]: Grammars and Other Rewriting Systems--parsing General Terms: Algorithms, Languages, Theory Additional Key Words and Phrases: Context-free grammar, Backus-Naur form, strongly connected component, LALR(1), LR(k), grammar debugging, 1. INTRODUCTION Since the invention of LALR(1) grammars by DeRemer [9], LALR grammar analysis and parsing techniques have been popular as a component of translator writing systems and compiler-compilers.
    [Show full text]
  • SAKI: a Fortran Program for Generalized Linear Inversion of Gravity and Magnetic Profiles
    UNITED STATES DEPARTMENT OF THE INTERIOR GEOLOGICAL SURVEY SAKI: A Fortran program for generalized linear Inversion of gravity and magnetic profiles by Michael Webring Open File Report 85-122 1985 This report is preliminary and has not been reviewed for conformity with U.S. Geological Survey editorial standards. Table of Contents Abstract ------------------------------ 1 Introduction ---------------------------- 1 Theory ------------------------------- 2 Users Guide ---------------------------- 7 Model File -------------------------- 7 Observed Data Files ---------------------- 9 Command File ------------------------- 10 Data File Parameters ------------------- 10 Magnetic Parameters ------------------- 10 General Parameters -------------------- H Plotting Parameters ------------------- n Annotation Parameters ------------------ 12 Program Interactive Commands ----------------- 13 Inversion Function ---------------------- 16 Examples ------------------------------ 20 References ----------------------------- 26 Appendix A - plot system description ---------------- 27 Appendix B - program listing -------------------- 30 Abstract A gravity and magnetic inversion program is presented which models a structural cross-section as an ensemble of 2-d prisms formed by linking vertices into a network that may be deformed by a modified Marquardt algorithm. The nonuniqueness of this general model is constrained by the user in an interactive mode. Introduction This program fits in a least-squares sense the theoretical gravity and magnetic response of
    [Show full text]
  • Michigan Terminal System (MTS) Distribution 3.0 Documentation
    D3.NOTES 1 M T S D I S T R I B U T I O N 3.0 August 1973 General Notes Distribution 3.0 consists of 6 tapes for the 1600 bpi copy or 9 tapes for the 800 bpi copy. All tapes are unlabeled to save space. The last tape of each set is a dump/restore tape, designed to be used to get started (for new installations) or for conversion (for old installations). Instructions for these procedures are given in items 1001 and 1002 of the documentation list (later in this writeup). The remaining tapes are "Distribution Utility" tapes containing source, object, command, data, and print files, designed to be read by *FS or by the Distribution program, which is a superset of *FS. A writeup for this distribution version of *FS is enclosed (item 1003). Throughout the documentation for this distribution, reference is made to the components of the distribution. Generally these references are made by the 3-digit component number, sometimes followed by a slash and a 1- or 2-digit sub- component number. For example, the G assembler (*ASMG) has been assigned component number 066. However, the assembler actually has many "pieces" and so the assembler consists of components 066/1 through 066/60. Component numbers 001 through 481 are compatible with the numbers used in distribution 2.0; numbers above 481 are new components, or in some cases, old components which have been grouped under a new number. Two versions of the distribution driver file are provided: 461/11 for the 1600 bpi tapes, and 461/16 for the 800 bpi tapes.
    [Show full text]
  • Pearl: Diagrams for Composing Compilers
    Pearl: Diagrams for Composing Compilers John Wickerson Imperial College London, UK [email protected] Paul Brunet University College London, UK [email protected] Abstract T-diagrams (or ‘tombstone diagrams’) are widely used in teaching for explaining how compilers and interpreters can be composed together to build and execute software. In this Pearl, we revisit these diagrams, and show how they can be redesigned for better readability. We demonstrate how they can be applied to explain compiler concepts including bootstrapping and cross-compilation. We provide a formal semantics for our redesigned diagrams, based on binary trees. Finally, we suggest how our diagrams could be used to analyse the performance of a compilation system. 2012 ACM Subject Classification Software and its engineering → Compilers Keywords and phrases compiler networks, graphical languages, pedagogy Digital Object Identifier 10.4230/LIPIcs.ECOOP.2020.1 1 Introduction In introductory courses on compilers across the globe, students are taught about the inter- actions between compilers using ‘tombstone diagrams’ or ‘T-diagrams’. In this formalism, a compiler is represented as a ‘T-piece’. A T-piece, as shown in Fig. 1, is characterised by three labels: the source language of the compiler, the target language of the compiler, and the language in which the compiler is implemented. Complex networks of compilers can be represented by composing these basic pieces in two ways. Horizontal composition means that the output of the first compiler is fed in as the input to the second. Diagonal composition means that the first compiler is itself compiled using the second compiler.
    [Show full text]
  • MTS on Wikipedia Snapshot Taken 9 January 2011
    MTS on Wikipedia Snapshot taken 9 January 2011 PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sun, 09 Jan 2011 13:08:01 UTC Contents Articles Michigan Terminal System 1 MTS system architecture 17 IBM System/360 Model 67 40 MAD programming language 46 UBC PLUS 55 Micro DBMS 57 Bruce Arden 58 Bernard Galler 59 TSS/360 60 References Article Sources and Contributors 64 Image Sources, Licenses and Contributors 65 Article Licenses License 66 Michigan Terminal System 1 Michigan Terminal System The MTS welcome screen as seen through a 3270 terminal emulator. Company / developer University of Michigan and 7 other universities in the U.S., Canada, and the UK Programmed in various languages, mostly 360/370 Assembler Working state Historic Initial release 1967 Latest stable release 6.0 / 1988 (final) Available language(s) English Available programming Assembler, FORTRAN, PL/I, PLUS, ALGOL W, Pascal, C, LISP, SNOBOL4, COBOL, PL360, languages(s) MAD/I, GOM (Good Old Mad), APL, and many more Supported platforms IBM S/360-67, IBM S/370 and successors History of IBM mainframe operating systems On early mainframe computers: • GM OS & GM-NAA I/O 1955 • BESYS 1957 • UMES 1958 • SOS 1959 • IBSYS 1960 • CTSS 1961 On S/360 and successors: • BOS/360 1965 • TOS/360 1965 • TSS/360 1967 • MTS 1967 • ORVYL 1967 • MUSIC 1972 • MUSIC/SP 1985 • DOS/360 and successors 1966 • DOS/VS 1972 • DOS/VSE 1980s • VSE/SP late 1980s • VSE/ESA 1991 • z/VSE 2005 Michigan Terminal System 2 • OS/360 and successors
    [Show full text]
  • Die Gruncllehren Cler Mathematischen Wissenschaften
    Die Gruncllehren cler mathematischen Wissenschaften in Einzeldarstellungen mit besonderer Beriicksichtigung der Anwendungsgebiete Band 135 Herausgegeben von J. L. Doob . E. Heinz· F. Hirzebruch . E. Hopf . H. Hopf W. Maak . S. Mac Lane • W. Magnus. D. Mumford M. M. Postnikov . F. K. Schmidt· D. S. Scott· K. Stein Geschiiftsfiihrende Herausgeber B. Eckmann und B. L. van der Waerden Handbook for Automatic Computation Edited by F. L. Bauer· A. S. Householder· F. W. J. Olver H. Rutishauser . K. Samelson· E. Stiefel Volume I . Part a Heinz Rutishauser Description of ALGOL 60 Springer-Verlag New York Inc. 1967 Prof. Dr. H. Rutishauser Eidgenossische Technische Hochschule Zurich Geschaftsfuhrende Herausgeber: Prof. Dr. B. Eckmann Eidgenossische Tecbnische Hocbscbule Zurich Prof. Dr. B. L. van der Waerden Matbematisches Institut der Universitat ZUrich Aile Rechte, insbesondere das der Obersetzung in fremde Spracben, vorbebalten Ohne ausdriickliche Genehmigung des Verlages ist es auch nicht gestattet, dieses Buch oder Teile daraus auf photomechaniscbem Wege (Photokopie, Mikrokopie) oder auf andere Art zu vervielfaltigen ISBN-13: 978-3-642-86936-5 e-ISBN-13: 978-3-642-86934-1 DOl: 10.1007/978-3-642-86934-1 © by Springer-Verlag Berlin· Heidelberg 1967 Softcover reprint of the hardcover 1st edition 1%7 Library of Congress Catalog Card Number 67-13537 Titel-Nr. 5l1B Preface Automatic computing has undergone drastic changes since the pioneering days of the early Fifties, one of the most obvious being that today the majority of computer programs are no longer written in machine code but in some programming language like FORTRAN or ALGOL. However, as desirable as the time-saving achieved in this way may be, still a high proportion of the preparatory work must be attributed to activities such as error estimates, stability investigations and the like, and for these no programming aid whatsoever can be of help.
    [Show full text]
  • CLR(1) and LALR(1) Parsers
    CLR(1) and LALR(1) Parsers C.Naga Raju B.Tech(CSE),M.Tech(CSE),PhD(CSE),MIEEE,MCSI,MISTE Professor Department of CSE YSR Engineering College of YVU Proddatur 6/21/2020 Prof.C.NagaRaju YSREC of YVU 9949218570 Contents • Limitations of SLR(1) Parser • Introduction to CLR(1) Parser • Animated example • Limitation of CLR(1) • LALR(1) Parser Animated example • GATE Problems and Solutions Prof.C.NagaRaju YSREC of YVU 9949218570 Drawbacks of SLR(1) ❖ The SLR Parser discussed in the earlier class has certain flaws. ❖ 1.On single input, State may be included a Final Item and a Non- Final Item. This may result in a Shift-Reduce Conflict . ❖ 2.A State may be included Two Different Final Items. This might result in a Reduce-Reduce Conflict Prof.C.NagaRaju YSREC of YVU 6/21/2020 9949218570 ❖ 3.SLR(1) Parser reduces only when the next token is in Follow of the left-hand side of the production. ❖ 4.SLR(1) can reduce shift-reduce conflicts but not reduce-reduce conflicts ❖ These two conflicts are reduced by CLR(1) Parser by keeping track of lookahead information in the states of the parser. ❖ This is also called as LR(1) grammar Prof.C.NagaRaju YSREC of YVU 6/21/2020 9949218570 CLR(1) Parser ❖ LR(1) Parser greatly increases the strength of the parser, but also the size of its parse tables. ❖ The LR(1) techniques does not rely on FOLLOW sets, but it keeps the Specific Look-ahead with each item. Prof.C.NagaRaju YSREC of YVU 6/21/2020 9949218570 CLR(1) Parser ❖ LR(1) Parsing configurations have the general form: A –> X1...Xi • Xi+1...Xj , a ❖ The Look Ahead
    [Show full text]
  • Parser Generation Bottom-Up Parsing LR Parsing Constructing LR Parser
    CS421 COMPILERS AND INTERPRETERS CS421 COMPILERS AND INTERPRETERS Parser Generation Bottom-Up Parsing • Main Problem: given a grammar G, how to build a top-down parser or a • Construct parse tree “bottom-up” --- from leaves to the root bottom-up parser for it ? • Bottom-up parsing always constructs right-most derivation • parser : a program that, given a sentence, reconstructs a derivation for that • Important parsing algorithms: shift-reduce, LR parsing sentence ---- if done sucessfully, it “recognize” the sentence • LR parser components: input, stack (strings of grammar symbols and states), • all parsers read their input left-to-right, but construct parse tree differently. driver routine, parsing tables. • bottom-up parsers --- construct the tree from leaves to root input: a1 a2 a3 a4 ...... an $ shift-reduce, LR, SLR, LALR, operator precedence stack s m LR Parsing output • top-down parsers --- construct the tree from root to leaves Xm .. recursive descent, predictive parsing, LL(1) s1 X1 s0 parsing Copyright 1994 - 2015 Zhong Shao, Yale University Parser Generation: Page 1 of 27 Copyright 1994 - 2015 Zhong Shao, Yale University Parser Generation: Page 2 of 27 CS421 COMPILERS AND INTERPRETERS CS421 COMPILERS AND INTERPRETERS LR Parsing Constructing LR Parser How to construct the parsing table and ? • A sequence of new state symbols s0,s1,s2,..., sm ----- each state ACTION GOTO sumarize the information contained in the stack below it. • basic idea: first construct DFA to recognize handles, then use DFA to construct the parsing tables ! different parsing table yield different LR parsers SLR(1), • Parsing configurations: (stack, remaining input) written as LR(1), or LALR(1) (s0X1s1X2s2...Xmsm , aiai+1ai+2...an$) • augmented grammar for context-free grammar G = G(T,N,P,S) is defined as G’ next “move” is determined by sm and ai = G’(T, N { S’}, P { S’ -> S}, S’) ------ adding non-terminal S’ and the • Parsing tables: ACTION[s,a] and GOTO[s,X] production S’ -> S, and S’ is the new start symbol.
    [Show full text]
  • Elkhound: a Fast, Practical GLR Parser Generator
    Published in Proc. of Conference on Compiler Construction, 2004, pp. 73–88. Elkhound: A Fast, Practical GLR Parser Generator Scott McPeak and George C. Necula ? University of California, Berkeley {smcpeak,necula}@cs.berkeley.edu Abstract. The Generalized LR (GLR) parsing algorithm is attractive for use in parsing programming languages because it is asymptotically efficient for typical grammars, and can parse with any context-free gram- mar, including ambiguous grammars. However, adoption of GLR has been slowed by high constant-factor overheads and the lack of a general, user-defined action interface. In this paper we present algorithmic and implementation enhancements to GLR to solve these problems. First, we present a hybrid algorithm that chooses between GLR and ordinary LR on a token-by-token ba- sis, thus achieving competitive performance for determinstic input frag- ments. Second, we describe a design for an action interface and a new worklist algorithm that can guarantee bottom-up execution of actions for acyclic grammars. These ideas are implemented in the Elkhound GLR parser generator. To demonstrate the effectiveness of these techniques, we describe our ex- perience using Elkhound to write a parser for C++, a language notorious for being difficult to parse. Our C++ parser is small (3500 lines), efficient and maintainable, employing a range of disambiguation strategies. 1 Introduction The state of the practice in automated parsing has changed little since the intro- duction of YACC (Yet Another Compiler-Compiler), an LALR(1) parser genera- tor, in 1975 [1]. An LALR(1) parser is deterministic: at every point in the input, it must be able to decide which grammar rule to use, if any, utilizing only one token of lookahead [2].
    [Show full text]
  • LR Parsing, LALR Parser Generators Lecture 10 Outline • Review Of
    Outline • Review of bottom-up parsing LR Parsing, LALR Parser Generators • Computing the parsing DFA • Using parser generators Lecture 10 Prof. Fateman CS 164 Lecture 10 1 Prof. Fateman CS 164 Lecture 10 2 Bottom-up Parsing (Review) The Shift and Reduce Actions (Review) • A bottom-up parser rewrites the input string • Recall the CFG: E → int | E + (E) to the start symbol • A bottom-up parser uses two kinds of actions: • The state of the parser is described as •Shiftpushes a terminal from input on the stack α I γ – α is a stack of terminals and non-terminals E + (I int ) ⇒ E + (int I ) – γ is the string of terminals not yet examined •Reducepops 0 or more symbols off the stack (the rule’s rhs) and pushes a non-terminal on • Initially: I x1x2 . xn the stack (the rule’s lhs) E + (E + ( E ) I ) ⇒ E +(E I ) Prof. Fateman CS 164 Lecture 10 3 Prof. Fateman CS 164 Lecture 10 4 LR(1) Parsing. An Example Key Issue: When to Shift or Reduce? int int + (int) + (int)$ shift 0 1 I int + (int) + (int)$ E → int • Idea: use a finite automaton (DFA) to decide E E → int I ( on $, + E I + (int) + (int)$ shift(x3) when to shift or reduce 2 + 3 4 E + (int I ) + (int)$ E → int – The input is the stack accept E int E + (E ) + (int)$ shift – The language consists of terminals and non-terminals on $ I ) E → int E + (E) I + (int)$ E → E+(E) 7 6 5 on ), + E I + (int)$ shift (x3) • We run the DFA on the stack and we examine E → E + (E) + E + (int )$ E → int the resulting state X and the token tok after I on $, + int I E + (E )$ shift –If X has a transition labeled tok then shift ( I 8 9 β E + (E) I $ E → E+(E) –If X is labeled with “A → on tok”then reduce E + E I $ accept 10 11 E → E + (E) Prof.
    [Show full text]
  • 19710010121.Pdf
    General Disclaimer One or more of the Following Statements may affect this Document This document has been reproduced from the best copy furnished by the organizational source. It is being released in the interest of making available as much information as possible. This document may contain data, which exceeds the sheet parameters. It was furnished in this condition by the organizational source and is the best copy available. This document may contain tone-on-tone or color graphs, charts and/or pictures, which have been reproduced in black and white. This document is paginated as submitted by the original source. Portions of this document are not fully legible due to the historical nature of some of the material. However, it is the best reproduction available from the original submission. Produced by the NASA Center for Aerospace Information (CASI) ^i A PROPOSFD TRANSLATOR-WRITING-SYSTEM LANGUAGE FRANKLIN L. DE RLMER CEP REPORT, VOLUME 3, No. 1 September 1970 C4 19 5 G °o ACCESSION NUPBER) (T U) c^`r :E 3 (PAG 5 (CODE) ^y1P`C{ o0 LL (NASA CR OR T X OR AD NUME ►_R) (CATEGORY) 0^0\G^^^ v Z The Computer Evolution Project Applied Sciences The University of California at Santa Cruz, California 95060 A PROPOSED TRANSLATOR-WRITING-SYSTEM LANGUAGE ABSTRACT We propose herein a language for an advanced translator writing system (TWS), a system for use by langwige designers, I d as opposed to compiler writers, which will accept as input a formal specification of a language and which will generate as output a translator for the specified language.
    [Show full text]