Omccp: a Metamodelica Based Parser Generator Applied to Modelica

Total Page:16

File Type:pdf, Size:1020Kb

Omccp: a Metamodelica Based Parser Generator Applied to Modelica Institutionen f¨orDatavetenskap Department of Computer and Information Science Master's thesis OMCCp: A MetaModelica Based Parser Generator Applied to Modelica by Edgar Alonso Lopez-Rojas LIU-IDA/LITH-EX-A{11/019{SE 2011-05-31 ' $ & Link¨opingsuniversitet Link¨opingsuniversitet % SE-581 83 Link¨oping, Sweden 581 83 Link¨oping Institutionen f¨orDatavetenskap Department of Computer and Information Science Master's thesis OMCCp: A MetaModelica Based Parser Generator Applied to Modelica by Edgar Alonso Lopez-Rojas LIU-IDA/LITH-EX-A{11/019{SE 2011-05-31 Supervisors: Martin Sj¨olundand Mohsen Torabzadeh-Tari Dept. of Computer and Information Science Examiner: Prof. Peter Fritzson Dept. of Computer and Information Science Upphovsr¨att Detta dokument h˚allstillg¨angligtp˚aInternet ^aeller dess framtida ers¨attare ^a under en l¨angre tid fr˚anpubliceringsdatum under f¨oruts¨attningatt inga extra-ordin¨araomst¨andigheteruppst˚ar. Tillg˚angtill dokumentet inneb¨artillst˚andf¨orvar och en att l¨asa, ladda ner, skriva ut enstaka kopior f¨orenskilt bruk och att anv¨anda det of¨or¨andratf¨orickekommersiell forskning och f¨orundervisning. ¨overf¨oringav upphovsr¨atten vid en senare tidpunkt kan inte upph¨ava detta tillst˚and. All annan anv¨andningav dokumentet kr¨aver up- phovsmannens medgivande. F¨oratt garantera ¨aktheten,s¨akerheten och tillg¨anglighetenfinns det l¨osningarav teknisk och administrativ art. Upphovsmannens ideella r¨attinnefattar r¨attatt bli n¨amndsom up- phovsman i den omfattning som god sed kr¨aver vid anv¨andningav dokumentet p˚aovan beskrivna s¨attsamt skydd mot att dokumentet ¨andraseller presenteras i s˚adanform eller i s˚adant sammanhang som ¨arkr¨ankande f¨orupphovsmannens litter¨araeller konstn¨arliga anseende eller egenart. F¨or ytterligare information om Link¨oping University Electronic Press se f¨orlagetshemsida http://www.ep.liu.se/ Copyright The publishers will keep this document online on the Internet - or its possible replacement - for a considerable time from the date of publication barring exceptional circumstances. The online availability of the document implies a permanent per- mission for anyone to read, to download, to print out single copies for your own use and to use it unchanged for any non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional on the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Link¨opingUniversity Elec- tronic Press and its procedures for publication and for assurance of document integrity, please refer to its WWW home page: http://www.ep.liu.se/ c Edgar Alonso Lopez-Rojas To Isabella, my new project in life Abstract The OpenModelica Compiler-Compiler parser generator (OMCCp) is an LALR(1) parser generator implemented in the MetaModelica language with parsing tables generated by the tools Flex and GNU Bison. The code gener- ated for the parser is in MetaModelica 2.0 language which is the OpenMod- elica compiler implementation language and is an extension of the Modelica 3.2 language. OMCCp uses as input an LALR(1) grammar that specifies the Modelica language. The generated Parser can be used inside the OpenMod- elica Compiler (OMC) as a replacement for the current parser generated by the tool ANTLR from an LL(k) Modelica grammar. This report explains the design and implementation of this novel Lexer and Parser Generator called OMCCp. Modelica and its extension MetaModelica are both languages used in the OpenModelica environment. Modelica is an Object-Oriented Equation- Based language for Modeling and Simulation. v vi Acknowledgements It is an honor for me to be able to culminate this work with the guidance of remarkable computer scientists. This thesis would not have been possible unless the clear vision of my examiner, professor Peter Fritzson. As the director of the Open Source Modelica Consortium (OSMC) he presented this great opportunity to me. Together with him, I have to thank my supervisors Martin Sj¨olundand Mohsen Torabzadeh-Tari. Martin has made available his support and guidance in a number of ways that I cannot count and Mohsen has always been keeping track of my progress and helping me with the difficulties I found. I am pleased to be part, learn and contribute to this great open source project called OpenModelica. Nevertheless, To IDA (Department of Computer and Information Sci- ence) for offering its locations and resources for my daily work. I cannot forget to thank my family. My parents Jesus and Soledad for supporting me since the beginning in this project to become a Master in Computer Science. My fianc´ee Helena, who has all the time been encour- aging me to give my best in every step of this journey. I am delighted to include my future daughter Isabella here; who is been my biggest motivation to complete this work before the day she step for the first time in this world. Last, but not less important my financial sponsors from Colombia: Fun- dacion Colfuturo1 and EAFIT University2. They believed in my talent and provided the financial resources to achieve this goal. 1http://www.colfuturo.org/ 2http://www.eafit.edu.co/ vii viii Contents 1 Introduction 1 1.1 Background . .1 1.2 Project Goal . .2 1.3 Methodology . .2 1.4 Intended Readers . .3 1.5 Thesis Outline . .3 2 Theoretical Background 5 2.1 Compilers . .5 2.1.1 Fundamentals . .6 2.1.2 Lexical Analysis . .8 2.1.3 Syntax Analysis . 10 2.1.4 Parser LALR(1) . 13 2.2 Error Handling in Syntax Analysis . 15 2.2.1 Error Recovery . 16 2.2.2 Error Messages . 17 2.3 The OpenModelica Project . 17 2.3.1 The Modelica Language . 18 2.3.2 MetaModelica extension . 18 2.3.3 Abstract Syntax Tree - AST . 21 3 Existing Technologies 23 3.1 OpenModelica Compiler (OMC) . 23 3.1.1 Architecture and Components . 23 3.1.2 ANTLR . 24 3.1.3 Current state . 26 3.2 Flex . 27 3.2.1 Input file lexer.l . 27 3.2.2 Output file lexer.c . 27 3.3 GNU Bison . 28 3.3.1 Input file parser.y . 29 3.3.2 Output file parser.c . 29 ix x CONTENTS 4 Implementation 33 4.1 Proposed Solution . 33 4.2 OMCCp Design . 34 4.2.1 Lexical Analyser . 35 4.2.2 Syntax Analyser . 39 4.3 OpenModelica Compiler-Compiler Parser (OMCCp) . 44 4.3.1 Lexer Generator . 44 4.3.2 Parser Generator . 46 4.4 Error handling . 49 4.4.1 Error recovery . 49 4.4.2 Error messages . 50 4.5 Integration OMC . 54 5 Discussion 57 5.1 Analysis of Results . 57 5.1.1 Lexer and Parser . 57 5.1.2 OMCCp Construction . 58 5.1.3 Implementation of a subset of Modelica and Meta- Modelica grammar . 61 5.2 OpenModelica Compiler . 64 5.3 Limitations . 64 6 Related Work 66 6.1 OpenModelica Development . 66 6.2 Compiler-Compiler Construction . 67 7 Conclusions 69 7.1 Accomplishments . 69 7.2 Future Work . 70 Bibliography 73 Appendices 80 A OMC Compiler Commands 80 A.1 Parameters - MetaModelica Parser Generator . 80 A.1.1 Generate compilerName . 80 A.1.2 Run compilerName, fileName . 80 A.2 OMC Commands . 80 B Lexer Generator 83 B.1 Lexer.mo . 83 B.2 LexerGenerator.mo . 92 B.3 LexerCode.tmo . 100 B.4 Types.mo . 102 CONTENTS xi C Parser Generator 107 C.1 Parser.mo . 107 C.2 ParserGenerator.mo . 126 C.3 ParseCode.tmo . 143 D Sample Input 146 D.1 lexer10.l . 146 D.2 parser10.y . 147 E Sample Output 152 E.1 ParseTable10.mo . 152 E.2 ParseCode10.mo . 157 E.3 Token10.mo . 168 E.4 LexTable10.mo . 168 E.5 LexerCode10.mo . 171 F Modelica Grammar 176 F.1 lexerModelica.l . 176 F.2 parserModelica.y . 180 G Additional Files 205 G.1 SCRIPT.mos . 205 G.2 Main.mo . 206 Glossary 209 Acronyms 211 xii CONTENTS List of Figures 2.1 Compiler Phases . .6 2.2 Compiler Front-End . .8 2.3 Parser components . 12 2.4 OpenModelica Environment [Fritzson et al., 2009] . 18 3.1 OMC simplified overall structure [Fritzson et al., 2009] . 24 3.2 OMC Language Grammars . 24 4.1 OMCCp (OpenModelica Compiler - Compiler) Lexer and Parser Generator . 34 4.2 OMCCp Lexer and Parser Generator Architecture Design . 36 4.3 OMC-Lexer design . 37 4.4 OMC-Parser design . 39 4.5 OMC-Parser LALR(1) . 40 5.1 OMCCp - Time Parsing . 63 xiii xiv LIST OF FIGURES List of Tables 2.1 LR(1) parsing table [Aho et al., 2006] . 14 2.2 LR(1) parsing table rearranged [Aho et al., 2006] . 14 2.3 LALR(1) parsing table [Aho et al., 2006] . 15 5.1 OMCCp Files Implementation . 60 5.2 Test Suite - Compiler . 63 5.3 OMCCp - Time Parsing . 63 xv xvi LIST OF TABLES Listings 2.1 MetaModelica uniontype . 19 2.2 MetaModelica matchcontinue . 20 2.3 MetaModelica list . 20 3.1 ANTLR grammar file structure . 25 3.2 Flex file structure . 27 3.3 Bison file structure . 29 4.1 Lexer.mo function scan . 37 4.2 Parser.mo function parse . 41 4.3 MultiTypedStack AstStack . 43 4.4 ParseCode.mo case reduce action . 43 4.5 ParseCode.mo function getAST . 44 4.6 Modifications in the Bison Epilogue . 46 4.7 Modifications in the Rules section in Bison . 47 4.8 List of semantic values of tokens .
Recommended publications
  • Scheme of Teaching and Examination for BE
    Scheme of Teaching and Examination for B.E (CS&E) SEMESTER: III Sl. Subject Course Title Teaching Credits Contact Marks Exam No. Code Department Hours Duration in hrs L T P TOTAL CIE SEE Total 1 MA310 Mathematics III Mathematics 4 0 0 4 4 50 50 100 03 2 CS310 Digital System CSE 4 0 1 5 6 50 50 100 03 Design 3 CS320 Discrete CSE 4 0 0 4 4 50 50 100 03 Mathematical Structures and Combinatorics 4 CS330 Computer CSE 4 0 0 4 4 50 50 100 03 Organization 5 CS340 Data Structures CSE 4 0 1 5 6 50 50 100 03 6 CS350 Object Oriented CSE 4 0 1 5 6 50 50 100 03 Programming with C++ Total Total 27 Total Marks 600 Credits Scheme of Teaching and Examination for B.E (CS&E) SEMESTER: IV Sl. Subject Course Title Teaching Credits Contact Marks Exam No. Code Department Hours Duration in hrs L T P TOTAL CIE SEE Total 1 MA410 Probability, Mathematics 4 0 0 4 4 50 50 100 03 Statistics and Queuing 2 CS410 Operating CSE 4 0 1 5 6 50 50 100 03 Systems 3 CS420 Design and CSE 4 0 1 5 6 50 50 100 03 Analysis of Algorithms 4 CS430 Theory of CSE 4 0 0 4 4 50 50 100 03 Computation 5 CS440 Microprocessors CSE 4 0 1 5 6 50 50 100 03 6 CS450 Data CSE 4 0 0 4 4 50 50 100 03 Communication Total Total 27 Total Marks 600 Credits Scheme of Teaching and Examination for B.E (CS&E) SEMESTER: V Sl.
    [Show full text]
  • CWI Scanprofile/PDF/300
    Centrum voor Wiskunde en lnformatica Centre for Mathematics and Computer Science J. Heering, P. Klint, J.G. Rekers Incremental generation of parsers , Computer Science/Department of Software Technology Report CS-R8822 May Biblk>tlleek Centrum ypor Wisl~unde en lnformatk:a Am~tel>dam The Centre for Mathematics and Computer Science is a research institute of the Stichting Mathematisch Centrum, which was founded on February 11, 1946, as a nonprofit institution aim­ ing at the promotion of mathematics, computer science, and their applications. It is sponsored by the Dutch Government through the Netherlands Organization for the Advancement of Pure Research (Z.W.0.). q\ ' Copyright (t:: Stichting Mathematisch Centrum, Amsterdam 1 Incremental Generation of Parsers J. Heering Department of Software Technology, Centre for Mathematics and Computer Science P.O. Box 4079, 1009 AS Amsterdam, The Netherlands P. Klint Department of Software Technology, Centre for Mathematics and Computer Science P.O. Box 4079, 1009 AS Amsterdam, The Netherlands and Programming Research Group, University of Amsterdam P.O. BOX 41882, 1009 DB Amsterdam, The Netherlands J. Rekers Department of Software Technology, Centre for Mathematics and Computer Science P.O. Box 4079, 1009 AB Amsterdam, The Netherlands A parser checks whether a text is a sentence in a language. Therefore, the parser is provided with the grammar of the language, and it usually generates a structure (parse tree) that represents the text according to that grammar. Most present-day parsers are not directly driven by the grammar but by a 'parse table', which is generated by a parse table generator. A table based parser wolks more efficiently than a grammar based parser does, and provided that the parser is used often enough, the cost of gen­ erating the parse table is outweighed by the gain in parsing efficiency.
    [Show full text]
  • B.E Computer Science and Engineering
    CURRICULUM B.E. – Computer Science and Engineering Regulations 2019 VISION MISSION “To become a center of excellence • To produce technocrats in the in Computer Science and industry and academia by Engineering and Research to create educating computer concepts and global leaders with holistic growth techniques. and ethical values for the industry • To facilitate the students to and academics.” trigger more creativity by applying modern tools and technologies in the field of computer science and engineering. • To inculcate the spirit of ethical values contributing to the welfare of the society. Department of Computer Science and Engineering Department of CSE, Francis Xavier Engineering College | Regulation 2019 2 Department of CSE, Francis Xavier Engineering College | Regulation 2019 5 B.E.-COMPUTER SCIENCE AND ENGINEERING (REGULATIONS 2019) CHOICE BASED CREDIT SYSTEM SUMMARY OF CREDIT DISTRIBUTION Range Of CREDITS PER SEMESTER Total S. TOTAL CREDITS CATEGORY Credits No CREDIT IN % I II III IV V VI VII VIII Min Max 1 HSS 3 2 3 8 4.5% 9 11 2 BS 12 4 4 4 24 14.5% 21 21 3 ES 8 11 3 22 13.9% 23 26 4 PC 13 17 10 11 8 59 35.75% 59 59 5 PE 6 6 6 6 24 14.5% 24 27 6 OE 3 3 3 3 12 7.3% 12 15 7 EEC 2 2 1 10 15 9.1% 12 15 TOTAL 23 18 20 23 22 22 21 16 165 100% - - BS - Basic Sciences ES - Engineering Sciences HSS - Humanities and Social Sciences PC - Professional Core PE - Professional Elective OE - Open Elective EEC - Employability Enhancement Course Department of CSE, Francis Xavier Engineering College | Regulation 2019 6 B.E.- COMPUTER SCIENCE AND ENGINEERING (REGULATIONS 2019) CHOICE BASED CREDIT SYSTEM I – VIII SEMESTERS CURRICULUM AND SYLLABI FIRST SEMESTER Code No.
    [Show full text]
  • Compiler Design
    CCOOMMPPIILLEERR DDEESSIIGGNN -- PPAARRSSEERR http://www.tutorialspoint.com/compiler_design/compiler_design_parser.htm Copyright © tutorialspoint.com In the previous chapter, we understood the basic concepts involved in parsing. In this chapter, we will learn the various types of parser construction methods available. Parsing can be defined as top-down or bottom-up based on how the parse-tree is constructed. Top-Down Parsing We have learnt in the last chapter that the top-down parsing technique parses the input, and starts constructing a parse tree from the root node gradually moving down to the leaf nodes. The types of top-down parsing are depicted below: Recursive Descent Parsing Recursive descent is a top-down parsing technique that constructs the parse tree from the top and the input is read from left to right. It uses procedures for every terminal and non-terminal entity. This parsing technique recursively parses the input to make a parse tree, which may or may not require back-tracking. But the grammar associated with it ifnotleftfactored cannot avoid back- tracking. A form of recursive-descent parsing that does not require any back-tracking is known as predictive parsing. This parsing technique is regarded recursive as it uses context-free grammar which is recursive in nature. Back-tracking Top- down parsers start from the root node startsymbol and match the input string against the production rules to replace them ifmatched. To understand this, take the following example of CFG: S → rXd | rZd X → oa | ea Z → ai For an input string: read, a top-down parser, will behave like this: It will start with S from the production rules and will match its yield to the left-most letter of the input, i.e.
    [Show full text]
  • Compiler Construction
    UNIVERSITY OF CAMBRIDGE Compiler Construction An 18-lecture course Alan Mycroft Computer Laboratory, Cambridge University http://www.cl.cam.ac.uk/users/am/ Lent Term 2007 Compiler Construction 1 Lent Term 2007 Course Plan UNIVERSITY OF CAMBRIDGE Part A : intro/background Part B : a simple compiler for a simple language Part C : implementing harder things Compiler Construction 2 Lent Term 2007 A compiler UNIVERSITY OF CAMBRIDGE A compiler is a program which translates the source form of a program into a semantically equivalent target form. • Traditionally this was machine code or relocatable binary form, but nowadays the target form may be a virtual machine (e.g. JVM) or indeed another language such as C. • Can appear a very hard program to write. • How can one even start? • It’s just like juggling too many balls (picking instructions while determining whether this ‘+’ is part of ‘++’ or whether its right operand is just a variable or an expression ...). Compiler Construction 3 Lent Term 2007 How to even start? UNIVERSITY OF CAMBRIDGE “When finding it hard to juggle 4 balls at once, juggle them each in turn instead ...” character -token -parse -intermediate -target stream stream tree code code syn trans cg lex A multi-pass compiler does one ‘simple’ thing at once and passes its output to the next stage. These are pretty standard stages, and indeed language and (e.g. JVM) system design has co-evolved around them. Compiler Construction 4 Lent Term 2007 Compilers can be big and hard to understand UNIVERSITY OF CAMBRIDGE Compilers can be very large. In 2004 the Gnu Compiler Collection (GCC) was noted to “[consist] of about 2.1 million lines of code and has been in development for over 15 years”.
    [Show full text]
  • Unit-5 Parsers
    UNIT-5 PARSERS LR Parsers: • The most powerful shift-reduce parsing (yet efficient) is: • LR parsing is attractive because: – LR parsing is most general non-backtracking shift-reduce parsing, yet it is still efficient. – The class of grammars that can be parsed using LR methods is a proper superset of the class of grammars that can be parsed with predictive parsers. LL(1)-Grammars ⊂ LR(1)-Grammars – An LR-parser can detect a syntactic error as soon as it is possible to do so a left-to-right scan of the input. – covers wide range of grammars. • SLR – simple LR parser • LR – most general LR parser • LALR – intermediate LR parser (look-head LR parser) • SLR, LR and LALR work same (they used the same algorithm), only their parsing tables are different. LR Parsing Algorithm: A Configuration of LR Parsing Algorithm: A configuration of a LR parsing is: • Sm and ai decides the parser action by consulting the parsing action table. (Initial Stack contains just So ) • A configuration of a LR parsing represents the right sentential form: X1 ... Xm ai ai+1 ... an $ Actions of A LR-Parser: 1. shift s -- shifts the next input symbol and the state s onto the stack ( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) è ( So X1 S1 ..Xm Sm ai s, ai+1 ...an $ ) 2. reduce Aβ→ (or rn where n is a production number) – pop 2|β| (=r) items from the stack; – then push A and s where s=goto[sm-r,A] ( So X1 S1 ... Xm Sm, ai ai+1 ..
    [Show full text]
  • 2017-18 MCA – I Integrated.Pdf
    North Maharashtra University, Jalgaon North Maharashtra University, Jalgaon (NAAC Accredited ‘A’ Grade University) FACULTY OF SCIENCE INTEGRATED MCA (I-MCA) Syllabus (With effect from July 2017-18) First Year I-MCA – ( Sem I & II ) w.e.f. AY 2017-18 Paper Semester-I Paper Semester-II Mathematical Foundations in CA-1.1 CA-2.1 Discrete Mathematics Computer Science CA-1.2 Computer & Internet Fundamentals CA-2.2 System Programming CA-1.3 Computer Organization & Architecture CA-2.3 Object Oriented Analysis & Design CA-1.4 Programming using C CA-2.4 Programming using C++ CA-1.5 Essentials of Web Designing CA-2.5 Data Structure – I CA-1.6 Lab on Programming using C CA-2.6 Lab on Programming using C++ CA-1.7 Lab on Essentials of Web Designing CA-2.7 Lab on Data Structure - I Second Year I-MCA – ( Sem III & IV ) Paper Semester-III Paper Semester-IV CA-3.1 Computer Networks CA-4.1 Basics of Accounting CA-3.2 Operating System – I CA-4.2 Operating System - II CA-3.3 System Analysis and Design CA-4.3 Network Security CA-3.4 Programming using C#.NET CA-4.4 Java Programming CA-3.5 Data Structure – II CA-4.5 Database Management System CA-3.6 Lab on Programming using C#.NET CA-4.6 Lab on Java Programming CA-3.7 Lab on Data Structure - II CA-4.7 Lab on DBMS Third Year I – MCA – ( Sem V & VI ) Paper Semester-V Paper Semester-VI CA-5.1 Theoretical Computer Science CA-6.1 Automata Theory and Computability CA-5.2 Software Engineering-I CA-6.2 Software Engineering-II CA-5.3 Computer Graphics CA-6.3 Advanced Data Base Management System CA-5.4 Advanced Java CA-6.4
    [Show full text]
  • Compiler Construction
    Compiler construction PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 10 Dec 2011 02:23:02 UTC Contents Articles Introduction 1 Compiler construction 1 Compiler 2 Interpreter 10 History of compiler writing 14 Lexical analysis 22 Lexical analysis 22 Regular expression 26 Regular expression examples 37 Finite-state machine 41 Preprocessor 51 Syntactic analysis 54 Parsing 54 Lookahead 58 Symbol table 61 Abstract syntax 63 Abstract syntax tree 64 Context-free grammar 65 Terminal and nonterminal symbols 77 Left recursion 79 Backus–Naur Form 83 Extended Backus–Naur Form 86 TBNF 91 Top-down parsing 91 Recursive descent parser 93 Tail recursive parser 98 Parsing expression grammar 100 LL parser 106 LR parser 114 Parsing table 123 Simple LR parser 125 Canonical LR parser 127 GLR parser 129 LALR parser 130 Recursive ascent parser 133 Parser combinator 140 Bottom-up parsing 143 Chomsky normal form 148 CYK algorithm 150 Simple precedence grammar 153 Simple precedence parser 154 Operator-precedence grammar 156 Operator-precedence parser 159 Shunting-yard algorithm 163 Chart parser 173 Earley parser 174 The lexer hack 178 Scannerless parsing 180 Semantic analysis 182 Attribute grammar 182 L-attributed grammar 184 LR-attributed grammar 185 S-attributed grammar 185 ECLR-attributed grammar 186 Intermediate language 186 Control flow graph 188 Basic block 190 Call graph 192 Data-flow analysis 195 Use-define chain 201 Live variable analysis 204 Reaching definition 206 Three address
    [Show full text]
  • Principled Procedural Parsing Nicolas Laurent
    Principled Procedural Parsing Nicolas Laurent August 2019 Thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Applied Science in Engineering Institute of Information and Communication Technologies, Electronics and Applied Mathematics (ICTEAM) Louvain School of Engineering (EPL) Université catholique de Louvain (UCLouvain) Louvain-la-Neuve Belgium Thesis Committee Prof. Kim Mens, Advisor UCLouvain/ICTEAM, Belgium Prof. Charles Pecheur, Chair UCLouvain/ICTEAM, Belgium Prof. Peter Van Roy UCLouvain/ICTEAM Belgium Prof. Anya Helene Bagge UIB/II, Norway Prof. Tijs van der Storm CWI/SWAT & UG, The Netherlands Contents Contents3 1 Introduction7 1.1 Parsing............................7 1.2 Inadequacy: Flexibility Versus Simplicity.........8 1.3 The Best of Both Worlds.................. 10 1.4 The Approach: Principled Procedural Parsing....... 13 1.5 Autumn: Architecture of a Solution............ 15 1.6 Overview & Contributions.................. 17 2 Background 23 2.1 Context Free Grammars (CFGs).............. 23 2.1.1 The CFG Formalism................. 23 2.1.2 CFG Parsing Algorithms.............. 25 2.1.3 Top-Down Parsers.................. 26 2.1.4 Bottom-Up Parsers.................. 30 2.1.5 Chart Parsers..................... 33 2.2 Top-Down Recursive-Descent Ad-Hoc Parsers....... 35 2.3 Parser Combinators..................... 36 2.4 Parsing Expression Grammars (PEGs)........... 39 2.4.1 Expressions, Ordered Choice and Lookahead... 39 2.4.2 PEGs and Recursive-Descent Parsers........ 42 2.4.3 The Single Parse Rule, Greed and (Lack of) Ambi- guity.......................... 43 2.4.4 The PEG Algorithm................. 44 2.4.5 Packrat Parsing.................... 45 2.5 Expression Parsing...................... 47 2.6 Error Reporting........................ 50 2.6.1 Overview....................... 51 2.6.2 The Furthest Error Heuristic...........
    [Show full text]
  • UNIT – IV PARSERS Role of Parsers, Classification of Parsers: Top Down Parsers- Recursive Descent Parser and Predictive Parser
    UNIT – IV PARSERS Role of parsers, Classification of Parsers: Top down parsers- recursive descent parser and predictive parser. Bottom up Parsers – Shift Reduce: SLR, CLR and LALR parsers. Error Detection and Recovery in Parser. YACC specification and Automatic construction of Parser (YACC). YACC • What is YACC ? – developed by Stephen C. Johnson. – Yacc (for "yet another compiler compiler." ) is the standard parser generator for the Unix operating system. – An open source program, yacc generates code for the parser in the C programming language. – It is a Look Ahead Left-to-Right (LALR) parser generator, 2 How YACC Works y.tab.h YACC source (*.y) yacc y.tab.c y.output (1) Parser generation time y.tab.c C compiler/linker a.out (2) Compile time Abstract Token stream a.out Syntax Tree (3) Run time 3 YACC File Format %{ C declarations %} yacc declarations %% Grammar rules %% Additional C code – Comments enclosed in /* ... */ may appear in any of the sections. 4 Definitions Section %{ #include <stdio.h> #include <stdlib.h> %} It is a terminal %token ID NUM %start expr 5 YACC Declaration Summary `%start' Specify the grammar's start symbol `%union' Declare the collection of data types that semantic values may have `%token' Declare a terminal symbol (token type name) with no precedence or associativity specified `%type' Declare the type of semantic values for a nonterminal symbol 6 YACC Declaration `%right' Declare a terminal symbol (token type name) that is right-associative `%left' Declare a terminal symbol (token type name) that is left-associative `%nonassoc' Declare a terminal symbol (token type name) that is nonassociative (using it in a way that would be associative is a syntax error, ex: x op.
    [Show full text]
  • B. Tech. W. E. F. 2015-16 Admitted Batch
    REGULATIONS AND SYLLABUS of Bachelor of Technology in Computer Science and Engineering (w.e.f 2015-16 admitted batch) A University Committed to Excellence . B.Tech. in Computer Science and Engineering REGULATIONS (w.e.f. 2015-16 admitted batch) 1. ADMISSION 1.1 Admission into B.Tech. in Computer Science and Engineering program of GITAM University is governed by GITAM University admission regulations. 2. ELIGIBILITY CRITERIA 2.1 A first class in 10+2 or equivalent examination approved by GITAM University with Physics, Chemistry and Mathematics. 2.2 Admission into B.Tech. will be based on an All India Entrance Test (GAT) conducted by GITAM University and the rule of reservation, wherever applicable, will be followed. 3. CHOICE BASED CREDIT SYSTEM 3.1 Choice Based Credit System (CBCS) is introduced with effect from the admitted Batch of 2015-16 based on UGC guidelines in order to promote: • Student centered learning • Cafeteria approach • Students to learn courses of their choice • Learning at their own pace • Interdisciplinary learning 3.2 Learning goals/objectives and outcomes are specified, focusing on what a student should be able to do at the end of the program. 4. STRUCTURE OF THE PROGRAM 4.1 The Program consists of i) Foundation Courses (compulsory) which give general exposure to a student in communication and subject related area. ii) Core Courses (compulsory). iii) Discipline centric electives which a) are supportive to the discipline Programme b) give expanded scope of the subject} Electives c) give interdisciplinary exposure} Interdisciplinary d) nurture the student skills Electives 1 iv) Open electives are of general nature either related or unrelated to the discipline.
    [Show full text]
  • MCA Part III Paper- XIX Topic: Parsing Prepared By
    MCA Part III Paper- XIX Topic: Parsing Prepared by: Dr. Kiran Pandey School of Computer science Email-Id: [email protected] INTRODUCTION The output of syntax analysis is a parse tree. Parse tree is used in the subsequent phases of compilation. This process of analyzing the syntax of the language is done by a module in the compiler called parser. The process of verifying whether an input string matches the grammar of the language is called parsing. The syntax analyzer gets the string of tokens from the lexical analyzer. It then verifies the syntax of input string by verifying whether the input string can be derived from the grammar or not. If the input string is derived from the grammar then the syntax is correct otherwise it is not derivable and the syntax is wrong. The parser will report syntactical errors in a manner that is easily understood by the user. It has procedures to recover from these errors and to continue parsing action. The output of a parser is a parse tree. The figure below shows the position of a parser in the compiler. Figure 1: Position of parser in a Compiler. TYPES OF PARSING Syntax analyzers follow production rules defined by means of context-free grammar. The way the production rules are implemented (derivation) divides parsing into two types: top-down parsing and bottom-up parsing. Figure 2: Types of Parsing TOP DOWN PARSING, We have learnt in the last chapter that the top-down parsing technique parses the input, and starts constructing a parse tree from the root node gradually moving down to the leaf nodes.
    [Show full text]