Syntax (Pre Lecture)

Total Page:16

File Type:pdf, Size:1020Kb

Syntax (Pre Lecture) Syntax (Pre Lecture) Dr. Neil T. Dantam CSCI-400, Colorado School of Mines Spring 2021 Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 1 / 36 Introduction Introduction Outcomes I Syntax: what programs we can write I Know basic definitions of formal / what the language \looks like" language theory I Semantics: what these programs I Understand parse trees and abstract means / what the language does syntax trees (more later in the course) I Design grammars for common I concrete syntax { human-readable programming language constructs I abstract syntax { encoded for use by interpreter/compiler I Formal language: mathematical basis to represent and analyze syntax Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 2 / 36 Front-end text Phases of Interpretation/Compilation Front-end text Lexical Analysis I Analysis: Front-end terminal sequence I Lexical: convert text to terminals, Syntax Analysis aka lexing, scanning abstract syntax tree I Syntax: convert terminals to syntax tree aka parsing Semantic Analysis I Semantic: check or infer types annotated syntax tree aka type checking, type inference I Synthesis: Back-end I Compiler: Construct machine code I Interpreter: Execute the program Back-end machine code Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 3 / 36 Phases of Analysis ``foo+bar*bif'' Lexical Analysis [foo; +; bar; ∗; bif] Syntax Analysis + foo ∗ bar bif + : float foo : float ∗ : int Semantic Analysis bar : int bif : int Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 4 / 36 Automatically Generating Code for Analysis Compiler Compilers Describe terminals/syntax using formal I Lexical Analysis language theory Scanner: Regular Expressions Regular Scanner I Scanner I Parser: Grammar Expressions Generator I Automatically generate code Syntax Analysis Example Parser I Scanner Generators: Lex / Flex, Ragel Grammar Parser Generator I Parser Generators: YACC / Bison I Combined: JavaCC, ANTLR Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 5 / 36 Formal Language Theory Outline Formal Language Theory Grammars Definition Grammars for the Functional Programs Ambiguity and Precedence Abstract Syntax Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 6 / 36 Formal Language Theory Why use formal language? Overview Example I Some program text is \valid" Valid I And some is \invalid" if true then false else true I Formal language lets us: I I Precisely define the program text I 1 + 2 * 3 that is valid/invalid I Automatically recognize (parse) Invalid program text I if true else then false true I (Also, profound implications on what computers can do (CSCI-561)) I 1 + * 3 Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 7 / 36 Formal Language Theory Sets Review Notation I S = fs0; s1; s2;:::; sng Definition (Set) I Empty Set: fg = ; An unordered collection of objects without repetition I Set Membership: x 2 S x 2= S | {z } | {z } x in S x not in S Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 8 / 36 Formal Language Theory Sequences Definition (Sequence) Example An ordered list of objects. (1; 2; 3; 5; 8;:::) Definition (Tuple) Example A sequence of finite length. I k-tuple: An tuple of length k I 3-tuple: (2; 4; 8) I pair: An 2-tuple I pair-tuple: (a; b) Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 9 / 36 Formal Language Theory Strings Definition (Symbol) Example (Symbols) An abstract, primitive, atomic \thing" 0, 1, a, x, foo, bar, +, -, if, match Definition (Alphabet) Example (Alphabets) A non-empty, finite set of symbols I ΣB = f0; 1g I ΣE = fa; b; c; dg I ΣC = fif; match; case; +; −} Definition (String) Example (Strings) A sequence over some alphabet I ΓB = (1; 0; 1; 0; 1; 0) I ΓE = (h; e; l; l; o) I ΓC = (3; +; x) Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 10 / 36 Formal Language Theory Formal Languages Definition (Formal Language) A formal language is a set of strings. Representation I How would you represent: I The language (set) of arithmetic expressions? I The language (set) of well-formed XML documents? I The language (set) of valid variable names in C? I The language (set) of C programs? Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 11 / 36 Grammars Outline Formal Language Theory Grammars Definition Grammars for the Functional Programs Ambiguity and Precedence Abstract Syntax Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 12 / 36 Grammars Definition Overview of Grammars Example (Conditional Expression) Overview I Programs are written as text if e1 then e2 else e3 I There is a structure to the program A conditional consists of the following I Grammars represent this structure sequence: 1. keyword \if" Grammar 2. an expression 3. the keyword \then" cond ! \if" exp \then" exp \else" exp 4. an expression 5. the keyword \else" 6. an expression Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 13 / 36 Grammars Definition Terminal and Nonterminal Symbols Example Terminals and Nonterminals Grammar Terminals: The alphabet of the language. Atomic. cond ! \if" exp \then" exp \else" exp Nonterminals: Decompose into multiple exp ! \true" j \false" terminals and nonterminals. Non-atomic. I Terminals: if, then, else, true, false I Nonterminals: cond, exp Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 14 / 36 Grammars Definition Context-Free Grammars Definition Example A context-free grammar G is the Grammar tuple G = (V ; T ; P; S), where: cond ! \if" exp \then" exp \else" exp I V is a finite set of nonterminals exp ! \true" j \false" I T is a finite set of terminals I P is a finite set of productions of Elements form V ! X1;:::; X1, V = fcond; expg where each Xi 2 V [ T I T = fif; then; else; true; falseg I S 2 V is the start symbol I I P = fcond ! \if" exp \then" exp \else" exp; exp ! \True"; exp ! \False"g I S 2 cond Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 15 / 36 Grammars Definition What's \context-free?" no surrounding symbols v ! x0 x1 ::: xn |{z} | {z } left-hand side right-hand side Nonterminals' expansion is independent of surrounding context Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 16 / 36 Grammars Definition What's not \context-free?" Non-context-free languages Counterexample (C/C++) I Most programming language syntax is context-free (or close) /∗ Context: ∗ I sxa type or variable? C/C++ are almost context free I ∗/ I In practice: integrate parsing and scanning to distinguish type x ∗ y ; // declaration or and variable names // multiplication? f ( ( x )∗ y ) ; // multiplication or // deref. and cast? Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 17 / 36 Grammars Definition Backus-Naur Form (BNF) Example LATEX hcondi ! \if"hexpi\then"hexpi\else"hexpi hexpi ! \true" j \false" Plain Text <cond> ::= "if" <exp> "then" <exp> "else" <exp> <exp> ::= "true" | "false" Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 18 / 36 Grammars Definition What can we do with a grammar? Definition (Generation) Definition (Recognition) Generation or uses a grammar to produce Recognition or parsing determines if an a string in its language through a input string is in the language of a sequence of substitutions or rewrites called grammar. a derivation. Equivalently, parsing determines whether a derivation exists from the grammar's start symbol to the input string. Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 19 / 36 Grammars Definition Derivation Overview Example Input: Grammar Output: String of terminals in the hcondi ! \if"hexpi\then"hexpi\else"hexpi Grammar's language hexpi ! \true" j \false" Approach: Rewriting 1. Begin with the start symbol 2. Find a nonterminal in the I hcondi current in the current string and I \if "hexpi\then"hexpi\else"hexpi rewrite with a right-hand side I \if "\true"\then"hexpi\else"hexpi 3. Repeat 2 until no nonterminals I \if "\true"\then"\false"\else"hexpi remain. I \if "\true"\then"\false"\else"\true" Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 20 / 36 Grammars Definition Parsing Overview Input: Grammar, String Output: Is the string in the grammar's language? Approach: Construct a derivation for the string, corresponding to a parse tree Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 21 / 36 Grammars Definition Parse Tree Parse Trees Grammar Leaves: Terminals I cond ! \if" exp \then" exp \else" exp Nodes: Nonterminals I exp ! \true" j \false" I Edges: Productions Text Parse Tree cond if true then false if exp then exp else exp else true true false true Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 22 / 36 Grammars Grammars for the Functional Programs Example: Lambda Calculus Grammar (λa : a) b Grammar Parse Tree hexpi hexpi ! hsymi hexpi hexpi j \λ"hsymi\:"hexpi j hexpihexpi \(" hexpi \)" hsymi j \("hexpi\)" hsymi \:" hexpi hsymi ! \a" j \b" j \c" j ::: \λ" \b" \a" hsymi \a" Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 23 / 36 Grammars Grammars for the Functional Programs Example: Lambda Calculus Grammar (λa : a) b (continued) Derivation Parse Tree hexpi hexpihexpi hexpi \("hexpi\)"hexpi \("\λ"hsymi\:"hexpi\)"hexpi hexpi hexpi \("\λ"\a"\:"hexpi\)"hexpi \("\λ"\a"\:"hsymi\)"hexpi \(" hexpi \)" hsymi \("\λ"\a"\:"\a"\)"hexpi \λ" hsymi \:" hexpi \b" \("\λ"\a"\:"\a"\)"hsymi \("\λ"\a"\:"\a"\)"\b" \a" hsymi \a" Dantam (Mines CSCI-400) Syntax (Pre Lecture) Spring 2021 24 / 36 hexpi hexpi hexpi hsymi \λ" hsymi \:" hexpi \a" \b" \(" hexpi \)" hexpi hexpi hsymi hsymi \b" \c" Grammars Grammars for the Functional Programs Exercise: Lambda Calculus Grammar a λb : (b c) Grammar Parse Tree hexpi ! hsymi j \λ"hsymi\:"hexpi j hexpihexpi
Recommended publications
  • Ragel State Machine Compiler User Guide
    Ragel State Machine Compiler User Guide by Adrian Thurston License Ragel version 6.3, August 2008 Copyright c 2003-2007 Adrian Thurston This document is part of Ragel, and as such, this document is released under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. Ragel is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PUR- POSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with Ragel; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA i Contents 1 Introduction 1 1.1 Abstract...........................................1 1.2 Motivation.........................................1 1.3 Overview..........................................2 1.4 Related Work........................................4 1.5 Development Status....................................5 2 Constructing State Machines6 2.1 Ragel State Machine Specifications............................6 2.1.1 Naming Ragel Blocks...............................7 2.1.2 Machine Definition.................................7 2.1.3 Machine Instantiation...............................7 2.1.4 Including Ragel Code...............................7 2.1.5 Importing Definitions...............................7 2.2 Lexical Analysis of a Ragel Block.............................8
    [Show full text]
  • Mda13:Hpg-Variant-Developers.Pdf
    Overview Global schema: Binaries HPG Variant VCF Tools HPG Variant Effect HPG Variant GWAS Describing the architecture by example: GWAS Main workflow Reading configuration files and command-line options Parsing input files Parallelization schema How to compile: Dependencies and application Hacking HPG Variant Let's talk about... Global schema: Binaries HPG Variant VCF Tools HPG Variant Effect HPG Variant GWAS Describing the architecture by example: GWAS Main workflow Reading configuration files and command-line options Parsing input files Parallelization schema How to compile: Dependencies and application Hacking HPG Variant Binaries: HPG Variant VCF Tools HPG Variant VCF Tools preprocesses VCF files I Filtering I Merging I Splitting I Retrieving statistics Binaries: HPG Variant Effect HPG Variant Effect retrieves information about the effect of mutations I Querying a web service I Uses libcurl (client side) and JAX-RS/Jersey (server side) I Information stored in CellBase DB Binaries: HPG Variant GWAS HPG Variant GWAS conducts genome-wide association studies I Population-based: Chi-square, Fisher's exact test I Family-based: TDT I Read genotypes from VCF files I Read phenotypes and familial information from PED files Let's talk about... Global schema: Binaries HPG Variant VCF Tools HPG Variant Effect HPG Variant GWAS Describing the architecture by example: GWAS Main workflow Reading configuration files and command-line options Parsing input files Parallelization schema How to compile: Dependencies and application Hacking HPG Variant Architecture: Main workflow
    [Show full text]
  • UNIVERSITY of TRENTO Degree Course in Computer
    UNIVERSITY OF TRENTO Department of Information Engineering and Computer Science Degree course in Computer Science Final Thesis GHERKIN* AND CUCUMBER* ANEWTESTCASEPATHCOMPOSITIONAPPROACH TO TESTING RUBY ON RAILS WEB APPLICATIONS Supervisor: Graduant: Prof. Maurizio Marchese Roberto Zen Co-Supervisor: Prof. Adolfo Villafiorita Academic year 2013-2014 Ai miei genitori A mio fratello Acknowledgements I would like to thank my supervisor Maurizio Marchese for his encourage- ment and support during the writing of this composition. My work would have never been carried out without the help of the whole ICT4G Unit of Fondazione Bruno Kessler. In particular, I would like to thank Prof. Adolfo Villafiorita for his help and his patience with me during these last two years. You are a mentor for me. Thanks also to Prof. Alberto Montresor for the useful discussion we had. Thanks to my family for supporting me during my studies. I want to sincerely express my gratitude and thanks to my best friends: Stefano, Sara, Sveva and Antonio. I also acknowledge my roommates and friends: Victor, Damiano and Diego. I would like also to thank all my friends, particularly Mirko Za↵aroni, Gio- vanni Bonetta, Andrea Sosi, Giovanni De Francesco, Giulio Fornasaro, Luca Zamboni, Amedeo Calafiore, Andrea Balzan, Chiara Salvagno, Lucia Pilat, Anna Giamosa and Federica Stetka. Contents Abstract iv 1 Introduction 1 1.1 Motivations . 3 1.2 Goals . 3 1.3 Results............................... 3 1.4 Outline .............................. 4 2 State of the art 5 2.1 Introduction............................ 5 2.2 Ruby and its approach to testing . 5 2.3 RSpec and Capybara . 6 2.4 Gherkin .............................
    [Show full text]
  • Ragel State Machine Compiler User Guide
    Ragel State Machine Compiler User Guide by Adrian Thurston License Ragel version 6.6, Dec 2009 Copyright c 2003-2007 Adrian Thurston This document is part of Ragel, and as such, this document is released under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. Ragel is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PUR- POSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with Ragel; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA i Contents 1 Introduction 1 1.1 Abstract...........................................1 1.2 Motivation.........................................1 1.3 Overview..........................................2 1.4 Related Work........................................4 1.5 Development Status....................................5 2 Constructing State Machines6 2.1 Ragel State Machine Specifications............................6 2.1.1 Naming Ragel Blocks...............................7 2.1.2 Machine Definition.................................7 2.1.3 Machine Instantiation...............................7 2.1.4 Including Ragel Code...............................7 2.1.5 Importing Definitions...............................7 2.2 Lexical Analysis of a Ragel Block.............................8 2.3 Basic Machines.......................................8 2.4 Operator Precedence.................................... 11 2.5 Regular Language Operators............................... 11 2.5.1 Union........................................ 12 2.5.2 Intersection..................................... 12 2.5.3 Difference...................................... 13 2.5.4 Strong Difference.................................. 13 2.5.5 Concatenation................................... 14 2.5.6 Kleene Star....................................
    [Show full text]
  • Spirit 2.1 Joel De Guzman Hartmut Kaiser Copyright © 2001-2009 Joel De Guzman, Hartmut Kaiser
    Spirit 2.1 Joel de Guzman Hartmut Kaiser Copyright © 2001-2009 Joel de Guzman, Hartmut Kaiser Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) Table of Contents Preface ................................................................................................................................................................. 3 What's New .......................................................................................................................................................... 6 Introduction .......................................................................................................................................................... 8 Structure ............................................................................................................................................................. 13 Include ....................................................................................................................................................... 13 Abstracts ............................................................................................................................................................ 15 Syntax Diagram ........................................................................................................................................... 15 Parsing Expression Grammar .........................................................................................................................
    [Show full text]
  • Writing Parsers Like It Is 2017
    Writing parsers like it is 2017 Pierre Chifflier1 and Geoffroy Couprie2 [email protected] [email protected] 1 ANSSI 2 Clever Cloud Abstract. Despite being known since a long time, memory violations are still a very important cause of security problems in low-level programming languages containing data parsers. We address this problem by proposing a pragmatic solution to fix not only bugs, but classes of bugs. First, using a fast and safe language such as Rust, and then using a parser combinator. We discuss the advantages and difficulties of this solution, and we present two cases of how to implement safe parsers and insert them in large C projects. The implementation is provided as a set of parsers and projects in the Rust language. 1 Introduction 1.1 Manipulating data and related problems In 2016, like every year for a long time, memory corruption bugs have been one of the first causes of vulnerabilities of compiled programs [2]. When looking at the C programming language, many errors lead to memory corruption: buffer overflow, use after free, double free, etc. Some of these issues can be complicated to diagnose, and the consequence is that a huge quantity of bugs is hidden in almost all C software. Any software manipulating untrusted data is particularly exposed: it needs to parse and interpret data that can be controlled by the attacker. Unfortunately, data parsing is often done in a very unsafe way, especially for network protocols and file formats. For example, many bugs were discovered in media parsing libraries in Android [12], leading to the possible remote exploitation of all devices by a simple MMS message.
    [Show full text]
  • Ragel State Machine Compiler User Guide
    Ragel State Machine Compiler User Guide by Adrian Thurston License Ragel version 5.16, November 2006 Copyright c 2003, 2004, 2005, 2006 Adrian Thurston This document is part of Ragel, and as such, this document is released under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. Ragel is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with Ragel; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA i Contents 1 Introduction 1 1.1 Abstract . 1 1.2 Motivation . 1 1.3 Overview . 2 1.4 Related Work . 4 1.5 Development Status . 5 2 Constructing State Machines 7 2.1 Ragel State Machine Specifications . 7 2.1.1 Naming Ragel Blocks . 7 2.1.2 Including Ragel Code . 8 2.1.3 Machine Definition . 8 2.1.4 Machine Instantiation . 8 2.2 Lexical Analysis of an FSM Specification . 9 2.3 Basic Machines . 9 2.4 Operator Precedence . 12 2.5 Regular Language Operators . 12 2.5.1 Union . 13 2.5.2 Intersection . 14 2.5.3 Difference . 14 2.5.4 Strong Difference . 15 2.5.5 Concatenation . 15 2.5.6 Kleene Star . 16 2.5.7 One Or More Repetition .
    [Show full text]
  • Ragel State Machine Compiler User Guide
    Ragel State Machine Compiler User Guide by Adrian Thurston License Ragel version 6.9, Oct 2014 Copyright c 2003-2007 Adrian Thurston This document is part of Ragel, and as such, this document is released under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. Ragel is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PUR- POSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with Ragel; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA i Contents 1 Introduction 1 1.1 Abstract...........................................1 1.2 Motivation.........................................1 1.3 Overview..........................................2 1.4 Related Work........................................4 1.5 Development Status....................................5 2 Constructing State Machines6 2.1 Ragel State Machine Specifications............................6 2.1.1 Naming Ragel Blocks...............................7 2.1.2 Machine Definition.................................7 2.1.3 Machine Instantiation...............................7 2.1.4 Including Ragel Code...............................7 2.1.5 Importing Definitions...............................7 2.2 Lexical Analysis of a Ragel Block.............................8 2.3 Basic Machines.......................................8 2.4 Operator Precedence.................................... 11 2.5 Regular Language Operators............................... 11 2.5.1 Union........................................ 12 2.5.2 Intersection..................................... 12 2.5.3 Difference...................................... 13 2.5.4 Strong Difference.................................. 13 2.5.5 Concatenation................................... 14 2.5.6 Kleene Star....................................
    [Show full text]
  • Ragel State Machine Compiler User Guide
    Ragel State Machine Compiler User Guide by Adrian Thurston License Ragel version 5.21, May 2007 Copyright c 2003-2007 Adrian Thurston This document is part of Ragel, and as such, this document is released under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. Ragel is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PUR- POSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with Ragel; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA i Contents 1 Introduction 1 1.1 Abstract . 1 1.2 Motivation . 1 1.3 Overview . 2 1.4 Related Work . 4 1.5 Development Status . 5 2 Constructing State Machines 6 2.1 Ragel State Machine Specifications . 6 2.1.1 Naming Ragel Blocks . 7 2.1.2 Including Ragel Code . 7 2.1.3 Importing Definitions . 7 2.1.4 Machine Definition . 7 2.1.5 Machine Instantiation . 7 2.2 Lexical Analysis of a Ragel Block . 8 2.3 Basic Machines . 8 2.4 Operator Precedence . 10 2.5 Regular Language Operators . 11 2.5.1 Union . 12 2.5.2 Intersection . 12 2.5.3 Difference . 13 2.5.4 Strong Difference . 13 2.5.5 Concatenation . 14 2.5.6 Kleene Star .
    [Show full text]
  • Kleene Meets Church
    FACULTY OF SCIENCE UNIVERSITY OF COPENHAGEN Kleene Meets Church Ordered Finite Action Transducers for High-Performance Stream Processing Kristoffer Aalund Søholm Sebastian Paaske Tørholm July 17, 2015 Abstract Efficient methods for regular string matching are well known, while the problem of regular string transduction (rewriting) is less explored. We introduce ordered finite action transducers (OFAT), building on the theory of streaming string transducers (SST) to produce an efficient two-phase transducer with good streaming behavior, that allows for execution of arbitrarily complex actions along the parsed path. We describe Kleenex, a programming language for expressing string transductions, and introduce a formalization to an OFAT with a limited subset of actions for which we can prove a worst case linear transduction time for fixed size automata. We also describe repg, an implementation of the Kleenex language, and its compilation process. In use cases we achieve good performance characteristics compared to both similar tools such as DReX and Ragel, as well as related tools such as RE2, PCRE and other regular expression libraries. Thesis supervisor: Fritz Henglein 1 2 Acknowledgements We would like to thank our thesis supervisor Fritz Henglein, as well as Niels Bjørn Bugge Grathwohl and Ulrik Rasmussen, Ph.D. students associated with the KMC project, for their excellent supervision and support during the entire project. We would also like to thank Mathias Bundgaard Svensson, Rasmus Wriedt Larsen, and René Løwe Jacobsen for their help with proofreading our thesis. Contents Contents3 List of Figures5 List of Tables6 List of Theorems7 1 Introduction8 2 Preliminaries 11 2.1 Regular expressions........................... 11 2.2 Finite automata...........................
    [Show full text]
  • Sample Chapter
    TABLE OF CONTENT 1. Table of Content 2. Introduction 1. Summary 2. About The Author 3. Before We Begin 3. Overview 1. The Four Parts of a Language 2. Meet Awesome: Our Toy Language 4. Lexer 1. Lex (Flex) 2. Ragel 3. Python Style Indentation For Awesome 4. Do It Yourself I 5. Parser 1. Bison (Yacc) 2. Lemon 3. ANTLR 4. PEGs 5. Operator Precedence 6. Connecting The Lexer and Parser in Awesome 7. Do It Yourself II 6. Runtime Model 1. Procedural 2. Class-based 3. Prototype-based 4. Functional 5. Our Awesome Runtime 6. Do It Yourself III 7. Interpreter 1. Evaluating The Nodes in Awesome 2. Do It Yourself IV 8. Virtual Machine 1. Byte-code 2. The Stack 3. Prototyping a VM in Ruby 9. Compilation 1. Compiling to Byte-code 2. Compiling to Machine Code 10. Mio, a minimalist homoiconic language 1. Homoicowhat? 2. Messages all the way down 3. The Runtime 4. Implementing Mio in Mio 5. But it’s ugly 11. Going Further 1. Homoiconicity 2. Self-Hosting 3. What’s Missing? 12. Resources 1. Books & Papers 2. Events 3. Forums and Blogs 4. Classes 5. Interesting Languages 13. Farewell! 14. Solutions to Do It Yourself 1. Solutions to Do It Yourself I 2. Solutions to Do It Yourself II 3. Solutions to Do It Yourself III 4. Solutions to Do It Yourself IV Revision #5, Published June 2013. Cover background image © Asja Boros Content of this book is © Marc-André Cournoyer. All right reserved. This eBook copy is for a single user.
    [Show full text]
  • Bpaste Documentation Release 1.0.0
    bpaste Documentation Release 1.0.0 Clement Bramy Sep 10, 2017 Contents 1 Introduction 3 2 Purpose of the bpaste tool 5 3 Overview 7 4 Submitting code snippet 9 5 Listing supported languages 11 6 List of potential improvements 13 7 List of user requested features 15 8 Indices and tables 17 i ii bpaste Documentation, Release 1.0.0 Contents: Contents 1 bpaste Documentation, Release 1.0.0 2 Contents CHAPTER 1 Introduction The purpose of this of this document is to describe how to use the bpaste utility tool and the options that can be used. The improvements section also provide a list of improvements that are planned for the tool and requests from potential users. 3 bpaste Documentation, Release 1.0.0 4 Chapter 1. Introduction CHAPTER 2 Purpose of the bpaste tool The bpaste tool allows to easily submit a code file to the https://bpaste.net snippet website. It does so using an HTTP POST request and providing the requested arguments to it, on top of the content of the submitted file. 5 bpaste Documentation, Release 1.0.0 6 Chapter 2. Purpose of the bpaste tool CHAPTER 3 Overview The bpaste tool has two main commands upload and list, their purpose and use is described in the below sections.: usage: bpaste [-h] [-v] {upload,list}... optional arguments: -h,--help show this help message and exit -v,--version displays the version of this program subcommands: {upload,list} upload upload the provided file's content to http://www.bpaste.net list displays the list of supported languages 7 bpaste Documentation, Release 1.0.0 8 Chapter 3.
    [Show full text]