A PEG-Based Macro System for Java

Total Page:16

File Type:pdf, Size:1020Kb

A PEG-Based Macro System for Java Université Catholique de Louvain Ecole Polytechnique de Louvain INGI Department A PEG-Based Macro System for Java A thesis submitted to the Université Catholique de Louvain in accordance with the requirements of the degree of Master in Computer Science (120 credits). Author: Nicolas Laurent Supervisor: Pr. Peter Van Roy Readers: Pr. Kim Mens & Yves Jaradin Louvain-la-Neuve, academic year 2012 - 2013 - 2 - Abstract Programming is abstracting. As such, programming languages supply us with abstraction facilities such as functions and classes. One such facility is the macro. A macro defines new language constructs, to be translated at compile-time into existing constructs. Macros permit the abstraction of patterns that cannot be abstracted by the usual abstraction facilities. We introduce caxap, a framework that adds a Lisp-like macro system on top of Java, a programming language with a complex grammar and lacking the property of homoiconicity. Our aim is to be as general as possible, allowing arbitrary syntax to be added to the language, and arbitrary computation to be done to translate the new syntax to the old. To that end, we use parsing expression grammars (PEGs) as tools to reason about and manipulate syntax. - 2 - Acknowledgements I would first like to thank my friends, which have helped me keep my sanity and my motivation during all these months. I’m thinking of Arnaud Beelen, Adrien Bibal, Etienne Martel, Anthony Popolo Cagnisi, Renaud Proye, Valentin Vansteenberghe and Romain Van Vooren. My thanks also extend to my family, which has always supported me. I would also like to thank my promoter, Peter Van Roy, for the freedom with which he has entrusted me. I hope I used it well. I’m also grateful to all the professors and teaching assistants which imparted their knowledge to me during these five years. - 2 - It is impossible to dissociate language from science or science from language, because every natural science always involves three things: the sequence of phenomena on which the science is based, the abstract concepts which call these phenomena to mind, and the words in which the concepts are expressed. To call forth a concept, a word is needed; to portray a phenomenon, a concept is needed. All three mirror one and the same reality. —Antoine Lavoisier, 1789 - 4 - Contents 1 Macros 9 1.1 Generalities......................................9 1.2 Examples....................................... 10 1.3 Matching Input and Defining Context....................... 11 1.4 Mapping an Input Sequence to its Replacement.................. 12 2 Motivation 13 2.1 Macros in the Wild.................................. 13 2.2 The Expressiveness of Macros............................ 14 2.3 Abstracting More................................... 14 2.4 Meta-Programming and Source Rewriting..................... 16 2.5 caxap: Source Rewriting Made Practical...................... 16 2.6 Targeting the Java Programming Language.................... 17 2.7 Meta-Abstraction................................... 17 2.8 Examples....................................... 18 3 Parsing Expression Grammars 19 3.1 Context Free Grammars (CFGs).......................... 19 3.1.1 CFG Basics.................................. 19 3.1.2 Extended Notations............................. 20 3.1.3 Generated Language & Nonterminal Expansion.............. 21 3.1.4 Parsing CFGs................................. 21 3.2 PEG as a CFG Extension.............................. 22 3.2.1 Ordered Choice and Single Parse Rule................... 23 3.2.2 Lookahead Operators............................ 23 3.2.3 Philosophical Differences........................... 23 3.3 Practical Implication of PEGs’ Specificities.................... 24 3.3.1 Ambiguity & Language Hiding....................... 24 3.3.2 Left-Recursion................................ 24 3.3.3 Backtracking & Greed............................ 24 3.3.4 Scannerless Parsing............................. 25 3.3.5 Memoization................................. 26 3.4 Further Comparison of PEGs and CFGs...................... 26 3.4.1 Reasoning about Grammars......................... 26 3.4.2 Expressivity.................................. 27 3.4.3 Grammar Composition............................ 28 3.5 Definition of the PEG Formalism.......................... 28 5 3.6 Extended Notation.................................. 30 3.6.1 Matching Characters............................. 30 3.6.2 New Operators................................ 31 3.6.3 Recapitulation & Precedence........................ 32 3.7 Parsing PEGs..................................... 33 3.7.1 Naive Parser................................. 33 3.7.2 Packrat Parsing................................ 33 3.7.3 Left-Recursion................................ 35 3.8 Why PEGs...................................... 36 4 User Manual 39 4.1 Introduction...................................... 39 4.1.1 What is caxap ?............................... 39 4.1.2 What is a macro?.............................. 40 4.1.3 How does caxap deal with files?....................... 40 4.1.4 How does caxap recognize macros?..................... 40 4.1.5 How does caxap expand macros?...................... 41 4.1.6 Where can macro calls appear?....................... 41 4.2 Installation...................................... 41 4.3 Invocation....................................... 42 4.4 A Simple Example.................................. 42 4.5 The require Statement: Importing Macros and Specifying Dependencies.... 45 4.5.1 import and require ............................. 45 4.5.2 General Form................................. 46 4.5.3 Package-Wide and Class-Wide Requires.................. 46 4.5.4 Syntactic Sugars............................... 46 4.5.5 Static Requires................................ 47 4.5.6 Macro Requires................................ 47 4.5.7 When to use import or require?...................... 48 4.6 The Match Class: Representing the Syntax Tree.................. 48 4.7 Macro Definition................................... 49 4.7.1 Spacing.................................... 49 4.7.2 Compile-Time Environement........................ 50 4.7.3 Captures................................... 51 4.7.4 Expansion Strategies............................. 51 4.7.5 Prioritary Macros.............................. 52 4.8 Working with Matches................................ 53 4.8.1 Match Information.............................. 53 4.8.2 MatchSpec ................................... 54 4.8.3 Finding Submatches............................. 55 4.8.4 Checking for the Presence of Submatches................. 57 4.8.5 Examples................................... 57 4.9 (Quasi)quotation................................... 57 4.9.1 Quotation Basics............................... 58 4.9.2 Nested Quotations.............................. 60 4.9.3 Non-Recursive Unquotation......................... 61 4.9.4 Splicing.................................... 61 - 6 - 4.9.5 Quotation & Escaped Sequences...................... 62 4.10 Macro Composition.................................. 63 4.10.1 Composing Unrelated Macros........................ 63 4.10.2 Macro Expansion Order & Raw Macros.................. 64 4.10.3 Macros Referencing Macros......................... 64 4.11 Error Reporting.................................... 64 4.12 Miscellaneous Concerns............................... 65 5 Implementation & Internals 67 5.1 Overview of the Architecture............................ 67 5.1.1 driver..................................... 67 5.1.2 files...................................... 68 5.1.3 grammar................................... 68 5.1.4 source..................................... 69 5.1.5 grammar.java................................. 69 5.1.6 parser..................................... 70 5.1.7 trees...................................... 71 5.1.8 compiler.................................... 71 5.1.9 compiler.java................................. 72 5.1.10 compiler.util................................. 72 5.1.11 compiler.macros............................... 72 5.1.12 util and util.apache.............................. 73 5.2 The require Statement............................... 73 5.2.1 Dependencies and the Java Compiler.................... 73 5.2.2 Problems with the import Statements................... 74 5.2.3 Why do we need compile time dependencies?............... 75 5.3 Finding Submatches................................. 76 5.4 Quotations...................................... 76 5.5 The Java Grammar.................................. 77 6 Evaluation 79 6.1 Testing........................................ 79 6.2 Performance...................................... 80 6.2.1 Setup & Preliminary Considerations.................... 80 6.2.2 Profiling.................................... 80 6.2.3 Memory Consumption............................ 81 6.2.4 Parser Performance............................. 81 6.2.5 Parser vs Back-End............................. 83 6.2.6 Complexity Bounds............................. 83 6.2.7 Improving Performances........................... 84 7 Related Work 85 7.1 More Macro Concepts................................ 85 7.1.1 Homoiconicity................................ 85 7.1.2 Hygiene.................................... 86 7.2 Java Macro Frameworks............................... 86 7.2.1 java-macros.................................. 87 7.2.2 Jatha..................................... 87 - 7 - 7.2.3 The Jakarta Toolset (JTS)......................... 88 7.2.4 The Java Syntactic Extender (JSE).................... 89 7.3
Recommended publications
  • Reflection, Zips, and Generalised Casts Abstract
    Scrap More Boilerplate: Reflection, Zips, and Generalised Casts Ralf L a¨mmel Vrije Universiteit & CWI, Amsterdam Abstract Writing boilerplate code is a royal pain. Generic programming promises to alleviate this pain b y allowing the programmer to write a generic #recipe$ for boilerplate code, and use that recipe in many places. In earlier work we introduced the #Scrap y our boilerplate$ approach to generic p rogramming, which exploits Haskell%s exist- ing type-class mechanism to support generic transformations and queries. This paper completes the picture. We add a few extra #introspec- tive$ or #reflective$ facilities, that together support a r ich variety of serialisation and de-serialisation. We also show how to perform generic #zips$, which at first appear to be somewhat tricky in our framework. Lastly, we generalise the ability to over-ride a generic function with a type-specific one. All of this can be supported in Haskell with independently-useful extensions: higher-rank types and type-safe cast. The GHC imple- mentation of Haskell readily derives the required type classes for user-defined data types. Categories and Subject Descriptors D.2. 13 [Software Engineering]: Reusable Software; D.1.1 [Programming Techniques]: Functional Programming; D.3. 1 [Programming Languages]: Formal Definitions and Theory General Terms Design, Languages Keywords Generic p rogramming, reflection, zippers, type cast 1 Introduction It is common to find that large slabs of a program consist of #boil- erplate$ code, which conceals b y its b ulk a smaller amount of #in- teresting$ code. So-called generic p rogramming techniques allow Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or c ommercial advantage and that copies b ear this notice and the full citation on the first page.
    [Show full text]
  • Administering Unidata on UNIX Platforms
    C:\Program Files\Adobe\FrameMaker8\UniData 7.2\7.2rebranded\ADMINUNIX\ADMINUNIXTITLE.fm March 5, 2010 1:34 pm Beta Beta Beta Beta Beta Beta Beta Beta Beta Beta Beta Beta Beta Beta Beta Beta UniData Administering UniData on UNIX Platforms UDT-720-ADMU-1 C:\Program Files\Adobe\FrameMaker8\UniData 7.2\7.2rebranded\ADMINUNIX\ADMINUNIXTITLE.fm March 5, 2010 1:34 pm Beta Beta Beta Beta Beta Beta Beta Beta Beta Beta Beta Beta Beta Notices Edition Publication date: July, 2008 Book number: UDT-720-ADMU-1 Product version: UniData 7.2 Copyright © Rocket Software, Inc. 1988-2010. All Rights Reserved. Trademarks The following trademarks appear in this publication: Trademark Trademark Owner Rocket Software™ Rocket Software, Inc. Dynamic Connect® Rocket Software, Inc. RedBack® Rocket Software, Inc. SystemBuilder™ Rocket Software, Inc. UniData® Rocket Software, Inc. UniVerse™ Rocket Software, Inc. U2™ Rocket Software, Inc. U2.NET™ Rocket Software, Inc. U2 Web Development Environment™ Rocket Software, Inc. wIntegrate® Rocket Software, Inc. Microsoft® .NET Microsoft Corporation Microsoft® Office Excel®, Outlook®, Word Microsoft Corporation Windows® Microsoft Corporation Windows® 7 Microsoft Corporation Windows Vista® Microsoft Corporation Java™ and all Java-based trademarks and logos Sun Microsystems, Inc. UNIX® X/Open Company Limited ii SB/XA Getting Started The above trademarks are property of the specified companies in the United States, other countries, or both. All other products or services mentioned in this document may be covered by the trademarks, service marks, or product names as designated by the companies who own or market them. License agreement This software and the associated documentation are proprietary and confidential to Rocket Software, Inc., are furnished under license, and may be used and copied only in accordance with the terms of such license and with the inclusion of the copyright notice.
    [Show full text]
  • Template Meta-Programming for Haskell
    Template Meta-programming for Haskell Tim Sheard Simon Peyton Jones OGI School of Science & Engineering Microsoft Research Ltd Oregon Health & Science University [email protected] [email protected] Abstract C++, albeit imperfectly... [It is] obvious to functional programmers what the committee did not realize until We propose a new extension to the purely functional programming later: [C++] templates are a functional language evalu- language Haskell that supports compile-time meta-programming. ated at compile time...” [12]. The purpose of the system is to support the algorithmic construction of programs at compile-time. Robinson’s provocative paper identifies C++ templates as a ma- The ability to generate code at compile time allows the program- jor, albeit accidental, success of the C++ language design. De- mer to implement such features as polytypic programs, macro-like spite the extremely baroque nature of template meta-programming, expansion, user directed optimization (such as inlining), and the templates are used in fascinating ways that extend beyond the generation of supporting data structures and functions from exist- wildest dreams of the language designers [1]. Perhaps surprisingly, ing data structures and functions. in view of the fact that templates are functional programs, func- tional programmers have been slow to capitalize on C++’s success; Our design is being implemented in the Glasgow Haskell Compiler, while there has been a recent flurry of work on run-time meta- ghc. programming, much less has been done on compile-time meta- This version is very slightly modified from the Haskell Workshop programming. The Scheme community is a notable exception, as 2002 publication; a couple of typographical errors are fixed in Fig- we discuss in Section 10.
    [Show full text]
  • Types and Programming Languages by Benjamin C
    < Free Open Study > . .Types and Programming Languages by Benjamin C. Pierce ISBN:0262162091 The MIT Press © 2002 (623 pages) This thorough type-systems reference examines theory, pragmatics, implementation, and more Table of Contents Types and Programming Languages Preface Chapter 1 - Introduction Chapter 2 - Mathematical Preliminaries Part I - Untyped Systems Chapter 3 - Untyped Arithmetic Expressions Chapter 4 - An ML Implementation of Arithmetic Expressions Chapter 5 - The Untyped Lambda-Calculus Chapter 6 - Nameless Representation of Terms Chapter 7 - An ML Implementation of the Lambda-Calculus Part II - Simple Types Chapter 8 - Typed Arithmetic Expressions Chapter 9 - Simply Typed Lambda-Calculus Chapter 10 - An ML Implementation of Simple Types Chapter 11 - Simple Extensions Chapter 12 - Normalization Chapter 13 - References Chapter 14 - Exceptions Part III - Subtyping Chapter 15 - Subtyping Chapter 16 - Metatheory of Subtyping Chapter 17 - An ML Implementation of Subtyping Chapter 18 - Case Study: Imperative Objects Chapter 19 - Case Study: Featherweight Java Part IV - Recursive Types Chapter 20 - Recursive Types Chapter 21 - Metatheory of Recursive Types Part V - Polymorphism Chapter 22 - Type Reconstruction Chapter 23 - Universal Types Chapter 24 - Existential Types Chapter 25 - An ML Implementation of System F Chapter 26 - Bounded Quantification Chapter 27 - Case Study: Imperative Objects, Redux Chapter 28 - Metatheory of Bounded Quantification Part VI - Higher-Order Systems Chapter 29 - Type Operators and Kinding Chapter 30 - Higher-Order Polymorphism Chapter 31 - Higher-Order Subtyping Chapter 32 - Case Study: Purely Functional Objects Part VII - Appendices Appendix A - Solutions to Selected Exercises Appendix B - Notational Conventions References Index List of Figures < Free Open Study > < Free Open Study > Back Cover A type system is a syntactic method for automatically checking the absence of certain erroneous behaviors by classifying program phrases according to the kinds of values they compute.
    [Show full text]
  • Ironpython in Action
    IronPytho IN ACTION Michael J. Foord Christian Muirhead FOREWORD BY JIM HUGUNIN MANNING IronPython in Action Download at Boykma.Com Licensed to Deborah Christiansen <[email protected]> Download at Boykma.Com Licensed to Deborah Christiansen <[email protected]> IronPython in Action MICHAEL J. FOORD CHRISTIAN MUIRHEAD MANNING Greenwich (74° w. long.) Download at Boykma.Com Licensed to Deborah Christiansen <[email protected]> For online information and ordering of this and other Manning books, please visit www.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact Special Sales Department Manning Publications Co. Sound View Court 3B fax: (609) 877-8256 Greenwich, CT 06830 email: [email protected] ©2009 by Manning Publications Co. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps. Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15% recycled and processed without the use of elemental chlorine.
    [Show full text]
  • UNIX Workshop Series: Quick-Start Objectives
    Part I UNIX Workshop Series: Quick-Start Objectives Overview – Connecting with ssh Command Window Anatomy Command Structure Command Examples Getting Help Files and Directories Wildcards, Redirection and Pipe Create and edit files Overview Connecting with ssh Open a Terminal program Mac: Applications > Utilities > Terminal ssh –Y [email protected] Linux: In local shell ssh –Y [email protected] Windows: Start Xming and PuTTY Create a saved session for the remote host name centos.css.udel.edu using username Connecting with ssh First time you connect Unix Basics Multi-user Case-sensitive Bash shell, command-line Commands Command Window Anatomy Title bar Click in the title bar to bring the window to the front and make it active. Command Window Anatomy Login banner Appears as the first line of a login shell. Command Window Anatomy Prompts Appears at the beginning of a line and usually ends in $. Command Window Anatomy Command input Place to type commands, which may have options and/or arguments. Command Window Anatomy Command output Place for command response, which may be many lines long. Command Window Anatomy Input cursor Typed text will appear at the cursor location. Command Window Anatomy Scroll Bar Will appear as needed when there are more lines than fit in the window. Command Window Anatomy Resize Handle Use the mouse to change the window size from the default 80x24. Command Structure command [arguments] Commands are made up of the actual command and its arguments. command -options [arguments] The arguments are further broken down into the command options which are single letters prefixed by a “-” and other arguments that identify data for the command.
    [Show full text]
  • Metaprogramming Concepts of Programming Languages
    Metaprogramming Concepts of Programming Languages Alexander Schramm Institut für Softwaretechnik und Programmiersprachen 2. November 2015 A. Schramm 2. November 2015 1/39 Table of Contents Introduction Runtime Reflection in Java Runtime Metaprogramming in Ruby C++ Templates Haskell Templates Lisp Macros Conclusion A. Schramm 2. November 2015 2/39 Outline Introduction Runtime Reflection in Java Runtime Metaprogramming in Ruby C++ Templates Haskell Templates Lisp Macros Conclusion A. Schramm 2. November 2015 3/39 Motivation Which issues do we want to tackle? I Avoid writing boilerplate code I Write code that shows our intention I Expand the syntax of languages I Write type independent code in strongly typed languages A. Schramm 2. November 2015 4/39 What is Metaprogramming Definition: Metaprogramming Metaprograming describes different ways to generate and manipulate code Differentations: I Compile time vs. runtime metaprogramming I Domain language vs. host language A. Schramm 2. November 2015 5/39 Differentations of Metaprograms Compile time Metaprogramming Ways to manipulate or generate code during compilation, e.g: Macros, Templates Runtime Metaprogramming Ways to manipulate or generate code, while it is executed, e.g: dynamic methods, Reflections A. Schramm 2. November 2015 6/39 Differentations of Metaprograms Domain Language The Programming Language, in which the metaprogram is written Host Language The Programming Language of the generated code I Can be different (YACC, Compilers) I Domain language can be a subset of the host language (C++ Templates) I Domain language can be an integral part of the host language (Ruby) A. Schramm 2. November 2015 7/39 Outline Introduction Runtime Reflection in Java Runtime Metaprogramming in Ruby C++ Templates Haskell Templates Lisp Macros Conclusion A.
    [Show full text]
  • Processing an Intermediate Representation Written in Lisp
    Processing an Intermediate Representation Written in Lisp Ricardo Pena,˜ Santiago Saavedra, Jaime Sanchez-Hern´ andez´ Complutense University of Madrid, Spain∗ [email protected],[email protected],[email protected] In the context of designing the verification platform CAVI-ART, we arrived to the need of deciding a textual format for our intermediate representation of programs. After considering several options, we finally decided to use S-expressions for that textual representation, and Common Lisp for processing it in order to obtain the verification conditions. In this paper, we discuss the benefits of this decision. S-expressions are homoiconic, i.e. they can be both considered as data and as code. We exploit this duality, and extensively use the facilities of the Common Lisp environment to make different processing with these textual representations. In particular, using a common compilation scheme we show that program execution, and verification condition generation, can be seen as two instantiations of the same generic process. 1 Introduction Building a verification platform such as CAVI-ART [7] is a challenging endeavour from many points of view. Some of the tasks require intensive research which can make advance the state of the art, as it is for instance the case of automatic invariant synthesis. Some others are not so challenging, but require a lot of pondering and discussion before arriving at a decision, in order to ensure that all the requirements have been taken into account. A bad or a too quick decision may have a profound influence in the rest of the activities, by doing them either more difficult, or longer, or more inefficient.
    [Show full text]
  • The Machine That Builds Itself: How the Strengths of Lisp Family
    Khomtchouk et al. OPINION NOTE The Machine that Builds Itself: How the Strengths of Lisp Family Languages Facilitate Building Complex and Flexible Bioinformatic Models Bohdan B. Khomtchouk1*, Edmund Weitz2 and Claes Wahlestedt1 *Correspondence: [email protected] Abstract 1Center for Therapeutic Innovation and Department of We address the need for expanding the presence of the Lisp family of Psychiatry and Behavioral programming languages in bioinformatics and computational biology research. Sciences, University of Miami Languages of this family, like Common Lisp, Scheme, or Clojure, facilitate the Miller School of Medicine, 1120 NW 14th ST, Miami, FL, USA creation of powerful and flexible software models that are required for complex 33136 and rapidly evolving domains like biology. We will point out several important key Full list of author information is features that distinguish languages of the Lisp family from other programming available at the end of the article languages and we will explain how these features can aid researchers in becoming more productive and creating better code. We will also show how these features make these languages ideal tools for artificial intelligence and machine learning applications. We will specifically stress the advantages of domain-specific languages (DSL): languages which are specialized to a particular area and thus not only facilitate easier research problem formulation, but also aid in the establishment of standards and best programming practices as applied to the specific research field at hand. DSLs are particularly easy to build in Common Lisp, the most comprehensive Lisp dialect, which is commonly referred to as the “programmable programming language.” We are convinced that Lisp grants programmers unprecedented power to build increasingly sophisticated artificial intelligence systems that may ultimately transform machine learning and AI research in bioinformatics and computational biology.
    [Show full text]
  • An Optimized R5RS Macro Expander
    An Optimized R5RS Macro Expander Sean Reque A thesis submitted to the faculty of Brigham Young University in partial fulfillment of the requirements for the degree of Master of Science Jay McCarthy, Chair Eric Mercer Quinn Snell Department of Computer Science Brigham Young University February 2013 Copyright c 2013 Sean Reque All Rights Reserved ABSTRACT An Optimized R5RS Macro Expander Sean Reque Department of Computer Science, BYU Master of Science Macro systems allow programmers abstractions over the syntax of a programming language. This gives the programmer some of the same power posessed by a programming language designer, namely, the ability to extend the programming language to fit the needs of the programmer. The value of such systems has been demonstrated by their continued adoption in more languages and platforms. However, several barriers to widespread adoption of macro systems still exist. The language Racket [6] defines a small core of primitive language constructs, including a powerful macro system, upon which all other features are built. Because of this design, many features of other programming languages can be implemented through libraries, keeping the core language simple without sacrificing power or flexibility. However, slow macro expansion remains a lingering problem in the language's primary implementation, and in fact macro expansion currently dominates compile times for Racket modules and programs. Besides the typical problems associated with slow compile times, such as slower testing feedback, increased mental disruption during the programming process, and unscalable build times for large projects, slow macro expansion carries its own unique problems, such as poorer performance for IDEs and other software analysis tools.
    [Show full text]
  • Expressive and Strongly Type-Safe Code Generation
    Expressive and Strongly Type-Safe Code Generation Thomas Winant, Jesper Cockx, and Dominique Devriese imec-DistriNet, KU Leuven [email protected] Meta-programs are useful to avoid writing boilerplate code, for polytypic programming, etc. However, when a meta-program passes the type checker, it does not necessarily mean that the programs it generates will be free of type errors, only that generating object programs will proceed without type errors. For instance, this well-typed Template Haskell [5] meta-program generates the ill-typed object program not 'X'. notX :: Q Exp notX = [j not 'X' j] Fortunately, Template Haskell will type-check the generated program after generation, and detect the type error. We call such meta-programming systems weakly type-safe. Even though weakly type-safe meta-programming suffices for guaranteeing that the resulting program is type-safe, it has important downsides. Type errors in the generated code are presented to the application developer, who may simply be a user of the meta-program. If the meta-program is part of a library, the developer has little recourse other than contacting the meta-program authors about the bug in their code. Moreover, composing weakly type-safe meta-programs can be brittle because the type of generated programs is not specified in a machine-checked way. We are interested in what we call strongly type-safe meta-programming, which offers a stronger guarantee: when a meta-program is strongly type-safe, all generated programs are guaranteed to be type-safe too. As such, bugs in meta-programs are detected as early as possible.
    [Show full text]
  • Adaptive LL(*) Parsing: the Power of Dynamic Analysis
    Adaptive LL(*) Parsing: The Power of Dynamic Analysis Terence Parr Sam Harwell Kathleen Fisher University of San Francisco University of Texas at Austin Tufts University [email protected] [email protected] kfi[email protected] Abstract PEGs are unambiguous by definition but have a quirk where Despite the advances made by modern parsing strategies such rule A ! a j ab (meaning “A matches either a or ab”) can never as PEG, LL(*), GLR, and GLL, parsing is not a solved prob- match ab since PEGs choose the first alternative that matches lem. Existing approaches suffer from a number of weaknesses, a prefix of the remaining input. Nested backtracking makes de- including difficulties supporting side-effecting embedded ac- bugging PEGs difficult. tions, slow and/or unpredictable performance, and counter- Second, side-effecting programmer-supplied actions (muta- intuitive matching strategies. This paper introduces the ALL(*) tors) like print statements should be avoided in any strategy that parsing strategy that combines the simplicity, efficiency, and continuously speculates (PEG) or supports multiple interpreta- predictability of conventional top-down LL(k) parsers with the tions of the input (GLL and GLR) because such actions may power of a GLR-like mechanism to make parsing decisions. never really take place [17]. (Though DParser [24] supports The critical innovation is to move grammar analysis to parse- “final” actions when the programmer is certain a reduction is time, which lets ALL(*) handle any non-left-recursive context- part of an unambiguous final parse.) Without side effects, ac- free grammar. ALL(*) is O(n4) in theory but consistently per- tions must buffer data for all interpretations in immutable data forms linearly on grammars used in practice, outperforming structures or provide undo actions.
    [Show full text]