Parsing for Agile Modeling

Parsing for Agile Modeling

Parsing For Agile Modeling Inauguraldissertation der Philosophisch-naturwissenschaftlichen Fakultat¨ der Universitat¨ Bern vorgelegt von Jan Kursˇ von Tschechien | downloaded: 7.10.2021 Leiter der Arbeit: Prof. Dr. Oscar Nierstrasz Institut fur¨ Informatik https://doi.org/10.24442/boristheses.891 source: Parsing For Agile Modeling Inauguraldissertation der Philosophisch-naturwissenschaftlichen Fakultat¨ der Universitat¨ Bern vorgelegt von Jan Kursˇ von Tschechien Leiter der Arbeit: Prof. Dr. Oscar Nierstrasz Institut fur¨ Informatik Von der Philosophisch-naturwissenschaftlichen Fakultat¨ angenommen. Bern, 25.10.2016 Der Dekan: Prof. Dr. Gilberto Colangelo This dissertation can be downloaded from scg.unibe.ch. Copyright c 2016 by Jan Kursˇ This work is licensed under a Creative Commons Attribution-Non-Commercial-No derivative works 2.5 Switzerland license. To see the license go to http://creativecommons.org/licenses/by-sa/2.5/ch/ Attribution–ShareAlike Abstract Agile modeling refers to a set of methods that allow for a quick initial development of an importer and its further refinement. These requirements are not met simultaneously by the current parsing technology. Problems with parsing became a bottleneck in our research of agile modeling. In this thesis we introduce a novel approach to specify and build parsers. Our approach allows for expressive, tolerant and composable parsers with- out sacrificing performance. The approach is based on a context-sensitive extension of parsing expression grammars that allows a grammar engineer to specify complex language restrictions. To insure high parsing perfor- mance we automatically analyze a grammar definition and choose differ- ent parsing strategies for different parts of the grammar. We show that context-sensitive parsing expression grammars allow for highly composable, tolerant and variable-grained parsers that can be easily refined. Different parsing strategies significantly insure high-performance of parsers without sacrificing expressiveness of the underlying grammars. 2 Contents 1 Introduction 8 1.1 Agile Modeling . .9 1.2 Parsing Obstacles of Agile Modeling . 12 1.3 Thesis . 13 1.4 Our Contribution . 13 2 Overview of Parsing Technologies 15 2.1 Parsing in the Wild . 15 2.1.1 Expressive Power . 15 2.1.2 Composability . 17 2.1.3 Tolerant Grammars and Semi-Parsing . 18 2.1.4 Performance . 19 2.1.5 Parsing Frameworks . 20 2.2 Existing Limitations . 22 2.3 Our Solution . 23 3 Parsing Expression Grammars and PetitParser 24 3.1 Parsing Expression Grammars . 24 3.1.1 PEG Analysis . 25 3.1.2 Parser Combinators . 26 3.2 PetitParser . 29 4 Context Sensitivity in Parsing Expression Grammars 32 4.1 Motivating Example . 33 4.2 Parsing Contexts . 34 4.2.1 Context-Sensitive Extension . 35 4.2.2 Indentation Stack . 35 4.3 Parsing Contexts in Parsing Expression Grammars . 37 4.3.1 Parser Combinators . 40 4.3.2 CS-PEG analysis . 40 4.4 Implementation . 43 4.4.1 Performance . 44 4.5 Case Studies . 47 4.5.1 Python . 47 4.5.2 Markdown . 49 4.6 Related Work . 52 4.7 Conclusion . 54 4 CONTENTS 5 5 Semi-Parsing with Bounded Seas 55 5.1 Motivating Example . 56 5.1.1 Why not use Regular Expressions? . 56 5.1.2 A Na¨ıve Island Grammar . 57 5.1.3 An Advanced Island Grammar . 57 5.2 Bounded Seas . 59 5.2.1 The Sea Boundary . 60 5.2.2 The Context Sensitivity of Bounded Seas . 61 5.3 Bounded Seas in Parsing Expression Grammars . 62 5.3.1 The Water Operator . 63 5.3.2 The NEXT function . 67 5.3.3 BS-PEG analysis . 71 5.4 Implementation . 72 5.4.1 Performance . 73 5.5 Java Parser Case Study . 73 5.5.1 Without Nested Classes . 75 5.5.2 With Nested Classes . 75 5.5.3 With Return Types . 76 5.5.4 Performance . 77 5.6 Related Work . 79 5.7 Conclusion . 82 6 Adaptable Parsing Strategies 83 6.1 Motivating Example . 84 6.1.1 Composition Overhead . 86 6.1.2 Superfluous Intermediate Objects . 86 6.1.3 Backtracking Overhead . 87 6.1.4 Context-Sensitivity Overhead . 87 6.2 A Parser Combinator Compiler . 88 6.2.1 Adaptable Strategies . 88 6.3 Parser Optimizations . 90 6.3.1 Regular Optimizations . 92 6.3.2 Context-Free Optimizations . 94 6.3.3 Context-Sensitive Optimizations . 96 6.4 Performance analysis . 100 6.4.1 PetitParser compiler . 100 6.4.2 Benchmarks . 101 6.4.3 Parsing Strategies Impact . 103 6.4.4 Scanner Impact . 105 6.4.5 Memoization Impact . 107 6.4.6 Java Parsers Comparison . 108 6.4.7 Smalltalk Parsers Comparison . 108 6.5 Related Work . 109 6.6 Conclusion . 110 7 Ruby Case study 112 7.1 Ruby Structure . 112 7.1.1 The Dangling End Problem . 113 7.1.2 Measurements . 115 7.2 Ruby Method Calls . 116 CONTENTS 6 7.2.1 Measurements . 117 7.3 Performance . 119 7.4 Conclusion . 121 8 Conclusion 123 A Formal development of PEGs 136 B Bounded Seas Examples 141 B.1 Example of Dynamic NEXT computation . 141 B.2 Example of Static NEXT computation . 142 B.3 Overlapping Seas Example . 147 C Implementation 152 C.1 Bounded seas . 152 D Layout Sensitivity in the Wild 159 D.1 Haskell . 159 D.2 Python . 160 D.3 F#.................................... 161 D.4 YAML . 162 D.5 OCaml . 162 D.6 CoffeeScript . 163 D.7 Grace . 163 D.8 SRFI 49 — Indentation-Sensitive Scheme . 164 D.9 Elastic Tabstops . 164 E Scanner 166 E.1 Scanners in PEG-based parsers . 166 E.1.1 Tokens and Scannable Parsing Expressions . 166 E.1.2 Scannable Choices . 167 E.1.3 Scanner . 167 E.2 Regular Parsing Expressions . 170 E.3 Regular Parsing Expression Languages . 172 E.4 Finite State Automata . 175 E.4.1 Construction of finite state automata from regular parsing ex- pressions (FSA)........................ 178 E.4.2 Determinization of the automata with epsilons and priorities (D) 181 F Measurements 183 F.1 Summary . 183 F.2 Strategies Details . 187 F.2.1 Expressions . 190 F.2.2 IS Expressions . 191 F.2.3 CF Python . 192 F.2.4 Python . 193 F.2.5 Smalltalk . 194 F.2.6 Java . 195 F.2.7 Java Sea . 196 F.3 Scanner Impact . 197 F.3.1 Expressions . 199 CONTENTS 7 F.3.2 Smalltalk . 200 F.4 Memoization Details . 201 F.5 Smalltalk Parsers . 202 F.6 Java Parsers . ..

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    215 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us