
Ambiguity Detection for Programming Language Grammars Bas Basten To cite this version: Bas Basten. Ambiguity Detection for Programming Language Grammars. Computation and Language [cs.CL]. Universiteit van Amsterdam, 2011. English. tel-00644079 HAL Id: tel-00644079 https://tel.archives-ouvertes.fr/tel-00644079 Submitted on 23 Nov 2011 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. a Basten Bas Bas Basten Ambiguity Detection for Programming Language Grammars ACADEMISCH PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Universiteit van Amsterdam op gezag van de Rector Magnificus prof. dr. D.C. van den Boom ten overstaan van een door het college voor promoties ingestelde commissie, in het openbaar te verdedigen in de Agnietenkapel op donderdag 15 december 2011, te 14:00 uur door Hendrikus Joseph Sebastiaan Basten geboren te Boxmeer Promotiecommissie Promotor: Prof. dr. P. Klint Copromotor: Dr. J.J. Vinju Overige leden: Prof. dr. J.A. Bergstra Prof. dr. R. Lämmel Prof. dr. M. de Rijke Dr. S. Schmitz Dr. E. Visser Faculteit der Natuurwetenschappen, Wiskunde en Informatica The work in this thesis has been carried out at Centrum Wiskunde & Informatica (CWI) in Amsterdam, under the auspices of the research school IPA (Institute for Programming research and Algorithmics). Contents Contents 5 Acknowledgements 9 1 Introduction 11 1.1 Introduction . 11 1.1.1 Context-Free Grammars and Parsing . 11 1.1.2 Ambiguity . 13 1.1.3 Ambiguity Detection . 13 1.1.4 Goal of the Thesis . 14 1.2 Motivation . 15 1.2.1 Deterministic Parsers . 15 1.2.2 Backtracking Parsers . 15 1.2.3 Generalized Parsers . 15 1.2.4 Scannerless Generalized Parsers . 15 1.3 Research Questions . 16 1.3.1 Measuring the Practical Usability of Ambiguity Detection . 16 1.3.2 Improving the Practical Usability of Ambiguity Detection . 17 1.4 Overview of the Chapters and Contributions . 17 1.5 Origins of the Chapters . 19 2 The Usability of Ambiguity Detection Methods for Context-Free Grammars 21 2.1 Introduction . 21 2.2 Comparison Framework . 22 2.2.1 Criteria for Practical Usability . 22 2.2.2 Measurements . 23 2.2.3 Analysis . 24 2.3 AMBER . 24 2.3.1 Measurements and Analysis . 24 5 2.4 LR(k)Test.................................... 25 2.4.1 Measurements and Analysis . 25 2.5 Noncanonical Unambiguity Test . 26 2.5.1 Measurements and Analysis . 26 2.6 Comparison . 27 2.7 Evaluation . 28 2.8 Conclusions . 29 2.8.1 Discussion . 29 3 Faster Ambiguity Detection by Grammar Filtering 31 3.1 Introduction . 31 3.2 Filtering Unambiguous Productions . 32 3.2.1 Preliminaries . 33 3.2.2 The Noncanonical Unambiguity Test . 34 3.2.3 LR(0) Approximation . 34 3.2.4 Finding Ambiguity in an item0 Position Graph . 35 3.2.5 Filtering Harmless Production Rules . 35 3.2.6 Grammar Reconstruction . 36 3.3 Experimental Validation . 38 3.3.1 Experiment Setup . 38 3.3.2 Experimental Results . 40 3.3.3 Analysis and Conclusions . 46 3.3.4 Threats to validity . 46 3.4 Conclusions . 46 3.5 Appendix: Updated Measurement Results . 47 3.5.1 Improved Implementation . 47 3.5.2 Analysis . 47 3.5.3 Effects on Sentence Generation Times . 49 4 Tracking Down the Origins of Ambiguity in Context-Free Grammars 51 4.1 Introduction . 51 4.1.1 Related Work . 52 4.1.2 Overview . 52 4.2 Preliminaries . 53 4.2.1 Context-Free Grammars . 53 4.2.2 Bracketed Grammars . 54 4.2.3 Parse Trees . 54 4.2.4 Ambiguous Core . 54 4.2.5 Positions . 55 4.2.6 Automata . 55 4.3 Regular Unambiguity Test . 56 4.3.1 Position Automaton . 56 4.3.2 Approximated Position Automaton . 56 4.3.3 The item0 Equivalence Relation . 57 4.3.4 Position Pair Automaton . 58 4.4 Finding Parse Trees of Unambiguous Strings . 59 4.4.1 Unused Positions . 59 4.4.2 Computation . 61 4.5 Harmless Production Rules . 61 4.5.1 Finding Harmless Production Rules . 62 4.5.2 Complexity . 62 4.5.3 Grammar Reconstruction . 63 4.6 Noncanonical Unambiguity Test . 63 4.6.1 Improving the Regular Unambiguity Test . 64 4.6.2 Noncanonical Position Pair Automaton . 64 4.6.3 Effects on Identifying Harmless Production Rules . 66 4.7 Excluding Parse Trees Iteratively . 66 4.8 Conclusions . 67 5 Scaling to Scannerless 69 5.1 Introduction . 69 5.1.1 Background . 69 5.1.2 Contributions and Roadmap . 70 5.2 The Ambiguity Detection Framework . 71 5.2.1 The Framework . 71 5.2.2 Notational Preliminaries . 73 5.3 Character-Level Grammars . 73 5.3.1 Example . 74 5.3.2 Definition . 74 5.4 Baseline Algorithm . 75 5.4.1 Step 1: NFA Construction . 75 5.4.2 Step 2: Construct and Traverse Pair Graph . 76 5.4.3 Steps 3–4: NFA Filtering and Harmless Rules Identification . 80 5.4.4 Steps 5–7: NFA Reconstruction and Sentence Generation . 80 5.5 Ambiguity Detection for Character-level Grammars . 81 5.5.1 Application of Baseline Algorithm on Example Grammar . 81 5.5.2 Changes to the Baseline Algorithm . 82 5.5.3 NFA Reconstruction . 85 5.6 Grammar Unfolding . 86 5.7 Experimental Results . 87 5.7.1 Experiment Setup . 87 5.7.2 Results and Analysis . 88 5.7.3 Validation . 90 5.8 Conclusion . 91 6 Implementing AMBIDEXTER 93 6.1 Introduction . 93 6.2 Grammar Filter . 94 6.2.1 Requirements . 94 6.2.2 Architecture and Design . 94 6.2.3 Implementation Details . 96 6.3 Sentence Generator . 98 6.3.1 Requirements . 98 6.3.2 Architecture and Design . 98 6.3.3 Implementation Details . 100 6.4 Usage . ..
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages161 Page
-
File Size-