Generalized LR Parsing in Haskell Jo~ao Fernandes
[email protected] Techn. Report DI-PURe-04.11.01 2004, October PURe Program Understanding and Re-engineering: Calculi and Applications (Project POSI/ICHS/44304/2002) Departamento de Informatica´ da Universidade do Minho Campus de Gualtar | Braga | Portugal DI-PURe-04.11.01 Generalized LR Parsing in Haskell by Jo~ao Fernandes Abstract Parser combinators elegantly and concisely model generalised LL parsers in a purely functional language. They nicely illustrate the concepts of higher- order functions, polymorphic functions and lazy evaluation. Indeed, parser combinators are often presented as a motivating example for functional pro- gramming. Generalised LL, however, has an important drawback: it does not handle (direct nor indirect) left recursive context-free grammars. In a different context, the (non-functional) parsing community has been doing a considerable amount of work on generalised LR parsing. Such parsers handle virtually any context-free grammar. Surprisingly, little work has been done on generalised LR by the functional programming community ( [9] is a good exception). In this report, we present a concise and elegant implementation of an incremental generalised LR parser generator and interpreter in Haskell. For good computational complexity, such parsers rely heavily on lazy evaluation. Incremental evaluation is obtained via function memoisation. An implementation of our generalised LR parser generator is available as the HaGLR tool. We assess the performance of this tool with some benchmark examples. 1 Motivation The Generalised LR parsing algorithm was first introduced by Tomita [14] in the context of natural language processing. Several improvements have been proposed in the literature, among others to handle context-free gram- mars with hidden left-recursion [11, 7].