algorithms Article LR Parsing for LCFRS Laura Kallmeyer * and Wolfgang Maier Department for Computational Linguistics, Institute for Language and Information, Heinrich-Heine Universität Düsseldorf, Universitätsstr. 1, 40225 Düsseldorf, Germany;
[email protected] * Correspondence:
[email protected]; Tel.: +49-211-8113-899 Academic Editor: Henning Fernau Received: 21 March 2016; Accepted: 18 August 2016; Published: 27 August 2016 Abstract: LR parsing is a popular parsing strategy for variants of Context-Free Grammar (CFG). It has also been used for mildly context-sensitive formalisms, such as Tree-Adjoining Grammar. In this paper, we present the first LR-style parsing algorithm for Linear Context-Free Rewriting Systems (LCFRS), a mildly context-sensitive extension of CFG which has received considerable attention in the last years in the context of natural language processing. Keywords: parsing; automata; LCFRS 1. Introduction In computational linguistics, the modeling of discontinuous structures in natural language, i.e., of structures that span two or more non-adjacent portions of the input string, is an important issue. In recent years, Linear Context-Free Rewriting System (LCFRS) [1] has emerged as a formalism which is useful for this task. LCFRS is a mildly context-sensitive [2] extension of CFG in which a single non-terminal can cover k N continuous blocks of terminals. CFG is a special case of LCFRS where 2 k = 1. In CFG, only embedded structures can be modelled. In LCFRS, in contrast, yields can be intertwined. The schematically depicted derivation trees in Figure1 illustrate the different domains of locality of CFG and LCFRS. As can be seen, in the schematic CFG derivation, X contributes a continuous component b to the full yield.