TuLiPA: Towards a Multi-Formalism Parsing Environment for Grammar Engineering Laura Kallmeyer Timm Lichte Wolfgang Maier SFB 441 SFB 441 SFB 441 Universit¨at T¨ubingen Universit¨at T¨ubingen Universit¨at T¨ubingen D-72074, T¨ubingen, Germany D-72074, T¨ubingen, Germany D-72074, T¨ubingen, Germany [email protected] [email protected] [email protected]

Yannick Parmentier Johannes Dellert Kilian Evang CNRS - LORIA SFB 441 - SfS SFB 441 - SfS Nancy Universit´e Universit¨at T¨ubingen Universit¨at T¨ubingen F-54506, Vandœuvre, France D-72074, T¨ubingen, Germany D-72074, T¨ubingen, Germany [email protected] {jdellert,kevang}@sfs.uni-tuebingen.de

Abstract sharing of resources (e.g., having a common lex- icon, from which different features would be ex- In this paper, we present an open-source tracted depending on the target formalism). parsing environment (T¨ubingen Linguistic In this context, we present a parsing environ- Parsing Architecture, TuLiPA) which uses ment relying on a general architecture that can Range Concatenation Grammar (RCG) be used for parsing with mildly context-sensitive as a pivot formalism, thus opening the (MCS) formalisms1 (Joshi, 1987). Its underly- way to the parsing of several mildly ing idea is to use Range Concatenation Grammar context-sensitive formalisms. This en- (RCG) as a pivot formalism, for RCG has been vironment currently supports tree-based shown to strictly include MCS languages while be- grammars (namely Tree-Adjoining Gram- ing parsable in polynomial time (Boullier, 2000). mars (TAG) and Multi-Component Tree- Currently, this architecture supports tree-based Adjoining Grammars with Tree Tuples grammars (Tree-Adjoining Grammars and Multi- (TT-MCTAG)) and allows computation Component Tree-Adjoining Grammars with Tree not only of syntactic structures, but also Tuples (Lichte, 2007)). More precisely, tree- of the corresponding semantic representa- based grammars are first converted into equivalent tions. It is used for the development of a RCGs, which are then used for parsing. The result tree-based grammar for German. of RCG parsing is finally interpreted to extract a derivation structure for the input grammar, as well 1 Introduction as to perform additional processings (e.g., seman- tic calculus, extraction of dependency views). Grammars and lexicons represent important lin- guistic resources for many NLP applications, The paper is structured as follows. In section 2, among which one may cite dialog systems, auto- we present the architecture of the TuLiPA pars- matic summarization or machine translation. De- ing environment and show how the use of RCG veloping such resources is known to be a complex as a pivot formalism makes it easier to design a task that needs useful tools such as parsers and modular system that can be extended to support generators (Erbach, 1992). several dimensions (syntax, semantics) and/or for- malisms. In section 3, we give some desiderata for Furthermore, there is a lack of a common frame- grammar engineering and present TuLiPA’s cur- work allowing for multi-formalism grammar engi- neering. Thus, many formalisms have been pro- 1A formalism is said to be mildly context sensitive (MCS) posed to model natural language, each coming iff (i) it generates limited cross-serial dependencies, (ii) it is with specific implementations. Having a com- polynomially parsable, and (iii) the string languages gener- ated by the formalism have the constant growth property (e.g., mon framework would facilitate the comparison n {a2 |n ≥ 0} does not have this property). Examples of MCS between formalisms (e.g., in terms of parsing com- formalisms include Tree-Adjoining Grammars, Combinatory plexity in practice), and would allow for a better Categorial Grammars and Linear Indexed Grammars. rent state with respect to these. In section 4, we tures labelling nodes. As a result of these unifica- compare this system with existing approaches for tions, the arguments of the semantic formulas are parsing and more generally for grammar engineer- unified (see Fig. 1). ing. Finally, in section 5, we conclude by present- ing future work. S NP↓x VP 2 Range Concatenation Grammar as a ↓y pivot formalism NPj VNP NPm John loves Mary The main idea underlying TuLiPA is to use RCG as a pivot formalism for RCG has appealing formal name(j,john) love(x,y) name(m,mary) properties (e.g., a generative capacity lying be- ; love(j,m),name(j,john),name(m,mary) yond Linear Context Free Rewriting Systems and a polynomial parsing complexity) and there ex- Figure 1: Semantic calculus in Feature-Based ist efficient algorithms, for RCG parsing (Boullier, TAG. 2000) and for grammar transformation into RCG (Boullier, 1998; Boullier, 1999). In our system, the semantic support has been in- Parsing with TuLiPA is thus a 3-step process: tegrated by (i) extending the internal tree objects to include semantic formulas (the RCG-conversion is 1. The input tree-based grammar is converted kept unchanged), and (ii) extending the construc- into an RCG (using the algorithm of tion of the derived tree (step 3) so that during the Kallmeyer and Parmentier (2008) when deal- interpretation of the RCG derivation in terms of ing with TT-MCTAG). tree combinations, the semantic formulas are car- ried and updated with respect to the feature unifi- 2. The resulting RCG is used for parsing the in- cations performed. put string using an extension of the parsing Secondly, let us consider lexical disambigua- algorithm of Boullier (2000). tion. Because of the high redundancy lying within 3. The RCG derivation structure is interpreted to lexicalized formalisms such as lexicalized TAG, extract the derivation and derived trees with it is common to consider tree schemata having a respect to the input grammar. frontier node marked for anchoring (i.e., lexical- ization). At parsing time, the tree schemata are The use of RCG as a pivot formalism, and thus anchored according to the input string. This an- of an RCG parser as a core component of the sys- choring selects a subgrammar supposed to cover tem, leads to a modular architecture. In turns, this the input string. Unfortunately, this subgrammar makes TuLiPA more easily extensible, either in may contain many trees that either do not lead to terms of functionalities, or in terms of formalisms. a parse or for which we know apriorithat they cannot be combined within the same derivation 2.1 Adding functionalities to the parsing (so we should not predict a derivation from one environment of these trees to another during parsing). As a re- As an illustration of TuLiPA’s extensibility, one sult, the parser could have poor performance be- may consider two extensions applied to the system cause of the many derivation paths that have to be recently. explored. Bonfante et al. (2004) proposed to polar- First, a semantic calculus using the syn- ize the structures of the grammar, and to apply an tax/semantics interface for TAG proposed by Gar- automaton-based filtering of the compatible struc- dent and Kallmeyer (2003) has been added. This tures. The idea is the following. One compute po- interface associates each tree with flat semantic larities representing the needs/resources brought formulas. The arguments of these formulas are by a given tree (or tree tuple for TT-MCTAG). unification variables, which are co-indexed with A substitution or foot node with category NP re- features labelling the nodes of the syntactic tree. flects a need for an NP (written NP-). In the same During classical TAG derivation, trees are com- way, an NP root node reflects a resource of type bined, triggering unifications of the feature struc- NP (written NP+). Then you build an automaton whose edges correspond to trees, and states to po- To sum up, the idea would be to keep the core larities brought by trees along the path. The au- RCG parser, and to extend TuLiPA with a spe- tomaton is then traversed to extract all paths lead- cific conversion module for each targeted formal- ing to a final state with a neutral polarity for each ism. On top of these conversion modules, one category and +1 for the axiom (see Fig. 2, the state should also provide interpretation modules allow- 7 is the only valid state and {proper., trans., det., ing to decode the RCG derivation forest in terms noun.} the only compatible set of trees). of the input formalism (see Fig. 3).

0 John 11eats 22a 33cake 4 det. noun. 2 4 6 intrans. S+ S+ S+ NP+ proper. 0 1 trans. NP+ det. noun. 3 5 7 S+ NP- S+ NP- S+ Figure 3: Towards a multi-formalism parsing en- vironment. Figure 2: Polarity-based lexical disambiguation. An important point remains to be discussed. It In our context, this polarity filtering has been concerns the role of lexicalization with respect to added before step 1, leaving untouched the core the formalism used. Indeed, the tree-based gram- RCG conversion and parsing steps. The idea is mar formalisms currently supported (TAG and TT- to compute the sets of compatible trees (or tree MCTAG) both share the same lexicalization pro- tuples for TT-MCTAG) and to convert these sets cess (i.e., tree anchoring). Thus the lexicon format separately. Indeed the RCG has to encode only is common to these formalisms. As we will see valid adjunctions/substitutions. Thanks to this below, it corresponds to a 2-layer lexicon made of automaton-based “clustering” of the compatible inflected forms and lemma respectively, the latter tree (or tree tuples), we avoid predicting incompat- selecting specific grammatical structures. When ible derivations. Note that the time saved by using parsing other formalisms, it is still unclear whether a polarity-based filter is not negligible, especially one can use the same lexicon format, and if not 2 when parsing long sentences. what kind of general lexicon management module should be added to the parser (in particular to deal 2.2 Adding formalisms to the parsing with morphology). environment Of course, the two extensions introduced in the 3 Towards a complete grammar previous section may have been added to other engineering environment modular architectures as well. The main gain brought by RCG is the possibility to parse not So far, we have seen how to use a generic parsing only tree-based grammars, but other formalisms architecture relying on RCG to parse different for- provided they can be encoded into RCG. In our malisms. In this section, we adopt a broader view system, only TAG and TT-MCTAG have been and enumerate some requirements for a linguistic considered so far. Nonetheless, Boullier (1998) resource development environment. We also see and Søgaard (2007) have defined transformations to what extent these requirements are fulfilled (or into RCG for other mildly context-sensitive for- partially fulfilled) within the TuLiPA system. malisms.3 3.1 Grammar engineering with TuLiPA 2An evaluation of the gain brought by this technique when using Interaction Grammar is given by Bonfante et al. (2004). As advocated by Erbach (1992), grammar en- 3These include Multi-Component Tree-Adjoining Gram- gineering needs “tools for testing the grammar mar, Linear , Head Grammar, Coupled Context Free Grammar, Right Linear Unification Grammar with respect to consistency, coverage, overgener- and Synchronous Unification Grammar. ation and accuracy”. These characteristics may be taken into account by different interacting soft- Morphological specification: ware. Thus, consistency can be checked by a semi- vergisst vergessen [pos=v,num=sg,per=3] automatic grammar production device, such as the XMG system of Duchier et al. (2004). Overgen- Lemma specification: ∗ eration is mainly checked by a generator (or by ENTRY: vergessen ∗ a parser with adequate test suites), and coverage CAT: v ∗ and accuracy by a parser. In our case, the TuLiPA SEM: BinaryRel[pred=vergessen] ∗ system provides an entry point for using a gram- ACC: 1 ∗ mar production system (and a lexicon conversion FAM: Vnp2 ∗ tool introduced below), while including a parser. FILTERS: [] ∗ Note that TuLiPA does not include any generator, EX: ∗ nonetheless it uses the same lexicon format as the EQUATIONS: → GenI surface realizer for TAG4. NParg1 cas = nom → TuLiPA’s input grammar is designed using NParg2 cas = acc ∗ XMG, which is a metagrammar compiler for tree- COANCHORS: based formalisms. In other terms, the linguist de- Figure 4: Morphological and lemma specification fines a factorized description of the grammar (the of vergisst. so-called metagrammar) in the XMG language. Briefly, an XMG metagrammar consists of (i) el- ementary tree fragments represented as tree de- information allowing for semantic instantiation, scription logic formulas, and (ii) conjunctive and the ∗FAM field, which contains the name of the disjunctive combinations of these tree fragments tree family to be anchored, the ∗FILTERS field to describe actual TAG tree schemata.5 This meta- which consists of a feature structure constraining grammar is then compiled by the XMG system to by unification the trees of a given family that can produce a tree grammar in an XML format. Note be anchored by the given lemma (used for instance that the resulting grammar contains tree schemata for non-passivable verbs), the ∗EQUATIONS field (i.e., unlexicalized trees). To lexicalize these, the allowing for the definition of equations targeting linguist defines a lexicon mapping words with cor- named nodes of the trees, and the ∗COANCHORS responding sets of trees. Following XTAG (2001), field, which allows for the specification of co- this lexicon is a 2-layer lexicon made of morpho- anchors (such as by in the verb to come by). logical and lemma specifications. The motivation From these XML resources, TuLiPA parses a of this 2-layer format is (i) to express linguistic string, corresponding either to a sentence or a con- generalizations at the lexicon level, and (ii) to al- stituent (noun phrase, prepositional phrase, etc.), low the parser to only select a subgrammar accord- and computes several output pieces of informa- ing to a given sentence, thus reducing parsing com- tion, namely (for TAG and TT-MCTAG): deriva- plexity. TuLiPA comes with a lexicon conversion tion/derived trees, semantic representations (com- tool (namely lexConverter) allowing to write a lex- puted from underspecified representations using icon in a user-friendly text format and to convert it the utool software6, or dependency views of the into XML. An example of an entry of such a lexi- derivation trees (using the DTool software7). con is given in Fig. 4. The morphological specification consists of a 3.2 Grammar debugging word, the corresponding lemma and morphologi- The engineering process introduced in the preced- cal features. The main pieces of information con- ing section belongs to a development cycle, where ∗ tained in the lemma specification are the ENTRY one first designs a grammar and corresponding ∗ field, which refers to the lemma, the CAT field lexicons using XMG, then checks these with the referring to the syntactic category of the anchor parser, fixes them, parses again, and so on. node, the ∗SEM field containing some semantic 6See http://www.coli.uni-saarland.de/ 4http://trac.loria.fr/˜geni projects/chorus/utool/, with courtesy of Alexan- 5See (Crabb´e, 2005) for a presentation on how to use the der Koller. XMG formalism for describing a core TAG for French. 7With courtesy of Marco Kuhlmann. To facilitate grammar debugging, TuLiPA in- cludes both a verbose and a robust mode allow- ing respectively to (i) produce a log of the RCG- conversion, RCG-parsing and RCG-derivation in- terpretation, and (ii) display mismatching features leading to incomplete derivations. More precisely, in robust mode, the parser displays derivations step by step, highlighting feature unification failures. TuLiPA’s options can be activated via an intu- itive Graphical User Interface (see Fig. 5).

Figure 6: TuLiPA’s eclipse plug-in.

development platform allows for reusing several components inherited from the software develop- ment community, such as plug-ins for version con- trol, editors coupled with explorers, etc. Eventually, one point worth considering in the context of grammar development concerns data encoding. To our knowledge, only few environ- ments provide support for UTF-8 encoding, thus guarantying the coverage of a wide set of charsets Figure 5: TuLiPA’s Graphical User Interface. and languages. In TuLiPA, we added an UTF- 8 support for linguistic resources (in the lexCon- verter), thus allowing to design a TAG for Korean 3.3 Towards a functional common interface (work in progress). Unfortunately, as mentioned above, the linguist 3.4 Usability of the TuLiPA system has to move back-and-forth from the gram- mar/lexicon descriptions to the parser, i.e., each As mentioned above, the TuLiPA system is made time the parser reports grammar errors, the lin- of several interacting components, that one cur- guist fixes these and then recomputes the XML rently has to install separately. Nonetheless, much files and then parses again. To avoid this tedious attention has been paid to make this installation task of resources re-compilation, we started devel- process as easy as possible and compatible with oping an Eclipse8 plug-in for the TuLiPA system. all major platforms.9 Thus, the linguist will be able to manage all these XMG and lexConverter can be installed by com- resources, and to call the parser, the metagrammar piling their sources (using a make command). compiler, and the lexConverter from a common in- TuLiPA is developed in Java and released as an terface (see Fig. 6). executable jar. No compilation is needed for it, The motivation for this plug-in comes from the only requirement is the Gecode/GecodeJ li- the observation that designing electronic gram- brary10 (available as a binary package for many mars is a task comparable to designing source platforms). Finally, the TuLiPA eclipse plug-in code. A powerful grammar engineering environ- can be installed easily from eclipse itself. All these ment should thus come with development facili- tools are released under Free software licenses (ei- ties such as precise debugging information, syntax ther GNU GPL or Eclipse Public License). highlighting, etc. Using the Eclipse open-source 9See http://sourcesup.cru.fr/tulipa. 8See http://www.eclipse.org 10See http://www.gecode.org/gecodej. This environment is being used (i) at the Univer- syntactic structures thanks to its intuitive parsing sity of T¨ubingen, in the context of the development environment. Once the grammar is stable, one of a TT-MCTAG for German describing both syn- may use SemTAG in batch processing to parse tax and semantics, and (ii) at LORIA Nancy, in the corpuses and build semantic representations using development of an XTAG-based metagrammar for large grammars. This combination of these 2 sys- English. The German grammar, called GerTT (for tems is made easier by the fact that both use the German Tree Tuples), is released under a LGPL li- same input formats (a metagrammar in the XMG cense for Linguistic Resources11 and is presented language and a text-based lexicon). This approach in (Kallmeyer et al., 2008). The test-suite cur- is the one being adopted for the development of a rently used to check the grammar is hand-crafted. French TAG equipped with semantics. A more systematic evaluation of the grammar is in For Interaction Grammar (Perrier, 2000), there preparation, using the Test Suite for Natural Lan- exists an engineering environment gathering the guage Processing (Lehmann et al., 1996). XMG metagrammar compiler and an eLEtrOstatic PARser (LEOPAR).14 This environment is be- 4 Comparison with existing approaches ing used to develop an Interaction Grammar for 4.1 Engineering environments for tree-based French. TuLiPA’s lexical disambiguation module grammar formalisms reuses techniques introduced by LEOPAR. Unlike TuLiPA, LEOPAR does not currently support se- To our knowledge, there is currently no available mantic information. parsing environment for multi-component TAG. Existing grammar engineering environments for 4.2 Engineering environments for other TAG include the DyALog system12 described in grammar formalisms Villemonte de la Clergerie (2005). DyALog is a For other formalisms, there exist state-of-the-art compiler for a logic programming language using grammar engineering environments that have been tabulation and dynamic programming techniques. used for many years to design large deep gram- This compiler has been used to implement efficient mars for several languages. parsing algorithms for several formalisms, includ- For Lexical Functional Grammar, one may cite ing TAG and RCG. Unfortunately, it does not in- the Xerox Linguistic Environment (XLE).15 For clude any built-in GUI and requires a good know- Head-driven Phrase Structure Grammar, the main ledge of the GNU build tools to compile parsers. available systems are the Linguistic Knowledge This makes it relatively difficult to use. DyALog’s Base (LKB)16 and the TRALE system.17 For main quality lies in its efficiency in terms of pars- Combinatory Categorial Grammar, one may cite ing time and its capacity to handle very large re- the OpenCCG library18 and the C&C parser.19 sources. Unlike TuLiPA, it does not compute se- mantic representations. These environments have been used to develop The closest approach to TuLiPA corresponds broad-coverage resources equipped with seman- to the SemTAG system13, which extends TAG tics and include both a generator and a parser. Un- parsers compiled with DyALog with a semantic like TuLiPA, they represent advanced projects, that calculus module (Gardent and Parmentier, 2007). have been used for dialog and machine translation applications. They are mainly tailored for a spe- Unlike TuLiPA, this system only supports TAG, 20 and does not provide any graphical output allow- cific formalism. ing to easily check the result of parsing. 14See http://www.loria.fr/equipes/ Note that, for grammar designers mainly inter- calligramme/leopar/ 15See http://www2.parc.com/isl/groups/ ested in TAG, SemTAG and TuLiPA can be seen nltt/xle/ as complementary tools. Indeed, one may use 16See http://wiki.delph-in.net/moin TuLiPA to develop the grammar and check specific 17See http://milca.sfs.uni-tuebingen.de/ A4/Course/trale/ 11See http://infolingu.univ-mlv. 18See http://openccg.sourceforge.net/ fr/DonneesLinguistiques/ 19See http://svn.ask.it.usyd.edu.au/ Lexiques-Grammaires/lgpllr.html trac/candc/wiki 12See http://dyalog.gforge.inria.fr 20Nonetheless, Beavers (2002) encoded a CCG in the 13See http://trac.loria.fr/˜semconst LKB’s Type Description Language. 5 Future work minutes with a TAG having 6000 trees, but the re- sulting parser can parse sentences within a second. In this section, we give some prospective views In TuLiPA’s approach, the grammar is compiled concerning engineering environments in general, into an RCG on-line. While giving satisfactory re- and TuLiPA in particular. We first distinguish be- sults on reduced resources21, it may lead to trou- tween 2 main usages of grammar engineering en- bles when scaling up. This is especially true for vironments, namely a pedagogical usage and an TAG (the TT-MCTAG formalism is by definition a application-oriented usage, and finally give some factorized formalism compared with TAG). In the comments about multi-formalism. future, it would be useful to look for a way to pre- Pedagogical usage Developing grammars in a compile a TAG into an RCG off-line, thus saving pedagogical context needs facilities allowing for the conversion time. inspection of the structures of the grammar, step- Another important feature of grammar engi- by-step parsing (or generation), along with an in- neering environments consists of its debugging tuitive interface. The idea is to abstract away from functionalities. Among these, one may cite unit technical aspects related to implementation (inter- and integration testing. It would be useful to ex- mediate data structures, optimizations, etc.). tend the TuLiPA system to provide a module for The question whether to provide graphical or generating test-suites for a given grammar. The text-based editors can be discussed. As advo- idea would be to record the coverage and analyses cated by Baldridge et al. (2007), a low-level text- of a grammar at a given time. Once the grammar based specification can offer more flexibility and is further developed, these snapshots would allow bring less frustration to the grammar designer, es- for regression testing. pecially when such a specification can be graph- About multi-formalism We already mentioned ically interpreted. This is the approach chosen that TuLiPA was opening a way towards multi- by XMG, where the grammar is defined via an formalism by relying on an RCG core. It is worth (advanced or not) editor such as gedit or emacs. noticing that the XMG system was also designed Within TuLiPA, we chose to go further by using to be further extensible. Indeed, a metagrammar the Eclipse platform. Currently, it allows for dis- in XMG corresponds to the combination of ele- playing a summary of the content of a metagram- mentary structures. One may think of designing mar or lexicon on a side panel, while editing these a library of such structures, these would be depen- on a middle panel. These two panels are linked dent on the target grammar formalism. The combi- via a jump functionality. The next steps concern nations may represent general linguistic concepts (i) the plugging of a graphical viewer to display and would be shared by different grammar imple- the (meta)grammar structures independently from mentations, following ideas presented by Bender a given parse, and (ii) the extension of the eclipse et al. (2005). plug-in so that one can easily consistently modify entries of the metagrammar or lexicon (especially 6Conclusion when these are split over several files). In this paper, we have presented a multi-formalism Application-oriented usage When dealing with parsing architecture using RCG as a pivot formal- applications, one may demand more from the ism to parse mildly context-sensitive formalisms grammar engineering environment, especially in (currently TAG and TT-MCTAG). This system has terms of efficiency and robustness (support for been designed to facilitate grammar development larger resources, partial parsing, etc.). by providing user-friendly interfaces, along with Efficiency needs optimizations in the parsing several functionalities (e.g., dependency extrac- engine making it possible to support grammars tion, derivation/derived tree display and semantic containing several thousands of structures. One calculus). It is currently used for developing a core interesting question concerns the compilation of a grammar for German. grammar either off-line or on-line. In DyALog’s approach, the grammar is compiled off-line into 21For a TT-MCTAG counting about 300 sets of trees and an and-crafted lexicon made of about 300 of words, a 10-word a logical automaton encoding all possible deriva- sentence is parsed (and a semantic representation computed) tions. This off-line compilation can take some within seconds. At the moment, we are working on the extension Erbach, Gregor. 1992. Tools for grammar engineer- of this architecture to include a fully functional ing. In 3rd Conference on Applied Natural Lan- Eclipse plug-in. Other current tasks concern op- guage Processing, pages 243–244, Trento, Italy. timizations to support large scale parsing and the Gardent, Claire and Laura Kallmeyer. 2003. Seman- extension of the syntactic and semantic coverage tic Construction in FTAG. In Proceedings of EACL 2003, pages 123–130, Budapest, Hungary. of the German grammar under development. Gardent, Claire and Yannick Parmentier. 2007. Sem- Acknowledgments tag: a platform for specifying tree adjoining gram- mars and performing tag-based semantic construc- This work has been supported by the Deutsche tion. In Proceedings of the ACL 2007, Companion Forschungsgemeinschaft (DFG) and the Deutscher Volume Proceedings of the Demo and Poster Ses- sions, pages 13–16, Prague, Czech Republic. Akademischer Austausch Dienst (DAAD, grant A/06/71039). We are grateful to three anonymous Joshi, Aravind K. 1987. An introduction to Tree Ad- reviewers for valuable comments on this work. joining Grammars. In Manaster-Ramer, A., editor, Mathematics of Language, pages 87–114. John Ben- jamins, Amsterdam. References Kallmeyer, Laura and Yannick Parmentier. 2008. On the relation between Multicomponent Tree Adjoin- Baldridge, Jason, Sudipta Chatterjee, Alexis Palmer, ing Grammars with Tree Tuples (TT-MCTAG) and and Ben Wing. 2007. DotCCG and VisCCG: Wiki Range Concatenation Grammars (RCG). In Pro- and programming paradigms for improved grammar ceedings of LATA 2008, pages 277–288, Tarragona, engineering with OpenCCG. In King, Tracy Hol- Spain. loway and Emily M. Bender, editors, Proceedings of GEAF07, pages 5–25, Stanford, CA. CSLI. Kallmeyer, Laura, Timm Lichte, Wolfgang Maier, Yan- nick Parmentier, and Johannes Dellert. 2008. Devel- Beavers, John. 2002. Documentation: A CCG Imple- opping an MCTAG for German with an RCG-based mentation for the LKB. LinGO Working Paper No. Parser. In Proceedings of LREC 2008, Marrakech, 2002-08, CSLI, Stanford University, Stanford, CA. Morocco. Lehmann, Sabine, Stephan Oepen, Sylvie Regnier- Bender, Emily, Dan Flickinger, Frederik Fouvry, and Prost, Klaus Netter, Veronika Lux, Judith Klein, Melanie Siegel. 2005. Shared representation in mul- Kirsten Falkedal, Frederik Fouvry, Dominique Esti- tilingual grammar engineering. Research on Lan- val, Eva Dauphin, Herv´e Compagnion, Judith Baur, guage & Computation, 3(2):131–138. Lorna Balkan, and Doug Arnold. 1996. TSNLP — Test Suites for Natural Language Processing. In Bonfante, Guillaume, Bruno Guillaume, and Guy Per- Proceedings of CoLing 1996, volume 2, pages 711– rier. 2004. Polarization and abstraction of grammat- 716, Copenhagen, Denmark. ical formalisms as methods for lexical disambigua- tion. In Proceedings of CoLing 2004, pages 303– Lichte, Timm. 2007. An MCTAG with tuples for co- 309, Geneva, Switzerland. herent constructions in German. In Proceedings of the 12th Conference on , Dublin, Boullier, Pierre. 1998. Proposal for a natural lan- Ireland. guage processing syntactic backbone. Rapport de Recherche 3342, INRIA. Perrier, Guy. 2000. Interaction grammars. In Proceed- ings of CoLing 2000, pages 600–606, Saarbruecken, Boullier, Pierre. 1999. On TAG and Multicomponent Germany. TAG Parsing. Rapport de Recherche 3668, INRIA. Søgaard, Anders. 2007. Complexity, expressivity and logic of linguistic theories. Ph.D. thesis, University Boullier, Pierre. 2000. Range concatenation gram- of Copenhagen, Copenhagen, Denmark. mars. In Proceedings of IWPT 2000, pages 53–64, Trento, Italy. Villemonte de la Clergerie, Eric.´ 2005. DyALog: a tabular logic programming based environment for Crabb´e, Benoit. 2005. Grammatical development with NLP. In Proceedings of the workshop on Constraint XMG. In Proceedings of LACL 05, pages 84–100, Satisfaction for Language Processing (CSLP 2005), Bordeaux, France. pages 18–33, Barcelona, Spain. Duchier, Denys, Joseph Le Roux, and Yannick Par- XTAG-Research-Group. 2001. A lexicalized tree mentier. 2004. The Metagrammar Compiler: An adjoining grammar for english. Technical Re- NLP Application with a Multi-paradigm Architec- port IRCS-01-03, IRCS, University of Pennsylva- ture. In Proceedings of MOZ’2004, pages 175–187, nia. Available at http://www.cis.upenn. Charleroi, Belgium. edu/˜xtag/gramrelease.html.