Constraint Grammar Parsing with Left and Right Sequential Finite

Constraint Grammar parsing with left and right sequential finite transducers Mans Hulden University of Helsinki [email protected] Abstract ten times faster operation, although at the cost of a loss in accuracy. We propose an approach to parsing Con- In this paper, we describe a process of compil- straint Grammars using finite-state transduc- ing individual CG rules into finite-state transducers ers and report on a compiler that converts Con- straint Grammar rules into transducer repre- (FSTs) that perform the corresponding disambigua- sentations. The resulting transducers are fur- tion task on an ambiguous input sentence. Using ther optimized by conversion to left and right this approach, we can improve the worst-case run- sequential transducers. Using the method, ning time of a CG parser to quadratic in the length we show that we can improve on the worst- of a sentence, down from the cubic time requirement case asymptotic bound of Constraint Gram- reported earlier (Tapanainen, 1999). The method mar parsing from cubic to quadratic in the presented here implements faithfully all the oper- length of input sentences. ations allowed in the CG-2 system documented in Tapanainen (1996). The same approach can be used 1 Introduction for various extensions and variants of the Constraint Grammar paradigm. The Constraint Grammar (CG) paradigm (Karlsson, The idea of representing CG rules as FSTs has 1990) is a popular formalism for performing part- been suggested before (Karttunen, 1998), but to of-speech disambiguation, surface syntactic tagging, our knowledge this implementation represents the and certain forms of dependency analysis. A CG first time the idea has been tried in practice.1 We is a collection of hand-written disambiguation rules also show that after compiling a collection of CG for part-of-speech or syntactic functions. The popu- rules into their equivalent FSTs, the individual trans- larity of CGs is explained by a few factors. They ducers can further be converted into left and right typically achieve quite high F-measures on unre- sequential transducers which greatly improves the stricted text, especially for free word-order lan- speed of application of a rule. guages (Chanod and Tapanainen, 1995; Samuelsson In the following, we give a brief overview of and Voutilainen, 1997). Constraint Grammars can the CG formalism, discuss previous work and CG also be developed by linguists rather quickly, even parsers, provide an account of our method, and fi- for languages that have only meager resources avail- nally report on some practical experiments in com- able as regards tagged or parsed corpora, although piling large-scale grammars into FSTs with our CG- it is hard to come by exact measures of how much rule-to-transducer compiler. effort development requires. One drawback to using CG, however, is that applying one to disambiguate 2 Constraint Grammar parsers input text tends to be very slow: for example, the Apertium project (Forcada et al., 2009), which of- A Constraint Grammar parser occupies the central fers the option of using both n-gram models and CG role of a system in the CG framework. A CG system (by way of the vislcg3 compiler (Bick, 2000)), re- 1However, Peltonen (2011) has recently implemented a sub- ports that using n-gram models currently results in set of CG-2 as FSTs using a different method. 39 Proceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing, pages 39–47, Blois (France), July 12-15, 2011. c 2011 Association for Computational Linguistics is usually intended to produce part-of-speech tag- tions, and BARRIER specifications. For example, a ging and surface syntactic tagging from unrestricted more complete rule such as: text. Generally, the text to be processed is first tok- "<word>" REMOVE (X) IF enized and subjected to morphological analysis, pos- (*1 A BARRIER C LINK *1 B BARRIER C); sibly by external tools, producing an output where would remove the tag X for the word-form word if words are marked with ambiguous, alternative read- the first tag A were followed by a B somewhere to ings. This output is then passed as input to a CG the right, and there was no C before the B, except if parser component. Figure 1 (left) shows an exam- the first A-tagged reading also contained C. ple of intended input to a CG parser where each in- It is also possible to add and modify tags to co- dented line following a lemma represents an alterna- horts using ADD and MAP operations, which work tive morphological and surface syntactic reading of exactly as the SELECT and REMOVE operations as that lemma; an entire group of alternative readings, regards the contextual target specification. such as the five readings for the word people in the figure is called a cohort. Figure 1 (right) shows the 2.2 Parser operation desired output of a CG disambiguator: each cohort Given a collection of CG rules, the job of the parser has been reduced to contain only one reading. is to apply each rule to the set of input cohorts representing an ambiguous sentence as in Figure 1, and 2.1 Constraint grammar rules remove or select readings as the rule dictates. The A CG parser operates by removing readings, or by formalism specifies no particular rule ordering per selecting readings (removing the others) according se, and different implementations of the CG formal- to a set of CG rules. In its standard form there ism apply rules in varying orders (Bick, 2000). In exists only these two types of rules (SELECT and this respect, it is up to the grammar writer to design REMOVE). How the rules operate is further condi- the rules so that they operate correctly no matter in tioned by constraints that dictate in which environ- what order they are called upon. The parser iter- ment a rule is triggered. A simple CG rule such as: ates rule application and removes readings until no rule can perform any further disambiguation, or un- REMOVE (V) IF (NOT *-1 sub-cl-mark) (1C (VFIN)) ; til each cohort contains only one reading. Naturally, since no rule order is explicit, most parser imple- would remove all readings that contain the tag V, if mentations (Tapanainen, 1996; Bick, 2000) tend to there (a) is no subordinate clause mark anywhere to use complex techniques to predict if a certain rule the left (indicated by the rule scope (NOT 1), and ∗− can apply at all to avoid the costly process of check- (b) the next cohort to the right contains the tag VFIN ing each reading and its respective contexts in an in- in all its readings (signaled by 1C (VFIN)). Such a put sentence against a rule for possible removal or rule would, for instance, disambiguate the word peo- selection. ple in the example sentence in Figure 1, removing all other readings except the noun reading. Rules can 2.3 Computational complexity also refer to the word-forms or the lemmas in their Tapanainen (1999) gives the following complexity environments. Traditionally, the word-forms are analysis for his CG-2 parsing system. Assume that quoted while the lemmas are enclosed in brackets a sentence of length n contains maximally k differ- and quotation marks (as in ‘‘<counselors>’’ ent readings of a token, and is to be disambiguated vs. ‘‘counselor’’ in fig. 1). by a grammar consisting of G rules. Then, testing In the example above, only morphological tags whether to keep or discard a reading with respect to a are being used, but the same formalism of con- single rule can be done in O(nk), with respect to all straints is often used to disambiguate additional, rules, in O(Gnk), and with respect to all rules and syntactically motivated tags as well, including tags all tokens in O(n2Gk). Now, in the worst case, ap- that mark phrases and dependencies (Tapanainen, plying all rules to all alternative readings only results 1999; Bick, 2000). Additional features in the rule in the discarding of a single reading. Hence, the pro- formalism include LINK contexts, Boolean opera- cess must in some cases be repeated n(k 1) times, − 40 "<Business>" "<Business>" "business" <*> N NOM SG "business" <*> N NOM SG "<people>" "<people>" "people" N NOM SG/PL "people" N NOM SG/PL "people" V PRES -SG3 VFIN "<can>" "people" V IMP VFIN "can" V AUXMOD Pres VFIN "people" V SUBJUNCTIVE VFIN "<play>" "people" V INF "play" V INF "<can>" "<a>" "can" V AUXMOD Pres VFIN "a" <Indef> DET CENTRAL ART SG "<role>" "<play>" "role" <Count> N NOM SG "play" N NOM SG "<as>" "play" V PRES -SG3 VFIN "as" PREP "play" V IMP VFIN "<counselors>" "play" V SUBJUNCTIVE VFIN "counselor" <DER:or> <Count> N NOM PL "play" V INF "<and>" "<a>" "and" CC "a" <Indef> DET CENTRAL ART SG "<teachers>" "<role>" "teacher" <DER:er> <Count> N NOM PL "role" <Count> N NOM SG "<.>" "<as>" "." PUNCT Pun "as" ADV AD-A> "as" <**CLB> CS "as" PREP "<counselors>" "counselor" <DER:or> <Count> N NOM PL "<and>" "and" CC "<teachers>" "teacher" <DER:er> <Count> N NOM PL "<.>" "." PUNCT Pun Figure 1: Example input (left) and output (right) from a Constraint Grammar disambiguator. yielding a total complexity of O(n3Gk2). As men- linking constraints; secondly, TBL rules target tags, tioned above there are various heuristics one can use not words, while CG allows for rules to target any to avoid blindly testing rules against readings where mix of both; thirdly, TBL rules only replace single they cannot apply, but none that guarantee a lower tags with other single tags and do not remove tags complexity. from sets of alternative tags.2 Additionally, Koskenniemi (1990); Koskenniemi 3 Related work et al.

Constraint Grammar Parsing with Left and Right Sequential Finite

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support