Multiset-Valued Linear Index Grammars: Imposing Dominance Constraints on Derivations
Total Page:16
File Type:pdf, Size:1020Kb
Multiset-Valued Linear Index Grammars: Imposing Dominance Constraints on Derivations Owen Rambow Univesit@ Paris 7 UFR Linguistique, TALANA Case 7003, 2, Place Jussieu F-75251 Paris Cedex 05, France rambow©linguist, j ussieu, fr Abstract While LIG-equivalent formalisms have been shown to This paper defines multiset-valued linear index gram- provide adequate formal power for a wide range of lin- mar and unordered vector grammar with dominance guistic phenomena (including the aforementioned Swiss links. The former models certain uses of multiset- German construction), the need for other mathemati- valued feature structures in unification-based for- cal formalisms has arisen in several unrelated areas. In malisms, while the latter is motivated by word order this paper, we discuss three such cases. First, captur- variation and by "quasi-trees", a generalization of trees. ing several semantic and syntactic issues in unification- The two formalisms are weakly equivalent, and an im- based formalisms leads to the use of multiset-valued portant subset is at most context-sensitive and polyno- feature structures. Second, word order facts from lan- mially parsable. guages such as German, Russian, or Turkish cannot be derived by LIG-equivalent formalisms. Third, a gener- alization of trees to "quasi-trees" (Vijay-Shanker, 1992) Introduction in the spirit of D-Theory (Marcus et al., 1983) leads Early attempts to use context-free grammars (CFGs) as to the definition of a new formal system. In this pa- a mathematical model for natural language syntax have per, we introduce two new equivalent mathematical for- largely been abandoned; it has been shown that (un- malisms which provide adequate descriptions for these der standard assumptions concerning the recursive na- three phenomena. ture of clausal embedding) the cross-serial dependencies The paper is structured as follows. First, we present found in Swiss German cannot be generated by a CFG the three phenomena in more detail. We then introduce (Shieber, 1985). Several mathematical models have multiset-valued LIG and present some formal proper- been proposed which extend the formal power of CFGs, ties. Thereafter, we introduce a second rewriting sys- while still maintaining the formal properties that make tem and show that it is weakly equivalent to the LIG CFGs attractive formalisms for formal and computa- variant. We then briefly mention some related for- tional linguists, in particular, polynomial parsability malisms. We conclude with a brief summary. and restricted weak generative capacity. These mathe- matical models include tree adjoining grammar (TAG) Three Problems for LIG-Equivalent (Joshi et al., 1975; Joshi, 1985), head grammar (Pollard, 1984), combinatory categorial grammar (CCG) (Steed- Formalisms man, 1985), and linear index grammar (LIG) (Gaz- The three problems we present are of a rather differ- dar, 1988). These formalisms have been shown to be ent nature. The first arises from the way a linguis- weakly equivalent to each other (Vijay-Shanker et al., tic problem is treated in a specific type of framework 1987; Vijay-Shanker and Weir, 1994); we will refer to (unification-based formalisms). The second problem them as "LIG-equivalent formalisms". LIG is a vari- derives directly from linguistic data. The third prob- ant of index grammar (IG) (Aho, 1968). Like CFG, IG lem is a formalism which has been motivated on in- is a context-free string rewriting system, except that dependent, methodological grounds, but whose formal the nonterminal symbols in a CFG are augmented with properties are unknown. stacks of index symbols. The rewrite rules push or pop indices from the index stack. In an IG, the index stack Multiset-Valued Feature Structures is copied to all nonterminal symbols on the right-hand HPSG (Pollard and Sag, 1987; Pollard and Sag, 1994) side of a rule. In a LIG, the stack is copied to exactly uses typed feature structures as its formal basis, which one right-hand side nonterminal. 1 are Turing-equivalent. However, it is not necessarily 1Note that a LIG is not an IG that is linear (i.e., whose side), but rather, it is a context-free grammar with linear productions have at most one nonterminal on the right-hand indices (i.e., the indices are never copied). 263 the case that the full power of the system is used in (2) ... dab [dem Kunden]i [den Kfihlschrank]j the linguistic analyses that are expressed in it. HPSG ... that the clientDAW the refrigeratorAcc analyses include information about constituent struc- bisher noch niemand ti [[tj ZU reparieren] ture which can be represented as a context-free phrase- so far yet no-oneNoM to repair structure tree. In addition, various mechanisms have zu versuchen] versprochen hat been proposed to handle certain linguistic phenomena to try promised has that relate two nodes within this tree. One of these ... that so-far, no-one yet has promised the client to is a multiset-valued feature that is passed along the repair the refrigerator phrase-structure tree from daughter node to mother node. Multiset-valued features have been proposed for Similar data has been observed in the literature for the SLASH feature which handles wh-dependencies (Pol- other languages, for example for Finnish by Karttunen lard and Sag, 1994, Chapter 4), and for certain semantic (1989). Becker et al. (1991) argue that a simple TAG purposes, including the representation of stored quan- (and the other LIG-equivalent formalisms) cannot de- tifiers in a mechanism similar to Cooper-storage. An- rive the full range of scrambled sentences. Rambow and other use may be the representation of anti-coreference Satta (1994) propose the use of unordered vector gram- constraints arising from Principle C of Binding Theory mar (UVG) to model the data. In UVG (Cremers and (be it that of (Chomsky, 1981) or of Pollard and Sag Mayer, 1973), several context-free string rewriting rules (1992)). are grouped into vectors, as for verspricht 'promises': It is desirable to be able to assess the formal power (3) ((S --+ NPnom VP), (VP -4 NPdat VP), of such a system, for both theoretical and practical (VP --~ Sinf V), (V ~ verspricht) ) reasons. Theoretically, it would be interesting if it turned out that the linguistic principles formulated in During a derivation, rules from a vector can be ap- HPSG naturally lead to certain restricted uses of the plied in any order, and rules from different vectors call unification-based formalism. Clearly this would repre- be interleaved, but at the end, all rules from an instance sent an important insight into the nature of grammat- of a vector must have been used in the derivation. By ical competence. On the practical side, formal equiv- varying the order in which rules from different vectors alences can guide the building of applications such as are applied, we can derive different word orders. Ob- parsers for existing HPSG grammars. For example, it serve that the vector in (3) contains exactly one ter- has been proposed that HPSG grammars can be "com- minal symbol (the verb); grammars in which every el- piled" into TAGs in order to obtain a computationally ementary structure (vector in UVG, tree in TAG, rule more tractable system (Kasper, 1992), thus sidestep- in CFG) contains at least one terminal symbol we will ping the issue of building parsers for HPSG directly. call lexicalized. However, LIG-equivalent formalisms cannot serve as Languages generated by UVG are known to be targets for compilations in cases in which HPSG uses context-sensitive and semilinear (Cremers and Mayer, multiset-valued feature structures. 1974) and polynomially parsable (Satta, 1993). How- ever, they are not adequate for modeling natural lan- guage syntax. In the following example, (4a) is out since Word Order Variation there is no analysis in which the moved NP c-commands Becket et al. (1991) discuss scrambling, which is the its governing verb, as is the case in (4b). permutation of verbal arguments in languages such as (4) a. *... dab niemand [dem Kundeu] [ti German, Korean, Japanese, Hindi, Russian, and Turk- ... that no-onesoM the clientDAT ish. If there are embedded clauses, scrambling in many ZU versuchen] [den Kiihlschrank]j versprochen languages can affect arguments of more than one verb to try the refrigeratorAcc promised ("long-distance" scrambling). hat [tj zu reparieren]i (1) ... dab [den Kiihlschrank]i bisher noch has to repair ... that the refrigeratorAcc so far yet b. ? ... daft niemand [dem Kunden] [den niemand [ti zu reparieren] versprochen hat Kiihlschrank]j [ti zu versuchen] versprochen hat no-onesoM to repair promised has [tjzu reparieren]i ... that so far, no-one has promised to repair the re- What is needed is an additional mechanism that en- frigerator forces a dominance relation between the sister node of Scrambling in German is "doubly unbounded" in the an argument and its governing verb. sense that there is no bound on the number of clause Quasi-Trees boundaries over which an element can scramble, and an element scrambled (long-distance or not) from one Vijay-Shanker (1992) introduces "quasi-trees" as a gen- clause does not preclude the scrambling of an element eralization of trees. He starts from the observation from another clause: that the traditional definition of tree adjoining gram- 264 $ mar (TAG) is incompatible with a unification-based ap- proach because the trees of a TAG start out as fully specified objects, which are later modified; in particu- NP S lar, immediate dominance relations in a tree need not I hold after another tree is adjoined into it. In order to ar- s rive at a definition that is compatible with a unification- based approach, he makes three minimal assumptions NP VP 1 about the nature of the objects used for the representa- I tion of natural language syntax. The first assumption vP (left implicit) is that these objects represent phrase- structure.