The Combinatory Morphemic Lexicon

The Combinatory Morphemic Lexicon Cem Bozsahin∗ Middle East Technical University Grammars that expect words from the lexicon may be at odds with the transparent projection of syntactic and semantic scope relations of smaller units. We propose a morphosyntactic framework based on Combinatory Categorial Grammar that provides flexible constituency, flexible category consistency, and lexical projection of morphosyntactic properties and attachment to grammar in order to establish a morphemic grammar-lexicon. These mechanisms provide enough expressive power in the lexicon to formulate semantically transparent specifications without the necessity to confine structure forming to words and phrases. For instance, bound morphemes as lexical items can have phrasal scope or word scope, independent of their attachment characteristics but consistent with their semantics. The controls can be attuned in the lexicon to language-particular properties. The result is a transparent interface of inflectional morphology, syntax, and semantics. We present a computational system and show the application of the framework to English and Turkish. 1. Introduction The study presented in this article is concerned with the integrated representation and processing of inflectional morphology, syntax, and semantics in a unified grammar ar- chitecture. An important issue in such integration is mismatches in morphological, syntactic, and semantic bracketings. The problem was first noted in derivational morphology. Williams (1981) provided examples from English; the semantic bracketings in (1a–2a) are in conflict with the morphological bracketings in (1b–2b). (1) a. -ity b. hydro hydro electric electric -ity (2) a. -ing b. Gödel Gödel number number -ing If the problem were confined to derivational morphology, we could avoid it by making derivational morphology part of the lexicon that does not interact with grammar. But this is not the case. Mismatches in morphosyntactic and semantic bracketing ∗ Computer Engineering and Cognitive Science, Middle East Technical University, 06531 Ankara, Turkey. E-mail: [email protected]. c 2002 Association for Computational Linguistics Computational Linguistics Volume 28, Number 2 also abound. This article addresses such problems and their resolution in a computational system.1 Müller (1999, page 401) exemplifies the scope problem in German prefixes. (3a) is in conflict with the bracketing required for the semantics of the conjunct (3b). (3) a. Wenn [Ihr Lust] und [noch nichts anderes vor-]habt, if you pleasure and yet nothing else intend können wir sie ja vom Flughafen abholen can we them PARTICLE from.the airport pick up ’If you feel like it and have nothing else planned, we can pick them up at the airport.’ b. Ihr Lust habt UND noch nichts anderes vorhabt Similar problems can be observed in Turkish inflectional suffixes. In the coordination of tensed clauses, the tense attaches to the verb of the rightmost conjunct (4a) but applies to all conjuncts (4b). Delayed affixation appears to apply to all nominal inflections (4c–e). (4) a. Zorunlu deprem sigortası [yürürlü˘ge girmi¸s] ama mandatory earthquake insurance effect enter-ASP but [tam anlamıyla uygulanamamı¸s]-tı exactly apply-NEG-ASP-TENSE ’Mandatory earthquake insurance had gone into effect, but it had not been enforced properly.’ b. yürürlü˘ge girmi¸s-ti ama tam anlamıyla uygulanamamı¸s-tı c. Adam-ın [araba ve ev]-i man-GEN car and house-POSS ’the man’s house and car’ d. Araba-yı [adam vecocuk ¸ ]-lar-a göster-di-m Car-ACC man and child-PLU-DAT show-TENSE-PERS1 ’(I) showed the car to the men and the children.’ e. Araba-yı sen-in [dost ve tanıdık]-lar-ın-a göster-di-m Car-ACC you-GEN friend and acq.-PLU-POSS-DAT showed ’(I) showed the car to the your friends and acquaintances.’ 1 Our use of the term morphosyntax needs some clarification. Some authors, (e.g., Jackendoff 1997), take it to mean the syntax of words, in contrast to the syntax of phrases. By morphosyntax we mean those aspects of morphology and syntax that collectively contribute to grammatical meaning composition. This is more in line with the inflectional-morphology-is-syntax view. In this respect, we will not address problems related to derivational morphology; its semantics is notoriously noncompositional and does not interact with grammatical meaning. Moreover, without a semantically powerful lexicon such as Pustejovsky’s (1991), even the most productive fragment of derivational morphology is hard to deal with (Sehitoglu and Bozsahin 1999). 146 Bozsahin The Combinatory Morphemic Lexicon Phrasal scope of inflection can be seen in subordination and relativization as well. In (5a), the entire nominalized clause marked with the accusative case is the object of want. In (5b), the relative participle applies to the relative clause, which lacks an object. The object’s case is governed by the subordinate verb, whose case requirements might differ from that of the matrix verb (5c). As we show later in this section, the coindexing mechanisms in word-based unification accounts of unbounded extraction face a conflict between the local and the nonlocal behavior of the relativized noun, mainly due to applying the relative participle -di˘g-i to the verbal stem ver rather than the entire relative clause. A lexical entry for -di˘g-i would resolve the conflict and capture the fact that it applies to nonsubjects uniformly. (5) a. Can [Ay¸se’nin kitab-ı oku-ma-sı]-nı iste-di C.NOM A.-GEN book-ACC read-INF-AGR-ACC want-TENSE ’Can wanted Ay¸se to read the book.’ lit. ’Can wanted Ay¸se’s-reading-the-book.’ b. Ben [ Mehmet’incocu˘ ¸ g-a/*-u ver ]-di˘g-i kitab-ı oku-du-m I.NOM M-GEN child-DAT/*ACC give-REL.OP book-ACC read-TENSE-PERS1 ’I read the book that Mehmet gave to the child.’ c. Ben [ Mehmet’in kitab-ı ver ]-di˘g-icocu˘ ¸ g-u/*-a gör-dü-m I.NOM M-GEN book-ACC give-REL.OP child-ACC/*DAT see-TENSE-PERS1 ’I saw the child to whom Mehmet gave the book.’ The morphological/phrasal scope conflict of affixes is not particular to morphologically rich languages. Semantic composition of affixes in morphologically simpler languages poses problems with word (narrow) scope of inflections. For instance, fake trucks needs the semantics (plu(fake truck)), which corresponds to the surface bracketing [fake truck]-s, because it denotes the nonempty nonsingleton sets of things that are not trucks but fake trucks (Carpenter 1997). Four trucks, on the other hand, has the semantics (four(plu truck)), which corresponds to four [truck]-s, because it denotes the subset of nonempty nonsingleton sets of trucks with four members. The status of inflectional morphology among theories of grammar is far from settled, but, starting with Chomsky (1970), there seems to be an agreement that derivational morphology is internal to the lexicon. Lexical Functional Grammar (LFG) (Bresnan 1995) and earlier Government and Binding (GB) proposals e.g. (Anderson 1982) consider inflectional morphology to be part of syntax, but it has been del- egated to the lexicon in Head-Driven Phase Structure Grammar (HPSG) (Pollard and Sag 1994, page 35) and in the Minimalist Program (Chomsky 1995, page 195). The representational status of the morpheme is even less clear. Parallel develop- ments in computational studies of HPSG propose lexical rules to model inflectional morphology (Carpenter and Penn 1994). Computational models of LFG (Tomita 1988) and GB (Johnson 1988; Fong 1991), on the other hand, have been noncommittal re- garding inflectional morphology. Finally, morphosyntactic aspects have always been a concern in Categorial Grammar (CG) (e.g., Bach 1983; Carpenter 1992; Dowty 1979; Heylen 1997; Hoeksema 1985; Karttunen 1989; Moortgat 1988b; Whitelock 1988), but the issues of constraining the morphosyntactic derivations and re- solving the apparent mismatches have been relatively untouched in computational studies. We briefly look at Phrase Structure Grammars (PSGs), HPSG, and Multimodal CGs (MCGs) to see how word-based alternatives for morphosyntax would deal with 147 Computational Linguistics Volume 28, Number 2 the issues raised so far. For convenience, we call a grammar that expects words from the lexicon a lexemic grammar and a grammar that expects morphemes a morphemic grammar. A lexemic PSG provides a lexical interface for inflected words (X0s) such 2 that a regular grammar subcomponent handles lexical insertion at X0. In (4d), the right conjunct cocuk-lar-a¸ is analyzed as N0 → cocuk-PLU-DAT¸ (or N0 → N0 -DAT, N0 → N0 -PLU, N0 → Stem, as a regular grammar). Assuming a syncategorematic coordination schema, that is, X → X and X, the N0 in the left and right conjuncts of this example would not be of the same type. Revising the coordination schema such that only the root features coordinate would not be a solution either. In (4e), the relation of possession that is marked on the right conjunct must be carried over to the left conjunct as well. What is required for these examples is that the syntactic constituent X in the schema be analyzed as X-PLU(-POSS)-DAT, after N0 and N0 coordination. What we need then is not a lexemic but a morphemic organization in which bracketing of free and bound morphemes is regulated in syntax. The lexicon, of course, must now supply the ingredients of a morphosyntactic calculus. This leads to a the- ory in which semantic composition parallels morphosyntactic combination by virtue of bound morphemes’ being able to pick their domains just like words (above X0, if needed). A comparison of English and Turkish in this regard is noteworthy. The English relative pronouns that/whom and the Turkish relative participle -di˘g-i would have exactly the same semantics when the latter is granted a representational status in the lexicon (see Section 6).

The Combinatory Morphemic Lexicon

The Generative Lexicon

THE LEXICON: a SYSTEM of MATRICES of LEXICAL UNITS and THEIR PROPERTIES ~ Harry H

CS474 Natural Language Processing Semantic Analysis Caveats Introduction to Lexical Semantics

The Art of Lexicography - Niladri Sekhar Dash

Changing the Rules: Why the Monolingual Learner's Dictionary Should Move Away from the Native-Speaker Tradition

Part 4: Lexical Semantics 2

The Project, Its Ratiale and Processes,'Lrcludes A,Descrip-Tionf

Chapter 4 Lexicon Reading and Lecture Notes

SLIDE – a Sentiment Lexicon of Common Idioms

Ontology and the Lexicon

A Tool Kit for Lexicon Building

Lexicography, Linguistics, and Minority Languages