<<

The Computational Lexical of Syntagmatic Relations

Evelyne Viegas, Stephen Beale and Sergei Nirenburg New Mexico State University Computing Research Lab, Las Cruces, NM 88003, USA viegas, sb, sergei©crl, nmsu. edu

Abstract inheritance hierarchy of Lexical Semantic Functions In this paper, we address the issue of syntagmatic (LSFs). expressions from a computational lexical semantic perspective. From a representational viewpoint, we 2 Approaches to Syntagmatic argue for a hybrid approach combining linguistic and Relations conceptual paradigms, in order to account for the Syntagmatic relations, also known as collocations, continuum we find in natural from free are used differently by lexicographers, linguists and combining to frozen expressions. In particu- statisticians denoting almost similar but not identi- lar, we on the place of lexical and semantic cal classes of expressions. restricted co-occurrences. From a processing view- The traditional approach to collocations has been point, we show how to generate/analyze syntag- lexicographic. Here provide infor- matic expressions by using an efficient constraint- mation about what is unpredictable or idiosyn- based processor, well fitted for a knowledge-driven cratic. Benson (1989) synthesizes Hausmann's stud- approach. ies on collocations, calling expressions such as com- 1 Introduction mit murder, compile a , inflict a wound, etc. "fixed combinations, recurrent combinations" You can take advantage o] the chambermaid 1 is not a or "collocations". In Hausmann's terms (1979) a collocation one would like to generate in the collocation is composed of two elements, a base ("Ba- of a hotel to mean "use the services of." This is why sis") and a collocate ("Kollokator"); the base is se- collocations should constitute an important part in mantically autonomous whereas the collocate cannot the design of Machine Translation or Multilingual be semantically interpreted in isolation. In other Generation systems. words, the set of lexical collocates which can com- In this paper, we address the issue of syntagmatic bine with a given basis is not predictable and there- expressions from a computational lexical semantic fore collocations must be listed in dictionaries. perspective. From a representational viewpoint, we It is hard to say that there has been a real focus argue for a hybrid approach combining linguistic and on collocations from a linguistic perspective. The conceptual paradigms, in order to account for the has been broadly sacrificed by both English- continuum we find in natural languages from free speaking schools and continental European schools. combining words to frozen expressions (such as in The scientific agenda of the former has been largely idioms kick the (proverbial) bucket). In particular, dominated by syntactic issues until recently, whereas we focus on the representation of restricted seman- the latter was more concerned with pragmatic as- tic and lexical co-occurrences, such as heavy smoker pects of natural languages. The focus has been on and #ssor ... students respectively, that we de- grammatical collocations such as adapt to, aim at, fine later. From a processing viewpoint, we show look ]or. Lakoff (1970) distinguishes a class of ex- how to generate/analyze syntagmatic expressions by pressions which cannot undergo certain operations, using an efficient constraint-based processor, well fit- such as nominalization, causativization: the problem ted for a knowledge-driven approach. In the follow- is hard; *the hardness of the problem; *the problem ing, we first compare different approaches to collo- hardened. The restriction on the application of cer- cations. Second, we present our approach in terms tain syntactic operations can help define collocations of representation and processing. Finally, we show such as hard problem, for example. Mel'~uk's treat- how to facilitate the acquisition of co-occurrences by ment of collocations will be detailed below. using 1) the of lexical rules (LRs), 2) an In recent years, there has been a resurgence of 1Lederer, R. 1990. Anguished English A Laurel Book, Dell statistical approaches applied to the study of nat- Publishing. ural languages. Sinclair (1991) states that '%

1328 which occurs in close proximity to a word under in- the collocational information is listed in a static way. vestigation is called a collocate of it .... Collocation We believe that one of the main drawbacks of the ap- is the occurrence of two or more words within a proach is the lack of any predictable calculi on the short space of each other in a text". The prob- possible expressions which can collocate with each lem is that with such a definition of collocations, other semantically. even when improved, z one identifies not only collo- cations but free-combining pairs frequently appear- 3 The Computational Lexical ing together such as lawyer-client; doctor-hospital. Semantic Approach However, nowadays, researchers seem to agree that In order to account for the continuum we find in nat- combining statistic with symbolic approaches lead ural languages, we argue for a continuum perspec- to quantifiable improvements (Klavans and Resnik, tive, spanning the range from free-combining words 1996). to idioms, with semantic collocations and idiosyn- The Meaning Text Theory Approach The crasies in between as defined in (Viegas and Bouil- Meaning Text Theory (MTT) is a generator-oriented lon, 1994): lexical grammatical formalism. Lexical knowledge is encoded in an entry of the Explanatory Combina- • free-combining words (the girl ate candies) torial Dictionary (ECD), each entry being divided * semantic collocations (fast car; long book) 6 into three zones: the semantic zone (a semantic net- work representing the meaning of the entry in terms • idiosyncrasies (large coke; green jealousy) of more primitive words), the syntactic zone (the • idioms (to kick the (proverbial) bucket) grammatical properties of the entry) and the lexi- cal combinatorics zone (containing the values of the Formally, we go from a purely compositional Lexical Functions (LFs) 3). LFs are central to the approach in "free-combining words" to a non- study of collocations: compositional approach in idioms. In between, a (semi-)compositional approach is still possible. (Vie- A lexical function F is a correspondence gas and Bouillon, 1994) showed that we can reduce which associates a L, called the the set of what are conventionally considered as id- key word of F, with a set of lexical items iosyncrasies by differentiating "true" idiosyncrasies F(L)-the value of F. (Mel'6uk, 1988) 4 (difficult to derive or calculate) from expressions We focus here on syntagmatic LFs describing co- which have well-defined calculi, being compositional occurrence relations such as pay attention, legitimate in nature, and that have been called semantic collo- complaint; from a distance. 5 cations. In this paper, we further distinguish their Heylen et al. (1993) have worked out some cases idiosyncrasies into: which help license a starting point for assigning LFs. They distinguish four types of syntagmatic LFs: • restricted semantic co-occurrence, where the meaning of the co-occurrence is semi- • evaluative qualifier compositional between the base and the collo- Magn(bleed) = profusely cate (strong coffee, pay attention, heavy smoker, • distributional qualifier ...) Mult(sheep) = flock • restricted lexical co-occurrence, where the • co-occurrence meaning of the collocate is compositional but Loc-in(distance)= at a distance has a lexical idiosyncratic behavior (lecture ... • verbal operator student; rancid butter; sour milk). Operl(attention) = pay We provide below examples of restricted seman- The MTT approach is very interesting as it pro- tic co-occurrences in (1), and restricted lexical co- vides a model of production well suited for genera- occurrences in (2). tion with its different strata and also a lot of lexical- Restricted semantic co-occurrence The se- semantic information. It seems nevertheless that all mantics of the combination of the entries is semi- 2Church and Hanks (1989), Smadja (1993) use statistics compositional. In other words, there is an entry in " in their algorithms to extract collocations from texts. the lexicon for the base, (the semantic collocate is 3See (Iordanskaja et al., 1991) and (Ramos et al., 1994) for their use of LFs in MTT and NLG respectively. encoded inside the base), whereas we cannot directly 4(Held, 1989) contrasts Hausman's base and collate to refer to the sense of the semantic collocate in the Mel'tuk's keyword and LF values. lexicon, as it is not part of its senses. We assign 5There are about 60 LFs listed said to be universal; the the co-occurrence a new semi-compositional sense, lexicographic approach of Mel'tuk and Zolkovsky has been applied among other languages to Russian, French, German 6See (Pustejovsky, 1995) for his account of such expres- and English. sions using a operator.

1329 where the sense of the base is composed with a new tional. In other words, there are entries in the lex- sense for the collocate. icon for the base and the collocate, with the same senses as in the co-occurrence. Therefore, we can di- (la) #O=[key: "smoker", rectly refer to the senses of the co-occurring words. rel: [syntagmatic: LSFIntensity What we are capturing here is a lexical idiosyncrasy [base: #0, collocate: or in other words, we specify that we should prefer [key: "heavy", this particular combination of words. This is useful gram: [subCat: Attributive, for , where it can help disambiguate a sense, freq: [value: 8]]]]] ...] and is most relevant for generation; it can be viewed as a preference among the paradigmatic family of (lb) #0= [key: "attention", the co-occurrence. rel: [syntagmatic: LSFOper [base: #0, collocate: [key: "pay", (2a) #O=[key: "truth", tel: [syntagmatic: LSFSyn gram: [subCat: SupportVerb, freq: [value: 5]]]]] ...] [base: #0, collocate: [key: "plain", sense: adj2, Ir: [comp:no, superl:no]]]] ...] In examples (1), the LSFs (LSFIntensity, LS- FOper, ...) are equivalent (and some identical) to (2b) #0=[key: "pupil", the LFs provided in the ECD. The notion of LSF rel: [syntagmatic: LSFSyn is the same as that of LFs. However, LSFs and [base: #0, collocate: LFs are different in two ways: i) conceptually, LSFs [key: "teacher", sense: n2, are organized into an inheritance hierarchy; ii) for- freq: [value: 5]]]]...] mally, they are rules, and produce a new entry com- posed of two entries, the base with the collocate. (2c) #O=[key: "conference" , As such, the new composed entry is ready for pro- tel: [syntagmatic: LSFSyn cessing. These LSFs signal a compositional [base: #0, collocate: and a semi-compositional semantics. For instance, [key: "student", sense: nl, in (la), a heavy smoker is somebody who smokes a freq: [value: 9]]]] ...] lot, and not a "fat" person. It has been shown that In examples (2), the LSFSyn produces a new en- one cannot code in the lexicon all uses of heavy for try composed of two or more entries. As such, the heavy smoker, heavy drinker, .... Therefore, we do new entry is ready for processing. LSFSyn signals not have in our lexicon for heavy a sense for "a lot", a compositional syntax and a compositional seman- or a sense for "strong" to be composed with wine, tics, and restricts the use of to be used in etc... It is well known that such co-occurrences are the composition. We can directly refer to the sense lexically marked; if we allowed in our a pro- of the collocate, as it is part of the lexicon. liferation of senses, multiplying in anal- In (2a) the entry for truth specifies one co- ysis and choices in generation, then there would be occurrence (plain truth), where the sense of plain no limit to what could be combined and we could here is adj2 (obvious), and not say adj3 (flat). The end up generating *heavy coffee with the sense of syntagmatic expression inherits all the zones of the "strong" for heavy, in our lexicon. entry for "plain", sense adj2, we only code here the The left hand-side of the rule LSFIntensity spec- irregularities. For instance, "plain" can be used ifies an "Intensity-Attribute" applied to an event as "plainer .... plainest" in its "plain" sense in its which accepts aspectual features of duration. In adj2 entry, but not as such within the lexical co- (la), the event is The LSFIntensity also smoke. occurrence "*plainer truth", "*plainest truth", we provides the syntax-semantic interface, allowing for therefore must block it in the collocate, as expressed an Adj-Noun construction to be either predicative in (comp: no, superh no). In other words, we will (the car is red) or attributive (the red car). We not generate "plainer/plainest truth". Examples need therefore to restrict the co-occurrence to the (2b) and (2c) illustrate complex entries as there is Attributive use only, as the predicative use is not no direct grammatical dependency between the base allowed: (the smoker is heavy) has a literal meaning and the collocate. In (2b) for instance, we prefer or figurative, but not collocational. to associate teacher in the context of a pupil rather In (lb) again, there is no sense in the dictionary than any other element belonging to the paradig- for which would mean The rule LS- pay concentrate. matic family of teacher such as professor, instructor. FOper makes the verb a verbal operator. No further restriction is required. Formally, there is no difference between the two Restricted lexical co-occurrence The seman- types of co-occurrences. In both cases, we specify tics of the combination of the entries is composi- the base (which is the word described in the en-

1330 try itself), the collocate, the frequency of the co- 3.2 Acquisition of Syntagmatic Relations occurrence in some corpus, and the LSF which links The acquisition of syntagmatic relations is knowl- the base with the collocate. Using the formalism edge intensive as it requires human intervention. In of typed feature structures, both cases are of type order to minimize this cost we rely on conceptual Co-occurrence as defined below: tools such as lexical rules, on the LSF inheritance Co-occurrence = [base: Entry, hierarchy. collocate: Entry, Lexical Rules in Acquisition The acquisition of freq: Frequency] ; restricted semantic co-occurrences can be minimized 3.1 Processing of Syntagrnatic Relations by detecting rules between different classes of co- occurrences (modulo presence of derived forms in the We utilizean efficientconstraint-based control mech- lexicon with same or subsumed semantics). Looking anism called (HG) (Beale, 1997). Hunter-Gatherer at the following example, HG allows us to mark certain compositions as be- ing dependent on each other and then forget about h + N <=> V + Adv them. Thus, once we have two lexicon entries bitter resentment resent bitterly that we know go together, HG will ensure that heavy smoker smoke heavily they do. HG also gives preference to co-occurring big eater eat *bigly compositions. In analysis, meaning representations constructed using co-occurrences are preferred over v + hdv <=> Adv + Adj-ed those that are not, and, in generation, realizations oppose strongly strongly opposed involving co-occurrences are preferred over equally oblige morally morally obliged correct, but non-cooccurring realizations, r The real work in processing is making sure that we we see that after having acquired with human in- have the correct two entries to put together. In re- tervention co-occurrences belonging to the A + N striated semantic co-occurrences, the co-occurrence class, we can use lexical rules to derive the V + Adv does not have the correct sense in the lexicon. For class and also Adv + Adj-ed class. example, when the heavy smoker is encoun- Lexical rules are a useful conceptual tool to extend tered, the lexicon entry for heavy would not contain a dictionary. (Viegas et al., 1996) used derivational the correct sense. (la) could be used to create the lexical rules to extend a Spanish lexicon. We ap- correct entry. In (la), the entry for smoker contains ply their approach to the production of restricted the key, or trigger, heavy. This signals the analyzer semantic co-occurrences. Note that eat bigly will be to produce another sense for heavy smoker. This produced but then rejected, as the form bigly does sense will contain the same syntactic information not exist in a dictionary. The rules overgenerate co- present in the "old" heavy, except for any modifi- occurrences. This is a minor problem for analysis cations listed in the "gram" section (see (la)). The than for generation. To use these derived restricted semantics of the new sense comes directly from the co-occurrences in generation, the output of the lexi- LSF. Generation works the same, except the trig- cal rule processor must be checked. This can be done ger is different. The input to generation will be a in different ways: dictionary check, corpus check and SMOKE event along with an Intensity-Attribute. ultimately human check. (la), which would be used to realize the SMOKE Other classes, such as the ones below can be event, would trigger LSFIntensify which has the extracted using lexico-statistical tools, such as in Intensity-Attribute in the left hand-side, thus con- (Smadja, 1993), and then checked by a human. firming the production of heavy. Restricted lexical co-occurrences are easier in the v + N pay attention, meet an obligation, sense that the correct entry already exists in the lexi- commit an offence, ... con. The analyzer/generator simply needs to detect the co-occurrence and add the constraint that the N + N dance marathon, marriage ceremony corresponding senses be used together. In examples of derision .... like (2b), there is no direct grammatical or semantic relationship between the words that co-occur. Thus, LSFs and Inheritance We take advantage of 1) the entire clause, or even text may have to the semantics encoded in the lexemes, and 2) an in- be searched for the co-occurrence. In practice, we heritance hierarchy of LSFs. We illustrate briefly limit such searches to the sentence level. this notion of LSF inheritance hierarchy. For in- stance, the left hand-side of LSFChangeState spec- 7The selection of co-occurrences is part of the lexical pro- ifies that it applies to foods (solid or liquid) which cess, in other words, if there are reasons not to choose a co- occurrence because of the presence of modifiers or because are human processed, and produces the collocates of reasons, the generator will not generate the co- rancid, rancio (Spanish). Therefore it could apply occurrence. to milk, butter, or wine. The rule would end up

1331 producing rancid milk, rancid butter, or vino rancio phy. In Proceedings of the 27th Annual Meeting (rancid wine) which is fine in Spanish. We therefore of the Association for Computational Linguistics. need to further distinguish LSFChangeState into F.J. Hausmann. 1979. Un dictionnaire des colloca- LSFChangeStateSolid and LSFChangeStateLiquid. tions est-il possible ? In Travaux de Linguistique This restricts the application of the rule to produce et de Littdrature XVII, 1. rancid butter, by going down the hierarchy. This U. Heid. 1979. D~crire les collocations : deux ap- enables us to factor out information common to sev- proches lexicographiques et leur application dans eral entries, and can be applied to both types of un outil informatisd. Internal Report, Stuttgart co-occurrences. We only have to code in the co- University. occurrence information relevant to the combination, D. Heylen. 1993. Collocations and the Lexicalisa- the rest is inherited from its entry in the dictionary. tion of Semantic Information. In Collocations, TR ET-10/75, Taaltechnologie, Utrecht. 4 Conclusion L. Iordanskaja, R. Kittredge and A. Polgu~re. 1991. In this paper, we built on a continuum perspec- Lexical Selection and Paraphrase in a Meaning- tive, knowledge-based, spanning the range from free- text Generation Model. In C. L. Paris, W. combining words to idioms. We further distin- Swartout and W. Mann (eds), NLG in AI and guished the notion of idiosyncrasies as defined in CL. Kluwer Academic Publishers. (Viegas and Bouillon, 1994), into restricted semantic J. Klavans and P. Resnik. 1996. The Balancing Act, co--occurrences and restricted lexical co-occurrences. Combining Symbolic and Statistical Approaches to We showed that they were formally equivalent, thus . MIT Press, Cambridge Mass., London facilitating the processing of strictly compositional England. and semi-compositional expressions. Moreover, by G. Lakoff. 1970. Irregularities in Syntax. New York: considering the information in the lexicon as con- Holt, Rinehart and Winston, Inc. straints, the linguistic difference between composi- I. Mel'~uk. 1988. Paraphrase et lexique dans la tionality and semi-compositionality becomes a vir- th~orie Sens-Texte. In Bes & Fuchs (ed) Lexique6. tual difference for Hunter-Gatherer. We showed S. Nirenburg and I. Nirenburg. 1988. A Framework ways of minimizing the acquisition costs, by 1) using for Lexical Selection in NLG. In Proceedings of lexical rules as a way of expanding co-occurrences, 2) COLING 88. taking advantage of the LSF inheritance hierarchy. J. Pustejovsky. 1995. The Generative Lexicon. MIT The main advantage of our approach over the ECD Press. approach is to use the semantics coded in the lex- M. Ramos, A. Tutin and G. Lapalme. 1994. Lexical emes along with the language independent LSF in- Functions of Explanatory Combinatorial Dictio- heritance hierarchy to propagate restricted semantic nary for Lexicalization in Text Generation. In P. co-occurrences. The work presented here is complete St-Dizier & E. Viegas (Ed) Computational Lexical concerning representational aspects and processing Semantics: CUP. aspects (analysis and generation): it has been tested J. Sinclair. 1991. Corpus, Concordance, Colloca- on the translations of on-line unrestricted texts. The tions. Oxford University Press. large-scale acquisition of restricted co-occurrences is F. Smadja. 1993. Retrieving Collocations from in progress. Texts: Xtract. Computational Linguistics, 19(1). E. Viegas and P. Bouillon. 1994. Semantic Lexi- 5 Acknowledgements cons: the Cornerstone for Lexical Choice in Nat- This work has been supported in part by DoD under ural Language Generation. In Proceedings of the contract number MDA-904-92-C-5189. We would 7th INLG, Kennebunkport. like to thank Pierrette Bouillon, L~o Wanner and E. Viegas, B. Onyshkevych, V. Raskin and S. Niren- R~mi Zajac for helpful discussions and the anony- burg. 1996. From Submit to Submitted via Sub- mous reviewers for their useful comments. mission: on Lexical Rules in Large-scale Lexi- con Acquisition. In Proceedings of the 34th An- nual Meeting of the Association for Computa- S. Beale. 1997. HUNTER-GATHERER: Applying tional Linguists. Constraint Satisfaction, Branch-and-Bound and L. Wanner. 1996. Lexical Functions in Solution Synthesis to . and Natural Language Processing. John Benjamin Ph.D. Diss., Carnegie Mellon University. Publishing Company. M. Benson. 1989. The Structure of the Colloca- tional Dictionary. In International Journal of Lex- icography. K.W. Church and P. Hanks. 1989. Word Associa- tion Norms, Mutual Information and Lexicogra-

1332