Identifying Participation of Individual Verbs Or Verbnet Classes in The

Proceedings of the Society for Computation in Linguistics Volume 2 Article 16 2019 Identifying Participation of Individual Verbs or VerbNet Classes in the Causative Alternation Esther Seyffarth Heinrich Heine University Düsseldorf, [email protected] Follow this and additional works at: https://scholarworks.umass.edu/scil Part of the Computational Linguistics Commons Recommended Citation Seyffarth, Esther (2019) "Identifying Participation of Individual Verbs or VerbNet Classes in the Causative Alternation," Proceedings of the Society for Computation in Linguistics: Vol. 2 , Article 16. DOI: https://doi.org/10.7275/efvz-jy59 Available at: https://scholarworks.umass.edu/scil/vol2/iss1/16 This Paper is brought to you for free and open access by ScholarWorks@UMass Amherst. It has been accepted for inclusion in Proceedings of the Society for Computation in Linguistics by an authorized editor of ScholarWorks@UMass Amherst. For more information, please contact [email protected]. Identifying Participation of Individual Verbs or VerbNet Classes in the Causative Alternation Esther Seyffarth Heinrich Heine University Düsseldorf Düsseldorf, Germany [email protected] Abstract meaning. The choice of a transitive or intransitive syntactic frame has an impact on the semantic roles Verbs that participate in diathesis alternations that are part of the meaning of the verb. Consider have different semantics in their different syn- sentences (1) and (2). tactic environments, which need to be distin- guished in order to process these verbs and (1) The number of students has decreased. their contexts correctly. We design and imple- ment 8 approaches to the automatic identifica- (2) We have decreased the number of students. tion of the causative alternation in English (3 based on VerbNet classes, 5 based on individ- Both sentences make a statement about the number ual verbs). For verbs in this alternation, the se- of students having changed. In (1), the only semantic roles that contribute to the meaning of mantic role that is explicitly encoded is the THEME the verb can be associated with different syn- (the thing that has decreased). In addition to this tactic slots. Our most successful approaches role, sentence (2) also specifies an AGENT that has use distributional vectors and achieve an F1 score of up to 79% on a balanced test set. We causative control over the event. also apply our approaches to the distinction be- The automatic identification of verbs that par- tween the causative alternation and the unex- ticipate in the causative alternation is not trivial, pressed object alternation. Our best system for because the semantic roles involved in the events this is based on syntactic information, with an described by the different uses of these verbs are F1 score of 75% on a balanced test set. encoded in the syntactic frames in different ways. In the sentences above, the THEME is located in the 1 Introduction syntactic subject position in (1), but in the syntactic English verbs impose syntactic and semantic re- direct object position in (2). strictions on their arguments, but some verbs are It is insufficient to use the presence or absence more flexible than others. A number of verbs in of syntactic arguments, such as the direct object, English have different syntactic frames (subcate- as indicators for or against a verb’s participation gorization frames, SCFs) that are associated with in the alternation, since verbs can occur with dif- different semantics. This behavior of a subset of ferent sets of arguments for other reasons. Many the verbs in a language is known as diathesis alter- verbs in English have optional direct objects; in the nations, or verb alternations (Levin, 1993). terminology of Levin (1993), they participate in Verbs that participate in one or more alternations the unexpressed object alternation. Distinguishing are potentially problematic in the context of natural between different alternations is essential for tasks language processing tasks. In order to be able to that rely on correct semantic analyses of a verb and process an instance of an alternating verb in a text, its arguments. Examples for applications where it is necessary to distinguish between the different this is important are question answering, informa- possible uses, so that the correct meaning can be tion extraction, or summarization. assigned to the given instance. This paper describes, compares and evaluates a The causative alternation is one of the regular total of 8 approaches to the automatic identification verb alternations in English. Verbs in this alter- of verbs in the causative alternation. Of particular nation can be used intransitively, with an inchoat- interest to us is the comparison of different evalua- ive meaning, or transitively, leading to a causative tion conditions and different test sets, which give 146 Proceedings of the Society for Computation in Linguistics (SCiL) 2019, pages 146-155. New York City, New York, January 3-6, 2019 a wide range of accuracy scores depending on the decide whether each verb participates in the alterna- setup. tion. The author describes several different setups; Our results show that some of our setups are the highest accuracy on her test set for the causative more robust than others against different evaluation alternation is 73%. conditions. The different evaluation conditions are Merlo and Stevenson (2001) distinguish three useful because they can expose problems of indi- types of optionally intransitive verbs using a num- vidual setups, such as a tendency to assign false ber of features. Their distinction is between unerga- positive labels. tive, unaccusative, and object-drop verbs. Their We also evaluate the performance of our systems features include semantic features like animacy or on the distinction between verbs in the causative causativity, as well as syntactic features like pas- alternation and verbs in the unexpressed-object al- sive voice or presence of a VBN tag. The com- ternation. Although these alternations resemble bination of all features leads to a classification each other in the SCFs they allow, our results show accuracy of 69.8% on a test set of 20 unergative that some systems perform well on both classifica- verbs, 19 unaccusative verbs, and 20 object-drop tion tasks. verbs. From their experiments with human anno- While English alternating verbs can be identi- tators, they derive an “expert-based upper bound” fied by looking them up in one of the resources accuracy around 86.5% for the task. that exist for such purposes, we find it desirable Sun et al. (2013) present an unsupervised ap- to create dynamic systems for the identification of proach to the semantic classification of verbs that alternating verbs, for two main reasons. First, the uses approximations of diathesis alternations as phenomenon may be productive to a certain degree, features. While they evaluate their system on verb which means that resources can become outdated; class induction tasks, their method for approximat- and second, there are other languages with similar ing diathesis alternations is also of interest in isola- phenomena for which no (large) resources like this tion from the clustering results. The approximation exist. A system that does well on English alterna- takes into account the subcategorization frames that tion identification can be helpful in building this are observed for each verb in a corpus and the like- type of resource for other languages. lihood of individual verbs and individual SCFs. As each diathesis alternation gives rise to several SCFs, 2 Related Work the joint probability can be used to predict whether While diathesis alternations have been a topic of the SCFs of a verb are observed by chance or due linguistic discussion for some time, the idea of to the verb’s participation in an alternation. identifying these alternations automatically has An area of research that is related to, but distinct mostly been discussed after the publication of from the task we discuss here is the clustering of Levin (1993). Her lists of verb alternations and verbs into Levin classes, VerbNet classes, or other of verb classes made it possible to design systems semantic classes. Examples for this are presented in that cluster verbs automatically and to evaluate the Lapata and Brew (1999); Schulte Im Walde (2000); outputs of those systems against Levin’s data. Joanis (2002); Joanis and Stevenson (2003); Steven- The creation of resources like WordNet (Miller, son and Joanis (2003); Schulte Im Walde (2006); 1998) and VerbNet (Kipper et al., 2000) made this Joanis et al. (2008); Sun et al. (2008); Korhonen even easier. For instance, approaches like the one (2009). While verb classes are associated with by McCarthy (2000, 2001) use the WordNet hi- diathesis alternations, there is no one-to-one rela- erarchy to predict whether a verb participates in tion between a verb class and an alternation. Thus, the causative or conative alternation. Using a sub- strategies that are useful for verb class prediction categorization lexicon derived from the BNC, she cannot always be used in the same way for the calculates the similarity of the role fillers for each prediction of verb alternations. In this paper, we position in the verb’s syntactic frame to identify develop systems for the identification of diathesis cases where different slots have a systematic over- alternations because we are specifically interested lap in their semantic preferences, which is seen as in properties of verbs at the syntax-semantics inter- an indicator for the alternation. McCarthy (2000) face. hand-picks a set of 46 positive and 53 negative One of the contributions of our work that has verbs that are being classified. Human annotators not been discussed in depth in the literature is the 147 comparison of the performance of different classifi- Our strategy for collecting verbs outside the cation methods on the lists by Levin (1993) and on causative alternation creates a negative set that is VerbNet classes, and on different subsets of verbs not too large and contains only verbs that exhibit from those lists.

Load more