Classification of Telicity Using Cross-Linguistic Annotation Projection

Classification of telicity using cross-linguistic annotation projection Annemarie Friedrich1 Damyana Gateva2 1Center for Information and Language Processing, LMU Munich 2Department of Computational Linguistics, Saarland University [email protected] [email protected] Abstract McKeown, 2000). For example, there is an en- tailment relation between English Progressive and This paper addresses the automatic recog- Perfect constructions (as shown in (2)), but only nition of telicity, an aspectual notion. A for atelic verb constellations. telic event includes a natural endpoint (she walked home), while an atelic event does (2) (a) He was swimming in the lake. (atelic) not (she walked around). Recognizing this = He has swum in the lake. | difference is a prerequisite for temporal (b) He was swimming across the lake. (telic) natural language understanding. In En- = He has swum across the lake. glish, this classification task is difficult, as 6| telicity is a covert linguistic category. In We model telicity at the word-sense level, cor- contrast, in Slavic languages, aspect is part responding to the fundamental aspectual class of of a verb’s meaning and even available in Siegel and McKeown(2000), i.e., we take into ac- machine-readable dictionaries. Our con- count the verb and its arguments and modifiers, tributions are as follows. We successfully but no additional aspectual markers (such as the leverage additional silver standard train- Progressive). In (2) we classify whether the event ing data in the form of projected annota- types “swim in the lake” and “swim across the tions from parallel English-Czech data as lake” have natural endpoints. This is defined on well as context information, improving au- a linguistic level rather than by world knowledge tomatic telicity classification for English requiring inference. “Swimming in the lake” has significantly compared to previous work. no natural endpoint, as also shown by the linguis- We also create a new data set of English tic test presented in (2). In contrast, “swimming texts manually annotated with telicity. across the lake” will necessarily be finished once the other side is reached. 1 Introduction In English, the aspectual notion of telicity is a This paper addresses the computational modeling covert category, i.e., a semantic distinction that is of telicity, a linguistic feature which represents not expressed overtly by lexical or morphological whether the event type evoked by a sentence’s verb means. As illustrated by (2) and (3), the same verb constellation (i.e., the verb and its arguments and type (lemma) can introduce telic and atelic events modifiers) has a natural endpoint or not (Comrie, to the discourse depending on the context in which 1976; Smith, 1997), see (1a) and (1b). it occurs. (1) (a) Mary ate an apple. (telic) (3) (a) John drank coffee. (atelic) (b) I gazed at the sunset. (atelic) (b) John drank a cup of coffee. (telic) Automatic recognition of telicity is a neces- In Slavic languages, aspect is a component of sary step for natural language understanding tasks verb meaning. Most verb types are either perfec- that require reasoning about time, e.g., natural tive or imperfective (and are marked as such in language generation, summarization, question an- dictionaries). For example, the two occurrences swering, information extraction or machine trans- of “drink” in (3) are translated into Czech using lation (Moens and Steedman, 1988; Siegel and the imperfective verb “pil” and the perfective verb 2559 Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2559–2565 Copenhagen, Denmark, September 7–11, 2017. c 2017 Association for Computational Linguistics “vypil,” respectively (Filip, 1994):1 parallel English-Serbian data. Their only features are how often the respective verb type was aligned (4) (a) Pil kavu.´ (imperfective) to Serbian verbs carrying certain affixes that indi- He was drinking (some) coffee. cate perfectiveness or imperfectiveness. Their us- (b) Vypil kavu.´ (perfective) age of “verb type” differs from ours as they do not He drank up (all) the coffee. lemmatize, i.e., they always predict that “falling” is a long event, while “fall” is short. Our approach Our contributions are as follows: (1) using shares the idea of projecting aspectual information the English-Czech part of InterCorp (Cermˇ ak´ and from Slavic languages to English, but in contrast Rosen, 2012) and a valency lexicon for Czech to classifying verb types, we classify whether an verbs (Zabokrtskˇ y´ and Lopatkova´, 2007), we cre- event type introduced by the verb constellation of ate a large silver standard with automatically de- a clause is telic or atelic, making use of a machine- rived annotations and validate our approach by readable dictionary for Czech instead of relying on comparing the labels given by humans versus the affix information. projected labels; (2) we provide a freely available Loaiciga´ and Grisot(2016) create an automatic data set of English texts taken from MASC (Ide classifier for boundedness, defined as whether the et al., 2010) manually annotated for telicity; (3) endpoint of an event has occurred or not, and show we show that using contextual features and the sil- that this is useful for picking the correct tense in ver standard as additional training data improves French translations of the English Simple Past. computational modeling of telicity for English in Their classifier employs a similar but smaller fea- terms of F compared to previous work. 1 ture set compared to ours. Other related work on 2 Related work predicting aspect include systems aiming at identifying lexical aspect (Siegel and McKeown, 2000; Siegel and McKeown(2000, henceforth Friedrich and Palmer, 2014) or habituals (Mathew SMK2000) present the first machine-learning and Katz, 2009; Friedrich and Pinkal, 2015). based approach to identifying completedness, i.e., Cross-linguistic annotation projection ap- telicity, determining whether an event reaches a proaches mostly make use of existing manually culmination or completion point at which a new created annotations in the source language; state is introduced. Their approach describes each similar to our approach, Diab and Resnik(2002) verb occurrence exclusively using features reflect- and Marasovic´ et al.(2016) leverage properties ing corpus-based statistics of the corresponding of the source language to automatically induce verb type. For each verb type, they collect the annotations on the target side. co-occurrence frequencies with 14 linguistic markers (e.g., present tense, perfect, combination 3 Data sets and annotation projection with temporal adverbs) in an automatically parsed We conduct our experiments based on two data background corpus. They call these features sets: (a) English texts from MASC manually an- linguistic indicators and train a variety of machine notated for telicity, on which we train and test our learning models based on 300 clauses, of which computational models, and (b) a silver standard roughly 2/3 are culminated, i.e., telic. Their test automatically extracted via annotation projection set also contains about 300 clauses, corresponding from the Czech-English part of the parallel cor- to 204 distinct non-stative verbs. Their data sets pus InterCorp, which we use as additional training are not available, but as this work is the most data in order to improve our models.2 closely related to ours, we reimplement their approach and compare to it in Section5. 3.1 Gold standard: MASC (EN) Samardziˇ c´ and Merlo(2016) create a model for real-world duration of events (as short or long) of We create a new data set consisting of 10 En- English verbs as annotated in TimeBank (Puste- glish texts taken from MASC (Ide et al., 2010), jovsky et al., 2003). The model is informed by annotated for telicity. Texts include two essays, temporal boundedness information collected from a journal article, two blog texts, two history texts from travel guides, and three texts from the fic- 1In Czech, aspectual verb pairs may be related by affixes as in this example, but this is not always the case. They may 2Annotations, guidelines and code available from even use different lexemes (Vintr, 2001). https://github.com/annefried/telicity 2560 MASC (gold standard) InterCorp (silver standard) intersection clauses (instances) 1863 457,000 - % telic 82 55 - % atelic 18 45 - distinct verb types (lemmas) 567 2262 510 ambiguous verb types 70 1130 69 Table 1: Corpus statistics. tion genre. Annotation was performed using the (i.e., that some events necessarily come to an end web-based SWAN system (Guhring¨ et al., 2016). eventually, such as the “swimming in the lake” in Annotators were given a short written manual with (2)) with the annotation task of determining telic- instructions. We model telicity for dynamic (even- ity at a linguistic level. tive) verb occurrences because stative verbs (e.g., “like”) do not have built-in endpoints by defini- 3.2 Silver standard: InterCorp (EN-CZ) tion. Annotators choose one of the labels telic and We create a silver standard of approximately atelic or they skip clauses that they consider to be 457,000 labeled English verb occurrences (i.e., stative. In a first round, each verb occurrence was clauses) extracted from the InterCorp parallel cor- labeled by three annotators (the second author of pus project (Cermˇ ak´ and Rosen, 2012). We lever- this paper plus two paid student assistants). They age the sentence alignments, as well as part-of- unanimously agreed on telicity labels for 1166 speech and lemma information provided by In- verb occurrences; these are directly used for the terCorp. We use the data from 151 sentence- gold standard. Cases in which only two annota- aligned books (novels) of the Czech-English part tors agreed on a telicity label (the third annotator of the corpus and further align the verbs of all may have either disagreed or skipped the clause) 1:1-aligned sentence pairs to each other using the are labeled by a fourth independent annotator (the verbs’ lemmas, achieving high precision by mak- first author), who did not have access to the la- ing sure that the translation of the verbs is licensed bels of the first rounds.

Load more