Removing Gender and Number Cues for Difficult Pronominal Anaphora

Total Page:16

File Type:pdf, Size:1020Kb

Removing Gender and Number Cues for Difficult Pronominal Anaphora The KNOWREF Coreference Corpus: Removing Gender and Number Cues for Difficult Pronominal Anaphora Resolution Ali Emami*1, Paul Trichelair*1, Adam Trischler2, Kaheer Suleman2, Hannes Schulz2, and Jackie Chi Kit Cheung1 1School of Computer Science, Mila/McGill University 2Microsoft Research Montreal fali.emami, [email protected] fadam.trischler, kasulema, [email protected] [email protected] Abstract like gender and number do not by themselves indi- cate the correct resolution (Trichelair et al., 2018). We introduce a new benchmark for corefer- To date, most existing methods for coreference ence resolution and NLI, KNOWREF, that tar- gets common-sense understanding and world resolution (Raghunathan et al., 2010; Lee et al., knowledge. Previous coreference resolution 2011; Durrett et al., 2013; Lee et al., 2017, 2018) tasks can largely be solved by exploiting the have been evaluated on a few popular datasets, in- number and gender of the antecedents, or have cluding the CoNLL 2011 and 2012 shared corefer- been handcrafted and do not reflect the diver- ence resolution tasks (Pradhan et al., 2011, 2012). sity of naturally occurring text. We present These datasets were proposed as the first compre- a corpus of over 8,000 annotated text pas- hensively tagged and large-scale corpora for coref- sages with ambiguous pronominal anaphora. These instances are both challenging and re- erence resolution, to spur progress in state-of-the- alistic. We show that various coreference art techniques. According to Durrett and Klein systems, whether rule-based, feature-rich, or (2013), this progress would contribute in the “up- neural, perform significantly worse on the hill battle” of modelling not just syntax and dis- task than humans, who display high inter- course, but also semantic compatibility based on annotator agreement. To explain this perfor- world knowledge and context. mance gap, we show empirically that state-of- Despite improvements in benchmark dataset per- the art models often fail to capture context, in- formance, the question of what exactly current sys- stead relying on the gender or number of can- didate antecedents to make a decision. We tems learn or exploit remains open, particularly then use problem-specific insights to propose with recent neural coreference resolution models. a data-augmentation trick called antecedent Lee et al.(2017) note that their model does “little switching to alleviate this tendency in mod- in the uphill battle of making coreference decisions els. Finally, we show that antecedent switch- that require world knowledge,” and highlight a few ing yields promising results on other tasks as examples in the CoNLL 2012 task that rely on more well: we use it to achieve state-of-the-art re- complex understanding or inference. Because these sults on the GAP coreference task. cases are infrequent in the data, systems can per- 1 Introduction form very well on the CoNLL tasks according to standard metrics by exploiting surface cues. High- Coreference resolution is one of the best known performing models have also been observed to rely tasks in Natural Language Processing (NLP). De- on social stereotypes present in the data, which spite a large body of work in the area over the could unfairly impact their decisions for some de- last few decades (Morton, 2000; Bean and Riloff, mographics (Zhao et al., 2018). 2004; McCallum and Wellner, 2005; Rahman and There is a recent trend, therefore, to develop Ng, 2009), the task remains challenging. Many more challenging and diverse coreference tasks. resolution decisions require extensive world knowl- Perhaps the most popular of these is the Winograd edge and understanding common points of refer- Schema Challenge (WSC), which has emerged as ence (Pradhan et al., 2011). In the case of pronomi- an alternative to the Turing test (Levesque et al., nal anaphora resolution, these forms of “common 2011). The WSC task is carefully controlled such sense” become much more important when cues that heuristics involving syntactic salience, the *equal contribution number and gender of the antecedents, or other 3952 Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3952–3961 Florence, Italy, July 28 - August 2, 2019. c 2019 Association for Computational Linguistics obvious syntactic/semantic cues are ineffective. substantial room for improvement to match Previous approaches to common sense reasoning, human performance. based on logical formalisms (Bailey et al., 2015) or 4. We demonstrate the benefits of a data- deep neural models (Liu et al., 2016), have solved augmentation technique called antecedent only restricted subsets of the WSC with high preci- switching in expanding our corpus, further de- sion. These shortcomings can in part be attributed terring models from exploiting surface cues, to the limited size of the corpus (273 instances), as well as in transferring to models trained on which is a side effect of its hand-crafted nature. other co-reference tasks like GAP, leading to Webster et al.(2018) recently presented a corpus state-of-the-art results. called GAP that consists of about 4,000 unique bi- nary coreference instances from English Wikipedia. 2 Related Work This corpus is intended to address gender bias and the mentioned size limitations of the WSC. We be- 2.1 General coreference resolution lieve that gender bias in coreference resolution is Automated techniques for standard coreference res- part and parcel of a more general problem: current olution — that is, the task of correctly partitioning models are unable to abstract away from the enti- the entities and events that occur in a document into ties in the sentence to take advantage of the wider resolution classes — date back to decision trees and context to make a coreference decision. hand-written rules (Hobbs, 1977; McCarthy, 1995). To tackle this issue, we present a coreference res- The earliest evaluation corpora were the Message olution corpus called KNOWREF that specifically Understanding Conferences (MUC) (Grishman and targets the ability of systems to reason about a situ- Sundheim, 1996) and the ACE (Doddington et al., ation described in the context.1 We designed this 2004). These focused on noun phrases tagged with task to be challenging, large-scale, and based on coreference information, but were limited in either natural text. The main contributions of this paper size or annotation coverage. are as follows: The datasets of Pradhan et al.(2011, 2012) from the CoNLL-2011 and CoNLL-2012 Shared Tasks 1. We develop mechanisms by which we con- were proposed as large-scale corpora with high struct a human-labeled corpus of 8,724 inter-annotator agreement. They were constructed Winograd-like text samples whose resolution by restricting the data to coreference phenomena requires significant common sense and back- with highly consistent annotations, and were pack- ground knowledge. As an example: aged with a standard evaluation framework to facil- Marcus is undoubtedly faster than Jarrett itate performance comparisons. right now but in [his] prime the gap wasn’t The quality of these tasks led to their widespread all that big. (answer: Jarrett) use and the emergence of many resolution systems, ranging from hand-engineered methods to deep- 2. We propose a task-specific metric called con- learning approaches. The multi-pass sieve system sistency that measures the extent to which a of Raghunathan et al.(2010) is fully deterministic model uses the full context (as opposed to a and makes use of mention attributes like gender surface cue) to make a coreference decision. and number; it maintained the best results on the We use this metric to analyze the behavior of CoNLL 2011 task for a number of years (Lee et al., state-of-the-art methods and demonstrate that 2011). Later, lexical learning approaches emerged they generally under-utilize context informa- as the new state of the art (Durrett and Klein, 2013), tion. followed more recently by neural models (Wise- man et al., 2016; Clark and Manning, 2016). The 3. We find that a fine-tuned version of the recent current state-of-the-art result on the CoNLL 2012 large-scale language model, BERT (Devlin task is by an end-to-end neural model from Lee et al., 2018), performs significantly better than et al.(2018) that does not rely on a syntactic parser other methods on KNOWREF, although with or a hand-engineered mention detector. 1The corpus, the code to scrape the sentences 2.2 Gender bias in general coreference from the source texts, as well as the code to repro- resolution duce all of our experimental results are available at https://github.com/aemami1/KnowRef. Zhao et al.(2018) observed that state-of-the-art 3953 methods for coreference resolution become gender- Challenge. The goal was that any successful system biased, exploiting various stereotypes that leak would necessarily use common-sense knowledge. from society into data. They devise a dataset of Although the WSC is an important step in evalu- 3,160 manually written sentences called WinoBias ating systems en route to human-like language un- that serves both as a gender-bias test for corefer- derstanding, its size and other characteristics are a ence resolution models and as a training set to bottleneck for progress in pronoun disambiguation counter stereotypes in existing corpora (i.e., the (Trichelair et al., 2018). A Winograd-like expanded two CoNLL tasks). The following example is rep- corpus was proposed by Rahman and Ng(2012) resentative: to address the WSC’s size limitations; however, (1) The physician hired the secretary because systems that perform well on the expanded dataset he was overwhelmed with clients. do not transfer successfully to the original WSC (Rahman and Ng, 2012; Peng et al., 2015), likely (2) The physician hired the secretary because due to loosened constraints in the former. she was overwhelmed with clients.
Recommended publications
  • Minimal Pronouns, Logophoricity and Long-Distance Reflexivisation in Avar
    Minimal pronouns, logophoricity and long-distance reflexivisation in Avar* Pavel Rudnev Revised version; 28th January 2015 Abstract This paper discusses two morphologically related anaphoric pronouns inAvar (Avar-Andic, Nakh-Daghestanian) and proposes that one of them should be treated as a minimal pronoun that receives its interpretation from a λ-operator situated on a phasal head whereas the other is a logophoric pro- noun denoting the author of the reported event. Keywords: reflexivity, logophoricity, binding, syntax, semantics, Avar 1 Introduction This paper has two aims. One is to make a descriptive contribution to the crosslin- guistic study of long-distance anaphoric dependencies by presenting an overview of the properties of two kinds of reflexive pronoun in Avar, a Nakh-Daghestanian language spoken natively by about 700,000 people mostly living in the North East Caucasian republic of Daghestan in the Russian Federation. The other goal is to highlight the relevance of the newly introduced data from an understudied lan- guage to the theoretical debate on the nature of reflexivity, long-distance anaphora and logophoricity. The issue at the heart of this paper is the unusual character of theanaphoric system in Avar, which is tripartite. (1) is intended as just a preview with more *The present material was presented at the Utrecht workshop The World of Reflexives in August 2011. I am grateful to the workshop’s audience and participants for their questions and comments. I am indebted to Eric Reuland and an anonymous reviewer for providing valuable feedback on the first draft, as well as to Yakov Testelets for numerous discussions of anaphora-related issues inAvar spanning several years.
    [Show full text]
  • Relative Clause Attachment and Anaphora: a Case for Short Binding
    Relative Clause Attachment and Anaphora: A Case for Short Binding Rodolfo Delmonte Ca' Garzoni-Moro, San Marco 3417, Università "Ca Foscari", 30124 - VENEZIA E-mail: [email protected] Abstract Relative clause attachment may be triggered by binding requirements imposed by a short anaphor contained within the relative clause itself: in case more than one possible attachment site is available in the previous structure, and the relative clause itself is extraposed, a conflict may arise as to the appropriate s/c-structure which is licenced by grammatical constraints but fails when the binding module tries to satisfy the short anaphora local search for a bindee. 1 Introduction It is usually the case that anaphoric and pronominal binding take place after the structure building phase has been successfully completed. In this sense, c-structure and f-structure in the LFG framework - or s-structure in the chomskian one - are a prerequisite for the carrying out of binding processes. In addition, they only interact in a feeding relation since binding would not be possibly activated without a complete structure to search, and there is no possible reversal of interaction, from Binding back into s/c-structure level seen that they belong to two separate Modules of the Grammar. As such they contribute to each separate level of representation with separate rules, principles and constraints which need to be satisfied within each Module in order for the structure to be licensed for the following one. However we show that anaphoric binding requirements may cause the parser to fail because the structure is inadequate. We propose a solution to this conflict by anticipating, for anaphors only the though, the agreement matching operations between binder and bindee and leaving the coindexation to the following module.
    [Show full text]
  • Antar Solhy Abdellah Publication Date: 2007 Source: CDELT (Centre for Developing English Language Teaching) Occasional Papers, January (2007) [Egypt]
    Title: “English Majors’ errors in translating Arabic Endophora; Analysis and Remedy” Author: Antar Solhy Abdellah Publication date: 2007 Source: CDELT (Centre for Developing English Language Teaching) Occasional Papers, January (2007) [Egypt]. ENGLISH MAJORS' ERRORS IN TRANSLATING ARABIC ENDOPHORA: ANALYSIS AND REMEDY Antar Solhy Abdellah Lecturer in TEFL Qena Faculty of Education, South Valley University- Egypt Abstract Egyptian English majors in the faculty of Education, South Valley university tend to mistranslate the plural inanimate Arabic pronoun with the singular inanimate English pronoun. A diagnostic test was designed to analyze this error. Results showed that a large number of students (first year and fourth year students) make this error, that the error becomes more common if the pronoun is cataphori rather than anaphori, and that the further the pronoun is from its antecedent the more students are apt to make the error. On the basis of these results, sources of the error are identified and remedial procedures are suggested. Abstract in Arabic تقوم الدراسة الحالية بتحليل أخطاء طﻻب شعبة اللغة اﻹنجليزية )الفرقة اﻷولى والرابعة( في ترجمة ضمير جمع غير العاقل من العربية إلى اﻹنجليزية؛حيث يميل الطﻻب إلى استخدام ضمير غير العاقل المفرد في اﻹنجليزية بدﻻ من ضمير الجمع. تستخدم الدراسة اختبارا تشخيصيا يسعى للكشف عن نسبة شيوع الخطأ ومن ثم تحليله. أظهرت النتائج أن عددا كبيرا من طﻻب الفرقتين يرتكبون هذا الخطأ، وأن الخطأ يزداد إذا كان الضمير في موضع المتقدم أكثر مما إذا كان في موضع المتأخر، وأن الخطأ يزداد كلما بعد الضمير عن عائده. ثم تناولت الدراسة تحليﻻ لمصدر الخطأ وقدمت مقترحات لعﻻجه. INTRODUCTION 62 Students whose major is English in faculties of Education are faced with translation problems from the very start of their study.
    [Show full text]
  • Inalienable Possession in Swedish and Danish – a Diachronic Perspective 27
    FOLIA SCANDINAVICA VOL. 23 POZNAŃ 20 17 DOI: 10.1515/fsp - 2017 - 000 5 INALIENABLE POSSESSI ON IN SWEDISH AND DANISH – A DIACHRONIC PERSP ECTIVE 1 A LICJA P IOTROWSKA D OMINIKA S KRZYPEK Adam Mickiewicz University in Poznań A BSTRACT . In this paper we discuss the alienability splits in two Mainland Scandinavian language s, Swedish and Danish, in a diachronic context. Although it is not universally acknowledged that such splits exist in modern Scandinavian languages, many nouns typically included in inalienable structures such as kinship terms, body part nouns and nouns de scribing culturally important items show different behaviour from those considered alienable. The differences involve the use of (reflexive) possessive pronouns vs. the definite article, which differentiates the Scandinavian languages from e.g. English. As the definite article is a relatively new arrival in the Scandinavian languages, we look at when the modern pattern could have evolved by a close examination of possessive structures with potential inalienables in Old Swedish and Old Danish. Our results re veal that to begin with, inalienables are usually bare nouns and come to be marked with the definite article in the course of its grammaticalization. 1. INTRODUCTION One of the striking differences between the North Germanic languages Swedish and Danish on the one hand and English on the other is the possibility to use definite forms of nouns without a realized possessive in inalienable possession constructions. Consider the following examples: 1 The work on this paper was funded by the grant Diachrony of article systems in Scandi - navian languages , UMO - 2015/19/B/HS2/00143, from the National Science Centre, Poland.
    [Show full text]
  • A Multi-Modal Analysis of Anaphora and Ellipsis
    University of Pennsylvania Working Papers in Linguistics Volume 5 Issue 2 Current Work in Linguistics Article 2 1998 A Multi-Modal Analysis of Anaphora and Ellipsis Gerhard Jaeger Follow this and additional works at: https://repository.upenn.edu/pwpl Recommended Citation Jaeger, Gerhard (1998) "A Multi-Modal Analysis of Anaphora and Ellipsis," University of Pennsylvania Working Papers in Linguistics: Vol. 5 : Iss. 2 , Article 2. Available at: https://repository.upenn.edu/pwpl/vol5/iss2/2 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/pwpl/vol5/iss2/2 For more information, please contact [email protected]. A Multi-Modal Analysis of Anaphora and Ellipsis This working paper is available in University of Pennsylvania Working Papers in Linguistics: https://repository.upenn.edu/pwpl/vol5/iss2/2 A Multi-Modal Analysis of Anaphora and Ellipsis Gerhard J¨ager 1. Introduction The aim of the present paper is to outline a unified account of anaphora and ellipsis phenomena within the framework of Type Logical Categorial Gram- mar.1 There is at least one conceptual and one empirical reason to pursue such a goal. Firstly, both phenomena are characterized by the fact that they re-use semantic resources that are also used elsewhere. This issue is discussed in detail in section 2. Secondly, they show a striking similarity in displaying the characteristic ambiguity between strict and sloppy readings. This supports the assumption that in fact the same mechanisms are at work in both cases. (1) a. John washed his car, and Bill did, too. b. John washed his car, and Bill waxed it.
    [Show full text]
  • Donkey Anaphora Is In-Scope Binding∗
    Semantics & Pragmatics Volume 1, Article 1: 1–46, 2008 doi: 10.3765/sp.1.1 Donkey anaphora is in-scope binding∗ Chris Barker Chung-chieh Shan New York University Rutgers University Received 2008-01-06 = First Decision 2008-02-29 = Revised 2008-03-23 = Second Decision 2008-03-25 = Revised 2008-03-27 = Accepted 2008-03-27 = Published 2008- 06-09 Abstract We propose that the antecedent of a donkey pronoun takes scope over and binds the donkey pronoun, just like any other quantificational antecedent would bind a pronoun. We flesh out this idea in a grammar that compositionally derives the truth conditions of donkey sentences containing conditionals and relative clauses, including those involving modals and proportional quantifiers. For example, an indefinite in the antecedent of a conditional can bind a donkey pronoun in the consequent by taking scope over the entire conditional. Our grammar manages continuations using three independently motivated type-shifters, Lift, Lower, and Bind. Empirical support comes from donkey weak crossover (*He beats it if a farmer owns a donkey): in our system, a quantificational binder need not c-command a pronoun that it binds, but must be evaluated before it, so that donkey weak crossover is just a special case of weak crossover. We compare our approach to situation-based E-type pronoun analyses, as well as to dynamic accounts such as Dynamic Predicate Logic. A new ‘tower’ notation makes derivations considerably easier to follow and manipulate than some previous grammars based on continuations. Keywords: donkey anaphora, continuations, E-type pronoun, type-shifting, scope, quantification, binding, dynamic semantics, weak crossover, donkey pronoun, variable-free, direct compositionality, D-type pronoun, conditionals, situation se- mantics, c-command, dynamic predicate logic, donkey weak crossover ∗ Thanks to substantial input from Anna Chernilovskaya, Brady Clark, Paul Elbourne, Makoto Kanazawa, Chris Kennedy, Thomas Leu, Floris Roelofsen, Daniel Rothschild, Anna Szabolcsi, Eytan Zweig, and three anonymous referees.
    [Show full text]
  • Arxiv:1805.11824V1 [Cs.CL] 30 May 2018
    Artificial Intelligence Review manuscript No. (will be inserted by the editor) Anaphora and Coreference Resolution: A Review Rhea Sukthanker · Soujanya Poria · Erik Cambria · Ramkumar Thirunavukarasu Received: date / Accepted: date Abstract Entity resolution aims at resolving repeated references to an entity in a document and forms a core component of natural language processing (NLP) research. This field possesses immense potential to improve the performance of other NLP fields like machine translation, sentiment analysis, paraphrase detection, summarization, etc. The area of entity resolution in NLP has seen proliferation of research in two separate sub-areas namely: anaphora resolution and coreference resolution. Through this review article, we aim at clarifying the scope of these two tasks in entity resolution. We also carry out a detailed analysis of the datasets, evaluation metrics and research methods that have been adopted to tackle this NLP problem. This survey is motivated with the aim of providing the reader with a clear understanding of what constitutes this NLP problem and the issues that require attention. Keywords Entity Resolution · Coreference Resolution · Anaphora Resolution · Natural Language Processing · Sentiment Analysis · Deep Learning 1 Introduction A discourse is a collocated group of sentences which convey a clear understanding only when read together. The etymology of anaphora is ana (Greek for back) and pheri (Greek for to bear), which in simple terms means repetition. In computational linguistics, anaphora is typically defined as references to items mentioned earlier in the discourse or \pointing back" reference as described by (Mitkov, 1999). The most prevalent type of anaphora in natural language is the pronominal anaphora (Lappin and Leass, 1994).
    [Show full text]
  • Donkey Sentences 763 Creating Its Institutions of Laws, Religion, and Learning
    Donkey Sentences 763 creating its institutions of laws, religion, and learning. many uneducated speakers to restructure their plural, It was the establishment of viceroyalties, convents so that instead of the expected cotas ‘coasts’, with -s and a cathedral, two universities – the most notable denoting plurality, they have created a new plural being Santo Toma´s de Aquino – and the flourishing of with -se,asinco´ tase. arts and literature during the 16th and early 17th Dominican syntax tends to prepose pronouns in century that earned Hispaniola the title of ‘Athena interrogative questions. As an alternative to the stan- of the New World.’ The Spanish language permeated dard que´ quieres tu´ ? ‘what do you want?’, carrying an those institutions from which it spread, making obligatory, postverbal tu´ ‘you’, speakers say que´ tu´ Hispaniola the cradle of the Spanish spoken in the quieres?. The latter sentence further shows retention Americas. of pronouns, which most dialects may omit. Fre- Unlike the Spanish of Peru and Mexico, which quently found in Dominican is the repetition of dou- co-existed with native Amerindian languages, ble negatives for emphatic purposes, arguably of Dominican Spanish received little influence from the Haitian creole descent. In responding to ‘who did decimated Tainos, whose Arawak-based language that?’, many speakers will reply with a yo no se´ no disappeared, leaving a few recognizable words, such ‘I don’t know, no’. as maı´z ‘maize’ and barbacoa ‘barbecue’. The 17th Notwithstanding the numerous changes to its century saw the French challenge Spain’s hegemony grammatical system, and the continuous contact by occupying the western side of the island, which with the English of a large immigrant population they called Saint Domingue and later became the residing in the United States, Dominican Spanish has Republic of Haiti.
    [Show full text]
  • The Bound Variable Hierarchy and Donkey Anaphora in Mandarin Chinese
    The Bound Variable Hierarchy and Donkey Anaphora in Mandarin Chinese Haihua Pan and Yan Jiang City University of Hong Kong / London University Cheng and Huang (1996) argue that both unselective binding and E-type pro- noun strategies are necessary for the interpretation of natural language sentences and claim that there exists a correspondence between two sentence types in Chinese and the two strategies, namely that the interpretation of the “wh … wh” construction (which they call “bare conditional”) employs the unselective binding strategy, while the ruguo ‘if’ and dou ‘all’ conditionals use the E-type pronoun strategy. They also suggest that there is a complementary distribution between bare conditionals and ruguo/dou conditionals in the sense that the lat- ter allows all the NP forms, e.g. (empty) pronouns and definite NPs, except for wh-phrases in their consequent clauses, and can even have a consequent clause with no anaphoric NP in it, while the former permits only the same wh-phrase appearing in both the antecedent clause and the consequent clause. Although we agree with Cheng and Huang on the necessity of the two strategies in natural language interpretation, we see apparent exceptions to the correspondence between sentence types and interpretation strategies and the complementary distribution between wh-phrases and other NPs in bare conditionals and ruguo/dou conditionals. We think that the claimed correspondence and comple- mentary distribution are the default or preferred patterns, or a special case of a more general picture, namely that (i) bare conditionals prefer the unselective binding strategy and the ruguo ‘if’ and dou ‘all’ conditionals, the E-type pronoun strategy; and (ii) wh-phrases are more suitable for being a bound variable, and pronouns are more suitable for being the E-type pronoun.
    [Show full text]
  • Long-Distance Reflexivization and Logophoricity in the Dargin Language Muminat Kerimova Florida International University
    Florida International University FIU Digital Commons MA in Linguistics Final Projects College of Arts, Sciences & Education 2017 Long-Distance Reflexivization and Logophoricity in the Dargin Language Muminat Kerimova Florida International University Follow this and additional works at: https://digitalcommons.fiu.edu/linguistics_ma Part of the Linguistics Commons Recommended Citation Kerimova, Muminat, "Long-Distance Reflexivization and Logophoricity in the Dargin Language" (2017). MA in Linguistics Final Projects. 3. https://digitalcommons.fiu.edu/linguistics_ma/3 This work is brought to you for free and open access by the College of Arts, Sciences & Education at FIU Digital Commons. It has been accepted for inclusion in MA in Linguistics Final Projects by an authorized administrator of FIU Digital Commons. For more information, please contact [email protected]. FLORIDA INTERNATIONAL UNIVERSITY Miami, Florida LONG-DISTANCE REFLEXIVIZATION AND LOGOPHORICITY IN THE DARGIN LANGUAGE A thesis submitted in partial fulfillment of the requirements for the degree of MASTER OF ARTS in LINGUISTICS by Muminat Kerimova 2017 ABSTRACT OF THE THESIS LONG-DISTANCE REFLEXIVIZATION AND LOGOPHORICITY IN THE DARGIN LANGUAGE by Muminat Kerimova Florida International University, 2017 Miami, Florida Professor Ellen Thompson, Major Professor The study of anaphora challenges us to determine the conditions under which the pronouns of a language are associated with possible antecedents. One of the theoretical questions is whether the distribution of pronominal forms is best explained by a syntactic, semantic or discourse level analysis. A more practical question is how we distinguish between anaphoric elements, e.g. what are the borders between the notions of pronouns, locally bound reflexives and long-distance reflexives? The study analyzes the anaphora device saj in Dargin that is traditionally considered to be a long-distance reflexivization language.
    [Show full text]
  • Anaphora: Text-Based Or Discourse-Dependent? Functionalist Vs
    Anaphora: Text-based or discourse-dependent? Functionalist vs. formalist accounts Francis Cornish To cite this version: Francis Cornish. Anaphora: Text-based or discourse-dependent? Functionalist vs. formalist accounts. Functions of Language, John Benjamins Publishing, 2010, 17 (2), pp.207-241. 10.1075/fol.17.2.03cor. hal-00966398 HAL Id: hal-00966398 https://hal-univ-tlse2.archives-ouvertes.fr/hal-00966398 Submitted on 26 Mar 2014 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. 1 ANAPHORA: TEXT-BASED OR DISCOURSE-DEPENDENT? FUNCTIONALIST VS. FORMALIST ACCOUNTS* (Published in Functions of Language 17(2), 2010, pp. 207-241. DOI: 10.1075/fol.17.2.03cor) Francis Cornish, CLLE-ERSS, CNRS UMR 5263 and Université de Toulouse-Le-Mirail, Département Etudes du Monde Anglophone, 5, Allées Antonio Machado, 31058 Toulouse Cedex 09, France Email address: [email protected] 2 Abstract The traditional definition of anaphora in purely co-textual terms as a relation between two co-occurring expressions is in wide currency in theoretical and descriptive studies of the phenomenon. Indeed, it is currently adopted in on-line psycholinguistic experiments on the interpretation of anaphors, and is the basis for all computational approaches to automatic anaphor resolution (see Mitkov, 2002).
    [Show full text]
  • 2 Linguistic Fundamentals of Anaphors and Anaphora
    2 Linguistic fundamentals of anaphors and anaphora 2.1 Basic definitions The word anaphora originates from Greek ana- (“back”) and pherein (“to bear”) and entered English via Latin transmission (cf. “Anaphora” 2010). In English, it is documented for the first time in 1589 (cf. Simpson & Weiner 1989: 436-437): Anaphora, or the Figure of Report. Repetition in the firſt degree we call the figure of Report according to the Greeke originall, and is when we make one word begin, and as they are wont to ſay, lead the daunce to many verſes in ſute, as thus. To thinke on death it is a miſerie, To think on life it is a vanitie: To thinke on the world verily it is, To thinke that heare man hath no perfit bliſſe. (Puttenham 1589: 165) “Anaphora” here denotes the rhetoric figure of repetition. The first written evi- dence of a use in grammar is not found until 1933, when the term appeared in Bloomfield’s work Language: [W]hen we say Ask that policeman, and he will tell you, the substitute he means, among other things, that the singular male substantive expression which is replaced by he, has been recently uttered. A substitute which implies this, is an anaphoric or dependent sub- stitute, and the recently-uttered replaced form is the antecedent. (Bloomfield 1984: 249) Later he gives another example: The word one […] replaces a with anaphora of the noun […] when no other modifier is pre- sent (Here are some apples; take one); […] it is the anaphoric substitute for nouns after an adjective, and in this use forms a plural, ones (the big box and the small one, these boxes and the ones in the kitchen […]).
    [Show full text]