Coreference: Theory, Annotation, Resolution and Evaluation

Total Page:16

File Type:pdf, Size:1020Kb

Coreference: Theory, Annotation, Resolution and Evaluation Coreference: Theory, Annotation, Resolution and Evaluation by Marta Recasens Potau A Dissertation Presented to the Doctoral Program in Linguistics and Communication, Department of Linguistics, University of Barcelona, in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy under the supervision of Dr. M. Antonia` Mart´ı Anton´ın University of Barcelona Dr. Eduard Hovy ISI – University of Southern California University of Barcelona September 2010 Coreferencia:` Teoria, anotacio,´ resolucio´ i avaluacio´ per Marta Recasens Potau Memoria` presentada dins del Programa de Doctorat Lingu¨´ıstica i Comunicacio´, Bienni 2006–2008, Departament de Lingu¨´ıstica General, Universitat de Barcelona, per optar al grau de Doctor sota la direccio´ de Dra. M. Antonia` Mart´ı Anton´ın Universitat de Barcelona Dr. Eduard Hovy ISI – University of Southern California Universitat de Barcelona Setembre 2010 Counting words isn’t very revealing if you aren’t listening to them, too. – Geoffrey Nunberg “The I’s Don’t Have It”, Fresh Air (November 17, 2009) Usage factors reveal language as a natural, organic social instrument, not an abstract logical one. – Joan Bybee Language, Usage and Cognition (2010:193) iii iv To my loved ones, Merce,` Eduard, Elm and Mark, for being there. v vi Abstract Coreference relations, as commonly defined, occur between linguistic expressions that refer to the same person, object or event. Resolving them is an integral part of discourse comprehension by allowing language users to connect the pieces of discourse information concerning the same entity. Consequently, coreference res- olution has become a major focus of attention in natural language processing as its own task. Despite the wealth of existing research, current performance of corefer- ence resolution systems has not reached a satisfactory level. The thesis is broadly divided into two parts. In the first part, I examine three separate but closely related aspects of the coreference resolution task, namely (i) the encoding of coreference relations in large electronic corpora, (ii) the de- velopment of learning-based coreference resolution systems, and (iii) the scoring and evaluation of coreference systems. Throughout this research, insight is gained into foundational problems in the coreference resolution task that pose obstacles to its feasibility. Hence, my main contribution resides in a critical but constructive analysis of various aspects of the coreference task that, in the second part of the thesis, leads to rethink the concept of coreference itself. First, the annotation of the Spanish and Catalan AnCora corpora (totaling nearly 800k words) with coreference information reveals that the concept of referentiality is not a clear-cut one, and that some relations encountered in real data do not fit the prevailing either-or view of coreference. Degrees of referentiality as well as relations that do not fall neatly into either coreference or non-coreference—or that accept both interpretations—are a major reason for the lack of inter-coder agree- ment in coreference annotation. Second, experiments on the contribution of over forty-five learning features to coreference resolution show that, although the extended set of linguistically mo- tivated features results in an overall significant improvement, this is smaller than vii expected. In contrast, the simple head-match feature alone succeeds in obtaining a quite satisfactory score. It emerges that head match is one of the few features suffi- ciently represented for machine learning to work. The complex interplay between factors, and the fact that pragmatics and world knowledge do not lend themselves to be captured systematically in the form of pairwise learning features, are indi- cators that the way machine learning is currently applied may not be well suited to the coreference task. I advocate for entity-based systems like the one presented in this thesis, CISTELL, as the model best suited to address the coreference prob- lem. CISTELL allows not only the accumulation and carrying of information from “inside” the text, but also the storing of background and world knowledge from “outside” the text. Third, further experiments, as well as the SemEval shared task, demonstrate that the current evaluation of coreference resolution systems is obscured by a num- ber of factors including variations in the task definition, the use of gold-standard or automatically predicted mention boundaries, and the disagreement between the system rankings produced by the widely-used evaluation metrics (MUC, B3, CEAF). The imbalance between the number of singletons and multi-mention en- tities in the data accounts for measurement biases toward either over- or under- clustering. The BLANC measure that I propose, which is a modified implementa- tion of the Rand index, addresses this imbalance by dividing the score into coref- erence and non-coreference links. Finally, the second part of the thesis concludes that abandoning the traditional categorical understanding of coreference is the first step to further the state of the art. To this end, the notion of near-identity is introduced within a continuum model of coreference. From a cognitive perspective, I argue for the variable granular- ity level at which discourse entities can be conceived. It is posited that three different categorization operations—specification, refocusing and neutralization— govern the shifts that discourse entities undergo as a discourse evolves and so ac- count for (near-)coreference relations. This new continuum model provides sound theoretical foundations to the coreference problem, both for the linguistic and com- putational fields. viii Resum Les relacions de coreferencia,` segons la definicio´ mes´ comuna, s’estableixen entre expressions lingu¨´ıstiques que es refereixen a una mateixa persona, objecte o es- deveniment. Resoldre-les es´ una part integral de la comprensio´ del discurs ja que permet als usuaris de la llengua connectar les parts del discurs que contenen infor- macio´ sobre una mateixa entitat. En consequ¨encia,` la resolucio´ de la coreferencia` ha estat un focus d’atencio´ destacat del processament del llenguatge natural, on te´ una tasca propia.` Tanmateix, malgrat la gran quantitat de recerca existent, els re- sultats dels sistemes actuals de resolucio´ de la coreferencia` no han assolit un nivell satisfactori. La tesi es divideix en dos grans blocs. En el primer, examino tres aspectes diferents pero` estretament relacionats de la tasca de resolucio´ de la coreferencia:` (i) l’anotacio´ de relacions de coreferencia` en grans corpus electronics,` (ii) el de- senvolupament de sistemes de resolucio´ de la coreferencia` basats en aprenentatge automatic` i (iii) la qualificacio´ i avaluacio´ dels sistemes de coreferencia.` En el transcurs d’aquesta investigacio,´ es fa evident que la tasca de coreferencia` presenta una serie` de problemes de base que constitueixen veritables obstacles per a la seva correcta resolucio.´ Per aixo,` la meva aportacio´ principal es´ una analisi` cr´ıtica i alhora constructiva de diferents aspectes de la tasca de coreferencia` que finalment condueix, en el segon bloc de la tesi, al replantejament del concepte mateix de coreferencia` . En primer lloc, l’anotacio´ amb coreferencia` dels corpus AnCora del castella` i el catala` (un total de 800.000 paraules) posa al descobert, d’una banda, que el concepte de referencialitat no esta` clarament delimitat i, d’una altra, que algunes relacions observades en dades d’us´ real no encaixen dins la visio´ de la coreferencia` entesa en termes dicotomics.` Tant els graus de referencialitat com les relacions ix que no son´ ni coreferencials ni no coreferencials (o que accepten totes dues inter- pretacions) son´ una de les raons principals que dificulten assolir un alt grau d’acord entre els anotadors d’aquesta tasca. En segon lloc, els experiments realitzats sobre la contribucio´ de mes´ de quaranta- cinc trets d’aprenentage automatic` a la resolucio´ de la coreferencia` mostren que, tot i que el ventall de trets motivats lingu¨´ısticament porta a una millora significativa general, aquesta es´ mes´ petita que l’esperada. En canvi, el senzill tret de mateix- nucli (head match) aconsegueix tot sol resultats prou satisfactoris. D’aixo` se’n despren` que es tracta d’un dels pocs trets suficientment representats per al bon fun- cionament de l’aprenentatge automatic.` La interaccio´ complexa que es dona´ entre els diversos factors aix´ı com el fet que el coneixement pragmatic` i del mon´ no es deixa representar sistematicament` en forma de trets d’aprenentatge de parells de mencions son´ indicadors que la manera en que` actualment s’aplica l’aprenentatge automatic` pot no ser especialment idonia` per a la tasca de coreferencia.` Per aixo,` considero que el millor model per adrec¸ar el problema de la coreferencia` corre- spon als sistemes basats en entitats com CISTELL, que presento a la tesi. Aquest sistema permet no nomes´ emmagatzemar informacio´ de “dins” del text sino´ tambe´ recollir coneixement general i del mon´ de “fora” del text. En tercer lloc, altres experiments aix´ı com la tasca compartida del SemEval demostren l’existencia` de diversos factors que questionen¨ la manera en que` actual- ment s’avaluen els sistemes de resolucio´ de la coreferencia.` Es tracta de varia- cions en la definicio´ de la tasca, l’extraccio´ de mencions a partir de l’estandard` de referencia` o predites automaticament,` i el desacord entre els ranquings` de sis- temes donats per les metriques` d’avaluacio´ mes´ utilitzades (MUC, B3, CEAF). La desigualtat entre el nombre d’entitats unaries` i el nombre d’entitats de multiples´ mencions explica el biaix de les mesures o be´ cap a un deficit` o be´ cap a un exces´ de clusters. La mesura BLANC que proposo, una implementacio´ modificada de l’´ındex de Rand, corregeix aquest desequilibri dividint la puntuacio´ final entre rela- cions de coreferencia` i de no coreferencia.` Finalment, la segona part de la tesi arriba a la conclusio´ que l’abando´ de la visio´ tradicional i dicotomica` de la coreferencia` es´ el primer pas per anar mes´ enlla` de l’estat de l’art.
Recommended publications
  • Questions Under Discussion: from Sentence to Discourse∗
    Questions under Discussion: From sentence to discourse∗ Anton Benz Katja Jasinskaja July 14, 2016 Abstract Questions under discussion (QUD) is an analytic tool that has recently become more and more popular among linguists and language philosophers as a way to characterize how a sentence fits in its context. The idea is that each sentence in discourse addresses a (of- ten implicit) QUD either by answering it, or by bringing up another question that can help answering that QUD. The linguistic form and the interpretation of a sentence, in turn, may depend on the QUD it addresses. The first proponents of the QUD approach (von Stut- terheim & Klein, 1989; van Kuppevelt, 1995) thought of it as a general approach to the analysis of discourse structure where structural relations between sentences in a coherent discourse are understood in terms of relations between questions they address. However, the idea did not find broad application in discourse structure analysis, where theories aiming at logical discourse representations are predominantly based on the notion of coherence re- lations (Mann & Thompson, 1988; Hobbs, 1985; Asher & Lascarides, 2003). The problem of finding overt markers for QUDs means that they are also difficult to utilize for quanti- tative measures of discourse coherence (McNamara et al., 2010). At the same time QUD proved useful in the analysis of a wide range of linguistic phenomena including accentua- tion, discourse particles, presuppositions and implicatures. The goal of this special issue is to bring together cutting-edge research that demonstrates the success of QUD in linguistics and the need for the development of a comprehensive model of discourse coherence that incorporates the notion of QUD as its integral part.
    [Show full text]
  • Coreference Resolution for Spoken French Loïc Grobol
    Coreference resolution for spoken French Loïc Grobol To cite this version: Loïc Grobol. Coreference resolution for spoken French. Computation and Language [cs.CL]. Université Sorbonne Nouvelle - Paris 3, 2020. English. tel-02928209v2 HAL Id: tel-02928209 https://hal.archives-ouvertes.fr/tel-02928209v2 Submitted on 9 Mar 2021 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution| 4.0 International License École Doctorale 622 — Sciences du langage Lattice, Inria Thèse de doctorat en sciences du langage de l’Université Sorbonne Nouvelle Coreference resolution for spoken French présentée et soutenue publiquement par Loïc Grobol le 15 juillet 2020 Sous la direction d’Isabelle Tellier† et de Frédéric Landragin Et co-encadrée par Éric Villemonte de la Clergerie et Marco Dinarelli Jury : Massimo Poesio, Professor (Queen Mary University), Rapporteur, Sophie Rosset, Directrice de recherche (LIMSI), Rapportrice, Béatrice Daille, Professeur (Université de Nantes), Examinatrice, Pascal Amsili, Professeur (Université Sorbonne Nouvelle), Examinateur, Frédéric Landragin, Directeur de recherche (Lattice), Directeur, Éric Villemonte de la Clergerie, Chargé de recherche (Inria), Co-encadrant, Marco Dinarelli, Chargé de recherche (LIG), Co-encadrant. Reconnaissance automatique de chaînes de coréférences en français parlé Résumé Une chaîne de coréférences est l’ensemble des expressions linguistiques — ou mentions — qui font référence à une même entité ou un même objet du discours.
    [Show full text]
  • Mind the GAP: a Balanced Corpus of Gendered Ambiguous Pronouns
    Mind the GAP: A Balanced Corpus of Gendered Ambiguous Pronouns Kellie Webster and Marta Recasens and Vera Axelrod and Jason Baldridge Google AI Language fwebsterk|recasens|vaxelrod|[email protected] Abstract (1) In May, Fujisawa joined Mari Motohashi’s rink as the team’s skip, moving back from Karuizawa to Ki- Coreference resolution is an important task tami where she had spent her junior days. for natural language understanding, and the resolution of ambiguous pronouns a long- With this scope, we make three key contributions: standing challenge. Nonetheless, existing • We design an extensible, language- corpora do not capture ambiguous pronouns in sufficient volume or diversity to accu- independent mechanism for extracting rately indicate the practical utility of mod- challenging ambiguous pronouns from text. els. Furthermore, we find gender bias in • We build and release GAP, a human-labeled existing corpora and systems favoring mas- corpus of 8,908 ambiguous pronoun-name culine entities. To address this, we present pairs derived from Wikipedia.2 This dataset and release GAP, a gender-balanced labeled targets the challenges of resolving naturally- corpus of 8,908 ambiguous pronoun-name occurring ambiguous pronouns and rewards pairs sampled to provide diverse coverage systems which are gender-fair. of challenges posed by real-world text. We explore a range of baselines which demon- • We run four state-of-the-art coreference re- strate the complexity of the challenge, the solvers and several competitive simple base- best achieving just 66.9% F1. We show lines on GAP to understand limitations in that syntactic structure and continuous neu- current modeling, including gender bias.
    [Show full text]
  • Constraints on Donkey Pronouns
    Constraints on Donkey Pronouns The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation Grosz, P. G., P. Patel-Grosz, E. Fedorenko, and E. Gibson. “Constraints on Donkey Pronouns.” Journal of Semantics 32, no. 4 (July 15, 2014): 619–648. As Published http://dx.doi.org/10.1093/jos/ffu009 Publisher Oxford University Press Version Author's final manuscript Citable link http://hdl.handle.net/1721.1/102962 Terms of Use Creative Commons Attribution-Noncommercial-Share Alike Detailed Terms http://creativecommons.org/licenses/by-nc-sa/4.0/ CONSTRAINTS ON DONKEY PRONOUNS Patrick Grosz, Pritty Patel-Grosz, Evelina Fedorenko, Edward Gibson Abstract This paper reports on an experimental study of donkey pronouns, pronouns (e.g. it) whose meaning covaries with that of a non-pronominal noun phrase (e.g. a donkey) even though they are not in a structural relationship that is suitable for quantifier-variable binding. We investigate three constraints, (i) the preference for the presence of an overt NP antecedent that is not part of another word, (ii) the salience of the position of an antecedent that is part of another word, and (iii) the uniqueness of an intended antecedent (in terms of world knowledge). We compare constructions in which intended antecedents occur in a context such as who owns an N / who is an N-owner with constructions of the type who was without an N / who was N-less. Our findings corroborate the existence of the overt NP antecedent constraint, and also show that the salience of an unsuitable antecedent’s position matters.
    [Show full text]
  • Arxiv:1805.11824V1 [Cs.CL] 30 May 2018
    Artificial Intelligence Review manuscript No. (will be inserted by the editor) Anaphora and Coreference Resolution: A Review Rhea Sukthanker · Soujanya Poria · Erik Cambria · Ramkumar Thirunavukarasu Received: date / Accepted: date Abstract Entity resolution aims at resolving repeated references to an entity in a document and forms a core component of natural language processing (NLP) research. This field possesses immense potential to improve the performance of other NLP fields like machine translation, sentiment analysis, paraphrase detection, summarization, etc. The area of entity resolution in NLP has seen proliferation of research in two separate sub-areas namely: anaphora resolution and coreference resolution. Through this review article, we aim at clarifying the scope of these two tasks in entity resolution. We also carry out a detailed analysis of the datasets, evaluation metrics and research methods that have been adopted to tackle this NLP problem. This survey is motivated with the aim of providing the reader with a clear understanding of what constitutes this NLP problem and the issues that require attention. Keywords Entity Resolution · Coreference Resolution · Anaphora Resolution · Natural Language Processing · Sentiment Analysis · Deep Learning 1 Introduction A discourse is a collocated group of sentences which convey a clear understanding only when read together. The etymology of anaphora is ana (Greek for back) and pheri (Greek for to bear), which in simple terms means repetition. In computational linguistics, anaphora is typically defined as references to items mentioned earlier in the discourse or \pointing back" reference as described by (Mitkov, 1999). The most prevalent type of anaphora in natural language is the pronominal anaphora (Lappin and Leass, 1994).
    [Show full text]
  • Event Coreference Resolution: a Survey of Two Decades of Research
    Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) Event Coreference Resolution: A Survey of Two Decades of Research Jing Lu and Vincent Ng Human Language Technology Research Institute University of Texas at Dallas Richardson, TX 75083-0688 fljwinnie,[email protected] Abstract Recent years have seen a gradual shift of focus from entity-based tasks to event-based tasks in informa- tion extraction research. Being a core event-based task, event coreference resolution is less studied but arguably more challenging than entity corefer- ence resolution. This paper provides an overview of the major milestones made in event coreference research since its inception two decades ago. Figure 1: The standard information extraction pipeline 1 Introduction It should be easy to see from this example that to per- Compared to entity coreference resolution, event coreference form end-to-end event coreference resolution, one has to resolution is less studied but arguably more challenging. To build an information extraction (IE) pipeline (cf. Figure 1) see its difficulty, consider the following example on within- that involves (1) extracting the entity mentions from a given document event coreference resolution, whose goal is to de- document (the entity extraction component) and determining termine which event mentions in a document refer to the same which of them are coreferent (the entity coreference compo- real-world event: nent); (2) extracting the event mentions by identifying their trigger words/phrases and determining which entity mentions Georges Cipriani fleftg a prison in Ensisheim ev1 are their arguments (the event extraction component); and (3) in northern France on parole on Wednesday.
    [Show full text]
  • Binding and Coreference in Vietnamese
    University of Massachusetts Amherst ScholarWorks@UMass Amherst Doctoral Dissertations Dissertations and Theses October 2019 Binding and Coreference in Vietnamese Thuy Bui University of Massachusetts Amherst Follow this and additional works at: https://scholarworks.umass.edu/dissertations_2 Part of the Psycholinguistics and Neurolinguistics Commons, Semantics and Pragmatics Commons, and the Syntax Commons Recommended Citation Bui, Thuy, "Binding and Coreference in Vietnamese" (2019). Doctoral Dissertations. 1694. https://doi.org/10.7275/ncqc-n685 https://scholarworks.umass.edu/dissertations_2/1694 This Open Access Dissertation is brought to you for free and open access by the Dissertations and Theses at ScholarWorks@UMass Amherst. It has been accepted for inclusion in Doctoral Dissertations by an authorized administrator of ScholarWorks@UMass Amherst. For more information, please contact [email protected]. BINDING AND COREFERENCE IN VIETNAMESE A Dissertation Presented by THUY BUI Submitted to the Graduate School of the University of Massachusetts Amherst in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY September 2019 Linguistics c Copyright by Thuy Bui 2019 All Rights Reserved BINDING AND COREFERENCE IN VIETNAMESE A Dissertation Presented by THUY BUI Approved as to style and content by: ————————————————– Kyle Johnson, Co-Chair ————————————————– Brian Dillon, Co-Chair ————————————————– Rajesh Bhatt, Member ————————————————– Adrian Staub, Member ————————————————– Joe Pater, Department Chair Department of Linguistics ACKNOWLEDGMENTS I now realize how terrible I really am at feelings, words, and expressing feelings with words, as I spent the past three weeks trying to make these acknowledge- ments sound right, and they just never did. With this acute awareness and abso- lutely no compensating skills, I am going to include a series of inadequate thank yous to the many amazing people whose overflowing acts of kindness towards me exceeds what words can possibly describe anyway.
    [Show full text]
  • Long-Distance Reflexivization and Logophoricity in the Dargin Language Muminat Kerimova Florida International University
    Florida International University FIU Digital Commons MA in Linguistics Final Projects College of Arts, Sciences & Education 2017 Long-Distance Reflexivization and Logophoricity in the Dargin Language Muminat Kerimova Florida International University Follow this and additional works at: https://digitalcommons.fiu.edu/linguistics_ma Part of the Linguistics Commons Recommended Citation Kerimova, Muminat, "Long-Distance Reflexivization and Logophoricity in the Dargin Language" (2017). MA in Linguistics Final Projects. 3. https://digitalcommons.fiu.edu/linguistics_ma/3 This work is brought to you for free and open access by the College of Arts, Sciences & Education at FIU Digital Commons. It has been accepted for inclusion in MA in Linguistics Final Projects by an authorized administrator of FIU Digital Commons. For more information, please contact [email protected]. FLORIDA INTERNATIONAL UNIVERSITY Miami, Florida LONG-DISTANCE REFLEXIVIZATION AND LOGOPHORICITY IN THE DARGIN LANGUAGE A thesis submitted in partial fulfillment of the requirements for the degree of MASTER OF ARTS in LINGUISTICS by Muminat Kerimova 2017 ABSTRACT OF THE THESIS LONG-DISTANCE REFLEXIVIZATION AND LOGOPHORICITY IN THE DARGIN LANGUAGE by Muminat Kerimova Florida International University, 2017 Miami, Florida Professor Ellen Thompson, Major Professor The study of anaphora challenges us to determine the conditions under which the pronouns of a language are associated with possible antecedents. One of the theoretical questions is whether the distribution of pronominal forms is best explained by a syntactic, semantic or discourse level analysis. A more practical question is how we distinguish between anaphoric elements, e.g. what are the borders between the notions of pronouns, locally bound reflexives and long-distance reflexives? The study analyzes the anaphora device saj in Dargin that is traditionally considered to be a long-distance reflexivization language.
    [Show full text]
  • Modeling the Lifespan of Discourse Entities with Application to Coreference Resolution
    Journal of Artificial Intelligence Research 52 (2015) 445-475 Submitted 09/14; published 03/15 Modeling the Lifespan of Discourse Entities with Application to Coreference Resolution Marie-Catherine de Marneffe [email protected] Linguistics Department The Ohio State University Columbus, OH 43210 USA Marta Recasens [email protected] Google Inc. Mountain View, CA 94043 USA Christopher Potts [email protected] Linguistics Department Stanford University Stanford, CA 94035 USA Abstract A discourse typically involves numerous entities, but few are mentioned more than once. Dis- tinguishing those that die out after just one mention (singleton) from those that lead longer lives (coreferent) would dramatically simplify the hypothesis space for coreference resolution models, leading to increased performance. To realize these gains, we build a classifier for predicting the singleton/coreferent distinction. The model’s feature representations synthesize linguistic insights about the factors affecting discourse entity lifespans (especially negation, modality, and attitude predication) with existing results about the benefits of “surface” (part-of-speech and n-gram-based) features for coreference resolution. The model is effective in its own right, and the feature represen- tations help to identify the anchor phrases in bridging anaphora as well. Furthermore, incorporating the model into two very different state-of-the-art coreference resolution systems, one rule-based and the other learning-based, yields significant performance improvements. 1. Introduction Karttunen imagined a text interpreting system designed to keep track of “all the individuals, that is, events, objects, etc., mentioned in the text and, for each individual, record whatever is said about it” (Karttunen, 1976, p. 364).
    [Show full text]
  • 1 Discourse and Logical Form: Pronouns, Attention and Coherence
    Discourse and Logical Form: Pronouns, Attention and Coherence 1 Introduction The meaning of the pronoun ‘he’ does not seem ambiguous in the way that, say, the noun ‘bank’ is. Yet what we express in uttering ‘He is happy,’ pointing at Bill, differs from what we express when uttering it, pointing at a different person, Sam. ‘he’ in the first case refers to Bill, in the second to Sam. And further with an utterance of ‘A man always drives the car he owns,’ or ‘A man came in. He sat down,’ without an accompanying pointing gesture, ‘he’ refers not at all; its interpretation, instead, co- varies with possible instances in the range of the quantifier ‘A man’. Those uses in which the pronoun is interpreted referentially are usually called ‘demonstrative,’ and those in which it is interpreted non-referentially are usually called ‘bound’. We immediately face two questions: what is the semantic value of a pronoun in a context, given that it can have both bound and demonstrative uses? And, since this value can vary with context, how does context fix it? The first question concerns the semantics of pronouns, and the second their meta-semantics. The received view answers the first question by positing an ambiguity: bound uses of ‘he’ function as bound variables; demonstrative ones, by contrast, are referential. The received view’s answer to the second question is that the semantic value of a bound use co-varies with possible instances in the range of a binding expression, and for demonstrative uses, linguistic meaning constrains, but does not fully determine, its semantic value in context.
    [Show full text]
  • Discourse-Givenness of Noun Phrases Theoretical and Computational Models
    Discourse-Givenness of Noun Phrases Theoretical and Computational Models Dissertation zur Erlangung des akademischen Grades Doktor der Philosophie (Dr. phil.) der Humanwissenschaftlichen Fakult¨at der Universit¨atPotsdam vorgelegt von Julia Ritz Stuttgart, Mai 2013 Published online at the Institutional Repository of the University of Potsdam: URL http://opus.kobv.de/ubp/volltexte/2014/7081/ URN urn:nbn:de:kobv:517-opus-70818 http://nbn-resolving.de/urn:nbn:de:kobv:517-opus-70818 Erkl¨arung(Declaration of Authenticity of Work) Hiermit erkl¨areich, dass ich die beigef¨ugteDissertation selbstst¨andigund ohne die un- zul¨assigeHilfe Dritter verfasst habe. Ich habe keine anderen als die angegebenen Quellen und Hilfsmittel genutzt. Alle w¨ortlich oder inhaltlich ¨ubernommenen Stellen habe ich als solche gekennzeichnet. Ich versichere außerdem, dass ich die Dissertation in der gegenw¨artigenoder einer an- deren Fassung nur in diesem und keinem anderen Promotionsverfahren eingereicht habe, und dass diesem Promotionsverfahren keine endg¨ultiggescheiterten Promotionsverfahren vorausgegangen sind. Im Falle des erfolgreichen Abschlusses des Promotionsverfahrens erlaube ich der Fakult¨at, die angef¨ugtenZusammenfassungen zu ver¨offentlichen. Ort, Datum Unterschrift iii iv iv Contents Preface xiii 1 Introduction 1 1.1 Discourse-Givenness and Related Concepts . .3 1.2 Goal . .4 1.3 Motivation . .6 1.4 The Field of Discourse-Givenness Classification . 10 1.5 Structure of this Document . 11 2 Concepts and Definitions 13 2.1 Basic Concepts and Representation . 13 2.1.1 Basic Concepts in Discourse . 14 2.1.2 Formal Representation . 15 2.2 Reference . 18 2.2.1 Identifying Basic Units . 20 2.2.2 Non-Referring Use of Expressions .
    [Show full text]
  • Lecture 5. Introduction to Issues in Anaphora 1. What Is Anaphora? 2
    Formal Semantics and Current Problems of Semantics, Lecture 5 Formal Semantics and Current Problems of Semantics, Lecture 5 Barbara H. Partee, RGGU, March 4, 2008 p. 1 Barbara H. Partee, RGGU, March 4, 2008 p. 2 Lecture 5. Introduction to Issues in Anaphora in (1a) and what is sometimes called a deictic use1 of he in (2), as it might be uttered while looking at someone who just walked by, where there is no linguistic antecedent. 1. What is Anaphora? ..................................................................................................................................... 1 (2) He looks lost. 2. Coreference vs. variable-binding................................................................................................................ 2 3. Syntactic aspects of anaphora, and syntax-semantics interface issues ....................................................... 4 Two notes on terminology. In the broad sense of the term anaphora, all of the examples 4. Definite NPs as anaphoric expressions: debates......................................................................................... 7 above as well as the ones in (3) below involve anaphora. There are two narrower senses 5. Issues and puzzles in anaphora. Preview of coming lectures ..................................................................... 7 5.1. Donkey anaphora and discourse anaphora with indefinite antecedents and the rise of Dynamic in which anaphora or anaphor are distinguished from other terms. Semantics. Weeks 6-8...............................................................................................................................
    [Show full text]