11003544.Pdf
Total Page:16
File Type:pdf, Size:1020Kb
The Value of an Annotated Corpus in The Investigation of Anaphoric Pronouns, with Particular Reference to Backwards Anaphora in English. Izumi Tanaka Thesis Submitted for the Degree of Ph.D., Department of Linguistics and Modem English Language, The University of Lancaster April, 2000 ProQuest Number: 11003544 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a com plete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. uest ProQuest 11003544 Published by ProQuest LLC(2018). Copyright of the Dissertation is held by the Author. All rights reserved. This work is protected against unauthorized copying under Title 17, United States C ode Microform Edition © ProQuest LLC. ProQuest LLC. 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106- 1346 Abstract This thesis investigates English personal pronoun reference in particular focusing on cataphora (backwards anaphora),using the Anaphoric Treebank (AT), which is a written English corpus with discourse annotation, and other corpus data. The analysis of corpus data reveals certain coreferential cataphora patterns (in particular in the initial adverbial or initial direct speech constructions). On the basis of the corpus data, the claims on cataphora made by generative approaches and cognitive discourse theories are tested. The points which became clear in testing generative approaches are: (1) The result of an informant test indicates the inadequacy of narrowly restricting data to invented examples. (2) The lack of understanding of the scope of the application of the theory can be observed in Reinhart (1984) and Binding theories. (3) A sentential-level approach to investigate pronoun references is inadequate. The points which became clear in testing cognitive discourse theories are: (i) First mention cataphora and Overriding-type of cataphora clearly show that there is such a phenomenon that can be called cataphora. contrary’ to the sceptical view on cataphora. (ii) These cataphora data indicate that a pronoun reference can be made not only for an already-mentioned entity but also for a new discourse entity. Also the analysis of the borderline cases (anaphora/cataphora) reveals that in certain conditions, the reference-direction (anaphora/cataphora) judgement tends to involve a triple-choice (anaphoric / cataphoric /indeterminate), or it can involve a cline between the anaphoric pole and the cataphoric pole. This indicates that the reference-direction judgement of a pronoun (anaphora/cataphora) tends to reflect the way in which a reader perceive focus of attention (highlight) when the reader completes to resolve the pronoun. In order to account for the cataphora data, a suggestion is made that it is necessary, from readers’ perspective, to assume that a pronoun creates a temporal information gathering point in some kind of the reader’s short term memory. ii Dedication This thesis is dedicated to Tomoji Tanaka, Mine Tanaka, and the founder of my alma mater, Dr. Daisaku Ikeda. Acknowledgements In addition to the three people to whom this thesis is dedicated, there are so many people to whom I must express gratitude for the completion of this thesis, and unfortunately I could not include all of their names here. First of all, I would like to thank my supervisor Professor. Geoffrey, N. Leech for his excellent supervision, patience and help during the course of writing this thesis. Without his invaluable comments and suggestions, I would never have been able to complete this research. Thanks are also due to another supervisor, Dr. Anthony M. McEnery for his support in particular in the early stage of this research. I must also express my gratitude to Professor. Anna Siewierska and Professor. Ruslan Mitkov for their invaluable comments, which enable me to improve this thesis significantly. I must express my gratitude to Nicholas Smith for his great help in the last stage of writing up the thesis as well as giving me useful advice in making questionnaire forms. Also I would like to express my gratitude to Banti Zelalem for his invaluable comments on my draft. I must thank the members of the research team who create the Anaphoric Treebank. which is the main data of this research. Thanks in particular go to Steve Fligelstone for providing me the invaluable documents about the coding and compilation of the Anaphoric Treebank. I would like to express my gratitude to Michael Oakes for his support in accommodation, Marjorie Wood and Dorothy Barber for the administrative support, Reiko Ikeo and David Lee for their comments on the final draft, and for all the native speakers of English who kindly offered their intuition for my informant test. I would like to express my sincere gratitude to Professor. Akira Yanabu, whose works have been always the source of inspiration for me, and Peter Davis for his encouragement and friendship since I came to Lancaster. V Table of Contents (Page) Chapter 1: Introduction: Background of the thesis 1 1-1 Aims of This Thesis 1 1-2 A Discussion of Corpus-Based Approach 3 1-2-1 A Discussion of M ethodological Aspect o f the Use o f 3 Corpus 1-2-2 A Source of Non-Corpus Based Approach 8 1-3 Issues Related to Cataphora (Backwards Anaphora) 12 1-4 What follows in This Thesis 16 Chapter 2: An overview of Personal pronoun reference in New reportage 19 2-1 Some Features of News Reportage Texts Relevant to Personal 20 Pronoun Reference 2-2 Deictic Dependency of Interpretation of Personal Pronoun 21 Reference 2-3 Situational Reference in Speech and Writing 25 2-3-1 Situational Reference in Speech 25 2-3-2 Situational Reference in Writing 26 2-4 Interpretation of Personal Pronouns in Direct Speech 29 2-4-1 Difficulties involved in Interpretation of Personal 29 Pronouns in Direct Speech 2-4-2 Mutually Exclusive Relations of Person and Cross- 31 Deictic Coreference 2-5 Interpretation of Personal Pronouns in Narrative 35 2-6 Signalling highlight (focus) maintenance: a Contrastive 36 function of Third person pronoun (3PP) in relation to full NP 2-7 Re-focusing / Re-highlighting out-of-focused entity 39 VI 2-8 Factors triggering highlight 43 2-9 Highlight maintenance at different levels in news Reportage 45 2-10 Conceptual issues related to highlighting 48 2-10-1 Proper Name in text and Name to be remembered 48 2-10-2 Notion of antecedent and anaphor 50 2-11 Summary of chapter 2 55 Chapter 3: Description and Illustration of the Anaphoric Treebank 56 3-1 A Profile of the AT 57 3-2 An illustration of the annotation process 58 3-3 An Overview of the mark-up of Anaphoric Treebank 61 3-4 Illustration of the AT coding scheme relevant to pronoun 63 reference 3-5 Alternative interpretations of the AT notions 73 3-6 The AT as a record of ‘off-line’ analysis 76 3-7 Corpus Data retrievable from the AT 81 Table illustrating how discourse entities are mentioned as the 81 discourse unfolds 3-8 Size of the Anaphoric Treebank (AT) used in this thesis 82 3-9 Conclusion 83 Chapter 4: Corpus data of cataphora (backwards anaphora): Structural 84 patterns of occurrences 4-0 Introduction 84 4-1 Corpus data of cataphora presented in research reports 85 4-1-1 C orpus data of cataphora presented in Carden (1982) 85 Table 4.1.1: Carden (1982) ’s 20 samples in terms o f the 87 grammatical position o f the pronoun-antecedent pattern vi i 4-1-2 Corpus data of cataphora presented in Kanzaki (1997) 88 Table 4.1.2: Kanzaki (1997) ’s examples in terms o f the 88 grammatical position o f the pronoun-antecedent pattern 4-1-3 Corpus data of cataphora presented in Van Hoek (1997) 89 Table 4.1.3. a: Summary o f Van Hoek (1997) ‘s cataphora 89 data according to the structural pattern o f the pronoun- antecedent occurrences Table 4.1.3. b: Grammatical position o f antecedent NTs in 90 cataphora data (Van Hoek, 1997). 4-2 Corpus data of cataphora retrieved from the Anaphoric 92 Treebank 4-2-0 Introduction 92 4-2-1 First-mention cataphora in the AT 96 4-2-2 Cataphorically marked cases in AT according to structural 98 pattern of pronoun-antecedent occurrences Table (4.2.2): Cataphorically marked cases in the AT 98 according to the frequent structural pattern o f the pronoun-antecedent occurrences 4-3 Other cataphora Constructions 105 4-4 Summary and Conclusion 111 Chapter 5: Generative Approaches 112 5-1 C-command coreferential constraint by Reinhart (1976, 1981, 112 1984)• 5-1-0 Introduction 112 5-1-1 Overview of the coreferential constraints by C-command 115 domain by Reinhart (1976. 1981, 1984) Figure (5.1.1): Illustration o f General Rule (I) 115 (Reinhart, 1984) Figure (5.1.2): Illustration o f C-command 116 (Reinhart, 1984) v i i i Figure (5.1.3): Illustration o f C-command and 'Domain 117 o f a node ’ Figure (5.1.4): Illustration o f C-command in initial-ADV 118 construction Figure (5.1.5): Illustration o f C-command coreferential 120 constraints with Example (45), (46), (47) Table 5.1.6: All possible patterns o f relation between two 121 NPs in terms o f C-command domain Table 5.1.7: Patterns o f relation between two NPs in 122 terms of C-command domain relevant to initial-ADV cataphora Initial empirical problems of Reinhart’s C-command 124 model Figure 5.1.8: Tree diagram for [AO13 42] assuming that 125 initial (preposed) PPs are positioned at COMP Figure 5.1.9: Tree diagram for [AO15 54] assuming that 126 initial (preposed) PP are positioned at COMP.