DENOTATION in DISCOURSE: Analysis and Algorithm
Total Page:16
File Type:pdf, Size:1020Kb
PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is a publisher's version. For additional information about this publication click this link. http://hdl.handle.net/2066/113693 Please be advised that this information was generated on 2021-09-23 and may be subject to change. DENOTATION in DISCOURSE: Analysis and Algorithm A. Weij'ters DENOTATION in DISCOURSE: Analysis and Algorithm ISBN: 90-9003057-3 DENOTATION in DISCOURSE: Analysis and Algorithm Een wetenschappelijke proeve op het gebied van de wijsbegeerte Proefschrift ter verkrijging van de graad van doctor aan de Katholieke Universiteit te Nijmegen, volgens besluit van het college van decanen in het openbaar te verdedigen op maandag 30 oktober 1989 des namiddags te 13.30 uur door Antonius Jozef Martha Maria Weijters geboren op 11 augustus 1950 te Coirle Promotor: Prof. dr. P.A.M. Seuren Preface This study is the result of a research project entitled Text Representation and Text Understanding, which was initiated by my supervisor Pieter Seuren, together with Leo Noordman and Wietske Vonk. The first three years of the project were supported by the University of Nijmegen (grant CI 1/81). The project has a strong interdisciplinary character involving philosophy, linguistics, psychology and artificial intelligence. Hopefully, the tone I have tried to achieve and the terminology chosen are not too far off the mark for interested readers from these disciplines. There are several people whom I wish to thank for their help and support as I prepared this study. First of all, I must mention Geer Hoppenbrouwers. Without his effort and generous dedication I doubt I if could have written up the text at all. How can I repay the many pleasant hours we spent in discussing all sorts of problems that cropped up in the course of the project. And how could I have paid for the many, no doubt less pleasant, hours he spent on the translation of the text into bearable English. Paul Gretton further polished and improved this text from the point of view of English idiom, in the same spirit of unselfish collegiality. I also wish to thank Edward Hoenkamp of Nijmegen University for his help as I was trying to acquire greater skill in using LISP, and for his skill in unobtrusively teaching me certain important techniques in AI. Jaap van den Herik, my Head of Department, has won my lasting gratitude for insisting that I should finish this thesis, and allowing me time off for that purpose, during my first year in my new job. I cannot refrain from mentioning my lovely wife Len, and my children Maaike and Tomas. Their presence made me aware, again and again, that proper names like Maaike, Tomas, and Len not only denote but also refer. (The uncomprehending reader will have to consult the text.) Finally, a word of thanks is due to my colleagues, friends and wider family, for their support and interest in my work. Inevitably, my parents have become "wider family" now, but it is difficult for me to express what their trust and silent support means to me. July 1989, A. Weijters List of Abbreviatons: DD Discourse Domain dd definite description DEN(dd) The denotatum of the definite description dd DR Discourse Representation DRAFT Denotation Resolution Algorithm For Texts f-p female pronoun LSA Language of Semantic Analyses m-p male pronoun n-p neutral pronoun NP Noun Phrase r-p reflexive pronoun S Sentence SA Semantic Analysis SS Surface Structure SSK Surface Structure Key Τ Term Tl Subject or subject clause T2 Object or object clause ТЗ Indirect object χ-η A mnemonic for new (indefinite NPs) х-о A mnemonic for old (definite NPs) Contents Preface List of Abbreviations I Introduction 1 II Introducing DRAFT (Denotation Resolution Algorithm For Texts) 11 III Expl·Drin g the Denotation Process 23 III.l Existentially quantified NPs 25 III.2 Definite descriptions 27 III.3 Anaphorical definite descriptions 40 III.4 Proper names 44 III.5 Pronouns 48 III.5.1 Introduction 48 III.5.2 Syntactic constraints on possible coreferential relations 49 111.5.2.1 Background 49 III.5.2.2 The Precede-Command Constraint 51 III.5.2.3 The C-Command Constraint 53 III.5.2.4 Lakoffs universal constraints 57 III.5.3 Relevant theories and views from psycholinguistics 61 III.6 Preliminaries to a heuristic denotation resolution algorithm 69 III.6.1 Background 69 III.6.2 The heuristic rules for a denotation resolution algorithm 71 III.6.3 Parallel processing 78 III.7 A global characterization of the underlying algorithm of DRAF1' 79 IV The Underlying Denotation Resolution Algorithm of DRAFT 85 IV. 1 The relation between LS A and DRAFT 86 IV.2 The Strong-First Strategy and the Vague-Specific Constraint 90 IV.2.1 The Strong-First Strategy 90 IV.2.2 The Vague-Specific Constraint 96 IV.3 A further elaboration of the denotation resolution algorithm 107 IV.3.1 The function DEN-М-Р (non-reflexive pronouns) 108 IV.3.2 The function DEN-X-N (indefinite descriptions) 114 IV.3.3 The function DEN-X-0 (definite descriptions) 117 IV.3.4 The function DEN-PROPER-NAME (proper names) 137 IV.3.5 Prominence and disambiguation within DRAFT 138 V Evaluation and Conclusions 145 V.l Conclusions V.2 Evaluation of DRAFT 146 V.3 The status of the model implemented in DRAFT 147 V.4 Future research 150 Appendices: 151 A The source code of DRAFT 153 В The source code of a simple English-SA translator 197 С Some demonstration sessions with DRAFT 211 Index of DRAFT functions References Index of Names Abstract-Samenvatting Curriculum Vitae Chapter I INTRODUCTION The basic question underlying the research presented here is "How does a reader, starting with the separate components of a text (words, sentences), come to understand these components as a coherent whole?". For instance, after reading (1), practically every reader will answer the question "Who had drunk too much again?" with "John". (1) Yesterday Mary met John. He had drunk too much again. The information "John had drunk too much again" is not given explicitly: the text says "He had drunk too much". Apparently a comprehending reader can interpret the sentences of a text as a coherent whole, and is thus able to draw more information from the text than is explicitly stated. The opposite is also possible. A reader who has just read a text and is then asked whether a number of given sentences appeared literally in it, will often become confused [Schänk and Abelson, 1977] and [Kintsch, 1977]. This is especially true in the case of sentences which differ from the original text mainly in wording or sentence structure but not much of meaning. Readers will usually recognize differences of meaning, in particular if they concern essentials rather than details. Apparently the reader remembers a kind of representation of the text rather than the literal text. This representation seems to abstract from the sentence level; the sentences of a text are represented in correlation with each other. The intriguing question is: how does a reader, departing from the linguistic object "text", arrive at the above-mentioned representation? To answer this question, we use the concept of Discourse Representation 1 (DR)1, as suggested by - among others - Seuren [1972; 1985], Karttuncn [1976], Stenning [1977], Grosz [1979], Fauconnier [1979], Johnson-Laird and Gamham [1980], Kamp [1981] and Garrod and Sanford [1982]. We basically follow [Seuren, 1985]. The construction of a DR is an incremental process in the semantic framework proposed by Seuren. On the one hand, every new sentence of a text may cause a change in the DR built up so far. On the other hand, the change in the DR brought about by a sentence depends on the DR built up so far. We will illustrate this process in the next chapter. In the framework proposed by Seuren, the meaning of a sentence token is equated with the change in a DR as a result of the incrementation of that token. The meaning of a sentence token therefore depends both on the context and on the listener's world knowledge. The meaning of a sentence type is equated with the systematic change in a discourse representation resulting from the incrementation of a sentence. Two sentence types have the same meaning if their incrementation in any possible discourse representation brings about the same change. Within this same framework we can also say something about the coherence and acceptability of sentences in a text. Every new utterance must have an informative value in relation to preceding utterances. If the addition of a particular sentence to the discourse representation built up so far is practically identical to its addition to an empty discourse representation and if this sentence is practically or completely irrelevant to the incrementation of following sentences, then there will be little coherence between this sentence and the rest of the text. If the above applies to a majority of the sentences in a text, that text can be called incoherent. One central methodological problem hanging over the whole project is the question of psychological plausibility or reality. The description of the process of building up a DR is meant to resemble the description of psychological processes. But it remains to be seen whether and to what extent a DR can be regarded as a mental model of the person who reads the text [Noordman, 1987]. In order to assess whether the theory presented here can be interpreted as a description of a psychological process, we will first have to carry out many psycholinguistic In this study we will restrict ourselves to written texts.