Understanding Complex Constructions: a Quantitative Corpus-Linguistic Approach to the Processing of English Relative Clauses
Total Page:16
File Type:pdf, Size:1020Kb
Understanding complex sentences Understanding Complex Constructions: A Quantitative Corpus-Linguistic Approach to the Processing of English Relative Clauses Dissertation zur Erlangung des akademischen Grades Doctor philosophiae (Dr. phil.) vorgelegt dem Rat der Philosophischen Fakultät der Friedrich-Schiller-Universität Jena von Daniel Wiechmann, MA geboren am 27.11.1974 in Itzehoe 1 A quantitative corpus-linguistic approach to the processing of English relative clauses Gutachter 1. Prof. Dr. Holger Diessel (FSU Jena) 2. Prof. Dr. Volker Gast (FSU Jena) 3. Prof. Dr. Anatol Stefanowitsch (Universität Bremen) Tag der mündlichen Prüfing: 9.4.2010 2 Understanding complex sentences TABLE OF CONTENTS 1 Introduction ...................................................................................................................... 5 1.1 Aims of this study ....................................................................................................... 5 1.1.1 Why relative clause constructions? .................................................................... 12 1.1.2 Characterizing English relative clause constructions ........................................ 15 1.2 Overture: Some precursors and some prerequisites .................................................. 36 1.2.1 Symbolization and mental states ........................................................................ 36 1.2.2 Linguistic units as processing instructions I: form to meaning ......................... 39 1.2.3 Linguistic units as processing instructions II: form to form .............................. 43 1.2.4 Conventional patterns as routinized instructions ............................................... 45 1.3 Chapter summary ...................................................................................................... 47 2 Towards a theoretical framework of the right kind ................................................... 48 2.1 The merits of being sign-based ................................................................................. 52 2.1.1 Regularity in language: rules and schemas ........................................................ 57 2.1.2 Constructions and the uniform representation of linguistic knowledge ............ 61 2.2 The merits of being usage-based ............................................................................... 65 2.2.1 Effects of frequency ........................................................................................... 66 2.3 Construction-driven memory-based language processing ........................................ 70 2.3.1 Memory-based language processing .................................................................. 71 2.3.2 Categorizing complex constructions .................................................................. 76 2.4 Chapter summary ...................................................................................................... 81 3 Describing English RCCs: Methods, data, and beyond ............................................. 83 3.1 Corpus and data used in the analysis ......................................................................... 85 3.1.1 A roadmap for the analysis of the corpus data ................................................... 94 3.1.2 Variables investigated in this study ................................................................... 96 3.2 Head features ........................................................................................................... 106 3.2.1 Morphosyntactic realization of the head .......................................................... 108 3.2.2 Definiteness of the head ................................................................................... 117 3.2.3 Contentfulness of the head ............................................................................... 119 3.2.4 Animacy of the head ........................................................................................ 121 3 A quantitative corpus-linguistic approach to the processing of English relative clauses 3.2.5 Concreteness of the head ................................................................................. 123 3.3 Features of the relative clause ................................................................................. 128 3.3.1 Grammatical features of RC: Finiteness .......................................................... 129 3.3.2 Grammatical features of RC: Transitivity........................................................ 137 3.3.3 Grammatical features of RC: Relativized role ................................................. 140 3.3.4 Grammatical features of RC: Corpus comparison ........................................... 142 3.4 Features of the main clause ..................................................................................... 150 3.4.1 Grammatical features of MC: Transitivity ....................................................... 151 3.4.2 Grammatical features of MC: External role and type of embedding ............... 155 3.5 Cross-clausal features .............................................................................................. 172 3.5.1 Cross clausal features: Transitivity configurations .......................................... 173 3.5.2 Cross-clausal features: Syntactic parallelism................................................... 176 3.5.3 Cross-clausal features: Thematic parallelism .................................................. 181 3.5.4 Head versus RC- Subject: Interference and discourse-function ...................... 186 4 Expanding horizons: RCC in ambient configuration space .................................... 198 4.1 Non-finite RCCs (bivariate prelude) ....................................................................... 199 4.2 A configural perspective on non-finite RCCs ......................................................... 205 4.2.1 Association rule mining: k-optimal patterns analysis ...................................... 205 4.2.2 Configural frequency analysis ......................................................................... 216 4.2.3 Identifying exemplar clusters: RCC-similarity in configural space ................ 221 4.3 Finite RCC............................................................................................................... 228 4.3.1 Finite subject RCCs ......................................................................................... 229 4.3.2 Finite non-subject RCCs .................................................................................. 243 4.3.3 Constructional schemas and relativizer omission ............................................ 265 4.4 General discussion and concluding remarks ........................................................... 282 5 References ..................................................................................................................... 294 4 Understanding complex sentences 1 Introduction 1.1 Aims of this study The present study will scrutinize English relative clause constructions to further explore the relationship between the shapes of grammars (and by implication, any given particular grammar) and the cognitive processes that motivate the forms licensed by a grammar. In line with the central tenets of cognitive linguistics, it is assumed that the human linguistic system develops under strong constraints from general cognition, most notably constraints from mechanisms of categorization, (symbolic) representation and on-line processing. In the attempt to contribute to the further development of the cognitive approach to language, the present thesis draws on ideas from cognitive psychology and computational linguistics/AI research and tries to connect these more firmly to linguistic theorizing as envisaged in cognitive construction grammar (Langacker 1987, 2008; Goldberg 1995, 2006). Following research into exemplar-based language processing (Bod 1998, Daelemans 2002, Daelemans & van den Bosch 2005, Skousen 1989, Skousen et al. 2002), special emphasis is put on the role of memory and analogical processes. It is argued that in order to understand the nature of linguistic knowledge, it is advantageous to entertain a unified conception of linguistic representation and processing. In the exemplar-based view, language learning is simply the storage of linguistic experience in memory, and language processing is the use of established memory structures. Following these avenues of research, the present account assumes that language experience can be represented by a corpus of parsed utterances, so that a distributional analysis of such a corpus can yield meaningful insights into the way humans behave linguistically, e.g. which structures are likely to cause more processing difficulties than others. In order to connect these ideas to ideas developed in cognitive construction grammar, special attention will be paid to the notion of a schema or a schematic construction, which is viewed as the theoretical construct in linguistics corresponding to coherent classes of exemplars in the psychological models of linguistic knowledge embraced. Throughout this work, relative clause constructions (henceforth RCCs) will be 5 A quantitative corpus-linguistic approach to the processing of English relative clauses portrayed as multi-clause constructions in the sense of construction grammar and so are conceived of as complex signs, i.e. pairings of form and meaning/function (Langacker 1987, Lakoff 1987, Goldberg 1995, 2006). The formal pole of signs in a linguistic system can