<<

A Comparative Grammar of Dialects Topics in English Linguistics 50.2

Editors Bernd Kortmann Elizabeth Closs Traugott

De Gruyter Mouton A Comparative Grammar of British English Dialects

Modals, Pronouns and Complement Clauses

by Nuria Herna´ndez Daniela Kolbe Monika Edith Schulz

De Gruyter Mouton ISBN 978-3-11-024028-3 e-ISBN 978-3-11-024029-0 ISSN 1434-3452

Library of Congress Cataloging-in-Publication Data

Herna´ndez, Nuria. A comparative grammar of British English dialects : modals, pro- nouns and complement clauses / by Nuria Herna´ndez, Daniela Kolbe, Monika Schulz. p. cm. Ϫ (Topics in English linguistics; 50.2) Includes bibliographical references and index. ISBN 978-3-11-024028-3 (alk. paper) 1. Ϫ Dialects Ϫ Great Britain. 2. English language Ϫ Great Britain Ϫ Grammar. 3. English language Ϫ Modality. 4. English language Ϫ Pronoun. 5. English language Ϫ Clauses. I. Kolbe, Daniela. II. Schulz, Monika Edith. III. Title. PE1721.C663 2011 4271.941Ϫdc23 2011037895

Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.d-nb.de.

” 2011 Walter de Gruyter GmbH & Co. KG, 10785 Berlin/Boston Cover image: Brian Stablyk/Photographer’s Choice RF/Getty Images Printing: Hubert & Co. GmbH & Co. KG, Göttingen ϱ Printed on acid-free paper Printed in Germany www.degruyter.com Preface

In 2005, A Comparative Grammar of British English Dialects: Agreement, Gender, Relative Clauses (Kortmann et al. 2005) appeared as the first publi- cation in this series concerned with the study of English dialect grammar. It consisted of three comprehensive studies on relative clauses (by Tanja Herr- mann), verbal concord (by Lukas Pietsch), and gender in English pronouns (by Susanne Wagner). The book was designed to fill a noticeable gap in English dialectology, at a time when systematic comparisons of individual grammatical phenomena across different dialects were virtually non-existent. It set an example as to how this gap could be filled by studies taking a mixed qualitative and quantitative approach to variation in morphology and syntax. The three studies presented in the first volume were informed by a func- tional typological approach to dialect grammar and were all based on data from the Freiburg Corpus of English Dialects. They were all written by mem- bers of the research group on English dialect syntax which was initiated by Prof. Bernd Kortmann at the university of Freiburg, Germany, in the late 1990s. Following the publication of these studies, the project continued with a second generation of researchers passionate about the dialects of British English, based on the same corpus. Three new studies are now presented in this book: the study of past possession and obligation by Monika Edith Schulz, the study of personal pronouns by Nuria Hernández, and the study of complement clauses by Daniela Kolbe. This second volume of A Comparative Grammar of British English Dia- lects is designed to provide new insights into grammatical variation, and to support the growing interest in the corpus-based study of dialects.

Essen, Trier and Hamburg, November 2011 Nuria Hernández y Siebold, Daniela Kolbe-Hanna and Monika Edith Schulz Acknowledgements

All authors most gratefully acknowledge the generous support of the Deutsche Forschungsgemeinschaft. The funding of the projects KO 1181/1-1,2,3 over a five-year period (2000-2005) facilitated the compilation of the Freiburg Cor- pus of English Dialects, FRED, on which all studies in the present volume are based.

This book is dedicated to Bernd Kortmann, whose research at the crossroads of dialectology and typology has been a great inspiration to all of us. Table of contents

Preface ...... v Acknowledgements ...... vi

General introduction ...... 1 Nuria Hernández, Daniela Kolbe, Monika Edith Schulz 1. Dialect syntax ...... 1 2. Dialectology and corpus linguistics ...... 3 3. The Freiburg Corpus of English Dialects ...... 5 4. Aims and Outline ...... 9 Notes ...... 12 References ...... 13

Possession and obligation ...... 19 Monika Edith Schulz 1. Introduction ...... 19 2. Possession and obligation ...... 22 3. Grammaticalization ...... 23 4. HAD and GOT: Disambiguating past possession and past obligation ...... 25 5. Past possession and past obligation in the and the North ...... 38 6. Past possession and negation ...... 42 7. Past obligation and negation ...... 43 8. Present tense obligation and the trend towards monosemy . . . . . 45 9. Summary ...... 47 References ...... 49

Personal pronouns ...... 53 Nuria Hernández 1. Introduction ...... 53 2. Two hierarchies ...... 62 3. Variation in number and person ...... 65 4. Variation in gender ...... 73 viii Table of contents

5. Pronoun exchange ...... 94 6. Case variation in prepositional phrases ...... 126 7. Qualified pronouns ...... 139 8. Synopsis and discussion ...... 156 Appendix ...... 169 Notes ...... 174 References ...... 181

Complement clauses ...... 193 Daniela Kolbe 1. Introduction ...... 193 2. Data and methods ...... 201 3. Embedded inversion ...... 221 4. The complementizer as ...... 244 5. For to clauses ...... 255 6. Conclusion ...... 288 Appendix ...... 293 Notes ...... 299 References ...... 303

Index ...... 315 General introduction

Nuria Hernández, Daniela Kolbe, Monika Edith Schulz

1. Dialect syntax

The study of dialect syntax in its present form is a relatively young field in terms of the combination of variety type and linguistic phenomenon under investigation. Non-standard, rural varieties of a language have been investi- gated within the framework of dialect geography and dialectology since the late 19th century, with work focusing mainly on phonology and the lexicon (Chambers and Trudgill 19982: 13–44). While traditional dialectology has often been associated with a lack of theoretical foundation and a “butterfly collecting mentality” (Filppula et al. 2005: vii), input from microparametric syntax, variationist sociolinguistics and typology have transformed the field over the past thirty years. Microparametric syntax and typology provided a variety of theoretical frameworks against which linguistic variation could be discussed in a princi- pled way. In addition, typological expertise from the study of cross-linguistic variation brought a fresh perspective on language-internal variation. Both paradigms have shifted the focus of investigation from phonological and lexical to morphosyntactic variation, which had been largely neglected in traditional dialectology. Variationist sociolinguistics, similar to dialectology in its focus on language-internal variation, provided a sophisticated method- ological toolkit and the crucial link between synchronic variation and dia- chronic change. The utilization and amalgamation of the strengths of dialectology, micro- parametric syntax and typology have resulted in an impressive and ever- growing body of research since the late 1980s (cf. Corrigan and Cornips 2005, among many others). Microparametric syntactic dialect atlases like the ASIS (Syntactic Atlas of Northern Italy), SAND (Syntactic Atlas of Dutch Dia- lects) or ScanDiaSyn (Sczandinavian Dialect Syntax) have been compiled. Since 2005 the Edisyn (European Dialect Syntax) project has taken a guiding 2 Nuria Hernández, Daniela Kolbe, Monika Edith Schulz role in syntactic dialect research in Europe, developing and testing standards of data collection, data storage and annotation, data retrieval and cartography (www.dialectsyntax.org). In the realm of typology, morphosyntactic properties of varieties of Eng- lish have been studied from a cross-linguistic perspective and integrated into a large-scale typological survey of the patterning of 76 non-standard morpho- syntactic features in 46 varieties of English (Kortmann 2004; Kortmann and Schneider 2004a; Kortmann and Schneider 2004b; Kortmann 2006). A state- of-the-art overview of developments in the field of dialect syntax within typo- logical and microparametric frameworks can be found in Kortmann (2009). While syntactic atlases and typological surveys are largely based on ques- tionnaire data, a recent trend has seen the compilation of dialect corpora of naturalistic spoken material which allow the quantitative modeling of syn- tactic variation in the tradition of urban dialectology and sociolinguistics (cf. Anderwald and Szmrecsanyi 2009 for an overview). The present volume contributes three studies in this line of research. It is situated within the FRED-project at Freiburg University, which investigates morphosyntactic variation in traditional, spoken varieties of British English using the Freiburg Corpus of English Dialects (Kortmann and Wagner 2005; Anderwald and Wagner 2007). FRED-based studies already completed cover a wide range of topics such as negation, reflexives, agreement and the Northern Subject Rule, pronominal gender, complement clauses, relative clauses, verb-formation, grammatica- lization and grammar-based dialectometry. They include Anderwald (2002; 2005; 2006; 2007; 2009; 2011a; 2011b), Anderwald and Wagner (2007), Herrmann (2003), Hernández (2002; 2006; 2008; forthcoming), Kolbe (2008; 2010) Kortmann et al. (2005), Kortmann and Szmrecsanyi (2011), Kortmann and Wagner (2007; 2010), Pietsch (2005), Schulz (2010a; 2010b), Szmre- csanyi (2005; 2006; 2008; 2009; 2010a; 2010b; 2011a; 2011b; in press a, b), Szmrecsanyi and Hernández (2007), Szmrecsanyi and Hinrichs (2008), Szmrecsanyi and Kortmann (2009), Szmrecsanyi and Wolk (2011) and Wag- ner (2003; 2007; 2008; forthcoming a, b). The present volume investigates aspects of complementation, pronoun systems and pronominal usage, pos- session, obligation and past habituality. In the remainder of this introduction, section 2 briefly discusses issues at the interface of dialectology and corpus linguistics. Section 3 introduces the database common to all three studies, the Freiburg Corpus of English dia- lects (FRED). Section 4 will introduce the phenomena under investigation. General introduction 3

The samples of FRED used for the individual studies and the methodologies employed are relegated to the introductions of the individual chapters.

2. Dialectology and corpus linguistics

The use of corpora in dialectology marks a significant shift in methodol- ogy. Traditional dialectology, as conducted since the nineteenth century, fo- cused on dialect lexis – specific words, their geographic distribution and their pronunciation by individuals deemed “good dialect speakers”. Areas of linguistics other than lexis and pronunciation, e.g., grammar, were widely neglected. The typical speakers chosen for dialectological studies were so- called NORMS, that is non-mobile older rural males who ideally had spent their whole life in the village where they were born (Chambers and Trudgill 1980: 33). The general aim was to create atlases such as the Survey of English Dia- lects (Orton and Dieth 1962-71) that showed, for example, where speakers reported using the Scottish word lassie instead of the word girl. Since the 1970s, modern dialectology has turned towards the description of actual speech and started to take into account dialect grammar. Modern dialectological studies draw on a more diverse selection of speakers, includ- ing younger and female speakers from urban areas. This change in methods occurred in the wake of sociolinguistic studies such as Labov (1966), which revealed that many linguistic expressions are variable: speakers make use of more than one realization of a linguistic form. A famous example of such a linguistic variable is a speaker’s choice between pronouncing or not pro- nouncing the postvocalic “r” sounds in fourth floor. Labov’s studies show that hardly any speaker uses the same variant at all times and that speakers’ choices of variants are strongly influenced by language-external, sociological factors, such as social class, age, gender, and ethnic background. The focus on linguistic variables and their variants means that instead of reporting only in which locations speakers used, e.g. lassie instead of girl, studies rather focus on the percentages of each variant in dif- ferent social groups. Most of the dialectological studies of the 1970s and 80s such as Trudgill (1974) focused on phonological or accent differences, mainly due to the fact that phoneme inventories constitute a limited unit and recur frequently. RP, 4 Nuria Hernández, Daniela Kolbe, Monika Edith Schulz for example, has only 45 phonemes, which can all be found in a reasonably short span of text (Upton 2008: 240, 248). By contrast, grammatical variables often consist of several words and are thus less frequent than phonemes. It is also more difficult to elicit individual syntactic structures such as an infinitive or a relative clause in an interview than a particular word or sound. Consequently, quantitative studies of syntax require much larger databases than studies of phonology and have only been possible since the compila- tion of large digitized corpora in the 1990s, such as the Newcastle Electronic Corpus of Tyneside English (NECTE). In the analysis of dialect grammar, researchers also face the problem that non-standard patterns, such as double modals as illustrated in (1), are often heavily stigmatized and regarded as instances of “wrong grammar".

(1) I would might move it now straight away (International corpus of English, Great Britain (ICE-GB), file s1b-025)

Speakers try to avoid them in interviews and do not report their use in ques- tionnaires (Labov 1966: 455). Thus, researchers face the so-called “observer’s paradox”, their presence alone changes the situation they want to observe. If dialect speakers are interviewed in a non-linguistic setting, however, they are not aware that their language is destined for later analysis and are hence less likely to control the way they speak. That way the observer’s para- dox is considerably reduced (Labov 1972a: 209). This type of data is there- fore highly useful for the analysis of linguistic features of dialects. Section 3 will comment on the FRED data with respect to this issue. Future research on dialect usage would benefit from the availability of even more corpora drawing on this kind of resource. In addition, any corpora which contain spontaneous speech from different regions such as the British National Corpus (BNC) or the various components of the International Cor- pus of English (ICE) can serve to analyze regional differences. The required size of corpora is inversely proportional to the decreasing frequency of standard or non-standard grammatical patterns. In ICE-GB, for instance, only two out of over 7,000 modal verb phrases contain a double modal. Only a corpus of approximately 10 million words would constitute a sufficiently large database for a pattern of this frequency. Nevertheless, as traditional dialects are on the decline, the future interest of research on British dialects probably lies in a redefinition of dialects (probably less regionally confined) and their features (cf. Trudgill 2000: 82-83, 132-135). General introduction 5

3. The Freiburg Corpus of English Dialects

The main empirical basis for all three studies in this volume is the Freiburg Corpus of English Dialects, generally known as FRED. Since the corpus was already discussed in some detail in the first volume of this series (Kort- mann et al. 2005), the current chapter will only give a brief introduction in order to convey a general idea of the type of data we are dealing with. For a more comprehensive description of the corpus design and textual markup, the reader is referred to the corpus manual (Hernández 2006; Szmrecsanyi and Hernández 2007) and the project’s website (http://www2.anglistik.uni- freiburg.de/institut/lskortmann/FRED/). The previous volume contains a description of the research design of FRED and an in-depth discussion of advantages of FRED for both qualita- tive and quantitative analyses of dialect phenomena (Kortmann and Wagner 2005). It also addresses some other issues that are worth keeping in mind when reading the current book, such as the use of oral history material for linguistic research, the problem of normalisation in interview transcripts, and the role of non-standard data as a corrective for typological research.1 The FRED corpus was compiled at the University of Freiburg, Germany, under the guidance of Professor Bernd Kortmann between 2000 and 2005. The primary incentive for this new collection of language data was the re- search team’s interest in morphosyntactic variation in spoken British Eng- lish and the need for a geographically well-balanced and machine-readable database geared towards the systematic collection of morphosyntactic infor- mation. The end result is an extensive 2.5 million word corpus of spontaneous speech that can be used for the investigation of specific dialects as well as cross-dialectal comparisons.

3.1. Data and speakers

FRED was designed as a spoken-language dialect corpus. It contains full- length interviews with native speakers from 9 major dialect areas in England (Southeast, Southwest, Midlands, North), Scotland (Lowlands, Highlands), Wales, the Hebrides and the Isle of Man, roughly following the geographical division of modern dialect areas outlined in Trudgill (1999). The texts repre- sent the ‘traditional’ varieties of British English spoken during the second half of the 20th century. All conversations were recorded between 1968 and 2000, the majority during the 1970s and 1980s. 6 Nuria Hernández, Daniela Kolbe, Monika Edith Schulz

Overall, there are 372 interviews with male and female speakers from 43 different counties. Researchers working with FRED have at their disposal speech data from 431 speakers of English who all grew up in Britain (inter- viewers excluded). Most of them were born before World War II (89% even before 1920), left school at age fourteen or younger, spent their life in one specific geographic area without leaving for any considerable amount of time, and were aged 60 or over at the time of the interview. Younger speakers included in the corpus only contribute a small percent- age of the textual material, the mean age being 75.2 years at recording date. The criteria affecting the choice of informants, i.e. their age, their strong affiliation with their native region, the relatively low education level and lit- tle mobility, have resulted in a largely homogeneous speaker profile which – despite restricting the use of FRED for sociolinguistic studies – facilitates the comparison with former surveys such as the Survey of English Dialects (SED, cf. http://www.leeds.ac.uk/library/spcoll/lavc/). One of the perhaps most interesting details about FRED is its innova- tive use of a resource which had hardly ever been contemplated by linguists before: oral history interviews. These interviews, which are usually aimed at preserving people’s memories of the past (cf. http://www.oralhistory.org.uk/), provide the quantity of speech data that is needed to investigate both frequent and rare linguistic phenomena. Since oral history interviews originally aim at the elicitation of historical information and folklore, they provide an ideal basis for spontaneous face- to-face conversations where the interviewee stays distracted from his or her own linguistic behaviour. According to (Labov 1997: 395), the elicitation of narratives of personal experience is the most effective method in mak- ing people forget to pay attention to the way they speak and hence switch to a more casual register. Well-known difficulties for the elicitation of sponta- neous speech, such as the observer’s paradox, the tape recorder effect (Labov 1972b: 113), and effects of hyperadaptation Trudgill (2004: 62), are thus min- imised.

3.2. Technical details

The corpus consists of sound recordings with orthographic transcripts, total- ling more than 2.5 million words and 300 hours of recorded speech. The fact that the original recordings are available allows researchers to countercheck General introduction 7

Table 1. FRED dialect areas and county codes

Dialect Area County Code Dialect Area County Code England Kent KEN Scotland Angus ANS Southeast London LND Lowlands Banffshire BAN (SE) Middlesex MDX (ScL) Dumfriesshire DFS Suffolk SFK East Lothian ELN England Cornwall CON Fife FIF Southwest Devon DEV Kincardineshire KCD (SW) Oxfordshire OXF Kinrosshire KRS Somerset SOM Lanarkshire LKS Wiltshire WIL Midlothian MLN England Leicestershire LEI Peebleshire PEE Midlands Nottinghamshire NTT Perthshire PER (Mid) SAL Selkirkshire SEL Warwickshire WAR West Lothian WLN England Durham DUR Scotland Inverness-shire INV North LAN Highlands Ross and Cromarty ROC (N) Northumberland NBL (ScH) Sutherland SUT Westmorland WES Wales Denbigshire DEN YKS (Wal) Glamorgan GLA Isle of Man — MAN Hebrides — HEB dubious occurrences, which benefits the general credibility of analytical re- sults. The whole corpus is digitised and stored on DVD. The transcripts are pure ASCII texts with markup separated from the running text by brackets. Interviewer utterances are marked by curly brackets – an additional feature that allows researchers to exclude these utterances in common concordance programmes such as WordSmith (see http://www.lexically.net/wordsmith/). 8 Nuria Hernández, Daniela Kolbe, Monika Edith Schulz

                       !" #$ %&'(&)*+  ,)-).   /01)('2134 51 1      - - -     -        -       6  /7   666           6 66  6 6 8    9!"1  2,+,)($.: 19 1             /    7/     9 / 7  (

        )*/ 7  9 / 7 9/  1     7 8    9 !" / 

;<=>$)$?  ;?     ;     ?  ; @  ?   ;/  !" !"?       ;/ ') ?         ;/  1)($. ?      ;/ )($ ?       ;/  < ?   

<  7 1 ;43=%)('2?     ;434)('?     !

A;8 ?/ 1  BC ;!"?D8/ 1     ,D8/ 1  @   6 : E F-E& F   1G / ! , A;8 ? ,C ;!"?D8/1      8  1/   /         ,D=   /1 ,D / 17 , D= / 7 / 7  7 /     /        / 3  @   /,D= 8  1  / /     7 6   /     /   1 6        , A;8 ?! ,H  ,C ;!"?D= /  7   / ,,,  Figure 1. Header and first lines of a FRED transcript (text LAN_010)

Each interview in the corpus has its own text identification number consisting of a three letter county code followed by underscore and a running number, for instance the Yorkshire interviews YKS_001, YKS_002, and so forth. The text ID is the same for the recording and the corresponding transcript. Longer interviews may have more than one soundfile. A complete list of dialect areas and county codes is given in Table 1. For an overview of the different texts and speakers, the reader is again referred to the corpus manual. General introduction 9

Figure 1 shows the beginning of a typical FRED transcript. Every transcript in the corpus is preceded by a text header. The first part of this header, in square brackets, contains the most important editorial information about the text, such as copyright information, source, place and date of the interview, some brief information about the fieldworker, and in some cases a short de- scription of the speaker’s accent and dialect features. The second part consists of a set of standardised tags in angular brackets () which specify the text ID, the dialect area, county and location, the recording date and decade, the speaker ID as well as a limited set of speaker variables which allow to evaluate the data with respect to region- al variation and gender and age differences. In interviews with more than one informant, the respective values are separated by commas. Information is given where available; otherwise, the tag value is empty.

4. Aims and Outline

4.1. Past possession and obligation

The study by Monika Edith Schulz investigates variable past possession and past obligation marking with HAD, HAD GOT, HAD TO and HAD GOT TO in two traditional British English dialects. Starting from the observation that the English auxiliary verb system is subject to a “wholesale reorganization” (Bolinger 1980: 6), the study investigates whether past possession and past obligation marking show dialectal differences and thus provide snapshots of different stages or different versions of reorganization in different dialect areas. Empirically, the study is based on two subcorpora of the Freiburg Cor- pus of English Dialects, which comprise 180,000 words each and represent speakers from the Midlands and the North born between 1984 and 1910. The choice of dialect areas is motivated by their different statuses as a transitional and a relic dialect area respectively and allows a comparison of the well- established patterns of the spread of phonological innovations to those of the spread of morphosyntactic innovations. The study finds categorical variation in past possession and past obligation marking. The two domains are marked invariably by HAD and HAD GOT in the relic area of the North but show stable variation between HAD and HAD GOT 10 Nuria Hernández, Daniela Kolbe, Monika Edith Schulz for possession and HAD TO and HAD GOT TO for obligation in the transitional area of the Midlands, as illustrated in examples (2)–(4).

(2) Macarthy had the butcher’s shop at the bottom of the lane. (FRED, Mid, LEI_001) (3) You know, if you, if you’d got plenty of time, they’d have you on to get it ready yourself. (FRED, Mid, SAL_011) (4) And you had to leave the put the leaves in in a book and press them you know, the, when you when you finished with them, you’d got to paint them the best you could. (FRED, Mid, LEI_001)

Variation between different markers receives an interpretation in terms of grammaticalization, where it is argued that past possession and past obliga- tion marking are more grammaticalized in the transitional area of the Mid- lands and less grammaticalized in the North. These findings are in line with findings from the spread of phonological innovations, which identify the North as a relic area only partly affected by phonological changes (cf. Trud- gill 1999: 52–84).

4.2. Pronouns

The study by Nuria Hernández presents a comprehensive account of personal pronoun behaviour from a variationist perspective, focusing on non-standard uses. A great variety of such uses can be found in spontaneous conversa- tions, which is why the linguistic data of the Freiburg Corpus of English Dialects (FRED) are ideal for this type of investigation. The corpus data ex- hibit extensive functional diversity in the pronominal paradigm, including in- stances of pronoun exchange as illustrated in (5)–(8), independent self-forms as in (9) and (10), gendered pronouns as in (11), singular us as in (12), generic question tags like innit and wunnit as in (13) and (14), and various other phenomena.

(5) Him and I ain’t been fishing for these last six weeks. (FRED, SE, MDX_001) (6) Her says, I can get two loaves of bread, her says, for sixpence half- penny. (FRED, N, LAN_020) (7) ...sohetold I he ’d give I the sack . . . (FRED, SW, WIL_009) General introduction 11

(8) So he said, I could do with he for a fortnight. (FRED, SW, CON_009) (9) And there’s a story to this. Might interest yourself. (FRED, N, LAN_012) (10) . . . and I side-stepped the manager because he was a bit apprehen- sive about miself ... (FRED, SW, WIL_015) (11) This old toilet, he was still there when we were doing the garden. (FRED, SW, SOM_027) (12) He used to just open his coat out, Here’s a couple of rabbits, Fred, give us a couple of pints. (FRED, N, YKS_006) (13) I used to walk to Woodstock at many a time with eh Woodstock when they had Woodstock gloves, innit, yes and made them. (FRED, SW, OXF_001) (14) ...ithadbeen a shop sometime or other, and down the bottom was a saw mill, wunnit? (FRED, SE, LND_004)

The empirical basis for this study is the England component of FRED, com- prising the four major dialect areas North, Midlands, Southeast and South- west, with a total 1.5 million words and 180 hours of recorded speech. By applying a mixed quantitative-qualitative approach to the entire component, the study gains new insights into the geographical distribution of non-standard pronouns as well as their underlying functions in spontaneous discourse. A holistic approach is used to identify determinants of variation that are com- mon to apparently unconnected phenomena. In addition, the applied method- ology facilitates the comparison of different morphosyntactic categories such as person, number, gender and case, and establishes the relative importance of each category for the correct processing of pronominal expressions.

4.3. Complement clauses

The study by Daniela Kolbe analyses dialect features in complement clauses. The individual features under investigation include the use of subject-verb inversion in embedded interrogative clauses, as illustrated in (15), the use of as instead of that as complementizer in that-clauses, as illustrated in (16), and the use of for to instead of to as an infinitive marker or complementizer as illustrated in (17). 12 Nuria Hernández, Daniela Kolbe, Monika Edith Schulz

(15) ...she wanted to see was there any pigs there, and she saw two. (Northern Ireland Transcribed Corpus of Speech (NITCS), A56.1b, AW23) (16) Don’t think as you can make a lot of money out of them [pigeons], because you can’t. (FRED, Mid, SAL_009) (17) He would try for to tell her. (NITCS, text A32.3, LM7)

Some of these dialect features, however, occur in almost all dialects of British English (Kortmann and Szmrecsanyi 2004: 1142-1148). This raises the ques- tion of whether they are actually dialectal, i.e., restricted to certain regions in the UK, or whether they should be considered as non-standard features of a general British English vernacular. In order to answer this question, the study considers the speakers’ geo- graphical origin (determining their dialect) as one factor that influences the use of the dialect features mentioned above, drawing on data from FRED and the NITCS (Northern Ireland Transcribed Corpus of Speech). Other fac- tors besides geographical origin that influence language use are the speaker’s age and gender, and language-internal factors. An example of the latter is the preference for embedded inversion in clauses expressing a lack of informa- tion as in (18), rather than in structurally similar clauses not expressing such a lack as in (19) (cf. Ohlander 1986; Huddleston and Pullum 2002: 981).

(18) She wanted to know what your name is/ what is your name. (19) She knew what your name is / ?what is your name.

For each dialect feature in complement clauses listed above, all factor weights are calculated and compared in logistic regression analyses which thus exhibit whether a speaker’s dialect is a strong reason for her or him to use a non- standard feature.

Notes

1. The wider theoretical framework of the project was based on functional typol- ogy, hence the project’s name ‘English Dialect Syntax from a Typological Per- spective’. For different motivations of linking dialectology and typology, see e.g., Anderwald and Kortmann (2002), Kortmann (2002) and Kortmann (2004). General introduction 13

References

Anderwald, Lieselotte. 2002. Negation in Non-Standard British English: Gaps, Regularizations, Asymmetries. London and New York: Routledge. Anderwald, Lieselotte. 2005. Negative concord in British English dialects. In: Yoko Iyeiri (ed.), Aspects of Negation, 113–137. Amsterdam and Philadelphia: John Benjamins. Anderwald, Lieselotte. 2006. Non-standard verb paradigms in traditional British English dialects: Morphological naturalness and comparative dialect grammar. Freiburg University, English Department: Post-doctoral dissertation (Habilita- tionsschrift). Anderwald, Lieselotte. 2007. ‘He rung the bell’ and ‘she drunk ale’ - non-standard past tense forms in traditional British dialects and on the internet. In: Marianne Hundt, Nadja Nesselhauf, and Carolin Biewer (eds.), Corpus Linguistics and the Web, 271–285. Amsterdam: Rodopi. Anderwald, Lieselotte. 2009. The Morphology of English Dialects: Verb-Formation in Non-Standard English. Cambridge: Cambridge University Press. Anderwald, Lieselotte. 2011a. Are non-standard dialects more ‘natural’ than the standard? A test case from English verb morphology. Journal of Linguistics 47: 251–274. Anderwald, Lieselotte. 2011b. Norm vs variation in British English irregular verbs: the case of past tense sang vs sung. English Language and Linguistics 15: 85– 112. Anderwald, Lieselotte and Bernd Kortmann. 2002. Typology and dialectology: a programmatic sketch. In: Jaap van Marle and Jan Berns (eds.), Present-Day Dialectology, Vol. I: Problems and Discussions, 159–171. Berlin and New York: Mouton de Gruyter. Anderwald, Lieselotte and Benedikt Szmrecsanyi. 2009. Corpus linguistics and dialectology. In: Anke Lüdeling and Merja Kytö (eds.), Corpus Linguistics: An International Handbook, vol. 2, 1126–1140. Berlin and New York: Mouton de Gruyter. Anderwald, Lieselotte and Susanne Wagner. 2007. FRED - the Freiburg English Dialect Corpus. In: Joan Beal, Karen Corrigan, and Hermann Moisl (eds.), Creating and Digitizing Language Corpora: Synchronic Databases, vol. 1, 35– 53. London: Palgrave Macmillan. Bolinger, Dwight. 1980. WANNA and the gradience of auxiliaries. In: Gunter Brettschneider and Christian Lehmann (eds.), Wege zur Universalienforschung. Sprachwissenschaftliche Beiträge zum 60. Geburtstag von Hansjakob Seiler, 292–299. Tübingen: Narr. Chambers, J.K. and Peter Trudgill. 1980. Dialectology. Cambridge: Cambridge University Press. Chambers, J.K. and Peter Trudgill. 19982. Dialectology. Cambridge: Cambridge University Press. 14 Nuria Hernández, Daniela Kolbe, Monika Edith Schulz

Corrigan, Karen and Leonie Cornips (eds.). 2005. Syntax and Variation: Recon- ciling the Biological and the Social. Amsterdam and Philadelphia: John Ben- jamins. Filppula, Markku, Juhani Klemola, Marjatta Palander, and Esa Penttilä (eds.). 2005. Dialects Across Borders: Selected Papers from the 11th International Confer- ence on Methods in Dialectology (Methods XI), Joensuu, August 2002. Ams- terdam and Philadelphia: John Benjamins. Hernández, Nuria. 2002. A context hierarchy of untriggered self-forms in English. Zeitschrift für Anglistik und Amerikanistik 50: 269–284. Hernández, Nuria. 2006. User’s Guide to FRED: Freiburg Corpus of English Dia- lects. Freiburg: Freiburg University. http://www.freidok.uni-freiburg.de/ voll- texte/2489. Hernández, Nuria. 2008. FRED corpus, database entry. Helsinki University: Cor- pus Resource Database. http://www.helsinki.fi/varieng/CoRD/corpora/ FRED. Hernández, Nuria. forthcoming. Personal Pronouns in the Dialects of England - A Corpus Study of Grammatical Variation in Spontaneous Speech. Herrmann, Tanja. 2003. Relative Clauses in Dialects of English: A Typological Approach. Ph. D. diss. Http://www. freidok.uni-freiburg.de//volltexte/830. Huddleston, Rodney and Geoffrey Pullum. 2002. The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press. Kolbe, Daniela. 2008. Complement clauses in British Englishes. Ph. D. diss. Kolbe, Daniela. 2010. The semantic and grammatical overlap of as and that: Ev- idence from non-standard English. In: Ute Römer and Rainer Schulze (eds.), Exploring the Lexis-Grammar Interface. Amsterdam: John Benjamins. Kortmann, Bernd. 2002. New prospects for the study of English dialect syntax: impetus from syntactic theory and language typology. In: L. Cornips S. Barbiers and S. van der Kleij (eds.), Syntactic Microvariation, 185–213. Amsterdam: Meertens Institute Electronic Publications in Linguistics. Kortmann, Bernd. 2004. Why dialect grammar matters. The European English Messenger XIII: 24–29. Kortmann, Bernd. 2006. Syntactic variation in English: A global perspective. In: Bas Aarts and April McMahon (eds.), Handbook of English Linguistics, 603– 624. Oxford: Blackwell. Kortmann, Bernd. 2009. Areal variation in syntax. In: Peter Auer and Jürgen Schmidt (eds.), Language and Space: Theory and Methods, 837–864. Berlin and New York: Walter de Gruyter. Kortmann, Bernd and Edgar Schneider (eds.). In collaboration with Kate Burridge, Rajend Mesthrie, and Clive Upton. 2004a. A Handbook of Varieties of English, vol. 1: Phonology. Berlin and New York: Mouton de Gruyter. Kortmann, Bernd and Edgar Schneider (eds.). In collaboration with Kate Burridge, Rajend Mesthrie, and Clive Upton. 2004b. A Handbook of Varieties of English, vol. 2: Morphology and Syntax. Berlin and New York: Mouton de Gruyter. General introduction 15

Kortmann, Bernd and Benedikt Szmrecsanyi. 2004. Global synopsis: Morpholog- ical and syntactic variation in English. In: Kortmann and Schneider (2004b), 1142-1202. Kortmann, Bernd and Benedikt Szmrecsanyi. 2011. Parameters of morphosyntactic variation in World Englishes: prospects and limitations of searching for univer- sals. In: Peter Siemund (ed.), Linguistic Universals and Language Variation, 264–290. Berlin and New York: Mouton de Gruyter. Kortmann, Bernd and Susanne Wagner. 2005. The Freiburg English Dialect Project and Corpus. In: Kortmann et al. (2005), 1–20. Kortmann, Bernd and Susanne Wagner. 2007. A Fresh Look at Late Dialect Syntax. In: Javier Pérez-Guerra, Dolores González-Álvarez, Jorge L. Bueno-Alonso, and Esperanza Rama-Martínez (eds.), Of Varying Lan- guage and Opposing Creed. New Insights into Late Modern English, 279–300. Bern and Frankfurt: Peter Lang. Kortmann, Bernd and Susanne Wagner. 2010. Changes and continuities in dialect grammar. In: Raymond Hickey (ed.), Eighteenth Century English. Ideology and Change, 269–292. Cambridge: Cambridge University Press. Kortmann, Bernd, Tanja Herrmann, Lukas Pietsch, and Susanne Wagner. 2005. A Comparative Grammar of British English Dialects: Agreement, Gender, Rela- tive Clauses. Berlin and New York: Mouton de Gruyter. Labov, William. 1966. The Social Stratification of English in New York City. Wash- ington, DC: Center for Applied Linguistics. Labov, William. 1972a. Sociolinguistic Patterns. Oxford: Blackwell. Labov, William. 1972b. Some principles of linguistic methodology. Language in Society I: 97–120. Labov, William. 1997. Some further steps in narrative analysis. Journal of Narra- tive and Life History 7: 395–415. Ohlander, Sölve. 1986. Question-orientation versus answer-orientation in English interrogative clauses. In: Dieter Kastovsky and Aleksander Szwedek (eds.), Linguistics across historical and geographical boundaries, 963–982. Berlin and New York: Mouton de Gruyter. Orton, Harold and Eugen Dieth. 1962-71. Survey of English dialects: Basic mate- rials. Leeds: E. J. Arnold and Son. Pietsch, Lukas. 2005. Variable Grammars: Verbal Agreement in Northern Dialects of English. Tübingen: Niemeyer. Schulz, Monika. 2010a. Morphosyntactic Variation in British English Dialects: Evidence from Possession, Obligation and Past Habituality. Ph. D. diss. Schulz, Monika. 2010b. Past habituality in British English dialects: the distribu- tion of WOULD and USED TO in Westmoreland and Nottinghamshire. In: Clive Upton and Barry Heselwood (eds.), Proceedings of the 13th International Conference on Methods in Dialectology, Leeds, 4th-8th August 2008. Frankfurt: Peter Lang. 16 Nuria Hernández, Daniela Kolbe, Monika Edith Schulz

Szmrecsanyi, Benedikt. 2005. Language users as creatures of habit: a corpus- linguistic analysis of persistence in spoken English. Corpus Linguistics and Linguistic Theory 1: 113–150. Szmrecsanyi, Benedikt. 2006. Morphosyntactic Persistence in Spoken English: A Corpus Study at the Intersection of Variationist Sociolinguistics, Psycholinguis- tics, and Discourse Analysis. Berlin and New York: Mouton de Gruyter. Szmrecsanyi, Benedikt. 2008. Corpus-based dialectometry: Aggregate morpho- syntactic variability in British English dialects. International Journal of Hu- manities and Arts Computing 2: 279–296. Szmrecsanyi, Benedikt. 2009. Typological parameters of intralingual variability: grammatical analyticity versus syntheticity in varieties of English. Language Variation and Change 21: 319–353. Szmrecsanyi, Benedikt. 2010a. The English genitive alternation in a cognitive sociolinguistics perspective. In: Dirk Geeraerts, Gitte Kristiansen, and Yves Peirsman (eds.), Advances in Cognitive Sociolinguistics, 141–166. Berlin and New York: Mouton de Gruyter. Szmrecsanyi, Benedikt. 2010b. The morphosyntax of BrE dialects in a corpus- based dialectometrical perspective: Feature extraction, coding protocols, pro- jections to geography, summary statistics. Freiburg: Freiburg University. http://www.freidok.uni-freiburg.de/volltexte/7320. Szmrecsanyi, Benedikt. 2011a. Corpus-based dialectometry – a methodological sketch. Corpora 6: 45–76. Szmrecsanyi, Benedikt. 2011b. The geolinguistics of grammatical variability in traditional British English dialects: A large-scale frequency-based study. Post- doctoral dissertation (Habilitationsschrift), University of Freiburg. Szmrecsanyi, Benedikt. in press a. Analyzing aggregated linguistic data. In: Man- fred Krug and Julia Schlüter (eds.), Research methods in language variation and change. Cambridge: Cambridge University Press. Szmrecsanyi, Benedikt. in press b. Geography is overrated. In: Sandra Hansen, Christian Schwarz, Philipp Stoeckle, and Tobias Streck (eds.), Dialectological and folk dialectological concepts of space. Berlin and New York: Walter de Gruyter. Szmrecsanyi, Benedikt and Nuria Hernández. 2007. Manual of Information to Ac- company the Freiburg Corpus of English Dialects Sampler. Freiburg: Freiburg University. http://www.freidok.uni-freiburg.de//volltexte/2859. Szmrecsanyi, Benedikt and Lars Hinrichs. 2008. Probabilistic determinants of gen- itive variation in spoken and written English: A multivariate comparison across time, space, and genres. In: Terttu Nevalainen, Irma Taavitsainen, Päivi Pahta, and Minna Korhonen (eds.), The Dynamics of Linguistic Variation: Corpus Ev- idence on English Past and Present, 291–309. Amsterdam and Philadelphia: John Benjamins. General introduction 17

Szmrecsanyi, Benedikt and Bernd Kortmann. 2009. Between simplification and complexification: non-standard varieties of English around the world. In: Ge- offrey Sampson, David Gil, and Peter Trudgill (eds.), Language Complexity as an Evolving Variable, 64–79. Oxford: Oxford University Press. Szmrecsanyi, Benedikt and Christoph Wolk. 2011. Holistic corpus-based dialectol- ogy. Brazilian Journal of Applied Linguistics/ Revista Brasileira de Linguística Aplicada (special issue “Corpus studies: future directions”, ed. by Stefan Th. Gries) 11: 561–592. Trudgill, Peter. 1974. The social differentiation of English in Norwich. Cambridge: Cambridge University Press. Trudgill, Peter. 1999. The Dialects of England. Oxford: Blackwell. Trudgill, Peter. 2000. The Dialects of England. Oxford: Blackwell. Trudgill, Peter. 2004. Dialects. London: Routledge. Upton, Clive. 2008. . In: Varieties of English: The British Isles, 237–252. Berlin and New York: Mouton de Gruyter. Wagner, Susanne. 2003. Gender in English Pronouns: Myth and Reality. PhD diss., Albert-Ludwigs-Universität Freiburg. http://www.freidok.uni-freiburg. de/volltexte/1412. Wagner, Susanne. 2007. Unstressed periphrastic do – from Southwest England to Newfoundland? English World-Wide 28: 249–278. Wagner, Susanne. 2008. English dialects in the Southwest: Morphology and syntax. In: Bernd Kortmann and Clive Upton (eds.), Varieties of English - The British Isles, 417–439. Berlin and New York: Mouton de Gruyter. Wagner, Susanne. forthcoming a. Southwest England. In: Bernd Kortmann and Kerstin Lunkenheimer (eds.), The Electronic World Atlas of Variation in Eng- lish: Grammar. München and Berlin: Max Planck Digital Library in coopera- tion with De Gruyter Mouton. Wagner, Susanne. forthcoming b. South West English. In: Tometro Hopkins, John McKenny, and Kendall Decker (eds.), World Englishes, Vol. 1: The British Isles. London: Continuum. Wagner, Susanne. forthcoming c. Pronominal systems. In: Raymond Hickey (ed.), Areal Features of the Anglophone World. Berlin and New York: Mouton de Gruyter.

Possession and obligation

Monika Edith Schulz

Well, ’cos I got wet a couple a’ times, he went and bought me a big police cape. Well, this caape ud wrap round me three or four times, but Mrs Jones ud put a tuck in it, and wrap it round me neck. I ’d got this basket, and I ’d got to goo about one and a half, two miles o’er the mount to Ma Baugh’s – goin’ towards Owd Park – draggin’ it – with this caape. (Midlands, Shropshire, SAL_038)

1. Introduction

The present chapter will be concerned with past possession and past obliga- tion marking in the Midlands and the North, supplemented by a discussion of the few present tense obligation contexts found in the data. The focus will be on differences in the distribution of past possessive HAD and HAD GOT and past obligation HAD TO and HAD GOT TO in the two dialect areas. The remainder of the present section lays out the data used for the present study. Section 2 briefly addresses the notions of predicative possession and obligation and comments on the comparative rarity of past tense HAD GOT and HAD GOT TO. Section 3 lays out a few basics of grammaticalization with a particular focus on layering in the sense of Hopper (1991) and variation in the verbalization of experience in the sense of Croft (2010). Section 4 details the delineation and identification of all past possession, present obligation and past obligation contexts in the data. Section 5 presents the results and discusses the patterning of HAD, HAD GOT, HAD TO and HAD GOT TO in the two dialect areas against the backdrop of grammaticalization and different degrees of grammaticalization. Section 6 is concerned with the availability of wide and narrow scope negation with the markers under disucssion. Section 7, finally, provides a discussion of the few instances of present tense obligation found in the data and relates them to recent findings from other varieties of British English. 20 Monika Edith Schulz

Table 1. The Midlands subcorpus county data point total speakers total words

Leicestershire Caldwell 2 7,862

Nottinghamshire Nottingham 6 53,010 Southwell 1 11,091 7 64,101

Shropshire Coalbrookdale 1 4,672 Coalport 4 13,455 Craigside 2 10,086 Dawley 2 7,366 Farley 2 9,456 Ironbridge 3 12,271 Lawley 1 5,986 Madeley 3 14,108 Oakengates 1 15,178 2 8,168 Wellington 2 10,664 23 111,419

Midlands 32 183,382

For the present study two subcorpora of roughly 180,000 words each were sampled from FRED to represent the Midlands and the North. All of the FRED speakers used for the present study were born between 1884 and 1910, grew up before the Second World War and were initially recorded during the 1970s and 1980s. An overview of data points, number of speakers per data point and number of words contributed by each speaker is provided in Tables 1 and 2. Full information including each speaker’s date of birth, longitude and latitude of the data point they represent, the number of words they contribute and the FRED text code of their interview can be found in Schulz (2011). As dialect boundaries cut across county boundaries in a number of cases, the location of the individual data points with respect to the modern accent regions of England as postulated on the basis of phonological differences in Trudgill (1999: 52-84), and their classification in terms of relic and transition areas, will be provided here. Past Possession and Obligation 21

Table 2. The North subcorpus

county data point total speakers total words

Lancashire Prescott 1 26,807 Preston 3 26,286 Wigan 2 22,708 6 75,801

Westmorland Ambleside 7 44,676

Yorkshire / Durham Guisborough 1 2,520 Hartlepool 1 7,274 Hinderwell 1 6,891 Middlesbrough 4 33,388 7 50,073

North 20 170,550

Trudgill (1999) distinguishes between Traditional and Modern British Eng- lish dialects. For both Traditional and Modern British English dialects the North has been identified as a more traditional, conservative area which has retained older phonological features of the language and has not been reached by some of the phonological innovations which spread from the South of England (Trudgill 1999: 24, 67). While the FRED material is certainly very conservative, with speakers of the two subcorpora investigated here born between 1884 and 1910, it has been argued to pattern with the Modern rather than the Traditional dialect boundaries (Kortmann and Wagner 2005: 11). The data points in Westmorland, North Yorkshire, Durham and Lancashire represent the most traditional dialect areas of the Lower North (Trudgill 1999: 67), splitting up into the Central North (Westmorland, North Yorkshire and Durham) and Central Lancashire at the border to the Northwest Midlands. The most prominent indicators of the conservative nature of these areas is the retention of a monophthongal pronunciation of words like gate and boat, which are pronounced /geIt/ and /b@Ut/ in southern dialects but /ge:t/ and /bo:t/ in northern dialects (Trudgill 1999: 70, Beal 2004: 123). 22 Monika Edith Schulz

The data points in Shropshire, Leicestershire and Nottinghamshire represent the less conservative dialect areas of the Midlands, splitting up into the North- (Shropshire) and the Central Midlands (Leicestershire and Nottinghamshire). In contrast to the northern dialect areas, both the North- west and the Central Midlands show the diphthongal pronunciation of words like gate and boat and pattern with the innovative southern dialect areas in that respect (Trudgill 1999: 73-74). On the basis of these phonological indicators the northern subcorpus will be taken to represent a relic area, while the Midlands subcorpus will be taken to represent a transition area as laid above. An analysis of the systems of past possession, past obligation and past ha- bitual marking and their degree of grammaticalization will establish whether this classification based on phonological indicators is confirmed by the patter- ning of morphosyntactic phenomena.

2. Possession and obligation

Cross-linguistically as well as conceptually, possession and obligation have repeatedly been shown to be intricatly linked, with expressions of possession a frequent source for expressions of obligation (Bybee et al. 1994: 182-183; Heine 1997: 193-195; Bhatt 1997; Heine and Kuteva 2002: 243-245, among others). For the English language in particular, the diachronic link between HAVE TO and HAVEposs is well-documented (cf. among others van der Gaaf 1931; Visser 1969; Brinton 1991; Fischer 1994; Fischer et al. 2000). The diachro- nic relationship between HAVE GOT and HAVE GOT TO has taken a back seat, with Gronemeyer (1998) and Krug (2000) as notable exceptions, who ar- gue that the rise of HAVE GOT TO can be understood as a “success story” of grammaticalization (Krug 2000: 63). A full discussion of the intricate dia- chronic relationships between these four markers can be found in Schulz (2011). Past possession HAD GOT and past obligation HAD GOT TO do not receive much attention in the literature. Early treatments of present tense HAVE TO and HAVE GOT TO mention the past tense uses in passing. The Oxford English Dictionary only records present tense HAVE TO and HAVE GOT TO, while Whitney’s 1889 Century Dictionary makes reference to both present and past tense uses (Whitney 1889: 2503). Past Possession and Obligation 23

Both Jespersen (1931: 47-53) and Visser (1973: 2206) discuss HAD GOT and HAD GOT TO as possible but rather rare forms, while Crowell (1959: 286) identifies them as a British English rather than an American English feature. Huddleston and Pullum (2002: 112) also classify HAD GOT and HAD GOT TO as possible but infrequent forms. Tagliamonte (2003), who investigates the distribution of HAVE and HAVE GOT in three Northern English dialects, only discusses present tense uses but does not indicate the status of past possession HAD GOT. Past obligation HAD GOT TO has been identified as an exceedingly rare to non-existent marker restricted to indirect speech contexts in numerous corpus-based studies (Coates 1983: 54; Krug 2000: 108; Tagliamonte and Smith 2006: 353; Tagliamonte and D’Arcy 2007: 61), or ruled out altogether as a possible form (Myhill 1996: 247). The dialect data from the North pattern along the same lines, with past possession and past obligation overwhelmingly marked by HADposs and HAD TO. The Midlands, however, show stable variation between HADposs and HAD GOT for past possession and HAD TO and HAD GOT TO for past obligation. The situation in the two dialect areas will be discussed in section 5.

3. Grammaticalization

Grammaticalization has received numerous definitions since it was first used as a term by the French linguist Antoine Meillet who described it as “the attri- bution of grammatical character to an erstwhile autonomous word” (“l’attribu- tion du caractère grammatical à un mot jadis autonome” (Meillet 1912: 131)). He was drawing on a rich research tradition including the work of scho- lars like Wilhelm von Humboldt (1767-1835) and Georg von der Gabelentz (1840-1893), who anticipated many of the basic principles of grammticaliza- tion as it is understood today (Hopper and Traugott 20032: 19-25). One of the cornerstones of grammaticalization research during the late 1990s was the increasing interest in framing grammaticalization within what is commonly referred to as the usage-based approach to grammatical struc- ture (Croft 2000, Hopper and Traugott 20032: 35). Several factors related to language use are hypothesized to play an important role, including ritualiza- tion (Haiman 1994), string-frequency (Krug 2000), and the localized nature of elements in their incipient stages of grammaticalization (cf. Bybee and Hopper 2001, among others). 24 Monika Edith Schulz

This trend led to more rigorous conceptionalizations and definitions of gram- maticalization and is probably best expressed in the following quote: [Grammaticalization is] the process whereby lexical material in highly con- strained pragmatic and morphosyntactic contexts is assigned grammatical function, and once grammatical, is assigned increasingly grammatical, opera- tor-like function (Traugott 2003: 645). Indicators of the progressive grammaticalization of an expression have re- ceived a lot of attention in the literature, starting with the set of three syn- tagmatic and three paradigmatic phonological, morphological and syntactic parameters postulated in Lehmann (1982), which were intended to provide criteria against which the degree of grammaticalization of an item could be measured. While these parameters have come under attack in the literature (Hopper and Traugott 20032: 31-32), the usefulness of having indicators of grammati- calization at one’s disposal has never been seriously questioned. The focus here will be on the indicator of layering. The term layering was coined by Hopper (1991: 22), who argues that within a broad functional domain, new layers are continually emerging. As this happens, the older layers are not necessarily discarded, but remain to coexist with and interact with the newer layers. (Hopper 1991: 22). Layering receives a usage-based interpretation in the recent work of Croft (2010: 44), who links variation in the verbalization of experience, or “first- order variation”, to morphosyntactic variation in the sense of layering, or “second-order variation”. Croft (2010) draws on the distinction introduced by Chafe (1977a,b) be- tween high- and low-codability experiential domains, where low-codability domains show higher levels of variation in encoding than high-codability do- mains. Low-codability domains are then linked to grammaticalization, where a high-degree of first-order variation favors the development of variation in grammatical encoding and thus leads to the layering of grammatical markers in a functional domain: Differences in codability may have significance for morphosyntactic change. It sis possible that morphosyntactic change originates more frequently in the verbalization of lower-codability experiences than in higher-codability expe- riences. That is, low-codability experiences will have more variants which may increase the likelihood that one will be propagated and replace the orig- inal highest-frequency variant. (Croft 2010: 44). Past Possession and Obligation 25

Building on these notions, it is hypothesized here that the degree of gramma- ticalization of a functional domain can be assessed in terms of how much of first-order variation has been grammaticalized into second-order variation. The more layered variants a functional domain has, the more grammaticalized it is as a whole domain.

4. HAD and GOT: Disambiguating past possession and past obligation

In order to identify all instances of HADposs, HAD GOT, HAVE GOT TO, HAD TO and HAD GOT TO, general searches with WordSmith were conducted for all expressions containing any one of the items have to, has to,’ve to,’sto, had* and got. Results for these searches will be discussed in turn. The search terms have to, has to,’ve to and ’stoallow a fairly unpro- blematic identification of present tense obligation HAVE TO as illustrated in (1). Examples of didn’t have to also identified by these search terms are not included in the counts for present tense obligation here and will be discussed as instances of past obligation below.

(1) You havetodoyour repairs yourself. (North, Westmorland, WES_011)

So-called syntactic uses of HAVE TO such as (2) should be distinguished from nonsyntactic uses in a comparison of HAVE TO with HAVE GOT TO, as syntac- tic contexts are available to HAVE TO only (Myhill 1996). Table 3 presents the results for present tense obligation HAVE TO in the Midlands and the North with raw frequencies first and normalized per 10,000 words in brackets.

(2) She wouldn’t even bring you one. If her husband wanted a drink he’d have to go for it, she wouldn’t bring it. (North, Lancashire, LAN_008)

The search terms had* and got yield sets of related expressions whose dis- ambiguation and classification is discussed in the remainder of this section. Classification of the different uses for the present study was led by two rationales. The main rationale was to identify all instances of past possession and past obligation. In addition, some of the classification choices were moti- vated by an exclusion of those contexts in which the markers under discussion do not compete. 26 Monika Edith Schulz

Table 3. Occurrences of present tense obligation HAVE TO in the Midlands and the North as raw frequencies and normalized per 10,000 words

Midlands North

120 61 present tense HAVE TO (syntactic uses) (6.5) (3.6)

present tense HAVE TO (nonsyntactic uses, 12 15 including negation) (0.7) (0.9)

The form had* can be used both as an auxiliary for marking the past perfect and as a main verb in the simple past with the latter use splitting up into pos- sessive and what has been termed dynamic or eventive uses (cf. Quirk et al. 1985: 132). Eventive uses include interpretations in the sense of ‘receive’ as illustrated in (3), idioms with an eventive object as in (4), causative uses as in (5) and experiencer uses as illustrated in (6).

(3) Then I could always recite a little poem about a rabbit. If I didna’ do it, then I had a poke in the back to do it. (Midlands, Shropshire, SAL_023) (4) We stopped in Marseilles for about a month, had a good time there and then of course went onto the boat. (Midlands, Nottinghamshire, NTT_002) (5) Christmas time we had a card printed and we used to hand it to the person who we delivered it to. (North, Lancashire, LAN_023) (6) The time of the accident. I had mi pay stopped from that time. (North, Yorkshire, YKS_001)

Possessive uses cover several subtypes of possession. Core types of perma- nent and physical possession as illustrated in examples (7) and (8) are in- cluded as well as peripheral types of abstract, inanimate and inalienable pos- session as illustrated in (9), (10) and (11). No attempt was made to assign separate labels to different types of possession though to avoid a fragmenta- tion of the data set.

(7) Macarthy had the butcher’s shop at the bottom of the lane. (Midlands, Leicestershire, LEI_001) Past Possession and Obligation 27

(8) So when I got to t’ front of t’ class, he had this little pointer and he had this relief map. (North, Lancashire, LAN_012) (9) Grandfather had a reputation for being a first class butcher. (Mid- lands, Shropshire, SAL_023) (10) It was unusual because the majority of houses only had two bed- rooms. (North, Durham, DUR_002) (11) Of course this poor old gaffer of mine, he only had one eye. (North, Yorkshire, YKS_001)

Affirmative and negated instances of past possession were included in the counts. Both direct negation and negation with do-support are available for HADposs, and occur with simple and double negation as illustrated in exam- ples (12) - (15). A discussion of differences between the dialects with respect to the two basic negation patterns will be discussed in section 6.

(12) Sometimes they hadn’t enough to pay the men. (Midlands, Notting- hamshire, NTT_008) (13) I can remember up at Colwick...where they couldn’t go to school be- cause they hadn’t no boots. (Midlands, Nottinghamshire, NTT_008) (14) While we didn’t have the toys a modern child has, I had one day a magic lantern. (Midlands, Shropshire, SAL_023) (15) I mean it was all bayonet fighting and all that you see, but eh, I didn’t have no rifle or bayonet. (Midlands, Nottinghamshire, NTT_008)

Finally, the different eventive and possessive uses of HAD discussed above can not only occur in the simple past but also in the present and the past perfect preceded by had,’d, have,’ve, has or ’s as illustrated for the past perfect in (16) and (17).

(16) This one now, this eighteen-year old, she’d had a brain haemor- rhage. (North, Lancashire, LAN_007) (17) I’d never had a sheepdog till I came to Tarn Foot. (North, Westmor- land, WES_008)

These types of uses are not available to HAD GOT and were thus grouped together as contexts excluding competition between the two past possession markers. 28 Monika Edith Schulz

Critical contexts for the development of obligational meaning in HAVE TO in the form of a possessive expression followed by a to-infinitive as discussed by Fischer (1994) occur in the past tense as well. Cases like (18) and (19), which are ambiguous between a clear possessive and a clear obligational reading, were coded separately.

(18) Well, mi mother always had a family to look after. (North, Durham, LAN_002) (19) I had all the rest of the stuff to shove up there in a handcart. (North, Yorkshire, YKS_001)

The uninterrupted sequence had to usually signals past obligation, as illus- trated in (20). Both affirmative and negated instances of past obligation were included in the counts. HAVE TO has both direct negation and negation with do-support available to it. Only instances of negation with do-support as il- lustrated in (21) were included as instances of past obligation though. Direct negation can have both wide scope as in (22) and narrow scope as in (23) and will thus be discussed separately in more detail in section 7.

(20) We always had to go to school until we were fourteen. (Midlands, Shropshire, SAL_030) (21) They didn’t have to be bedded in like the biscuit kiln, they’d stand in the sagggers. (Midlands, Shropshire, SAL_016) (22) Q: When you had to go to these camps for a fortnights training, did the firm you worked for have to keep your job open for you? A: Well they used to do but they hadn’t to do. There was no such a thing as them having to do in them days. (North, Lancashire, LAN_020) (23) A: And at prayer time, what they call assembly now, prayer times they had to go into a classroom on their own or out into the porch while we sang our hymns or whatever they were, you see. Catholic children hadn’t to be contaminated with us,or us with them. B: Yes, whichever way round you want to put it. A: ’Cause it was a Church of England school, you see. (North, Yorkshire, YKS_006) Past Possession and Obligation 29

Table 4 presents the results for the distribution of different uses of had with their raw frequencies and normalized per 10,000 words in order of their fre- quency of occurrence in the Midlands.

Table 4. Distribution of had across different uses in the Midlands and the North with raw frequencies and normalized per 10,000 words in brackets

Midlands North

449 797 simple past possession had (24.5) (46.7)

442 357 past perfect had/’d had (24.1) (21.0)

324 471 simple past obligation had to (17.7) (27.6)

266 241 simple past eventive had (14.5) (14.1)

contexts excluding competition with HAD 63 32 GOT (3.4) (1.9) had/’d/have/’ve/has/’s had,

19 22 bridging context had /’d NP + to-infinitive (1.0) (1.3)

The most obvious differences between the Midlands and the North can be observed for past possession and past obligation, both of which are much more frequent in the North than in the Midlands. The search for got yielded a set of different expressions which fall into the broad classes of lexical, possession, and obligational uses. “Lexical” here serves as a label for any use which cannot be classified as possession or obli- gation. There are prototypically lexical examples such as past, present perfect and past perfect uses of got with the meanings ‘receive’ and ‘acquire’. Prototypi- cally lexical is also the use of got in expressions of movement as illustrated in examples (24) - (26). More grammaticalized uses such as the get-passive as illustrated in (27) - (29) were included in the “lexical” category as well. 30 Monika Edith Schulz

(24) And this ’orse got into the sump and my father got down there and put some rope round ’im. (Midlands, Shropshire, SAL_035) (25) If I’d ’ve got to Trent bridge only a year sooner that’d never have gone that wouldn’t, no. (Midlands, Nottinghamshire, NTT_003) (26) He’d walked a considerable distance and he opened his eyes and he found he’d turned round in his sleep and got exactly back to Beck- bury. (Midlands, Shropshire, SAL_034) (27) He was, he got blacklisted at Clifton they couldn’t stand him, ’cause he was one of them sort. (Midlands, Nottinghamshire, NTT_002) (28) Of course they’ve got buried over the passing of the years, you know, and there was part of the old canal landing. (Midlands, Shropshire, SAL_023) (29) I expect he’d slipped off the duck boards and got sucked up, or he’d got killed, he never got down. (Midlands, Shropshire, SAL_025)

A detailed overview of those uses of get and got which do not signal pos- session or obligation can be found in Gronemeyer (1998). Hundt (2001) dis- cusses the rise of the get-passive and its relative frequency in relation to the other uses of get and got. It is sometimes difficult to distinguish between instances of present per- fect HAVE got(ten) ‘have acquired/received’ carrying the implicature ‘stative possession’ and instances of possessive HAVE GOT. Full semanticization of ‘stative possession’ can be postulated as soon as there are examples of in- alienable possession, which by default rule out the interpretation ‘have ac- quired/received’ (cf. Schulz 2012). The same situation holds for past perfect HAD got(ten) and past possession HAD GOT. Full semanticization of past possessive HAD GOT in the Midlands data is indicated by examples of abstract, inanimate and inalienable posses- sion as illustrated in examples (30) - (35) below.

(30) You know, if you, if you’d got plenty of time, they’d have you on to get it ready yourself. (Midlands, Shropshire, SAL_011) (31) This chap Bill Griffiths came in and asked old Fred if he’d got any ideas, because he’d been painting and that all his lifetime. (Mid- lands, Shropshire, SAL_039) Past Possession and Obligation 31

(32) The Stoney Bridge was a proper stone bridge, it had got stone walls, big stoney walls and all on. (Midlands, Shropshire, SAL_009) (33) Lots of cottages had got these old American organs, some had got harmoniums, one or two had got pianos. (Midlands, Shropshire, SAL_023) (34) And then there was another little chap called Jackson, the strange thing, on one hand he’d got four fingers and a thumb, but his one finger was so small, he must have had three parts of it off. (Midlands, Shropshire, SAL_023) (35) There was a major he’d only got one eye, ...he had us all lined up af- ter we’d been riding around. (Midlands, Nottinghamshire, SAL_002)

The examples provided above indicate that past possession HAD GOT is fully established in the Midlands data. While past perfect HAD got(ten) cannot be established with any certainty as the source expression for HAD GOT, quite a few examples are ambiguous between a past perfect ‘had acquired/received’ and a past possession reading. Examples (36) and (37) are clear examples of past perfect HAD got(ten) in the sense of ‘had acquired’ and ‘had received’ - the focus is very clearly on the acquisition of the shoes and the reception of a treat, respectively.

(36) All she’d payed for the shoes was about two and eightpence, when she’s finished with the man; she used to keep bartering him down for a halfpenny and another halfpenny until she’d got ’em for about two and eight. (Midlands, Shropshire, SAL_017) (37) Well, that was when the carnival was on and all that, and the regatta and such things as that. They used to have all the lighted boats on the river, all lit up in lanterns and things and it looked beautiful, all on the river at night, all in the dark ...It was grand in them days, anything like that and we thought we’d got a treat, you know. (Midlands, Shropshire, SAL_017)

Things are not as clear-cut, however, in examples (38)-(40) below. (38) can be interpreted both as a comment on the child having acquired a big bag of marbles or as a comment on him coming home in the possession of a bag full of marbles. 32 Monika Edith Schulz

Similarly, (39) is ambiguous between a reading which stresses the fact that the animals all have names and a reading which highlights the fact that the animals received their names from a lady who is being discussed as a noted character in the village.

(38) He was, he was more like a girl than a lad and I made him this here and he went to school with it and when he come home he’d got a great big bag of marbles. (Midlands, Shropshire, SAL_016) (39) There wasn’t a chair you could sit on - nearly every one was occu- pied by a cat or kittens. Now that was the atmosphere of that one- roomed cottage, and the washhouse was a bit of a lean-to next to it. Her was a most happy person, and all these animals had got a name and ’er seemed to worship them. (Midlands, Shropshire, SAL_023) (40) Well, the youth as was getting the hammer slack up at the back. He used the iron barrow and a long iron tail shovel, and he used to get it up and shove it in the barrow and when he’d got about half a barrow load he ’d take it to the furnace. (Midlands, Shropshire, SAL_015)

(40) finally, is ambiguous between a reading which focuses on the process of the young man filling half the barrow with slack and a reading which focuses on the result of him having filled the barrow. These types of ambiguous ex- amples were not counted as instances of past possession but as instances of a lexical use of past perfect HAD got(ten). Another source of ambiguity are instances of HAD GOT where HAD is elided as in (41) and (42). These examples are ambiguous between a past possession reading and an eventive simple past reading.

(41) One of the grand sounds at night was you’d hear the night men come to empty them toilets. Now, how they did it - they got a bowl dish on a long pole and scooped the sediment out of this hole and down the bottom of the garden stood the cart, and there they’d put all the refuse out of the toilets into that cart. (Midlands, Shropshire, SAL_023) (42) You could just go and get it and saw it up, eh, and eh we had the the grates you see then was eh over at one end and the other was a boiler you see you got hot water all day you see and the oven it was far better than these gas ovens today you know. (Midlands, Nottinghamshire, NTT_005) Past Possession and Obligation 33

(41) is ambiguous between an eventive reading in which the night men fetched a bowl dish on a pole to empty the toilets and a stative possessive reading which simply indicates that the night men were in the possession of such an instrument. Similarly, (42) is ambiguous between an eventive reading, which frames the boiler as the distributor and the speaker as a recipient of hot water, and a stative possessive meaning which focuses on the fact that hot water was generally available for use all day. Again, these types of ambiguous examples were not counted as instances of past possession but as instances of a lexical use of past tense got. Only examples like (43) and (44) where an eventive reading is not avail- able or clearly secondary were included as instances of past possession. (43) is an instance of inalienable possession which describes the component parts of a certain type of house and does not admit an eventive interpretation. In (44) a possessive reading that foregrounds what the speaker’s mother had at her disposal in terms of household remedies is certainly stronger than any eventive reading focusing on her actually procuring the remedy at the event of a possible illness.

(43) So I should say at some time or other, they [the houses] must have had some hoisting gear to have wound sack bags up. That was the evidence of that, and the other house had got just the same, and they got doors at the bottom and the remains of some steps which led down to the canal when I was a lad. (Midlands, Shropshire, SAL_023) (44) No, we never had no doctor, mi mother I ’ll tell you what she got – glycerine, bottled glycerine, and licorice powder and if that didn’t cure us well..., we’d had it. (Midlands, Nottinghamshire, NTT_005)

In sum, only very clear and unambiguous examples were classified as past possessive HAD GOT. (45) - (49) illustrate a selection of those clear exam- ples, including the subcategory of locative HAD GOT with a strong possessive meaning component in (48) and (49). Negated instances of past possession as illustrated in (50) were included as well. Contrary to HADposs, which has both direct negation and negation with do-support available to it, HAD GOT only negates directly. Double negation is available though, as illustrated in (51). 34 Monika Edith Schulz

(45) And, I think the band consisted of a drum, and one bloke had got a trumpet I think if I remember right it it looked big enough for me to sleep in as a kid. (Midlands, Nottinghamshire, NTT_016) (46) But sometimes if they were loading two or three boats - that’d proba- bly be nine wagons of sand you see, they’d put it into the boats there. Because they’d got a crane, where they could pick a carriage up and take it onto the boat. (Midlands, Shropshire, SAL_009) (47) Well, I was out of, out of work, and mi eldest brother said, Why don’t us, why don’t you start on your own? So I says, Well, I says, I ain’t got enough capital to carry on on mi own. I says, I’ve only one, (unclear) I didn’t tell him what I can (/unclear), but I’d only got a hundred pounds. (Midlands, Nottinghamshire, NTT_001) (48) The wheels had got them flats on them and they were big iron wheels – this height! The reason for the flats was for when they were coming downhill they’d got a locker on them, else they’d over-run the horse. (Midlands, Shropshire, SAL_033) (49) And we’d got plenty of patches on our trousers which is fashion- able today, but we were ashamed of them in that day. (Midlands, Shropshire, SAL_039) (50) They hadn’t got the money to spend, only those as was in work, them as was working at Celanese or Players and places like that, they had money and they’d come and have boats out at weekend. (Midlands, Nottinghamshire, NTT_003) (51) We couldn’t afford, we hadn’t got not one to go in. (Midlands, Shrop- shire, SAL_018)

The identification of past obligation HAD GOT TO is less problematic. The string had/’d got to receives two possible interpretations alongside the past obligation one. The first possibility is a combination of past perfect HAD got(ten) with the directional preposition to as illustrated in (52), which is the only instance of this use in the dialect data. The second possibility allows for a possessive reading, as illustrated in (53).

(52) He said, right. I’m off. And he bounced out, he said, and I was goin’ downstairs and I’d just got to t’ bottom o’ t’ stairs, and the girl come runnin’ after me. (North, Lancashire, LAN_012) Past Possession and Obligation 35

(53) “Well now, master Giles, what is it you have got to say to me? If I can do you any service, this company will give you leave to speak.” (Isaac Bickerstaff, The Maid of the Mill, 1765)

There are seven instances in the Midlands data of this type of context, illus- trated in (54) - (60). A possessive interpretation is possible in (56) - (60). It seems rather forced in (54) and (55) though, where there is no sense in which the barber could already possess the lather he is about to make or the pilot could possess the quarry he is about to drop his bombs into.

(54) Now, a feature of the barber’s shop was that which adorned the walls - nearly every man had his own cup and his own soap where the bar- ber knew what sort of lather he’d got to make. (Midlands, Shrop- shire, SAL_023) (55) He dropped ’em [the bombs] in there - it had been working and he thought that was the quarry he’d got to drop it in and he dropped them there. (Midlands, Shropshire, SAL_017) (56) There was nothing, no, no amusement of no sort. All as we had got to think about was having the stick when we got in [into the school] again afore we come home. (Midlands, Shropshire, SAL_018) (57) Well, a raker meant you put a big lump of on, on the back of that put ashes and little bits of coal and slack, and if you didna’ – well, that meant you’d have no hot water next morning for use, and when your mother got up the first job her’d got to do was to make sure the fire was in. (Midlands, Shropshire, SAL_023) (58) But in any case I was never interested much in my school days, more ’s the pity. I thought about work – what I ’d got to do when I got home. (Midlands, Shropshire, SAL_018) (59) And then it were very eh, more or less up to the teacher then the class your teacher would tell you what you got to do, it was either reading or writing or arithmetic or geography. (Midlands, Leicester- shire, LEI_001) (60) They just took me into the pit and showed me the two men that I’d got to tram for, ond I’d got to give the one his tub to fill. (Midlands, Shropshire, SAL_038) 36 Monika Edith Schulz

Even in (56) - (60), which allow a possessive interpretation of HAD GOT, the sense of obligation is fairly strong. In (56) the speaker describes the children’s compulsive dread of being beaten in school. (57), (58) and (59) all involve the infinitival complement to do and describe the past necessity for the speaker or a third party to peform a certain task or job assigned to them. In (60), finally, ’d got may be interpreted in the sense of ‘had been assig- ned’ in connection with a to-infinitive or in terms of an obligation to perform a certain job for the benefit for the two quarry workers. These five examples were excluded from the count of both past possession and past obligation HAD GOT TO and assigned their own label “critical context”. In sum, only clear-cut instances such as those illustrated in (61) and (62) were included.

(61) They got the pay they paid them, the disabled, for six months, and I’d got to keep them six months half pay, I had to pay them half pay and the government paid them the other half pay. (Midlands, Nottinghamshire, NTT_001) (62) Now we had one called Arthur Foulkes, well he rampaged up and down that chapel, crying, shouting, screaming, calling us all out to be saved like he was saved, until at last you’d got to physically stop him, he was so belligerent. (Midlands, Shropshire, SAL_023)

Contrary to HAD TO, which has both direct negation and negation with do- support available to it, HAD GOT TO only negates directly. Negated instances of HAD GOT TO do share scope ambiguities with directly negated instances of HAD TO though, as illustrated in (63) and (64). These examples will be discussed together with directly negated instances of HAD TO in section 7.

(63) Forty acres of limestone had been worked there. But it was easy to get it as it was on top of the ground. They hadn’t got to pull it out of the earth. (Midlands, Shropshire, SAL_033) (64) I were the machine gunner, sergeant machine gun. And we done a lot of overhead fighting you see, so you got to be able to read a map you see and and eh and and what and set your gun with a clinometer to get your range and everything so you hadn’t got to make any errors you know. (Midlands, Nottinghamshire, NTT_005) Past Possession and Obligation 37

Summing up, lexical instances of got in the data occur as got in the simple past as in (65), as have / has /’ve /’s got in the present perfect as in (66) and as had / ’d /Øgot in the past perfect as in (67). Possessive instances occur as have / has /’ve /’s /Øgot in the simple present as in (68) and as had / ’d /Ø got in the simple past as in (69). Obligational instances finally occur as have / has /’ve /’s /Øgot to in the present tense as in (70) and as had /’d /Øgot to in the past tense as in (71).

(65) And this ’orse got into the sump and my father got down there and put some rope round ’im. (Midlands, Shropshire, SAL_035) (66) If I’d ’ve got to Trent bridge only a year sooner that’d never have gone that wouldn’t, no. (Midlands, Nottinghamshire, NTT_003) (67) He’d walked a considerable distance and he opened his eyes and he found he’d turned round in his sleep and got exactly back to Beck- bury. (Midlands, Shropshire, SAL_034) (68) You would get the whole family and let me tell you the photographs in those days, after a hundred years have proved excellent quality. I’ve got some to prove that. (Midlands, Shropshire, SAL_023) (69) Yes, we had to resort to the candle when we hadn’t got a penny, and that ’d be a light. (Midlands, Shropshire, SAL_017) (70) Well, on this one occasion, I left at six-fifty-three, and a fellow asked me, he said, Do your best, he said, We ’ve got to get down as soon as we can, right time, else we shall lose time. (Midlands, Shropshire, SAL_011) (71) They got the pay they paid them, the disabled, for six months, and I’d got to keep them six months half pay, I had to pay them half pay and the government paid them the other half pay. (Midlands, Nottinghamshire, NTT_001)

The distribution of got across these different uses in the Midlands and the North is illustrated in Table 5, ordered according to their frequency of oc- currence in the Midlands. Raw frequencies are reported first with normalized frequencies per 10,000 words in brackets. Again, differences between the dialect areas are most conspicuous with regard to past possession and past obligation marking. 38 Monika Edith Schulz

Table 5. Distribution of got across different uses in the Midlands and the North with raw frequencies and normalized per 10,000 words in brackets

Midlands North

553 568 past tense lexical got (30.2) (33.3)

202 24 past possession had/’d/Ø got (11.0) (1.4)

123 4 past obligation had/’d/Ø got to (6.7) (0.2)

91 53 present possession have/has/’ve/’s/Ø got (5.0) (3.1)

66 23 lexical past perfect had/’d/Øgot (3.6) (1.3)

29 21 present obligation have/has/’ve/’s/ got to (1.6) (1.2)

16 19 lexical present perfect have/has/’ve/’s got (0.9) (1.1)

bridging context have/has/’ve/’s got + 5 0 to-infinitive (0.3) (0)

5. Past possession and past obligation in the Midlands and the North

Past possession and past obligation are cut up differently in the Midlands and the North. Figures 1 and 2 present the distribution of HADposs, HAD GOT, HAD TO and HAD GOT TO in the two dialect areas drawing on the frequency counts presented in Tables 4 and 5 in section 4. In the Midlands there is stable variation between HADposs and HAD GOT for past possession and between HAD TO and HAD GOT TO for past obli- gation while in the North HAD GOT and HAD GOT TO are marginalized in comparison to HADposs and HAD TO. Past Possession and Obligation 39

Figure 1. Distribution of possessive HAD and HAD GOT across past possession con- texts in the North and the Midlands

Figure 2. Distribution of HAD TO and HAD GOT TO across past obligation contexts in the North and the Midlands

The ratios of HAD and HAD GOT as well as HAD TO and HAD GOT TO are consistent across counties in the Midlands, as illustrated in Figures 3 and 4. Similarly, the predominance of HAD and HAD TO is consistent across the counties in the North, as illustrated in Figures 5 and 6. This indicates that the difference in past possession and past obligation marking is a supra-local feature which holds for larger dialect areas rather than for individual counties. 40 Monika Edith Schulz

Figure 3. Distribution of possessive HAD and HAD GOT across past possession con- texts in the Midlands counties

Figure 4. Distribution of HAD TO and HAD GOT TO across past obligation contexts in the Midlands counties

Figure 5. Distribution of possessive HAD and HAD GOT across past possession con- texts in the North counties Past Possession and Obligation 41

Figure 6. Distribution of HAD TO and HAD GOT TO across past obligation contexts in the North counties

The different patternings of past possession and past obligation markers sum- marized in Tables 6 and 7 can now be interpreted drawing on the notions of layering and the relationship between variation and degree of grammatica- lization as outlined in section 3. The system of obligation marking can be argued to be less grammaticalized in the North than in the Midlands, where the rise of present tense HAVE GOT TO, dubbed a “success story” of gramma- ticalization by Krug (2000: 63), is replicated in the past tense.

Table 6. Past possession and past obligation in the Midlands

HADposs (69 %) HAD TO (72 %)

HAD GOT (31 %) HAD GOT TO (28 %)

Table 7. Past possession and past obligation in the North

HADposs (97 %) HAD TO (99 %)

HAD GOT (3 %) HAD GOT TO (1 %) 42 Monika Edith Schulz

In terms of morphosyntactic variation and its relation to high- and low-coda- bility experiences in the sense of Croft (2010), the Midlands have grammati- calized more of the “first-order variation” in the notional domains of both past possession and past obligation. The absence of HAD GOT TO in the North correlates with the findings from Tagliamonte and Smith (2006), who are working with data from Scotland, Northern Ireland and the North of England. As no comparative studies on present or past tense obligation in contemporary Midlands dialects have been conducted so far, it is not possible to establish with any certainty whether HAD GOT TO has dropped out of use or not. This will be left for further research.

6. Past possession and negation

As pointed out in the discussion of examples (13) and (15) above, repeated here as (72) and (73), HADposs has both direct negation and negation with do- support available to it. HAD GOT, on the other hand, only has direct negation available as illustrated in (50) above repeated here as (74). Table 8 provides an overview of negative past possession contexts in the dialect data.

(72) I can remember up at Colwick...where they couldn’t go to school be- cause they hadn’t no boots. (Midlands, Nottinghamshire, NTT_008) (73) It was in the air, the second world war, I mean it was all bayonet fighting and all that you see, but eh, I didn’t have no rifle or bayonet. (Midlands, Nottinghamshire, NTT_008) (74) They hadn’t got the money to spend, only those as was in work, them as was working at Celanese or Players and places like that, they had money and they’d come and have boats out at weekend. (Midlands, Nottinghamshire, NTT_003)

Negation of simple past stative possession in the Midlands mirrors the trend observed for the negation of present tense stative possession in British Eng- lish towards haven’t got rather than didn’t have or haven’t. Biber et al. (1999: 161-162) find haven’t to be stable only in British fiction and classify it as a “conservative choice”. In the North, where hadn’t got only occurs once, however, the “conservative” marker hadn’t is the default choice. Past Possession and Obligation 43

Table 8. Distribution of didn’t have, hadn’t and hadn’t got in the Midlands and the North

Midlands North

negation with do-support HADposs 7 7 didn’t have (0.4) (0.4)

direct negation HADposs 6 32 hadn’t (0.3) (1.9)

direct negation HAD GOT 23 1 hadn’t got (1.3) (0.1)

ALL 36 40 (2.0) (2.3)

While didn’t have can be found in both the Midlands and the North, it is not the default marker of negated past possession in either of the two dialect areas. This situation provides support for the hypothesis put forward in Biber et al. (1999: 161-162) that the negation of stative possession HAD GOT with do-support is a feature of American rather than British English.

7. Past obligation and negation

The forms didn’t have to, hadn’t to and hadn’t got to are very rare but exhibit variation with respect to do-support and scope. Variation between direct nega- tion and negation with do-support can be found for HAD TO, as illustrated in (75) and (76). Variation between wide and narrow scope is illustrated in (76) and (77) for hadn’t to and in (78) and (79) for hadn’t got to.

(75) They didn’t have to be bedded in like the biscuit kiln, they’d stand in the sagggers. (Midlands, Shropshire, SAL_016) (76) Q: When you had to go to these camps...did the firm you worked for have to keep your job open for you? A: Well they used to do but they hadn’t to do. There was no such a thing as them having to do in them days. (North, Lancashire, LAN_020) 44 Monika Edith Schulz

(77) Dad used to go out and pull the tray out and take all the used carbide out, the lamp, take it away, and if there was little odd pieces left, he’d put them back, before he put any new in, you, but of course, you hadn’t to put too much in, in the beginning, as it got all wet, the damp on the top, it wouldn’t it wouldn’t allow the gas to come from the underneath. (North, Yorkshire, YKS_006) (78) They didn’t work it up from the floor level. They’d run up these stubs and loose them back and it’d be built that high where they run into, and work it up at that level. They hadn’t got to pick it up, and they used to lever it onto these carriages with a bar and break it up when they got down to the kilns. (Midlands, Shropshire, SAL_033) (79) But if you made a complaint about anything like after you were dis- charged you eh you got sent home, eh got sent back to your unit, eh done you out of any leave at all. You hadn’t got to complain. (Midlands, Nottinghamshire, NTT_005)

Tables 9 and 10 show the distribution of markers in the Midlands and North for wide and narrow scope negation respectively. The extremely low fre- quency of the items in question allows a very tentative interpretation at best. Both in the Midlands and the North, didn’t have to only signals wide scope negation.

Table 9. Wide scope negation of HAD TO and HAD GOT TO in the Midlands and the North

Midlands North

hadn’t to ‘not necessary that’ 1 1

didn’t have to ‘not necessary that’ 8 3

hadn’t got to ‘not necessary that’ 6 0

ALL 15 4

In the North, where hadn’t got to is completely absent, hadn’t to has devel- oped into a marker of predominantly narrow scope negation with 4 out of 5 instances receiving a narrow scope interpretation. Past Possession and Obligation 45

Table 10. Narrow scope interpretations of hadn’t to and hadn’t got to

Midlands North

hadn’t to ‘necessary that not’ 0 4

hadn’t got to ‘necessary that not’ 2 0

ALL 2 4

In the Midlands, hadn’t to is a marginal marker with just one instance of oc- currence. Hadn’t got to is used variably as a marker of either wide or narrow scope negation with a clear preference for wide scope negation. The complete absence of hadn’t got to from the North and its relative strenght in the Midlands are in line with the results for affirmative past obli- gation and also negated past possession. The patterning of markers here indi- cates that HAD GOT TO has pushed into negative contexts as well, a situation which has been observed for present tense HAVE GOT TO in British English conversation in general (Biber et al. 1999: 163).

8. Present tense obligation and the trend towards monosemy

This section will present an overview of MUST, HAVE TO and HAVE GOT TO in nonsyntactic contexts, that is those contexts in which all three markers are free to appear (cf. Myhill 1996). Nonsyntactic contexts have been identified as present tense affirmative contexts where there is no other auxiliary present. Combinations with other auxiliaries like will, would, could or used to are highly frequent in the dialect data and account for over 90% of all the uses of the form have to. Negation is very rare though. There is one instance of don’t have to and no negated form of HAVE GOT TO. A brief review of previous findings in this area of research will be followed by a discussion of the distribution of MUST, HAVE TO and HAVE GOT TO across present tense affirmative contexts in the dialect data. Most studies on the deontic and epistemic uses of MUST show a trend towards monosemy. 46 Monika Edith Schulz

While deontic uses are on the decline, the epistemic use is holding its ground (Palmer 1979; Coates 1983; Myhill 1995; Biber et al. 1999: 495; Trousdale 2003; Tagliamonte 2004; Tagliamonte and Smith 2006; Collins 2009). Notable exceptions are presented in Leech (2003) and Close and Aarts (2008). In a comparative study of LOB and FLOB for the written and the SEU and ICE-GB for the spoken register, Leech (2003: 233-234) finds both epistemic and root uses of MUST in decline. Similarly, a trend of MUST to- wards monosemy is challenged by Close and Aarts (2008), who investigate the use of MUST, HAVE TO and HAVE GOT TO in the DCPSE. In a compar- ison of spoken data from the London-Lund Corpus and from ICE-GB they find that MUST declines in overall frequency but still holds its ground as a marker of both obligation and epistemic necessity. The study by Close and Aarts (2008) is particularly interesting here, as the London-Lund Corpus speakers are roughly the same age group as the FRED speakers. A comparison of the distribution of MUST, HAVE TO and HAVE GOT TO in the London-Lond component of the DCPSE and in FRED reveals interesting differences. Present tense obligation contexts are naturally rare in FRED but nevertheless allow a tentative discussion and evaluation. Table 11 shows the distribution of MUST, HAVE TO and HAVE GOT TO in the Midlands, the North and the London-Lund component of the DCPSE.

Table 11. Distribution of present tense obligation markers MUST, HAVE TO and HAVE GOT TO in the Midlands, the North and the London-Lund component of the DCPSE.

Midlands North London-Lund

HAVE TO 11 15 188 (23 %) (31 %) (23%)

HAVE GOT TO 29 21 187 (62 %) (43 %) (23%)

MUST 8 13 427 (17 %) (26 %) (54 %)

ALL 48 49 802 (100 %) (100 %) (100 %) Past Possession and Obligation 47

While the relative frequencies for HAVE TO are roughly equally frequent in the dialect and the London-Lund data, the distribution of MUST and HAVE GOT TO in the London-Lund data is turned upside down in the dialect data. MUST as the default obligation marker in the London-Lund material has been ousted by HAVE GOT TO in the dialect data where it is the least frequent marker of present obligation by far. The raw frequencies are very low and thus have to be interpreted with extreme caution. The overall picture nevertheless shows MUST on its way out of the system of present tense obligation and towards monosemy in the dialect data. In the Midlands HAVE GOT TO accounts for the vast majority of present tense obligation contexts. The distribution in the North is slightly less radical with a lower rate of HAVE GOT TO and a slightly higher rate of MUST. As a marker of epistemic necessity, however, MUST is unchallenged in the dialect data as there is not a single instance of either epistemic HAVE TO or HAVE GOT TO. Some degree of competition between MUST, HAVE TO and HAVE GOT TO in the area of epistemic necessity has been reported for con- temporary varities of English in Tyneside (Trousdale 2003: 277) and Toronto (Tagliamonte and D’Arcy 2007), while MUST is the sole marker of epistemic necessity in York (Tagliamonte 2004). Finally, rates of MUST as a marker of epistemic necessity have been shown to vary between 87.5% and 100% across 8 varieties of contemporary Irish and British English (Tagliamonte forthcoming). Epistemic uses of HAVE TO and HAVE GOT TO developed much later than their deontic uses and have been described as a phenomenon mainly of Amer- ican English (Coates 1983: 57). The competition of HAVE TO and HAVE GOT TO in the realm of epistemic necessity is thus a fairly recent development in British varieties of English which can be detected in some contemporary va- rieties but is absent from the traditional dialect material. The trend of MUST towards monosemy in the dialect data is not accompanied by an extension of HAVE TO and HAVE TO into the realm of epistemic necessity.

9. Summary

The main concern of the present chapter was to investigate whether dialectal variation in British English can show us different versions or different stages of the “wholesale reorganization” of the English auxiliary verb system in the subsystems under investigation here (Bolinger 1980: 6). 48 Monika Edith Schulz

The patterning of past possession and past obligation markers does indeed show us different versions of the development of the respective functional domains. There is stable variation in past possession and past obligation marking in the Midlands, where HAD GOT accounts for roughly 30% of all past pos- session contexts and HAD GOT TO accounts for approximately 30% of all past obligation contexts. In the North, on the other hand, only a very low rate of occurrence of HAD GOT at just 3% of all past possession contexts and an even lower rate of occurrence of HAD GOT TO at just 1% of all past obligation contexts can be observed. The presence of variation between grammatical markers in the Midlands can be interpreted as an indicator of a higher degree of grammaticalization of the functional domains of past possession and past obligation in direct comparison to the North, where possible variation between markers has not been grammaticalized. In the words of Krug (2000: 63), another paragraph is added to the “success story” of HAVE GOT TO in the Midlands, but omitted from the North. The relative strength of the possessive and obligation markers containing GOT in the Midlands is complemented by negation patterns for past posses- sion and past obligation. The forms hadn’t got and hadn’t got to are virtually absent from the North but figure prominently in the Midlands, where hadn’t got is the default marker for negative past possession and hadn’t got to is roughly on a par with didn’t have to in marking wide scope negation. The preference of traditional dialects for direct negation of present tense possessive HAVEposs is confirmed in the North, where direct negation of HADposs in the form of hadn’t is the default case. Evidence from the patterning of past possession and past obligation mar- kers can be argued to provide evidence for the hypothesis that phonology and morphosyntax pattern alike with respect to relic and transition areas. The North is confirmed as a relic area which preserves an older stage of at least some linguistic subsystems, while the Midlands exhibit more advanced stages of the same subsystems. Past Possession and Obligation 49

References

Beal, Joan. 2004. English dialects in the North of England: Phonology. In: Kort- mann and Schneider (2004a), 113-133. Bhatt, Rajesh. 1997. Obligation and Possession. In: Heidi Harley (ed.), The Pro- ceedings of the MIT Roundtable on Argument Structure and Aspect, MIT Work- ing Papers in Linguistics 32, 21–40. Cambridge, MA: MITWPL. Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad, and Edward Fine- gan. 1999. Longman Grammar of Spoken and Written English. London: Long- man. Bolinger, Dwight. 1980. WANNA and the gradience of auxiliaries. In: Gunter Brettschneider and Christian Lehmann (eds.), Wege zur Universalienforschung. Sprachwissenschaftliche Beiträge zum 60. Geburtstag von Hansjakob Seiler, 292–299. Tübingen: Narr. Brinton, Laurel. 1991. The origin and development of quasi-modal HAVE TO in English. Paper presented at the Workshop on “The Origin and Development of Verbal Periphrasis” at the 10th International Conference of Historical Linguis- tics (ICHL) in Amsterdam, August 16, 1991. http://faculty.arts.ubc.ca/lbrinton. Bybee, Joan and Paul Hopper (eds.). 2001. Frequency and the Emergence of Lin- guistic Structure. Amsterdam and Philadelphia: John Benjamins. Bybee, Joan, Revere Perkins, and William Pagliuca. 1994. The Evolution of Gram- mar: Tense, Aspect and Modality in the Languages of the World. Chicago and London: The University of Chicago Press. Chafe, Wallace. 1977a. Creativity in verbalization and its implications for the na- ture of stored knowledge. In: Roy Freedle (ed.), Discourse Production and Comprehension, 41–55. Norwood, NJ: Ablex. Chafe, Wallace. 1977b. The recall and verbalization of past experience. In: Peter Cole (ed.), Current Issues in Linguistic Theory, 215–246. Bloomington, IN: Indiana University Press. Close, Jo and Bas Aarts. 2008. Changes in the use of the modals HAVE TO, HAVE GOT TO and MUST. Paper presented at the 15th International Conference on English Historical Linguistics, Munich, August 24-30, 2008. Coates, Jennifer. 1983. The Semantics of the Modal Auxiliaries. London and Can- berra: Croom Helm. Collins, Peter. 2009. Modals and Quasi-modals in English. Amsterdam: Rodopi. Croft, William. 2000. Explaining Language Change: An Evolutionary Approach. Harlow: Longman. Croft, William. 2010. The origins of grammaticalization in the verbalization of experience. Linguistics 48: 1–48. Crowell, Thomas. 1959. Have Got, a Pattern Preserver. American Speech 34: 280– 286. Fischer, Olga. 1994. The development of quasi-auxiliaries in English and changes in word order. Neophilologus 78: 137–164. 50 Monika Edith Schulz

Fischer, Olga, Ans van Kemenade, Willem Koopman, and Wim van der Wurff. 2000. Early English Syntax. Cambridge: Cambridge University Press. van der Gaaf, Willem. 1931. Beon and habban connected with an inflected infini- tive. English Studies 13: 176–188. Gronemeyer, Claire. 1998. On deriving complex polysemy: The grammaticaliza- tion of get. English Language and Linguistics 3: 1–39. Haiman, John. 1994. Ritualization and the development of language. In: William Pagliuca (ed.), Perspectives on Grammaticalization, 3–28. Amsterdam and Philadelphia: John Benjamins. Heine, Bernd. 1997. Possession. Cognitive Sources, Forces and Grammaticaliza- tion. Cambridge: Cambridge University Press. Heine, Bernd and Tanja Kuteva. 2002. World Lexicon of Grammaticalization. Cam- bridge: Cambridge University Press. Hopper, Paul. 1991. On some principles of grammaticalization. In: Elizabeth Trau- gott and Bernd Heine (eds.), Approaches to Grammaticalization, vol. 1, 17–35. Amsterdam and Philadelphia: John Benjamins. Hopper, Paul and Elizabeth Traugott. 20032. Grammaticalization. Cambridge: Cambridge University Press. Huddleston, Rodney and Geoffrey Pullum. 2002. The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press. Hundt, Marianne. 2001. What corpora tell us about the grammaticalization of voice in get-constructions. Studies in Language 25: 49–88. Jespersen, Otto. 1931. A Modern English Grammar on Historical Principles, vol. 4. London: Allen & Unwin. Kortmann, Bernd and Susanne Wagner. 2005. The Freiburg English Dialect Project and Corpus. In: Bernd Kortmann, Tanja Herrmann, Lukas Pietsch, and Su- sanne Wagner (eds.), A Comparative Grammar of British English Dialects: Agreement, Gender, Relative Clauses, 1–20. Berlin and New York: Mouton de Gruyter. Krug, Manfred. 2000. Emerging English Modals. A Corpus-based Study of Grammaticalization. Berlin and New York: Mouton de Gruyter. Leech, Geoffrey. 2003. Modality on the move: The English modal auxiliaries 1961- 1992. In: Manfred Krug, Frank Palmer, and Roberta Facchinetti (eds.), Modality in contemporary English, 223–240. Berlin, New York: Mouton de Gruyter. Lehmann, Christian. 1982. Thoughts on Grammaticalization: A Programmatic Sketch, Arbeiten des Kölner Universalienprojekts, 48, vol. 1. Köln: Institut für Sprachwissenschaft der Universität. Meillet, Antoine. 1912. L’évolution des formes grammaticales. Scientia 12: 384– 400. Myhill, John. 1995. Change and continuity in the functions of the American English modals. Linguistics 33: 157–211. Myhill, John. 1996. The development of the strong obligation system in American English. American Speech 71: 337–388. Past Possession and Obligation 51

Palmer, Frank. 1979. Modality and the English Modals. London: Longman. Quirk, Randolph, Geoffrey Leech, Sidney Greenbaum, and Jan Svartvik. 1985. A Comprehensive Grammar of the English Language. London: Longman. Schulz, Monika Edith. 2011. Morphosyntacticvariation in British English dialects. Evidence from possession, obligation and past habituality. Ph.D. thesis, Albert- Ludwigs-Universität Freiburg. Schulz, Monika Edith. 2012. The development of possessive HAVE GOT. The path (not) taken. Journal of Historical Pragmatics . Tagliamonte, Sali. 2003. ‘Every place has a different toll’: Determinants of gram- matical variation in cross-variety perspective. In: Günther Rohdenburg and Britta Mondorf (eds.), Determinants of Grammatical Variation in English, 531– 554. Berlin and New York: Mouton de Gruyter. Tagliamonte, Sali. 2004. Have to, gotta, must. In: Hans Lindquist and Chris- tian Mair (eds.), Corpus approaches to grammaticalization, 33–55. Amsterdam, Philadelphia: John Benjamins. Tagliamonte, Sali. forthcoming. Roots of English: Exploring the History of Dia- lects. Cambridge: Cambridge University Press. Tagliamonte, Sali and Alexandra D’Arcy. 2007. The modals of obligation and necessity in Canadian perspective. English World Wide 28: 47–87. Tagliamonte, Sali and Jennifer Smith. 2006. Layering, competition and a twist of fate. Deontic modality in dialects of English. Diachronica 23: 341–380. Traugott, Elizabeth. 2003. Constructions in grammaticalization. In: Brian D. Joseph and Richard D. Janda (eds.), The Handbook of Historical Linguistics, 624–647. Oxford: Blackwell. Trousdale, Graeme. 2003. Simplification and redistribution: An account of modal verb usage in Tyneside English. English World-Wide 24: 271–284. Trudgill, Peter. 1999. The Dialects of England. Oxford: Blackwell. Visser, Theodorus Fredericus. 1969. An Historical Syntax of the English Language, vol. 3.1. Leiden: Brill. Visser, Theodorus Fredericus. 1973. An Historical Syntax of the English Language, vol. 3.2. Leiden: Brill. Whitney, William Dwight. 1889. The Century Dictionary: An Encyclopedic Lexi- con of the English Language, vol. 3. New York: Century Company.

Personal pronouns

Nuria Hernández

Personal pronouns such as I, me, myself, she, her, herself, . . . are one of the most widely discussed lexical categories in present-day linguistics. Their use in spontaneous conversation continues to attract attention for transgressing the pre-determined grammatical boundaries of Standard English. This study is an empirical investigation of the functional diversity of personal pronouns in 20th century spoken British English based on FRED, the Freiburg Corpus of English Dialects.

1. Introduction

1.1. Aims and objectives

The main objectives in this study are to offer a comprehensive account of personal pronoun behaviour from a variationist perspective, focusing on non- standard uses; to identify the prevailing distributional tendencies and patterns of variation; and to identify and account for the most influential determinants of variation in individual phenomena as well as the pronominal paradigm as a whole. In order to achieve these aims, a variety of phenomena will be analysed which revolve around the interchangeability of pronominal expressions, al- ways against the backdrop of a prescriptive written standard. It will be shown that, while the individual phenomena differ in their geographical distribution and frequency, they all point in the same direction: pronoun variation is a regular feature of spontaneous speech. The following examples give a foretaste of the phenomena to be discussed. In the analysis, these phenomena fall under three different headings: variation in number and person, variation in gender, and variation in case (the individ- ual county codes are listed in Table 1 on page 7).1

(1) He used to just open his coat out, Here ’s a couple of rabbits, Fred, give us a couple of pints. (singular us, FRED, YKS_006) 54 Nuria Hernández

(2) I used to walk to Woodstock at many a time with eh Woodstock when they had Woodstock gloves, innit, yes and made them. (generic q-tag, FRED, OXF_001) (3) This old toilet, he was still there when we were doing the garden. (gendered pronoun, FRED, SOM_027) (4) I did give she a ’and and she did give I a ’and and we did ’elp one another. (pronoun exchange: she and I in object function, FRED, WIL_011) (5) Mind you I was a bit on the safe side, I put a rope round me just, to tension up . . . (simplex pronoun used as reflexive, FRED, YKS_001) (6) My brother Jack, he was the eldest. And there was miself,so... (in- dependent self-form, FRED, NBL_007) (7) And when you ’d done that perhaps you ’d have three in the night to clean between you, and you ’d settle the work up between your- selves using cleaning oil . . . (prepositional compl. you and yourselves in same sentence, FRED, SAL_011) (8) ...butifitwasa jobitwere needed it were, was carried out, we, us bairns used to carry it out to them . . . (post-qualified us in subject NP, FRED, NBL_006)

The approach underlying the results in this chapter is descriptive, data-based and not restricted to any particular theoretical framework. All of the empir- ical observations will be supported by quantitative and qualitative analyses, including the revision of linguistic principles and hierarchies proposed in the literature. Also included are typological parallels and historical precursors of the phenomena under investigation. This additional information will not only help position our observations on the language timeline, it will also help to distinguish traditional dialect phenomena from linguistic innovations.

All analyses are based on a special subcorpus compiled from the England component of FRED as shown in Table 1 (all references to corpus data will refer to this selection). The large size of this database allows for an analysis of both frequent and rare features relating to pronoun variation. In order to facilitate the detection of regional differences, the England data are divided into four dialect areas: Southeast (22 texts), Southwest (57 texts), Midlands (47 texts) and North (50 texts). These areas roughly correspond to Personal pronouns 55 the traditional dialect areas shown in Figure 1. The country’s division into discrete dialect areas is necessary for practical purposes, but it should be kept in mind that this is still a construct used for orientation.

Table 1. FRED subcorpus used for the pronoun study

corpus coverage FRED subcorpus FRED entire corpus words ≈ 1.5 m ≈ 2.5 m recorded hours ≈ 180 ≈ 300 texts 176 372 speakers 210 431 dialect areas 4 9 counties 17 43 locations 83 163 wordcounts Southeast (SE) 302,336 624,431 Southwest (SW) 406,626 588,931 Midlands (Mid) 293,219 351,284 North (N) 425,691 487,281

In the text selection process only those texts were included where information was available on both the county and location of the interview. In addition, only those texts were included where information was available on the birth decade of the main interviewee (i.e., the speaker producing the largest amount of text in the interview), and where the main interviewee was born during the period 1890 to 1920. As a result, the data used for this study has a largely homogeneous speaker profile which is representative of the traditional dialects typically associated with a rural working-class population (cf. Ihalainen 1994: 252). The oldest speaker in the corpus was born in 1877, aged 102 at the time of the interview, and the mean age at recording date is 79. Most speakers in this study spent their life in one particular geographic area, so that their linguistic behaviour can be considered characteristic of that area.2 Table 2 shows a breakdown of the text production by dialect area and decade of birth (the only speaker born in 1920 is included in the 1910 category). Due to the fact that some speakers were interviewed twice, and 56 Nuria Hernández

Figure 1. England’s traditional dialect areas (Trudgill 1999: 34) some interviews have more than one speaker, there is a discrepancy between the total number of speakers (210) and the total number of texts (176). Table 3 shows an additional breakdown of the text production by speaker sex. FRED consists to a large extent of interviews with so-called NORMs, the non-mobile older rural male speakers typically selected for traditional Personal pronouns 57 dialect studies (cf. Chambers and Trudgill 1998). Hence, the ratio of male to female speakers in the selected texts is roughly 2:1; the ratio of text produced by the two groups is roughly 3.5:1.

Table 2. Speaker distribution and text production by birth decade and dialect area

speakers born in . . . SE SW Mid N TOTAL text production in words 1890s 10 15 11 15 51 404,331 (28.3%) 1900s 9 23 25 21 78 618,251 (43.3%) 1910s 3 20 11 16 50 386,129 (27.0%) unknown 5 18 0 8 31 19,161 (1.3%) TOTAL 27 76 47 60 210 1,427,872 (100%)

Table 3. Speaker distribution and text production by speaker sex

speaker sex SE SW Mid N TOTAL text production in words male 20 50 34 33 137 1,114,293 (78.0%) female 6 20 13 26 65 311,533 (21.8%) unknown 1 6 0 1 8 2,046 (0.1%) TOTAL 27 76 47 60 210 1,427,872 (100%)

1.2. Why another pronoun study?

Variation in pronoun use has been a point of debate in the study of English ever since Chomsky (1981) postulated Binding Principles A and B for the delimitation of pronouns against anaphors.3 Over the last three decades, many proposals have been concerned with the use of pronominal forms in specific syntactic contexts, and varied approaches have tried to explain the variation found in actual performance data. The ongoing discussion has led to an extensive exchange between formal- ist and functionalist linguistics, but it is also an example for the sometimes tedious attempt to justify why language users behave the way they do, and not the way they ‘should’. In his cross-linguistic study on anaphora, Huang 58 Nuria Hernández

(2000: 22) concludes that the “distributional complementarity between ana- phora and pronominals . . . seems to be a generative syntactician’s fantasy world,” giving credence to the view that structural principles alone cannot account for pronoun use in all its variety. Research on pronoun variation has reached a complex state, making it all the more important to define what a specific approach and selection of data can or can not contribute to the discussion. The present study situates itself within the current trend of contributions that aim at a descriptive, cross- dialectal perspective (e.g., Rohdenburg and Mondorf 2003, or Kortmann and Schneider 2004). The study uses an empirically-founded approach which al- lows for a detailed description of distributional tendencies in quantitative and qualitative terms. It brings together a variety of non-standard phenomena and the different frameworks in which they have so far been discussed. This in- tegrative approach allows us to expand on linguistic claims and hypotheses that are compatible with the empirical observations in an unbiased way, and it permits the identification of determinants of variation which are common to apparently unconnected phenomena. One important incentive is the fact that studies on pronoun variation have so far restricted themselves to smaller geographic areas (e.g., Trudgill 1974, 2004; Ihalainen 1985, 1991; Wakelin 1986; Coupland 1988; Harris 1993; Beal 1993, 2004; Henry 1995; Pietsch 2005; but consider Szmrecsanyi’s re- cent work in English dialectometry, Szmrecsanyi 2011). Regionally restricted studies naturally run the risk of misinterpreting locally substantiated phenom- ena as regional phenomena if their presence in other varieties is ignored. The present study, therefore, takes a cross-dialectal perspective on pronoun use in the different parts of England: the Southeast, the Southwest, the Midlands and the North. Regionally restricted dialect features can thus be distinguished from supraregional features of spoken English. Finally, the comprehensive approach used in this study not only facilitates the comparison of different phenomena, but also the comparison of different morphosyntactic categories such as person, number, gender and case, estab- lishing the relative importance of each category for the correct processing of pronominal expressions (see 8.4). Personal pronouns 59

1.3. Grammatical categorisation and terminology

William Labov once remarked: "It is sometimes said that man is a catego- rizing animal; it is equally appropriate to say that language is a categorizing activity." (Labov 1966: 20). This insight is easily transferred to the use of pronouns in the standard varieties of different languages. In modern Standard English, for instance, pronominal expressions have clearly delimited func- tions. Each function is encoded by a specific type of pronoun, generally with a telling name such as interrogative, possessive or demonstrative. Regarding the use of personal pronouns and reflexives, the standard paradigm is straight- forward and unambiguous: subject functions are encoded by subject forms (I, he, etc.), object functions are encoded by object forms (me, him, etc.), and reflexivity is encoded by self-forms (myself, himself, etc.). This neat classification is convenient, and sometimes indispensable (for example for didactic purposes). However, it leaves no room for the variation observed in other varieties of English where pronoun forms regularly cross categorial boundaries. Interrogative forms, for example, may appear in rela- tive function (the man what I saw), possessive forms are replaced by definite articles (I told the wife), simplex personal pronouns can take on reflexive function (I’ll buy me a car), subject forms can have object function and vice versa (her told me; I gave it to he). Different technical terms have been proposed to describe these phenom- ena, including ‘refunctionalization’ (Lass 1990), ‘functional reinterpretation’ (Howe 1996) and ‘transcategorization’ (Ježek and Ramat 2009). In our study, however, it seems more appropriate to speak of functional variability, espe- cially considering the fact that historical evidence exists for almost all of the phenomena under investigation (cf. Hernández forthcoming). In order to assess the correlation between formal and semantic properties of pronouns in our database, a strict terminological distinction will be main- tained between pronoun form and pronoun function. The former term will be used to refer to word forms irrespective of their syntactic function. There are three main form types: subject forms or S-forms I, he, she, we, they, object forms or O-forms me, him, her, us, them, and self-forms myself, yourself, himself, herself, itself, ourselves, yourselves, themselves. Note that analyses comparing different case forms only include pronouns with overt case distinction, which is why you and it are excluded most of the time. Pronoun function will be used to refer to the syntactic function of a pro- noun in a specific clause or sentence irrespective of its form. The focus in 60 Nuria Hernández the analyses will be on three functions: subject, object and reflexive. The first one denotes the sentence subject, irrespective of the sentence type (statement, question, answer), the noun phrase structure (simplex or coordinated) and the presence or absence of the corresponding verb. The second denotes the sen- tence object or objects, both direct and indirect. Reflexive function denotes pronouns which encode a concept of self-relatedness, or which redirect the action or effect described by the sentence verb towards the agentive subject (cf. Wierzbicka 1996: 415). A broad definition of reflexivity is applied to prepositional phrases to include all cases where the pronoun is coreferential with the preceding subject and bears a beneficiary or recipient relation. Last but not least a few words are in order on the notions standard and non-standard. In the following, pronominal occurrences which conform to the norms and regulations of Standard British English will be referred to as standard. The standard paradigm shown in Table 4 will serve as the tertium comparationis. It represents the prescriptive norms, in their strictest sense, for the correct use of personal pronouns in written Standard English. All occurrences which depart from this paradigm will be interpreted as non-standard or dialectal. However, it is important to note that these terms are used for descriptive purposes only and carry no connotations of incorrect language use or ungrammaticality.

Table 4. Standard English paradigm of personal pronouns and reflexives

Standard subjects objects self-forms English (form = function) (form = function) (reflexive/ intensifier function) 1 I me myself 2 you you yourself SG 3 m he him himself 3 f she her herself 3 n it it itself 1 we us ourselves PL 2 you you yourselves 3 they them themselves Personal pronouns 61

1.4. Data extraction and coding

Preparing the data for analysis involved two main steps. First, all personal pronouns and self-forms were extracted from the corpus, then each case was function-coded. More than 132,500 individual occurrences were extracted overall (excluding interviewer utterances). A random sampling procedure was applied for you and it, the two pronouns with missing case distinction. In the case of you, a sample of 2000 cases was extracted for each dialect area and additional searches were later conducted in connection with different phe- nomena involving this form. Samples for it were analysed in connection with the phenomenon of gendered pronouns in section 4. Two types of irregularities had to be taken into account: pronunciation variants reflected in different spellings, and non-standard sentence structures and truncations. The former issue was easily solved by using the asterisk option in WordSmith Concord. A simple word search for m*se*, for example, returns all instances of myself, miself, meself and m’self. At the same time, the results obtained by this method indicate the absence of other variants known from the dialect literature, such as mesell. Regarding the delimitation of syntactic units, spontaneous speech is not always straightforward. In our data, for instance, we find a fair amount of truncations, anacolutha and non-standard sentence structures (e.g., inversion of verb arguments). The identification of pronominal antecedents is compli- cated by the ability of personal pronouns to refer to chunks of discourse that vary in size and textual distance. Referential chains can be interrupted and eventually picked up again, by both speaker and interviewer. In many cases, the wider context was therefore taken into consideration. In the case of anacolutha, it was decided to include those occurrences where a subject-verb sequence was recognisable in the clause containing the pronoun. In the case of verb elision, the syntactic role of the pronoun was inferred from the overall context. Furthermore, a distinction had to be made in coordinated examples between pronouns forming part of a coordinated NP, as seen in (9), and pronouns which are the subject of one of two main clauses coordinated by and, as seen in (10).

(9) That bed was ours. Eh, only us eh well four of us laid in that, see. [...] Me and Lil down the bottom and Elsie and eh Edie up the top. (FRED, LND_003) 62 Nuria Hernández

(10) Tom was on the buoy and us others – that left seven, didn’t it? – were in this little boat. (FRED, SFK_003)

After the extraction process, all occurrences were coded for their syntactic function. A complete list of these functions is shown in Table 21 in the Ap- pendix, with the functions discussed in this study printed in bold type.

1.5. Structure of the study

The present study is divided in three major parts: an introduction (section 1), the corpus study (sections 2–7) and the synopsis and discussion (section 8). The corpus study forms the empirical core of this investigation. It starts with a general overview and some preliminary observations on the overall versatility of the different pronoun forms in section 2. It then continues with a quantitative and qualitative analysis of different non-standard phenomena, starting with variation in number and person in section 3, followed by varia- tion in gender in section 4. Variation in case occupies centre stage in sections 5 through 7, where three different scenarios will be investigated: the well- known phenomenon of pronoun exchange, case variation in prepositional phrases, and the use of different case forms in qualified pronouns. The study ends with a synopsis and discussion of the major empirical results as well as a brief outlook towards possible areas of interest for future research in pronoun use in section 8.

2. Two hierarchies

The first empirical results to be presented in this chapter refer to the overall functional versatility of S-, O- and self-forms in the data. Besides frequency criteria, behavioural criteria are “the primary source of evidence for marked- ness within language structure" (Croft 1990: 77). In syntax, evidence that one variant is grammatically or functionally more versatile than another variant indicates the former’s unmarked status (‘syntactic criterion’); in morphology, behavioural evidence pertains to the number of morphological distinctions that a particular grammatical category possesses (‘inflection criterion’). In the words of Croft (1990: 81), “[t]he element which occurs in a larger num- ber of constructions is the less marked one.” Personal pronouns 63

One of the most basic empirical results in this respect is that the least marked personal pronoun forms in English are object forms. Before we start with the investigation of more specific phenomena, let us take a look at the function matrices provided in Tables 22 and 23. These matrices give a first impression of the functional range of each pronoun, and a quick look suffices to see that almost all of the investigated pronouns occur in standard as well as non-standard uses (except itself and yourselves, which are both extremely rare). The greatest functional diversity can be observed among 1SG forms, me being the overall most versatile personal pronoun form, and myself the most versatile self-form. In the data at hand, me is used in 35 out of 50 syntactic functions, more than half of which are non-standard uses. In various func- tions, myself is the only self-form to appear in the data, a fact that explains the frequent references to non-standard myself in the literature. A closer look at the matrices shows that all S-forms are occasionally used as objects and prepositional complements, and all O-forms are occasionally used in subject function. Furthermore, all self-forms are used as independent personal pronouns at least somewhere in the interviews, both in object and subject functions (except the very rare yourselves). In line with the historical development of the English pronominal paradigm, we find O-forms in reflex- ive function but no comparable S-forms (cf. Hernández forthcoming; König and Siemund 1997; Faltz 1985; Mitchell 1985).4

Based on these insights, two general hierarchies can be established for the use of personal pronouns and self-forms in spoken English: the Functional Diver- sity Hierarchy presented in (11) and the Non-standard Frequencies Hierarchy depicted in (12). The first hierarchy reflects the fact that, among all pronouns with distinct S- and O-forms such as I–me or he–him, the latter form can be expected to appear in a greater variety of contexts.

(11) Functional Diversity Hierarchy

O-forms > S-forms > self-forms

The Non-Standard Frequencies Hierarchy, on the other hand, describes the comparative likelihood of the three form types to appear in non-standard functions. It illustrates the higher overall frequency of O-form and self-form 64 Nuria Hernández occurrences in non-standard functions, as compared to the lower amount of S-form occurrences in non-standard functions.

(12) Non-standard Frequencies Hierarchy

O-forms/ self-forms > S-forms

The two hierarchies are partially correlated, since the variety of functions in which a specific pronoun occurs is not necessarily identical with its overall non-standard frequency. In this study, the two hierarchies are, for example, correlated regarding the use of myself – the functionally most diverse self- form (20 functions) with the highest non-standard proportion of all self-forms (> 22 %). Other pronouns have a high functional diversity index but do not occur in non-standard uses very often, e.g. 1SG I, and yet other pronouns are functionally less versatile but exhibit high non-standard proportions in one specific syntactic context (e.g., demonstrative determiner them). Another aspect to keep in mind is that, whilst the two hierarchies in (11) and (12) provide a good first impression of behavioural tendencies, their strength lies in their simplicity as generalised descriptions. In the analyses it will become clear that there are exceptions to these generalisations. For example, it is true that O-forms are generally more versatile than S-forms, yet we will also see that 1SG pronouns are comparatively more versatile than other pronouns. In combination, these two tendencies explain why I occurs in a similar number of functions as some O-forms (compare her and us). In any discussion on pronoun variation, the two hierarchies presented in (11) and (12) should hence be treated as a point of departure for more specific observations. Personal pronouns 65

3. Variation in number and person

In this first more detailed analysis we will be looking at the formal realisation of number and person features in the corpus. In terms of pronoun variability, number and person are the more robust categories compared to gender and case, which is why they will be discussed more briefly. The low degree of variation can be attributed to the fact that number and person have high priority in the correct processing of pronominal expres- sions. However, the data show that some variation within these categories is possible without causing any negative effect on pronoun attribution, referent identification, or the overall understandability of the respective utterances. In the following, we will take a closer look at two particular phenomena which are known to most people familiar with British English vernacular speech: the use of singular us and the use of generic question tags. Other features which cannot be discussed here are the use of third person plural forms with singular referent NPs (see (13) and (14)), the use of dummy object pronouns as seen in (15), and the use of existential it as in (16) (cf. Hernández (forthcoming) for a detailed analysis of these phenomena). In addition, unintentional shifts in person may also occur in spontaneous speech. Such shifts are extremely rare and restricted to compatible expres- sions. In our data, for example, some speakers can be observed to switch between 1PL and 3PL pronouns, as seen in (17), or between 1SG and 1PL, as seen in (18). Similar switches can be attributed to a spontaneous change in perspective in informal dialogue. In both examples, coreference relations remain traceable and the meaning of the individual pronouns remains clear.

(13) The Duke of York’s, they were a round potato. (FRED, CON_010)

(14) I think each housei had its wash-house to themselvesi .(FRED, SAL_021) (15) And mi father then was livin’ in Prescott, you see. Got a house here. And he had to walk back, to eh, walked it, no thought about jumpin’ on a, well there was no, on a bus or aught like that, to Victoria Ho- tel . . . And mi father made excellent job at these, this work. And he walked it to Warrington. Told us often about this. And he walked it to Warrington . . . (FRED, LAN_012) (16) But uh, in my opinion, it’s, you know, so much chemical on the, on the ground today. (FRED, KEN_006) 66 Nuria Hernández

(17) {Who looked after the garden? Did your dad or was he was it –} No, he was very good at that, my father was. He never bodged to bother us kids. We kidsi used to have a little bit on theiri own, little plot on theiri own, just for ourselvesi but eh only tiny plots, otherwise he done all the main. (FRED, KEN_009)

(18) They were better days then than they are today, I ’ll tell you. Ii ’m used to enjoy ourselvesi with threepence more than they do now with thirty pound. (FRED, LAN_002)

3.1. Singular us

It is widely known that British English speakers occasionally use us instead of me to refer to themselves in informal conversations, as seen in (19) (repeated for convenience from (1)).

(19) He used to just open his coat out, Here ’s a couple of rabbits, Fred, give us a couple of pints. (singular us, FRED, YKS_006)

In earlier stages of English, the use of 1PL pronouns with singular reference usually carried connotations of authorship and majesty (cf. Mitchell 1985: 107),5 or modesty (Mitchell 1985: 251). The first colloquial uses, for example in requests such as (20), seem to have appeared much later in the 19th century. Around the same time, singular us started being noticed in the dialectological literature (e.g., Lowsley 1888: 6 and Wright 1905: 271). In present-day English, the use of singular us in sentences like Give us a kiss is known all over the country (cf. Beal 1993: 205; Beal 2004: 117; Trudgill 1999: 88; Miller 1993: 108; Petyt 1985: 233; Upton et al. 1987, map 102 ‘with me’). Its function seems obvious: the plural form makes requests, in particular, sound more friendly and familiar. On the phonological level, the de-stressed pronoun, which is often pronounced /@s/, has weakened to the point of becoming enclitic, supporting the alleged modesty or reservedness of the speaker. (Note that this discourse-licensed usage is not captured by feature-geometric approaches; cf. Harley and Ritter 2002a: 507.)

(20) Tell us something more about the pea-shooting. (Thomas Hughes Tom Brown’s School Days I, iv, year 1857, quoted in OED online, http://www.oed.com) Personal pronouns 67

While oral history interviews are not the type of data where we would ex- pect to find many direct requests, they occasionally appear where speakers repeat conversations from the past. In addition to the extraction of pronouns described above, supplementary word searches were therefore conducted for all 1PL pronouns in combination with typical verbs of request (will you/ can you/...bring/ get/ give/...).Thesearches returned 11 cases, mostly requests, where the speaker used us to refer to himself or herself alone, as shown in examples (19), (21) and (22). Only one example contains we, seen in (23), and the corpus search returned no examples for (royal) ourself with singular reference.

(21) . . . and my mother said, Will you go down and get us a pound o’ thin arrowroot biscuits? (FRED, YKS_003) (22) Yes, bless us what did they call him. (FRED, WES_009) (23) When I come back there ’s a big Yankee walking up and down. (v ‘laughs’) And we, Can I help you? Said, uh, I ’m the new manager. I said, You what? (FRED, LND_001)

In distributional terms, it is interesting to note that – even if there are not enough cases in the corpus for a statistically valid analysis – the majority of utterances containing singular us appear in data from the North of England (2 SE, 0 SW, 1 Mid, 9 N).

3.2. Generic question tags – innit?

Person is the most robust morphosyntactic category of all, with even less variation than number. In our data, first person pronouns always refer to the speaker (plus one or more others in the plural), and second person pronouns always refer to the hearer (plus one or more others in the plural, except for in- definite you). Nevertheless, one of the best-known dialect features of spoken English falls under the heading of variation in person: the use of non-standard question tags, and the use of invariant innit in particular (see examples (2) and (24)–(26)). Innit, together with wunnit (‘wasn’t it’, ‘weren’t it’or‘wouldn’t it’), forms part of a larger group of generic question tags which can but do not have to be contracted. 68 Nuria Hernández

(24) I used to walk around the churchyard and read the headstones. (v ‘laughs’) Queer occupation innit? But it used to pass the time . . . (FRED, DEV_009) (25) Well, when I started I had eighty-four, when I started. ’t was eighty- four all here, innit. (‘wasn’t it’/ ‘wasn’t there’, FRED, SOM_029) (26) Someone or the other, they did get some of these names, innit? (‘didn’t they’, FRED, OXF_001)

The most basic characteristic of generic q-tags – alternatively ‘independent’ or ‘autonomous’ q-tags – is that they do not need to be in agreement with the verb and noun of the preceding sentence. This independence makes them comparable to invariant tags in other languages. While Standard English, with its variable q-tags, differs from other European languages, generic tags like innit resemble the q-tags found in German (nicht ‘not’, nicht wahr ‘not true’, oder ‘or’, gell or gelt in Swiss and Southern German), Spanish (verdad ‘true’, no ‘no’), or French (n’est-ce pas ‘is it not’). Functionally and formally, the genericness affects both parts of the tag. The verb no longer echoes the main sentence verb (e.g., didn’t in (27)), and, at the same time, no longer reflects its own original meaning. Generic is or isn’t, for instance, no longer reflect the original meaning of be. Furthermore, the pronoun it acts as an impersonal question-tag particle, showing loss of coreferentiality with the preceding NP, and even loss of referentiality alto- gether. In contractions such as innit, the change in meaning is accompanied by a change in morphology and pronunciation – through lexicalisation and phonological erosion – which strengthens the perception of the tag as one unit. The fact that the same changes can also be observed in wunnit indicates that we are dealing with a wider linguistic phenomenon that could theoreti- cally spread to other ‘V + pronoun’ combinations.

(27) ...hewasa bakeryousee,madehisownbreadandcakesandall that, and same with this one along here, didn’t it? In the village they made their own bread and that, we used to make our own bread, didn’t it, as well. (‘wasn’t it/ he’, ‘didn’t we’, FRED, OXF_001)

In our data, 24 instances of innit were found in different dialect areas (5 SE, 17 SW, 2 N), showing with certainty that the phenomenon, which is often associated with slang, is not restricted to the Southeast. Personal pronouns 69

In terms of agreement, innit is not always generic. In many cases, innit represents a simple contraction of isn’t it, as seen in example (24). In its generic function, however, innit can be used irrespective of the preceding verb (as previously seen in (2)) and verb tense ((25)), and irrespective of the preceding subject NP, as shown in (28).

(28) That young lady that was with you the other day said that she was dead before they put her in there because she said her lungs eh { There was no water in them. No.} There was no water in them. And had she been drowned they would have been full of water, her lungs would have been full of water, wouldn’t it? (‘wouldn’t they’, FRED, DEV_001)

As the examples show, innit is not the only generic q-tag in the corpus. Other utterances include non-contracted forms such as isn’t it, wasn’t it or weren’t it, as well as positive is it and was it. Overall, generic q-tags with be are certainly the most common, but other auxiliaries also feature in the data, such as wouldn’t and didn’t. Table 5 shows the proportion of generic cases for each of these verbs.6 If we look at the figures more closely, they suggest a lexicalisation process from isn’t it to innit which may be attributed to frequency factors. This has also been suggested by Krug (1998: 304), who found in his data that “[b]oth isn’t and it clearly dominate their respective operator and subject categories of the negated English tag paradigm. Most importantly, however, the tag question isn’t it? alone accounts for more than one third of all negated tag questions.” Krug’s frequencies are surprisingly similar to the results obtained in FRED. In addition, the argument that high string frequency is an important cognitive motivation in language change, especially coalescence, can be extended to the second most frequent invariant q-tag, wunnit. Consider the high string frequency of wasn’t it, which comes very close to that of isn’t it, even more if we include the other non-contracted q-tags starting in w-. Overall, the relatively high generic proportion in non-contracted tags shows that formal contraction and genericness are not co-dependent. It appears, however, that “the load of informational content is inversely proportional to the likelihood of undergoing contraction” (Krug 1998: 310). Based on the above-mentioned typological parallels between invariant q- tags in spoken English and invariant q-tags in languages such as German, Spanish or French, it is to be expected that missing agreement between sen- 70 Nuria Hernández

Table 5. All 3SGn question tags in the corpus

tag generic/ non-generic (generic %) innit/ inni’ 9/ 24 (37.5%) isn’t it 13/ 138 (9.4%) is it 3/ 20 (15.0%) wunnit 1/ 7 (14.3%) weren’t it 1/ 11 (9.1%) wasn’t it 16/ 121 (13.2%) was it 10/ 34 (29.4%) wouldn’t it 2/ 26 (7.7%) didn’t it 3/ 3 (100%) TOTAL 58/ 384 (15.1%) tence and q-tag affects neither the understandability of the sentence, nor the function of the tag itself. The underlying reason is that q-tags are discourse- structuring devices which are void of propositional meaning. Dieter Stein refers to this fact in the following passage:

Finally, tags express meanings that are associated with oral language. They in- vite or enlist the hearer’s presence and participation in endorsing the proposi- tional content expressed. ...Infact, to the exclusion of aspectual – and gram- maticalized – meanings they often do carry discourse-structuring meanings and act as ploys to move constituents into preferred information positions. To the latter extent they are indeed propositionally empty. (Stein 1997: 43)

The fact that q-tags have no propositional content also explains why it is possible for speakers in spontaneous conversations to use positive–negative sequences that are considered incorrect in Standard English, i.e. itis..., is it? The respective utterances remain perfectly understandable, indicating that the Standard English requirement that a positive verb must be followed by a negative tag, and vice versa, serves as a mere formality. Various empirical examples from the corpus contravene the standard rule. Interestingly, however, they all contain positive–positive sequences (17 cases), as seen in examples (29) and (30). Negative–negative sequences do not seem occur, but a logical explanation for this restriction remains to be found.

(29) . . . copper pulls him up, says, Oh ...he says, It’s your brother who got away with that case last year, is it? (FRED, LND_001) Personal pronouns 71

(30) I remembers when they first started round here, you know, they – that was just after the, I believe that was just after the War, was it? (FRED, OXF_001)

Generic q-tags as well as positive q-tags which follow a positive verb both violate the standard grammatical rules, but they retain the intrinsic function of utterance-final signals which is common to all q-tags in all languages. Just like standard q-tags, they question the proposition of the preceding sentence. In doing so, they can be used to express doubt or uncertainty, elicit a re- sponse from the hearer, or for other, more subtle pragmatic functions, such as the softening or intensifying of potentially face-threatening statements (cf. Holmes 1995: 82). The additional importance of pauses is illustrated in example (31), based on a sentence from FRED, text DEV_008. Here, the ambiguous tag in (31-a) is disambiguated by the insertion of speech pauses in (31-b) (generic wasn’t it instead of didn’t I) and (31-c) (standard wasn’t it after verb elision; as found in the corpus). The difference in meaning between (31-b) and (31-c) is not encoded in the sentence-final tag but in the prosody, by means of pauses and intonation.

(31) a. Then I got a job in Totnes after I met you wasn’t it? b. Then I got a job in Totnes after I met you (pause) wasn’t it? c. Then I got a job in Totnes (pause) after I met you wasn’t it?

Together with the findings of Holmes (1995) and the typological evidence for invariant q-tags in other European languages, similar examples suggest that prosodic aspects have a higher importance in the correct processing of q- tags than the formal agreement stipulated by the standard prescriptive rules. If ambiguity arises, for example due to unclear clause boundaries, prosodic means such as intonation and pausing facilitate the correct assignment. Since the same means also apply to standard q-tags, we can maintain that, from a functional point of view, non-standard q-tags work exactly the same way as ‘correct’ q-tags. Formal variation in the verb or pronoun, or in the overall morphology of the tag, has no negative effect on its interpretability or the understandability of the sentence as a whole. The future development of generic q-tags is uncertain. It has been sug- gested that innit, the most widely studied tag in British English varieties, 72 Nuria Hernández could one day become part of the standard grammar (e.g., Krug 1998: 304). However, innit is still heavily stigmatised as a marker of ‘uneducated’ speech or slang, which may prevent its spreading to the more formal registers.

3.3. Summary: Number and person

The two morphosyntactic categories of number and person have high priority in the correct processing of pronominal expressions. In the corpus data, this is reflected in a relatively low degree of variation, especially as compared to variation in case. Nevertheless, the data show that when variation occurs in these categories, it has no negative effect on pronoun attribution or referent identification. Variation in number has different motivations which are mostly pragmatic. The use of us instead of me, for example, carries connotations of modesty and closeness which can be used to make a request sound less demanding, even affectionate. Variation in person is restricted to impersonal, non-referential uses of it. In this section, we had a closer look at the use of non-standard question tags including the well-known contracted tag innit. Here too, non- standard uses have no negative impact on the understandability of the respec- tive utterances. In non-standard question tags we are faced with the loss of (co-)referen- tiality. With its variable tags, Standard English sets itself apart from other European languages whereas invariant tags like innit show convergence with the predominant typological trend. It can be argued that question tags, both standard and non-standard, are propositionally empty, and that they are hence not affected by variation in either pronoun or verb. Even contracted tags like innit and wunnit retain the intrinsic discourse functions of all question tags. The exact function of a specific tag in a specific utterance is independent of its formal realisation. It has to be derived from the overall meaning of the sentence and the conversational context, sometimes with the additional help of prosodic cues. Personal pronouns 73

4. Variation in gender

Modern Standard English is a natural gender language (cf. Corbett 1991; Curzan 2003; Greenbaum and Quirk 1990). Male referents are referred to by the masculine pronouns he and him, female referents are referred to by the feminine pronouns she and her, and referents without an assigned gender are referred to by neuter it. In this section we will investigate occurrences that violate these rules by containing masculine and feminine pronoun forms with neuter reference. Pronouns of this type are generally known as gendered pronouns.

(32) Well, if you picked one [an apple] and cooked it early he isn’t same as when he ’s picked and kept, is he? { Yes. Exactly.} So that’s what we did do. We made un late. (gendered reference for inanimate ‘apple’, FRED, SOM_013)

Natural gender systems are typologically widespread and go back as far as Indo-European (cf. Szemerényi 1996: 155–156). Tieken-Boon van Ostade (1994: 236) has observed that “grammatical gender is a concept that has to be learnt, and that is acquired considerable time after children have learnt to dis- tinguish between the sexes.” This explains why, even in languages with gram- matical gender assignment, natural gender may emerge in linguistic contexts where the speaker has to cope with an increased processing load. Mühlhäusler and Harré (1990: 72), for example, have noted the transition in German “from grammatical to natural marking as the distance between a pronoun and its an- tecedent widens.” But how can we explain the opposite phenomenon, i.e. the persistence of gendered reference in a natural gender language? In pursuit of this question, the corpus data will be used to shed light on distributional tendencies in quan- titative, qualitative and geographical terms. It will be shown that spontaneous conversations do not always follow the standard attribution rules, and that the observed deviations follow some regularities of their own. The investigation will be guided by two questions: are gendered pronouns a frequent or rare phenomenon, and what are the determinants of variation between gendered forms and it? In order to answer these questions, a variety of different aspects will be taken into account, including the role of referent categories, the importance of the speaker viewpoint, the relationship between 74 Nuria Hernández gendered pronouns and topicality, the distribution of gendered pronouns in different syntactic positions, regional differences, and differences between male and female speakers.

4.1. Gendered pronouns in the literature

English gendered pronouns are usually discussed in connection with one particular phenomenon, the traditional count–mass system found in the South- west of England. Instead of following the rules of natural gender assignment, this system uses gendered pronouns (usually masculine) to refer to count nouns or ‘personal class’, and it to refer to mass nouns or ‘impersonal class’ (cf. Hughes and Trudgill 1996: 32; Trudgill 1999: 95; Upton et al. 1994: 486). Even though systems of this type are typologically very rare (cf. Kortmann 2002: 202–203; Wagner 2004c: 482), a similar development has recently been observed for Modern Dutch. Audring (2006) has pointed out the resemanti- sation of Dutch pronouns along a count–mass line, with count nouns being increasingly referred to by masculine hij and hem while neuter het is used for mass reference.7 The traditional English West Country phenomenon has been investigated in great detail by Wagner (2004a,c), and more recently by Siemund (2008), but accounts go back as far as the 18th century (e.g., Marshall 1789; Barnes 1886; Ellis 1869–1889; Kruisinga 1905; Ihalainen 1985; Wakelin 1986; Up- ton et al. 1994). In his Provincialisms of the Vale of Glocester, Marshall (1789: 56) describes how he was used “almost invariably for it; all things inanimate being of masculine gender.” Almost a century later, Halliwell (1887: Vol.I, xvii) states that “[i]t is a common saying, that in Hampshire every thing is called he except a tom- cat which is called she.” And Lowsley (1888: 6) mentions the use of 3SGn subject forms ut, he and a, and object forms ‘e, ‘in and un in the dialect of Berkshire. Around the same time, Frederic T. Elworthy makes the following observation in his Outline of the Grammar of the Dialect of West Somerset: Every class or definite noun, i.e. the name of a thing or object which has a shape of its own, whether alive or dead, is either masculine or feminine, but nearly always the former; indeed, the feminine pronouns may be taken as used only with respect to persons. For instance, in chaffering for a sow, it would be said, Wuul, neef tez·u zuw, ee ul git au·n, ‘Well, if it is a sow, he will get on’, i.e. get fat. . . . It is simply an impersonal or abstract pronoun, used to express Personal pronouns 75

either an action or a noun of the undefined sort, as cloth in the quantity, water, snow, air, etc. (Elworthy 1875–1886: 32–33)

Towards the end of the 20th century, however, the situation already presents itself very differently. Ihalainen (1985: 158) finds that “the correct general- ization today seems to be that it can be used for ‘thing’ and ‘mass’ referents, whereas the personal forms do not occur with ‘mass’ referents at all.” It is of course true, as Wagner (2004a: 16, fn.8) has pointed out, that the comparability of early accounts such as Marshall or Elworthy with modern, statistically analysable datasets is by no means guaranteed. This includes the SED fieldworker notebooks, where we mostly find notes on exceptional uses. Nevertheless, a certain development is recognisable. In search for the tradi- tional West Country phenomenon, the most recent studies by Wagner and Siemund reach the same conclusion: the old categorical count–mass system is disappearing. Wagner describes her findings as follows: Yes, the traditional system is slowly dying out, witnessed by fewer and fewer masculine forms in those domains which at one time were their exclusive ter- ritory. The formerly obligatory system has developed into an optional system. (Wagner 2004a: 292, based on data from Newfoundland and the Southwest of England) ...and Siemund observes: The results yielded by my search through these sub-corpora are rather sober- ing. The handful of examples showing an animate pronoun used for picking out an inanimate object is a far cry from the highly regular usage of mascu- line pronouns for the domain of inanimates that was widespread use only a century before. (Siemund 2008: 60, based on the BNC spoken component)

4.2. Gendered pronouns in FRED

The following subsections present a qualitative and quantitative analysis of gendered reference in the FRED corpus. It will be shown that, while most cases correspond to Ihalainen’s above-mentioned generalisation, gendered pronouns occasionally still have mass reference, and that the semantic field of the gendered referent plays a crucial role. It will be argued that explanations based on a count–mass distinction are insufficient for the distributional ten- dencies observed in the data. Rather, most gendered pronouns point towards a pragmatic system revolving around the speaker viewpoint. This includes the topicality of the referent, as well as the speaker’s personal involvement and 76 Nuria Hernández emotional attachment to the referent. The emotive aspect explains why cer- tain groups of referents such as ‘pets’ have a stronger influence on gendered pronouns than other groups of referents. The emotive aspect also explains why male and female speakers differ in their use of gendered pronouns for certain referents, for example ‘ships’. The referent’s degree of individuation will also be taken into account. Over the last two decades, individuation scales have been attributed a promi- nent role in the description of gendered pronouns. In this study, however, it will be argued that individuation is not a referent property but that it is epiphenomenal to the speaker viewpoint. The degree to which we perceive something as an identifiable individual entity depends on a variety of subjec- tive factors. These factors may be correlated with certain referent properties, but they form part of the speaker perspective. Finally, it will be shown that gendered pronouns are a supraregional phe- nomenon which is by no means confined to the Southwest of England. The distribution in the data suggests a South–North continuum, the southern end of which bears resemblance – but is not identical with – the traditional West Country system described above.

4.2.1. Quantitative distribution

Early references in the literature give the impression that gendered forms, and masculine forms in particular, were once used categorically with all count nouns, whereas more recent studies such as Wagner (2004a) and Siemund (2008) suggest that the phenomenon has become optional and rather rare, especially with inanimate referents. The question whether gendered pronouns are frequent or rare can be answered in two different ways. Compared to early descriptions from the 18th and 19th centuries, gendered pronouns today seem rather infrequent. Putting our results into perspective, however, we find that gendered pronouns range between middle and high compared to other non- standard phenomena in the same dataset. Tables 6 and 7 show that the average proportion of gendered forms in the corpus exceeds 60% for non-human animate referents, and 9% for inanimate objects. In addition, the results show that, while masculine forms predominate as expected, a substantial amount of feminine forms also appears in the data. A restriction to masculine forms is not supported, limiting the comparability with earlier accounts. Personal pronouns 77

Table 6. Gendered pronoun frequencies: animate referents (m = masculine forms, f = feminine forms, n = neuter forms) dialect animate human animate non-human area mfnTOTALmfnTOTAL SE 2694 708 0 3402 39 5 97 141 79.2% 20.8% 0.0% 100% 27.7% 3.5% 68.8% 100% SW 1226 228 11 1465 28 14 57 99 83.7% 15.6% 0.8% 100% 28.3% 14.1% 57.6% 100% Mid 1514 316 16 1846 6 0 32 38 82.0% 17.1% 0.9% 100% 15.8% 0.0% 84.2% 100% N 1739 583 0 2322 7 2 134 143 74.9% 25.1% 0.0% 100% 4.9% 1.4% 93.7% 100% TOTAL 7173 1835 27 9035 80 21 320 421 79.4% 20.3% 0.3% 100% 19.0% 5.0% 76.0% 100%

Table 7. Gendered pronoun frequencies: inanimate referents (m = masculine forms, f = feminine forms, n = neuter forms) dialect inanimate count inanimate mass area mf nTOTALmf nTOTAL SE 53 82 1797 1932 0 0 534 534 2.7% 4.2% 93.0% 100% 0.0% 0.0% 100% 100% SW 46 1 838 885 4 0 226 230 5.2% 0.1% 94.7% 100% 1.7% 0.0% 98.3% 100% Mid 42 2 1228 1272 0 0 339 339 3.3% 0.2% 96.5% 100% 0.0% 0.0% 100% 100% N 5 4 1805 1814 0 0 290 290 0.3% 0.2% 99.5% 100% 0.0% 0.0% 100% 100% TOTAL 146 89 5668 5903 4 0 1389 1393 2.5% 1.5% 96.0% 100% 0.3% 0.0% 99.7% 100%

Note that the figures in these two tables are based on a representative text sample consisting of 10 randomly selected interviews from each dialect area. Each occurrence in this sample was assigned to one of the four referent categories presented in individual blocks: ‘animate human’, ‘animate non- human’, ‘inanimate count’ and ‘inanimate mass/abstract’. The results do not 78 Nuria Hernández include gendered possessives as in keep the butter to his shape (CON_011), or gender-neutral pronouns as in You get a working class person . . . works himself up, and he gets to that position, he ’ll do anything to keep himself there . . . (LAN_012). Results for it were extrapolated from random samples of 100 per dialect area, excluding impersonal it (it’s the truth), general it all, and set expressions like damn it.

4.2.2. Referent categories, topicality, and the speaker viewpoint

Previous studies such as Wagner (2005) have shown that gendered referents pertain to certain semantic categories, which means that gendered frequen- cies will to a large extent depend on the conversation topic. Some referents or groups of referents have been associated with the phenomenon for a long time. In his English Grammar from 1640, for example, Ben Jonson classified ships as feminine, and dogs and horses as generally masculine. Jonson de- scribed the neuter gender as “feigned gender: whose notion conceives neither Sexe; under which are compriz’d all inanimate things, a ship excepted” (p. 57, quoted in Nevalainen and Raumolin-Brunberg 1994: 183). More recently, it has been observed that gender choice is moving away from intra-linguistic assignment rules and towards extra-linguistic condition- ing factors, especially the speaker’s attitude towards the referent (Wagner 2004a: 294). This view is clearly supported by occurrences from FRED. Here, it seems to be the viewpoint and attitude of the speaker, rather than the referent properties themselves, which favour the use of gendered reference. Even if “cats are more likely to be shes generically, based on the biological- semantic pattern (dog = neuter or +male, vs. bitch; cat = neuter or +female, vs. tom-cat)” (Wagner 2004a: 124), it is the individual speaker’s perception that ‘cats are female’ which will lead him or her to use a feminine pronoun when referring to the animal in question. Similar to the individuation aspect, the biological-semantic classification of the referent can be epiphenomenal to the speaker viewpoint. The distinction between objectively verifiable ref- erent properties and the subjective perception of these properties is subtle but crucial. While the former can be used to identify common gendered referents such as ships, it is the latter that will tip the scales in the concrete speech situation and decide whether a specific ship is it or she. Personal pronouns 79

Referent categories – masculine forms

In FRED, gendered reference covers a surprisingly great variety of non- human animate and inanimate entities which are by no means restricted to the typical ‘horses, dogs and ships’. The different semantic categories can be summarised as follows: – animals and plants – vessels and vehicles – machines and tools – buildings – food and drink – recipients – other objects Very rarely do we find examples from the semantic field of ‘events’ (e.g., war in DEV_005). Most gendered forms in the conversations refer to animals: bears and bullocks, calves, colts, cows, dogs, donkeys, fish and fox, elephants, heifers and horses, lambs, mares, moles, pigs, sows, ponies, rabbits, sheep, sharks, squirrels. It is primarily masculine forms which account for the high frequency of gendered reference in the ‘animate non-human’ group in Table 6. The corpus results support Wagner’s findings:

Although most grammars of modern and earlier stages of English tell us that the appropriate pronoun to use when referring to an animal is it, except for cases where the sex of the animal is known, actual language use could not be further removed from this prescriptive statement. (Wagner 2004a: 121)

Familiarity with the animal in question is not mandatory, but gendered forms can be used to mark the one referent which represents the central topic or focus of a discourse segment, as seen in the following examples:

(33) I can also remember ehr the bear that used to be (v ‘short laughter’) kept at the Queen’s (v ‘laughter’) Head, which used to sit (v ‘short laughter’) used to be chained to a tree at the back and they used to bring ’im out sometimes and chain ’im to the horse trough in the in the front of the Queen’s Head and eh my old uncle used to give him (v ‘laughter’) pints of , to drink. He used to sit up on there, and eh he got a bit nasty, he got a bit spiteful, they had to have him destroyed. (gendered reference ‘bear’, FRED, MDX_002) 80 Nuria Hernández

(34) If you had a sow with a litter, that would have three meals a day, quite often. He ’d be fed in the morning then after while he ’d be left out, and then he ’d have a little bit more food and left into the young ones again, and out again, and that. (gendered reference, non-human animate ‘sow’, FRED, CON_011)8

Among the inanimate referents of he and him (also himself outside the text sample), we find various vessels and vehicles (58%) such as barges, boats, coaches, engines, planes, ships, submarines, tractors, trains and vans. There is also a large variety of other referents inside and outside the text sample, including all sorts of machines and tools (reapers, sewing machines, drills, ploughs, weaving looms, irons, ovens, wooden churns, and even the tape recorder used in one of the interviews), buildings (farm houses, cottages, churches, mills, pubs, sheds, shops), recipients (bags, bottles, boxes, barrels, buckets, mugs, pots, pails, a washing tub), and various other objects, such as an accordion, a ditch, a generating plant, clock, hill, hay stack, ladder, lamp post, letter, pram, button, painting, pond, rope, quilt, stove, teeth, bricks and walls, the moon, the apple seen in (32), or the toilet seen in (3), repeated for convenience in (35). Similar to the animals mentioned above, most of these inanimate objects are the central topic in the respective discourse segment. Masculine reference can thus even be used if the referent is assigned inanimate status by other expressions in the same sentence. In example (36), for instance, the rake is first referred to as a terrible, dangerous thing, but it becomes he as soon as the speaker starts to describe it in more detail. This passage is a good example for the distinction stressed above, between objectively verifiable referent properties and the speaker’s subjective perception of the referent. From the first perspective, the rake is an inanimate tool; from the second perspective it is dangerous. The shift in perspective is, however, not to be confused with personalisation.

(35) The toilet was right down the bottom of the garden and that was still there when we took over the garden. { I think there used to be two mud huts there at one time, I don’t know. . . . } Was there? This old toilet, he was still there when we were doing the garden. (gendered reference, inanimate ‘toilet’, FRED, SOM_027) Personal pronouns 81

(36) They used to be – they call it a turnover, a arrish rake. And it was a terrible dangerous thing. He had spikes each end . . . (gendered reference, inanimate ‘rake’, FRED, CON_011)

A different mechanism which has not received much attention in the liter- ature manifests itself in several interviews. Let us call it the ‘animation of the topical referent’. In the two text passages in (37) and (38) the gendered forms accentuate the agent status of the referent – a coach in one example, a plant in the other – making it appear more vivid and animated than could be achieved by using it. In both passages, the use of gendered reference makes the described event livelier, more colourful and ultimately more story-like.

(37) But this one occasion, I don’t know what happened, but it is so that he went straight through, this coach did, down the main line, and they sent the shunting engine out of Wellington goods yard to fol- low him down the line and got on to him and catch him, which they did by means of getting the coal pricker over a lamp bracket and gradually brought him to a stand like that and brought him back to Wellington with, a shunting engine. (gendered reference with anima- tion effect, ‘coach’, FRED, SAL_011)

(38) Well you cut them off with a spitter see, about three inches long, the root tap root, and he ’d never grow no more he ’d just shoot out some more roots out each side, but he ’d never grow in himself again . . . (gendered reference with animation effect, ‘teasel’, FRED, SOM_018)

Last but not least, we need to mention mass nouns. Most of the factors which further the use of gendered pronouns for count referents do not seem to apply in this category. Mass referents are usually attributed a very low degree of in- dividuation, and speakers are less likely to feel emotionally attached to them. Nevertheless, masculine reference for mass nouns can be observed in the data. Within the selected text sample, occurrences are rare and they only appear in a few interviews from the SW area (see Table 7), but they provide evidence against the categorical exclusion of gendered reference for non-count nouns (contrary to, e.g., Elworthy 1875–1886: 328, or Trudgill 1999: 95). All mass referents in the corpus belong to the semantic field of food and drink, such as juice, milk, butter, flour, salt, or cow feed (“kek”). The FRED 82 Nuria Hernández examples are comparable to masculine references recorded in the SED for mass nouns such as gravy, porridge and broth. Some examples allow different interpretations: a specific sort of liquid, for instance, may be perceived as a mass referent, or it may be perceived as a count referent if contained in some sort of recipient. In example (41), the referent could be classified as a mass noun (rum) or count noun (this specific drink) – a distinction which is grammatically relevant in other languages.9 In the present study, this distinction is not made but it should be kept in mind.

(39) And then it [the butter] was put on the, whether he used to put the salt in then, and then, of course, it was ready to go on. In them days, nearly every dairy had the big slate benches all around. ...And that used to be put on that, and beaten up from there. And then he would keep good shape. If it ’s summer time and warm weather, there wasn’t any fridge or anything, that it used to be awkward to keep the butter to his shape. He would, you know, go soft, but the slate would retain the coolness for a lot longer. (mixed reference, mass noun ‘butter’, FRED, CON_011) (40) . . . you know kek for cows like. . . . ‘Cause if he ’s all mixed up, then you got to bag en up to tip in, you know . . . (gendered reference, mass noun ‘kek’/ ‘cow feed’, FRED, CON_005) (41) I said, You ’d better have a drop of hot water and a bit of sugar in (reg sic=em) him (/reg) [the rum]. He said, I think I will. ...Heused t’ say, It isn’t too much trouble? I said, Trouble, no, ain’t no trouble at all putting hot water and a bit of sugar in (reg sic=em) him (/reg) and the old man used to sit down, he ’s happy as a king. (gendered reference, mass noun ‘rum’, FRED, CON_006)

Referent categories – feminine forms

In the data at hand, gendered reference is by no means restricted to masculine forms, even if gendered she and her are less frequent than their masculine counterparts (compare Tables 6 and 7). This result is in conflict with earlier accounts that have suggested such a restriction for Modern English (e.g., Iha- lainen 1985: 155). Personal pronouns 83

In addition, the referent range in feminine examples is wider than might be expected, even if it is more restricted than for masculine forms. Similar to gendered he and him, gendered she and her refer to both non-human animates and inanimate objects, encompassing the following semantic categories:

– animals (cows, dogs, horses in general, mares in particular) – vessels and vehicles (mainly boats, ships, submarines, engines, trains) – other objects (e.g., beam, cross piece, road, envelope, stove)

Unlike he and him, all gendered she and her in the data refer to countable entities, never to mass nouns. Occurrences include the story of Fan the pony in example (42). Regarding the well-known use of feminine reference for ships and other vessels, we can agree with Wales (2002) and Wagner (2004a) that ‘personification’ is not an adequate label – the label being too general, and the phenomenon too frequent. The respective occurrences can, however, be interpreted as a “subtle form of gender symbolism” (Wales 2002: 333) which is typically used by male speakers with a maritime background. Since different types of vessels are also referred to by masculine forms, gendered reference is to a certain extent unpredictable, as seen in (43). In this particular example, the speaker uses feminine reference for the one large ship which has its own name, distinguishing it from the other, smaller vessels in the same passage.

(42) And, this same bloke Gregory ...hehadthepony. And eh he used to go out at night, (v ‘whispers’) and used to get as drunk as a newt. . . . And when he come out of The Ship at Eastcote, they used to just open the trap, and they used to put him in the trap, and used to tie the reins, on, right, on the horse’s collar, . . . and lead the pony out into the middle of the road, and say, Go on, Fan, home. And she used to go up that road, they reckoned, like the wind. And if there was anything coming down the road, she used to pull out, pull up dead, and draw right into the side until they gone by, and as soon as they gone by, away she ’d go again. She used to go straight across Potter Street, ...upthe hill, see, turn round, come into the farm – actually about two or three o’clock in the morning – right up to the back door, start neighing and hollering like hell, she would. . . . There was Jim drunk as a newt in the back, they used to have to take him indoors and unharness Fan . . . (gendered reference, ‘pony’, FRED, MDX_002) 84 Nuria Hernández

(43) No, well, they had a brig first – he got square sails on two masts. But if you had a (trunc) b- (/trunc) , if you had a (trunc) bri- (/trunc), a, a barque, he (e ‘clock chimes’) got square sails on two masts –ifhe was three master – and four-and-a-half sails on the mizzen mast. But if was a four-mast barque he ’d have square sails on three masts. But a (trunc) sh- (/trunc), forward ship had square sails on all masts. Even the five-master what went ashore before the first war, (trunc) i- (/trunc), just up to the side of Dover, she been in collision, the Prussian. She was a five-master, but square sails on all masts, that ’s a ship. (mixed gendered reference, different vessels, FRED, KEN_008) (44) . . . and I saw this cow, and I spoke to her, went over and made a fuss of her... (gendered reference, ‘cow’, FRED, DEV_002) (45) . . . and the driver said to me, he said, Thank God I got her round Coalbrookdale. (gendered reference, ‘train’, FRED, SAL_020)

4.2.3. Graphic representation on the Animacy Hierarchy

Gendered pronouns are frequently perceived as a signal for the speaker’s emotional involvement with or attachment to the referent in question. Yet, the lack of a direct means for evaluating the speaker’s point of view presents a major difficulty in the quantification and interpretation of gendered forms. Descriptions of gendered reference therefore usually revert to other, objec- tively verifiable factors. Siemund (2008), for example, defines the scope of gendered forms as compared to it on the Animacy Hierarchy (Croft 1990) and the Continuum of Individuation (Sasse 1993). Siemund’s findings indi- cate that the difference between non-standard varieties and Standard English regarding the use of masculine and feminine pronouns “concerns the location of the cut-off point” in these hierarchies (Siemund 2008: 142). It was argued above that individuation is not really a referent property. Rather, the “degree to which we see something as a clearly delimited and identifiable individual entity” (Yamamoto 1999: 3) depends on the speaker’s perspective. If this perspective is not directly accessible, the degree of individ- uation of each referent is very difficult to determine. Aspects like humanness and animacy offer more reliable means of classification (cf. Croft 1990; Kuno 1987).10 Personal pronouns 85

In our data, systematic variation was observed within each pronominal gender category (masculine, feminine, neuter) and each animacy category (human animate, non-human animate, inanimate). In Figure 2, these results are projected onto the well-known Animacy Hierarchy formulated by Croft (1990), visualising the referential overlap between gendered pronouns and it. In accordance with earlier accounts, masculine and feminine forms encroach on the hierarchy from left to right, with linear decline towards the right. At the same time, it shows linear decline to the left, predominating in the inan- imate class and only rarely referring to humans (in the data only neonates). In addition, the Continuum of Individuation presented by Sasse (1993: 659) is included in Figure 2 in order to illustrate the overall correlation between animacy and individuation. In contradiction to earlier accounts,11 the use of gendered pronouns, specifically masculine forms, can be verified for the en- tire continuum including referents of the inanimate mass category.

               

                 ! !        

Figure 2. Gendered pronouns vs. it on the Animacy Hierarchy

4.2.4. Switches between gendered and non-gendered forms

It occasionally happens that speakers use gendered forms and it for the same referent in a conversation. Such switches can even occur within the same paragraph or sentence, as seen in (46)–(48). In some utterances, the two variants appear to be in free variation (apparently unmotivated switches), whereas examples such as (36) show how a change in focus, or a change 86 Nuria Hernández in the speaker’s perception of the referent, can provoke a switch to gendered reference. Switches of this kind can be called functional switches.

(46) If it was well built, he would keep fairly dry. (mixed reference, inan- imate ‘hayrick’, FRED, CON_011)

(47) D’ you know we ’ve got one calf before it was twelve month old and he won’t turn without the dummy. (mixed reference, non-human animate ‘calf’, FRED, SOM_013)

(48) Oh yes, I remember when he was an old pub then, cor wasn’t it hellish low, yeh you had to go down a step first to go in, yeh. (mixed reference, inanimate ‘pub’, FRED, DEV_008)

One subtype of functional switches are what we will here call demarcation switches. These are switches between gendered reference and it which are used to demarcate different referents in the discourse. In (43), for example, the alternation between masculine and feminine gendered forms sets apart different ships. Remember that in this particular example, feminine reference was used for the one large ship which had its own name, while other ves- sels were masculine. A similar effect can be observed in the following two examples:

(49) And the first horse, I bought a horse, or father did for me, I gave eighty-five pound for it. I bought it in September, and by Christmas it has lockjaw and died. And then I had to buy another one. But in the meantime, a friend of mine, down Ludgvan, he seen me. I used to buy feeding stuff off his brother, and he said, I got a horse home, Gilbert, he said, that ’ll tide you over for a few months. He ’s old, but he ’ll work. And I was very grateful for that man for that. He helped me out a lot. And after a few months, I got over the shock. I went up to Mithian, and I bought a horse for fifty pounds. She wasn’t a perfect horse, but she could do the job. And that kept me going. (switching between gendered and non-gendered reference, different horses, FRED, CON_009)

(50) And well, we pulled this thing, it [horse1] leapt, and it was quite a thing, because it was fairly strong at that age. And we brought it in, and we hadn’t had any experience much, of breaking in. But, you Personal pronouns 87

know, it came in all right. With the other, we had an old mare, was a big Shire one, that was a bit slow really, but we put her [the mare] in beside of that one, and he [horse1] couldn’t do much. You know, he had to go on with her, because she ’d drag him along too, and then she settled down to do her part. (switching between gendered and non-gendered reference, different horses, FRED, CON_011)

In juxtaposition, gendered pronouns outweigh it in speaker closeness, since they naturally indicate greater familiarity and intimacy in the speaker–referent relationship. In (49), for example, the speaker uses it for the first horse in the first few sentences – an animal he only had for a short time. The second horse in the middle of the paragraph, probably a stallion or gelding, is referred to as he; and the third horse, probably a mare, is referred to as she. Similarly, the speaker in (50) describes the untamed wild horse (thing) as it, in contrast to his old mare which is referred to as she. Once the untamed horse joins the mare and slowly becomes domesticated, it becomes he. The different referential forms in this example not only set apart the two referents – the old mare and the new horse – but also upgrade the latter from it to he.12 Similar examples leave no doubt that emotive aspects play an important role in the choice of gendered reference.

4.2.5. In-text distribution: semantic priming and semantic differentiation

In addition to the possible determinants of gendered reference described above – the semantic properties of the referent, the speaker perspective, as well as functional switches – it is important to consider the influence of priming ef- fects. In their investigation of complex noun phrases, for instance, Cleland and Pickering (2003) found that production priming is stronger if the head nouns in the prime and target match. Accordingly, production priming can be expected to be strongest in pronominal expressions if the referent of the prime and target match. This explains why we find some texts which have no gendered pronouns at all whereas others feature numerous occurrences, often accumulated in the same text passage. Two such examples are the story of the bear in (33) and the story of Fan the pony in (42). In spontaneous conversation, turn-taking can of course interrupt a chain of gendered references and lead to pronoun switches, as shown in this example: 88 Nuria Hernández

(51) Now, oh, you ain’t got en on, have ye? { Yeah. Ah, I can take it off again.} (referring to the recorded conversation, FRED, WIL_010)

A different finding from Cleland and Pickering (2003) which is not supported is that production priming is stronger when the prime and target head nouns are semantically related. According to the authors, it can be assumed that the production of a sentence ‘the sheep that is red’ (instead of ‘the red sheep’) increases the likelihood of producing the ‘goat that is red’ (instead of ‘the red goat’; cf. Cleland and Pickering 2003: 218). Examples from FRED show that this mechanism rarely applies to gendered pronouns. On the contrary, text passages like (43), (49) and (50) show how demarcation switches can help distinguish between different semantically related referents. Taking into account, (i) the use of demarcation switches, (ii) the complex- ity of relationships between the speaker and different referents in the same text, (iii) possible shifts in topicality and perspective during the interview, as well as (iv) the individual speaker’s general predisposition to use gendered forms, and (v) a certain amount of free variation, the issue of semantic prim- ing becomes exceedingly complex. Overall, sequences of gendered forms ap- pear to be strongest where the prime and target referents are identical, but even then a variety of factors may diminish the priming effect.

4.2.6. Syntactic positions

In most examples, the use of gendered reference appears to be intuitively functional or pragmatic, but a look at structural factors yields some interesting results. Most importantly, it can be shown that the syntactic position of the pronoun has an influence on gender choice. This can best be seen in the two referent categories with the largest overlap between gendered forms and it, i.e. non-human animate referents and count objects (Table 6). The numeric results are shown in Table 8. In both categories – animate non-human and inanimate count – the numbers show a significantly higher proportion of gendered forms in subject function as compared to object func- tion (p < .05), and a subject–object ratio of approximately 2:1. From a wider perspective, these results link up with the general tendency of traditional non- standard phenomena to survive in the most frequent grammatical relations (cf. Krug 2003: 18). Should it prove true that gendered pronouns are taking Personal pronouns 89 the usual path of morphological change, future stages of English may witness their disappearance, starting in object function.

Table 8. Non-human references in text sample, breakdown by syntactic function

syntactic animate non-human inanimate count function gendered forms it TOTAL gendered forms it TOTAL (m+f) (m+f) subject 65 131 196 137 1978 2115 (33.2%) (66.8%) (100%) (6.5%) (93.5%) (100%) object 33 131 164 73 1933 2006 (20.1%) (79.9%) (100%) (3.6%) (96.4%) (100%)

4.2.7. Gendered pronouns and speaker sex

Among the 48 speakers in the text sample used for this analysis, 26 used gendered forms at least occasionally. In the whole dataset, 72 speakers were found to use gendered him, the pronoun with the highest absolute number of gendered occurrences (385). This is over a third of all speakers in the corpus. Gendered reference can hence be described as a generalised linguistic feature which is used frequently and by a substantial number of speakers (cf. Hernández forthcoming). A different aspect not addressed so far is the linguistic behaviour of male speakers as compared to female speakers. If we assume that gendered ref- erence is largely influenced by emotive aspects and the speaker viewpoint, and if we also assume that men and women differ in their perception of, and attachment to, certain referents, as well as in the way that they express this attachment, differences are to be expected. Table 9 shows a breakdown of gendered pronouns by speaker sex, including 11 female and 36 male speakers (1 unknown; neuter values extrapolated from same random sample as above). The empirical results can be summarised as follows:

1. In our texts, male speakers use gendered reference more frequently than female speakers. 90 Nuria Hernández

2. Both sexes generally prefer gendered masculine forms over gendered feminine forms (compare the results in section 4.2.1 and Figure 2).

3. Male speakers also prefer masculine forms over feminine forms for inanimate referents, while female speakers use basically no gendered reference in this category; the use of gendered pronouns for inanimate referents is hence almost exclusively attributable to male speakers.

Table 9. Gendered pronouns and speaker sex

masculine reference 71 (25.4%) male speakers feminine reference 20 (7.2%) non-human animate neuter reference 188 (67.4%) (N=421) masculine reference 9 (6.3%) female speakers feminine reference 1 (0.7%) referent neuter reference 132 (93.0%) category masculine reference 145 (2.7%) male speakers feminine reference 89 (1.7%) inanimate count neuter reference 5147 (95.7%) (N=5903) masculine reference 1 (0.2%) female speakers feminine reference 0 (0.0%) neuter reference 521 (99.8%)

Based on these results, it is also important to note that no correlation was found between speaker sex and pronominal gender of the type suggested by Mathiot and Roberts (1979) for American English, i.e. no preference of male speakers for feminine forms, or female speakers for masculine forms. The difference between male and female speakers in Table 9 is very pro- nounced in both referent categories, but it is most noticeable for inanimate count objects (‘non-human animate’: 7.24 σ, p = 3∗10−4; ‘inanimate count’: 12.32 σ, p = 7 ∗ 10−10). Considering the fact that female speakers used basi- cally no gendered forms for inanimate referents in these texts, future analyses of sex-specific language behaviour should consider the referent category as an additional determinant. An explanation for the observed differences is once more found in the speaker viewpoint and the speaker’s attachment to the referent in question. There is no doubt that, on average, the male speakers in the corpus talk about objects like vehicles, ships, machines and tools more frequently and in more Personal pronouns 91 detail than the female speakers. If we, therefore, assume that male speakers feel on average more familiar with these objects, the occurrence of gendered forms comes at no surprise. It also stands to reason that a speaker with a marine background (among the traditional dialect speakers in FRED only men) will refer to boats and ships as she, using so-called symbolic gender or gender of affection. Taking into account these considerations, we can argue that, even if speaker sex and viewpoint occasionally correlate, the two aspects need to be treated separately.

4.2.8. Areal distribution

The areal distribution observed in this study is in opposition to the frequent assumption that gendered pronouns are an exclusively Southwestern phe- nomenon. At the same time, the results explain why occurrences in other parts of the country are occasionally ignored: gendered pronouns are significantly more frequent in the South than in the Midlands and North.13 The observed frequencies mark a South–North continuum which is most noticeable among the more frequent masculine forms. Figure 3 shows the relative frequencies of gendered forms in the ‘non- human animate’ category; Figure 4 shows the corresponding distribution for the ‘inanimate count’ category. Feminine, masculine and all gendered forms are shown in different shades of grey with the respective exponential trend- lines. In both referent categories we get significant differences in frequencies between the SW and N areas, and between the SE and N areas, for both mas- culine forms and gendered forms in general. The only noticeable exception within this South–North continuum is the relatively high proportion of femi- nine forms used with inanimate count referents in the SE area. A closer look at the data reveals that the exceptionally high frequency of 4.2% (as compared to 0.1% and 0.2% in the rest of the country) can be attributed to two speakers who consistently use feminine reference for differ- ent ships, boats and engines throughout their stories. Together, speaker JRS (male, 44 occurrences, SFK_011) and speaker EF1 (male, 27 occurrences, SFK_003) are responsible for 71 out of 82 she and her with inanimate count reference in the SE area. In terms of speaker–case ratio, these two speakers represent outliers who strongly bias the observed frequencies. The overall South–North cline remains visible. 92 Nuria Hernández

60,0%

50,0%

42,4% all gendered

40,0% masculine

feminine 31,2% expon. (all gendered) 28,3% 30,0% 27,0% expon. (masculine)

expon. (feminine) 20,0% 15,8% 15,8% 2 14,1% R = 0,9558

R² = 0.8414 10,0% 6,3% 4,9% R2 = 0,2738 3,5% 1,4% 0,0% 0,0% SW SE MID N

Figure 3. Geographic continuum: Gendered pronouns, ‘non-human animate’

10,0%

9,0%

8,0% all genderd 7,0% 7,0% masculine

6,0% feminine 5,3% 5,2% expon. (all gendered) 5,0% 4,2% expon. (masculine) 4,0% expon. (feminine) 3,5% 3,3% 2,7% 3,0% R2 = 0,709

2,0% R2 = 0,7203

R2 = 0,0056 1,0% 0,5% 0,2% 0,3% 0,1% 0,2% 0,0% SW SE MID N

Figure 4. Geographic continuum: Gendered pronouns, ‘inanimate count’ Personal pronouns 93

4.3. Summary: Gendered pronouns

In this section, variation in pronoun gender was investigated by means of a cross-regional analysis of gendered pronouns in England. Compared to other non-standard features in the same dataset, gendered pronouns represent a medium to high frequency phenomenon. Compared to earlier accounts, how- ever, a significant reduction in frequency seems to have taken place. The well- known traditional West Country system based on count–mass distinctions is no longer in place. Instead, the results mark a South–North continuum where the highest gendered proportions are attributable to masculine forms with non-human animate referents. The corpus results show a general predominance of masculine forms over feminine forms. Unlike in previous studies, empirical evidence was also found for masculine pronouns with mass reference. Yet the number of examples is limited, and it can be suspected that masculine reference will not persist in this referent category for much longer. It was argued that gendered pronouns encode pragmatic language use in close connection with the speaker viewpoint and with emotive aspects. In a variety of different functions, gendered reference is used for the topicalisation and animation of focal referents, as well as the demarcation of different refer- ents in the conversation by functionally motivated switches between gendered forms and it. Gendered forms can be used to express the speaker’s personal involvement or attachment to the referent in question. This is also reflected in differing preferences between the male and female speakers in the corpus: the well- known gender symbolism of vessels, for example, is typically found in male speakers with a marine background. Consequently, the results obtained in this study have serious implications for predictions about sex-related language use. Most importantly, they call for a specification of generalised claims, such as the alleged avoidance of vernacular features in female speech (see Coates 2004 for a more detailed discussion). Regarding the two aspects of individuation and animacy, an attempt was made to illustrate the subtle but important distinction between objectively ver- ifiable referent properties and the subjective attribution of referent properties. Especially individuation – defined as the degree to which we see something as a clearly delimited individual entity – is epiphenomenal to the speaker viewpoint and can therefore not be attributed determinant status. Animacy, on the contrary, provides a more objective means of classification. In Figure 2, 94 Nuria Hernández the distribution of gendered pronouns vs. it was projected onto the Animacy Hierarchy. The biggest overlap is recognisable in the ‘non-human animate’ category. Last but not least, gendered pronouns appear to be following the general tendency of traditional non-standard phenomena to survive in more frequent syntactic roles. This is indicated by the significantly lower proportion of gen- dered forms in object function, as compared to subject function. Based on the empirical results, it is to be expected that, if gendered pronouns were to disappear from the English language, they would first disappear from the inanimate referent category (i.e., right to left on the Animacy Hierarchy), and the less prominent syntactic positions (i.e., first object, then subject).

No matter where I roam I will return to my English rose for no bonds can ever tempt me from she. (The Jam “English Rose")

5. Pronoun exchange

When it comes to variation in pronoun use, case is certainly the most widely discussed grammatical category. In spontaneous spoken English, the mor- phological realisation of syntactic relations allows for substantial variation in many different syntactic functions. The aim of the remaining analyses is to describe variation in pronoun case as it can be observed in the FRED data, starting with the well-known phenomenon of pronoun exchange. Pronoun exchange (henceforth: PE), is probably the best-known vernac- ular feature regarding variation in case. Although not language-specific, the phenomenon also known as pronoun substitution (Trudgill 2004: 147) has certainly made its mark in the linguistics of English.14 From a typological view, “pronoun exchange in dialects of English contradicts the prototypical case-marking of subjects and objects in ‘accusative’ languages” (Kortmann 1999: 6). Nevertheless, the acceptability of PE in everyday conversation is readily explainable by the fact that speakers of English are used to a language that has already lost its morphological case in nouns, and which heavily relies on word order for the identification of syntactic roles (cf. Koktova 1999). Personal pronouns 95

5.1. The traditional definition

Pronoun exchange was first documented in the dialectological literature of the 18th and 19th centuries. Early definitions can, for example, be found in Elworthy (1875–1886: 35), Kruisinga (1905: 35) and Wright (1905: 271). In its traditional definition, the term describes the exchangeability of sub- ject and object forms in subject and object functions. Historically, the use of me, him, her, etc. in subject function (her’s not in) has been known since Old English whereas the use of I, he, she, etc. in object function (she told I) emerged during the period. The great attention paid to irregular uses of subject and object case forms is probably due to the fact that the distinction between subject and object forms and functions is generally considered essential for the proper command of the language. In the early 1920s, Edward Sapir stated:

Surely the distinction between subjective I and objective me, between subjec- tive he and objective him, and correspondingly for other personal pronouns, belongs to the very core of the language. We can throw whom to the dogs, somehow make shift to do without an its, but to level I and me to a single case – would that not be to un-English our language beyond recognition? There is no drift towards such horrors as Me see him or I see he. (Sapir 1921: 177)

Geographically, PE is often associated with dialects of the English Midlands and the Southwest of England (cf. Burchfield 1994; Hughes and Trudgill 1996; Wagner 2004b). In one of the earliest accounts, for instance, Halliwell (1887) mentioned occurrences of PE in the Midland counties of Worcester- shire, Warwickshire and Gloucestershire, and the SED identified two main areas for subject her in the Southwest and western Midlands (Orton et al. 1978, maps M68 and M69). The phenomenon has, however, been observed elsewhere, including East Anglia (Trudgill 2004, only S-form objects), the North of England (Beal 1993, 2004), Irish English (Filppula 1999), as well as (Wagner 2002). Despite the many references in the literature, descriptions of PE are often anecdotal and restricted to individual examples from specific dialect areas. According to Trudgill (2004: 147), the exchange of different case forms in English “has not yet been subjected to any definitive analysis.” The present study fills this gap with a systematic empirical investigation which takes into account the cross-regional distribution of PE occurrences and the different PE sub-features. 96 Nuria Hernández

An additional incentive for this particular analysis is the fact that, side by side with the literature on PE, an ongoing debate can be observed on so-called ‘unbound’, ‘untriggered’ or ‘independent’ reflexives which occur instead of personal pronouns in certain syntactic and pragmatic functions. Given that the three case forms – S-, O- and self-forms – have a long history of co-existence and exchangeability, it is surprising that studies on either phenomenon have so far not elaborated on a possible connection. In this analysis, we will consider both phenomena: the exchangeability of S- and O-forms as well as the exchangeability of personal pronouns and self-forms. We will see that, if one manages to break free from the traditional definition of PE, the resulting picture is a more complex phenomenon which includes both personal pronouns and self-forms in a variety of syntactic con- texts. The empirical results will lead to a modified definition of PE in 5.6.

5.2. Subject forms in object function

The phenomenon known as pronoun exchange consists of several sub-features which will be discussed separately in the following sections. Let us start by looking at subject forms in object function.

(52) ...sohetold I he ’d gid I the sack, I and my father . . . (simplex and coord. I in object function, FRED, WIL_009)

The most common explanation for this feature is emphasis. It has been argued that S-forms tend to be used in object function when the respective pronoun carries additional stress. Wakelin (1986: 34), for example, mentions the use of subject case forms as emphatic grammatical objects in the Southwest of England, and Trudgill (2004: 147) notes that “it seems possible that what hap- pens is that the Standard English subject pronouns occur as objects when the pronoun is emphasised, and object pronouns as subjects when the pronoun is not emphasised.” In an early account of the Berkshire dialect, Lowsley (1888: 6) lists she as an emphatic object in opposition to non-emphatic her. Lowsley also includes two syntactic rules which state that “active verbs gov- ern the nominative case,” as in theyloveweor he hates they (p. 13), and that “[p]repositions sometimes govern the nominative case,” as in from they as hate you expect malice or from he as is cunning expect deceit (p. 14). Other studies express a different view, finding that non-standard S-forms are not restricted to emphatic uses. Wagner (2002: 25) observes that in her Personal pronouns 97

SW data “there is no reason whatsoever to claim an ‘emphatic’ status for PE forms. Neither do speakers put particular stress on the forms, nor is there anything in the immediate environment that would cause a need for empha- sis.” Similarly, Ihalainen observes that most nominative objects attested in his recordings are “not ‘emphatic’ at all” (Ihalainen 1991: 107) and that “at this stage, no more can be safely said about nominative objects other than that they are by no means restricted to emphatic contexts” (Ihalainen 1985: 160). In the present study, this view is supported by different occurrences where S-form objects can, but do not need to, function as emphatic pronouns. The first quantifiable result regarding non-standard S-forms in FRED is that all of them – I, he, she, we, they – are occasionally used as single ob- jects, as seen in the function matrix in Table 22. Furthermore, all S-forms also occur in prepositional complement function, another syntactic context where Standard English requires object case (also listed in the matrix). The areal distribution of non-standard frequencies in the two syntactic functions is very similar, with non-standard S-forms clustering in the SW (more details below). In addition, S-forms occasionally appear in for–to and ECM con- structions (not analysed in this study), showing that the grammatical range of PE features is wider than usually described. Remember that you and it were not included in the analyses due to the missing case distinction. Also note that, according to Cardinaletti and Starke (1994), it falls into a universal category of [-human] pronouns which can not be coordinated: *It and the other one are nice. It was therefore excluded from the analysis of both PE and the influence of coordination on PE.

5.2.1. Object and prepositional complement function

Table 10 shows the S-form proportions for all pronominal objects and prepo- sitional complements in the corpus (limited significance where the overall number of occurrences is too small). The overall most frequent S-form object is he. With 41 occurrences, he covers 2.1% of all 3SGm pronominal objects. Two aspects are particularly striking if we compare the two syntactic func- tions. First of all, the range of non-standard frequencies in non-coordinated cases is quite narrow, reflecting the behavioural similarity of the different pronouns I, he, etc.: S-form objects range between 0.4% and 2.1%, S-form prepositional complements between 0.2% and 1.7%.15 Secondly, a compari- son of the two functions shows similar median and average values which are 98 Nuria Hernández

Table 10. S-form proportions (%) in object and prepositional complement function (compare functions in Table 21) syntactic function I he she we they TOTAL (of 1SG) (of 3SGm) (of 3SGf) (of 1PL) (of 3PL) % object 1.3 2.1 0.4 1.0 0.5 1.0 (25/ 1915) (41/ 1949) (3/ 754) (9/ 945) (29/ 5406) (107/10969) object, co. initial 0.0 0.0 0.0 0.0 0.0 0.0 (0/ 9) (0/ 1) (0/ 0) (0/ 1) (0/ 1) (0/ 12) object, co. final 50.0 0.0 0.0 0.0 0.0 22.2 (2/ 4) (0/ 0) (0/ 1) (0/ 1) (0/ 3) (2/ 9) object, co. middle 0.0 0.0 0.0 0.0 0.0 0.0 (0/ 3) (0/ 0) (0/ 0) (0/ 0) (0/ 0) (0/ 3) prep. complement 1.1 1.7 0.2 1.5 1.3 1.3 (14/ 1221) (16/ 941) (1/ 425) (10/ 665) (18/ 1418) (59/ 4670) prep. c., co. initial 6.7 0.0 0.0 0.0 0.0 5.3 (1/ 15) (0/ 2) (0/ 1) (0/ 1) (0/ 0) (1/ 19) prep. c., co. final 33.3 0.0 0.0 0.0 66.7 36.4 (2/ 6) (0/ 2) (0/ 0) (0/ 0) (2/ 3) (4/ 11) TOTAL 1.4 2.0 0.3 1.2 0.7 1.1 (44/ 3173) (57/ 2895) (4/ 1181) (19/ 1613) (49/ 6831) (173/15693) slightly higher for prepositional complement function (1.3) as compared to object function (1.0). The distributional similarities of non-standard S-forms in both functions support the inclusion of ‘prepositional complement’ in the definition of PE. Similar to object function, prepositional complements are another syntactic context where actual production data violate the Standard English requirement for the exclusive use of O-forms (compare the SED re- sults for Question VIII.7.5 ‘burglars steal they’, and the Computer Developed Linguistic Atlas of England, Viereck 1991, map M15 ‘with me’).

Here are some examples with different pronouns:

(53) ...buthenever interfered with I, but anybody else who came down here, he ’d go for. (prep. compl. I, FRED, SOM_020) (54) There was ten years between my sister and I see, she died last year, she was eighty-nine. (coord. prep. compl. I, final position, FRED, WIL_023) Personal pronouns 99

(55) We snapped he off like a damn carrot! (object he, referent ‘anchor’, FRED, SOM_028) (56) I did give she a ’and and she did give I a ’and and we did ’elp one another. (she and I objects, FRED, WIL_011) (57) Work didn’t frighten we, we knew we had to do it, we had to get on with it. (object we, FRED, SOM_005) (58) Of course the Plymouth buses, then there was the DMT’s Devon Mo- tor Transport that was the green buses. I remember they, you can’t remember they I suppose now. (object they, FRED, DEV_008)

5.2.2. Areal distribution

The areal distribution of S-form objects and prepositional complements in FRED is visualised in Figure 5 based on relative weighted frequencies.16 If we take uniformity between dialects as the null hypothesis (cf. Dahl 2001: 1458), we can observe that the two sub-features investigated here represent predominantly Southwestern features. They cluster heavily in the counties of Cornwall, Devon, Somerset and Wiltshire, the non-standard frequencies not being correlated with the overall number of pronominal objects and preposi- tional complements in the individual dialect areas. In the following, it will be shown that this distribution of non-standard S-forms is exceptional compared to other sub-features of PE.

(a) object function (b) prep. compl. function

Figure 5. Areal distribution of non-standard S-forms (I, he, she, we, they) 100 Nuria Hernández

5.3. Object forms in subject function

The use of object forms in subject function, especially in coordinated NPs like him and me or us and them, has been described as “arguably widespread enough among educated speakers in [present-day English] to be called stan- dard” (Denison 1998: 109). O-form subjects feature, for example, in several SED questions, and they were also noted in the incidental material of the survey.17 Historically, O-form subjects have been known for a long time, with early occurrences found in Old English examples such as (60).18 English is not the only language where object forms appear in the subject domain. In spoken Swedish, for instance, dem and dom ‘them’ have almost completely replaced de ‘they’ (cf. Kjellmer 1986: 446, fn 4). PE is, how- ever, not to be confused with the regular use of so-called quirky subjects in Icelandic, i.e. subjects with lexically selected case (cf. SigurDsson 1992; Fanselow 2002; Woolford 2006).

(59) I once dropped in a muck, me did there. (me in subject function, FRED, LAN_006) (60) him com færlice to micel leoht, and 3SGm-DAT came suddenly into great light, and hine astrehte to eorDan, and he gehyrde 3SGm-ACC/REFL threw to earth, and 3SGm-NOM heard stemne voices ’. . . and he threw himself to the ground and he heard voices’ ’. . . and it threw him to the ground and he heard voices’ (Ælfric Catholic Homilies i.386.6, taken from Mitchell 1985)

Contrary to O-form subjects, the use of O-forms after linking verb be is not always mentioned in the literature. Due to their extremely high frequency in Modern English sentences such as it is me or it was him, O-forms in this syntactic function receive much less attention. Quirk et al. (1972: 210), for example, state that “although the prescriptive grammar tradition stipulates the subjective case form, the objective case form is normally felt to be the natural one, particularly in informal style.” (also cf. Erdmann 1978; Harris 1981). Historically, the diffusion of O-forms in subject complement position ap- pears to have resulted from the reanalysis of impersonal it as described by Riley and Parker (1998: 38)19 for the following period: Personal pronouns 101

500–1000: He it is he is subject, i.e. ‘He is it’ = ‘He is the one’ 1300–1400: It is he word order change, but he is still subject 1400–1500: It is he it is reanalyzed as subject, so nominative case apparently follows be 1500: It is I nominative case after be extends to first person 1600: It is me object case follows be in analogy with other verbs According to this brief outline, O-forms started to replace S-forms after be around 1600, following the general tendency of post-verbal pronouns to be in the objective case. It appears that It is I was already being perceived as stilted and unduly formal in 18th century informal correspondence (cf. Tieken-Boon van Ostade 1994: 233).20 Nevertheless, advocates of Standard English have continued to support It is I well into the 20th century, even when aware that at least part of the English-speaking population feels uncomfortable using it. In his Guide to Correct English, Stratton (1949: 146) holds that, although to some people “these nominative pronoun forms after verbs seem strange,” they should “become comfortable with them. Learn to say over the phone ‘Is that you, Ethel?’ ‘Yes, this is I.’ ‘Is Mrs. Martin there?’ ‘This is she.’ ” Stratton’s attitude contrasts with newer, descriptive grammars such as the Cambridge Grammar of the English Language (Huddleston and Pullum 2002). Geoffrey Pullum, one of the authors, takes a humorous stand towards subject complement I in this interview with the University of California, Santa Cruz: The forms with nominative pronouns sound ridiculously stuffy today. In pres- ent-day English, the copular verb takes accusative pronoun complements and so does ‘than.’ My advice would be this: If someone knocks at your door, and you say ‘Who’s there?’ and what you hear in response is ‘It is I,’ don’t let them in. It’s no one you want to know. (April 15, 2002, http://www.ucsc. edu/currents/01-02/04-15/rules.html) The discrepancy between different opinions concerning grammatical correct- ness can, of course, cause uncertainty among language users and possibly lead to hypercorrection. Probably most people who mistakenly prefer I to me have at the back of their mind the glow of satisfaction they had, especially in school days, when they thought (just in time) to say ‘It is I’ instead of ‘It’s me.’ (Vallins 1952: 27) The pedagogical emphasis on the quite un-English nominative after be ...has led to a common hypercorrection: the use of nominatives after prepositions (as for my wife and I, and so on). The origin is simple: people have been taught that me is ‘bad’, so they avoid it except where they can’t possibly (nobody yet says *give it to I). (Lass 1987: 152, my italics) 102 Nuria Hernández

5.3.1. Subject and subject complement function

All O-forms in the corpus – me, him, her, us, them21 – appear in subject and subject complement function (examples will be given below). Table 11 lists all tensed occurrences in the data. Not included are no-verb utterances and question tags (discussed separately below) and disjunctive pronouns (cf. Hernández forthcoming). Also not included are occurrences of subject thee. Among those examples with thee which do not form part of a biblical citation, prayer, poem or direct speech produced at some point in the past, there are 43 simplex subjects, as opposed to 24 objects and 6 prepositional complements. Thee is obviously no longer exclusively objective.22 Examples for thee in different functions are also presented below. The results obtained show that O-forms are not used as simplex, non- coordinated subjects very frequently (mostly < 1%), but they contribute sub- stantially to the number of coordinated subjects in the corpus. In coordination, each O-form accounts for at least 50% of all pronominal occurrences in its category (except us, due to the small overall number of 1PL cases). There is no doubt that coordination has a positive influence on non-standard case forms. But at the same time, the results also show that coordination does not block standard case invariably,23 lending no support to the assumption that “ ‘double bound’ prevents constituents from being related by a transforma- tional rule, so pronouns in conjoined subjects cannot be subjective in form” (Emonds 1986: 99, tree structure 14).

Looking at the figures in more detail, another interesting result is that coor- dinated me in final position has a relatively low frequency compared to other pronouns. Examples from the corpus indicate that speakers tend to prefer and I over and me, whereas in initial position me is largely preferred over I. One possible explanation for this tendency is that and I is being treated like a frozen expression (cf. Householder 1987; Redfern 1994; Honey 1995). Still, the strong preference for I in second-conjunct position does not constitute an “X and I constraint” as suggested elsewhere (Grano 2006: 36).

(61) And many a time me and the boys have picked him up, sitting on the doorstep somewhere . . . (coord. subject me, initial, FRED, LND_004) (62) ...and him what owned that wood yard he used to own the cottage ... (subject him followed by relative clause, FRED, NTT_009) Personal pronouns 103

(63) Him and I ain’t been fishing for these last six weeks. (coord. subject him, initial, and final I, FRED, MDX_001) (64) Her says, I can get two loaves of bread, her says, for sixpence half- penny. (subject her, FRED, LAN_020) (65) . . . well us used to be shoved out there Saturday afternoons and go pictures and when us come out of there first place us went was to the Island because the pictures we saw was cowboys . . . (subject us, FRED, DEV_008) (66) I said, I ’ll have an echo-sounder. But Abey and them had wireless, which was better really . . . (coord. subject them, FRED, SFK_010) (67) And, course when you turn them out while thee, while you want to see that there (reg sic=idn) isn’t (/reg) no production there at all. (subject thee > you, FRED, SOM_007) (68) . . . ’cos if these hills wun steep, thee ’d got to get thee feet in one o’ these steps, and if thee missed one, thee could’st have the tub on top on thee if thee wasna strong enough to ’old it. (thee in different syntactic functions, FRED, SAL_038) (69) In the little pits – they wouldna listen to thee. (prep. compl. thee, FRED, SAL_038)

In subject complement function, the data show a clear preference for O-forms which ties in nicely with the historical development described above. The Modern English data in FRED show how rare S-forms have become in this context, making the prescriptivist call for S-forms untenable. The situation brings to mind Labov’s ‘principle of validity’: “When the use of language is shown to be more consistent than introspective judgments, a valid description of the language will agree with that use rather than with intuitions.” (Labov 1996: 84). The absolute number of occurrences in Table 11 may be on the low side for some pronouns, but the results nevertheless show that O-form subject com- plements are highly frequent in both coordinated and simplex NPs, leaving little space for variation. It is interesting to see that S-forms, despite their pref- erential treatment in prescriptive grammars, are an exception. On the other hand, their occasional appearance also contradicts claims which expect the invariable use of O-forms to the right of an inflected verb (e.g., Emonds 1986: 97). 104 Nuria Hernández

(70) And there was me in mi cut-down khaki trousers . . . (subj. compl. me, FRED, DUR_002) (71) Any-rate, uh, it was me and my two brothers . . . (coord. subj. compl. me, initial, FRED, LND_001) (72) . . . and there was me and him, and we were at this guard meeting . . . (coord. subj. compl. me, initial, and him, final, FRED, WES_008) (73) And there ’s her (v ‘interviewer clears throat’) and Mr Simpson, they ’d be drinking beer all the time . . . (coord. subj. compl. her, initial, FRED, LND_001) (74) It wasn’t us, we was in such a place we didn’t bother. (subj. compl. us, FRED, LAN_017)

Table 11. O-form proportions (%) in subject and subject complement function (compare functions in Table 21) syntactic function me him her us them TOTAL (of 1SG) (of 3SGm) (of 3SGf) (of 1PL) (of 3PL) % subject 0.02 0.03 2.1 0.6 0.2 0.3 (6/ 38472) (6/ 18640) (123/ 5913) (107/ 16859) (51/ 21722) (293/101606) subject, co. initial 78.9 50.0 55.6 0.0 50.0 67.4 (45/ 57) (11/ 22) (5/ 9) (0/ 2) (1/ 2) (62/92) subject, co. final 20.8 85.7 100.0 0.0 57.1 34.7 (11/ 53) (6/ 7) (4/ 4) (0/ 1) (4/ 7) (25/72) subject, co. middle 0.0 0.0 0.0 0.0 0.0 0.0 (0/ 2) (0/ 0) (0/ 0) (0/ 0) (0/ 0) (0/ 2) subject complement 85.1 80.0 83.3 70.0 100.0 83.0 (40/ 47) (16/ 20) (5/ 6) (7/ 10) (5/ 5) (73/88) subj. c., co. initial 78.6 100.0 100.0 100.0 100.0 82.9 (22/ 28) (3/ 3) (1/ 1) (2/ 2) (1/ 1) (29/35) subj. c., co. final 14.3 50.0 100.0 0.0 100.0 36.4 (1/ 7) (1/ 2) (1/ 1) (0/ 0) (1/ 1) (4/ 11) subj. c., co. middle 62.5 100.0 0.0 0.0 0.0 66.7 (5/ 8) (1/ 1) (0/ 0) (0/ 0) (0/ 0) (6/ 9) Personal pronouns 105

5.3.2. Coordination

The general relaxation of case assignment in coordinate structures has, for example, been observed by Frank Parker, Kathryn Riley and Charles Meyer (Parker et al. 1988, 1990). The authors suggest that because government, and thus case assignment, is blocked by the NP dominating the coordinate struc- ture, any form of personal pronoun can occur (compare Figure 6).

Figure 6. Tree structure with co-ordinated NPs

Interestingly, case assignment is not the only mechanism to relax in coordi- nation. In her study on Belfast English, Alison Henry (1995: 18) found that coordination has an influence on both pronoun choice and verbal agreement. Henry’s results show that “singular concord is impossible if the subject is a simple personal pronoun,” but “pronouns which are part of a co-ordination can have a singular verb, provided that they are not nominative.” In Henry’s study (p.23–24), the case and verb combinations shown in sentences (75-a)– (75-c) are hence acceptable whereas the coordinated subject forms in (75-d) are not.

(75) a. He and I are going. b. Him and me are going. c. Him and me is going. d. *He and I is going.

In the future, this apparent correlation between coordination, O-form subjects and singular concord will certainly be worth investigating in more detail. 106 Nuria Hernández

5.3.3. Question tags

In the relevant literature, examples used to illustrate O-form subjects usually consist of tensed S–V structures, whereas in the FRED corpus non-standard O-forms also appear in a variety of other structures, including no-verb utter- ances and question tags. In order to detect any systematic differences between these structures, they will be analysed separately. The results will show that O-form frequencies do, indeed, vary significantly depending on the syntactic context. In question tags, S-forms are the clear default option (512 cases, 95%), while O-forms are only used occasionally (27 cases, 5%; excluding you;no self-forms).24 The non-standard proportion here is higher than for simplex subjects in tensed clauses, but significantly lower than in subject comple- ments or coordinated subject NPs. The fact that not all O-forms appear in this context may be coincidental. Examples were found for him, us and them; should there exist a lexical constraint, it is not clear what that constraint should be.

(76) . . . the first lad as I seen down on this green was in was in with pony, and he – used to have a pony and run round, you know, didn’t him? (him as question tag subject, FRED, OXF_001) (77) ...Wehadacresofeggs. Wheeling and dealing, weren’t us? (v ‘laughter’) (us as ques- tion tag subject, FRED, SOM_030) (78) . . . ‘course they had to get these carrots for a lot of troops, hadn’t ‘em? (them as question tag subject, FRED, WIL_010)

5.3.4. No-verb utterances (1)

The distribution of S- and O-forms in question tags differs very clearly from their distribution in no-verb utterances.25 While pronouns in question tags are mostly S-forms and only rarely O-forms, no-verb utterances show the opposite tendency. In our data, we find O-forms in single-pronoun questions,

(79) {What, what politics did you take up, when you . . . } Me? Oh, I, I didn’t used to trouble about anything much. (me in no-verb question, FRED, KEN_005) Personal pronouns 107 in no-verb answers with simplex or coordinated pronominal NPs,26

(80) {How many of you were working there?} Ehm me and a man, a horseman and me. {Did the farmer’s wife do all the cooking and –} Oh yes, her and t’ girl, yes, we used to bake in a a brick oven. (coord. me and her in no-verb answers, FRED, WES_008) and in no-verb utterances with just, only, not or too which express some sort of contrast:

(81) {So it it was just you and her, actually?} Yeah. Just me and her. (no-verb answer with just, FRED, LAN_004) (82) {Were there some other lads living in the farm too?} No only me, and this here labourer. (no-verb answer containing only, FRED, NTT_008) (83) I don’t know who your mother is! I says, You do, says, The woman with the long plaits. Oh, says, Her. I says, That ‘s right. She says, Is she a fortune teller? I says, No, not her, that ‘s my mother. She had such long plaits she would sit on. (no-verb subject her after not, FRED, NTT_013) (84) Yes, me too. (no-verb answer with too, FRED, WIL_005)

In some texts, we find O-form subjects in no-verb statements such as (85), or independent self-forms in picture pronouns such as (86). Finally, pronoun variation is observed in pronouns followed by locative adjuncts, as seen in sentence (87).

(85) In fact I camped at Ayton with Norman when he was a young lad of fourteen years old. Him and a load more of us in tents. (no-verb coord. subject him, FRED, YKS_001) (86) {Can you tell me anything about W.M.S.52/9 please? [a photograph]} Now, this is a group of drivers in their winter uniform I don’t know many on but the fitter in the centre was Jack Smith the driver be- hind Bill Roscoe. [...]Harry Butler which in time became the Chief Inspector. {Back row first right.} Myself. {Middle row first right. That was in the thirties. Can you tell me anything about W.M.S.52/10 please? [another photograph]} Now this is the summer uniform. And this was taken in front of the depot. Lionel Lee he was a driver. 108 Nuria Hernández

{First left.} Myself at the back . . . (myself picture pronouns, no- verb, FRED, LAN_023) (87) We we was we was slitted, so was eh Wall End, you know, and him at eh over at eh, where ’s that eh Grasmere eh he was slitted both ears. (no-verb subject him with locative adjunct, FRED, WES_008) (88) {And who ’s the driver on the other engine, was it –} Mr Fred Moores, the one that ’s on that there little traction, he on that one there. (no-verb subject he with locative adjunct, picture pronoun, FRED, SOM_014)

Overall, O-form subjects are the default in no-verb contexts. They are used in 78.3% of all pronominal cases whereas S-forms only cover 4.3%. In addition, self-forms account for 17.4% of all no-verb pronouns, as seen in Table 12. In combination, O-forms and self-forms in no-verb utterances account for the highest non-standard proportion of all syntactic functions analysed in this study, with 97.4% in simplex cases and 93.5% in coordination. Based on the underlying speaker distribution, O-form subjects in no-verb utterances classify as a generalised feature which is used frequently and by a substantial number of speakers (32 speakers in the corpus, mean speaker- case ration of 1.7). With respect to the individual pronouns, him is the only 3SGm form in no-verb utterances except for two examples with he; her is the only 3SGf form; us the only 1PL form; and them the only 3PL form. In 1SG cases, the speakers used either me or myself, the only exception being the one example with coordinated I in (89).

(89) {Who would milk the cows?} Oh, I and the wife (v ‘laughter’). (co- ord. I in no-verb answer, FRED, SOM_031)

A comparison of no-verb utterances and tensed clauses provides additional insights on the influence of verb absence and coordination on pronoun choice. While the difference in frequencies between coordinated no-verb vs. tensed cases is not significant (1.2 σ, p = .21), it is highly significant for non- coordinated cases (5.9 σ, p = 3∗10−9). The influence of verb absence is thus only noticeable in the absence of coordination. From a wider perspective, this has far-reaching implications for a theory of case assignment in default-case environments, indicating that case in coordination is primarily influenced by coordination, irrespective of the presence or absence of the verb. Personal pronouns 109

Table 12. Case form distribution in no-verb statements and questions (including comparative constructions with verb elision, as in No, him I was going to have, not her (LAN_004), or She ’s told me that, not him (YKS_011); excluding you and yourself, all them, and demonstrative or qualified them as in Just them that were no use (WES_011))

no-verb simplex coordinated S-form O-form self-form TOTAL S-form O-form self-form TOTAL 1SG 021728 119525 3SGm 13 0411 02 3SGf 04 0404 04 1PL 01 0100 00 3PL 01 0100 00 TOTAL 130738 224531

5.3.5. Relative clauses and clefts

The last syntactic context analysed in this section with regard to variation in case are pronouns which are followed by a relative clause. This context, too, has long been discussed in the literature. Wright’s English Dialect Grammar (1905), for example, already included the use of O-forms in cleft sentences such as it was her that did it, and the use of O-forms before relative pronouns, as in him that did that ought to . . . . Among the more recent studies, Beal (2004: 119), mentions the use of third-person non-standard subjects in emphatic utterances such as You know, her that’s always late in the Newcastle Electronic Corpus of Tyneside English (NECTE, http://www.ncl.ac.uk/necte/). In FRED, speakers generally prefer O-forms before relative pronouns such as that, which, what, who, whom and as. In non-cleft occurrences, we find only 1 S-form, shown in (90), as compared to 13 O-forms, 8 of which have subject function. In cleft sentences, we get a similar distribution, with 2 S-forms (he) as compared to 10 O-forms. Unfortunately, no areal patterns could be detected due to the small number of occurrences.

(90) But, we that thought we were going to have to . . . (subject we before relative clause, FRED, SOM_029) 110 Nuria Hernández

(91) The only ones as needed water was the fire beater, him as used have to look after t’ furnace, he needed his water . . . (subject him before relative clause, FRED, LAN_020) (92) Of course us what could see a bit when it was time to go in, we used to get uh anybody with your hand . . . (subject us before relative clause, FRED, NTT_006) (93) And it was he who also kept the Park in front of the Castle . . . (he in cleft sentence, FRED, WES_010) (94) . . . she ’s supposed to be living with Gertie, that ’s her that was fore- woman at eh Standfast . . . (her in cleft sentence, FRED, LAN_010)

5.3.6. Context hierarchy

Based on the O-form frequencies in the different subject functions analysed above, we can establish a graded hierarchy of syntactic contexts. This hierar- chy goes from the highest frequencies in subject complement function (left) to the lowest frequencies in non-coordinated subjects (right). The empirically derived order lends itself to future comparisons with other datasets.

(95) Context Hierarchy of Non-standard Object Forms

subject compl. > no-verb > coord. subject > q-tag > simplex subject

5.3.7. Areal distribution

Object forms are the supraregional default in subject complement function and no-verb utterances. In the latter context, O-forms are distributed across the four dialect areas as follows: 10 SE (100%; O-forms in all no-verb cases), 8 SW (50%; plus 3 S-forms and 5 myself ), 8 Mid (88.9%; plus 1 myself ), 28 N (82.4%, plus 6 myself ). In subject complement function, O-forms clearly dominate in all dialect areas, representing a showcase general feature of spoken English. Their areal distribution is visualised in Figure 7 based on relative weighted frequencies. The additional information to the right side of the chart shows the number of standard deviations between the non-standard frequencies of different dia- Personal pronouns 111 lect areas, where Δ(SE,SW ) simply is the difference between the proportion of O-forms in subject complement function in the Southeast as compared to the Southwest. At a minimum confidence level of p < .05, the differences in frequency between the four areas are not significant.

Figure 7. Areal distribution: O-form subject complements (me, him, her, us, them)

Unlike the distribution of S-form objects and prepositional complements in Figures 5a and 5b, O-form subjects are distributed evenly across the four dialect areas as a supraregional phenomenon. The results obtained in FRED contradict other sources, for instance Trudgill (2004: 148) who states that “[t]he evidence of [Charles Benham’s Essex Ballads] and of the SED records suggests that in southern East Anglia the phenomenon was more restricted than in the southwest. The southwestern usage of him, her, us as subjects does not seem to have been a possibility; we witness merely the use of he, she, we, they as objects.” Figure 8 shows the individual charts for all O-forms, as well as the overall distribution of O-form subjects across the four areas in the bottom right graph. The values are based on relative weighted frequencies for all O-form propor- tions in subject function, since the absolute number of pronominal subjects varies for each pronoun and dialect area. The small graphs illustrate the general supraregionality of the individual pronouns. We can, for example, see that the distribution of her in FRED tallies roughly with the corresponding SED findings. In both datasets, her subjects are characteristic of the SW and Mid areas. However, her in FRED is not confined to these areas, whereas the SED suggests its use west of a line running from Portsmouth to south (SED Question IX.7.7.3 ‘she is’).27 A clear predominance of SW cases could only be confirmed for us. This can be attributed to an underlying speaker distribution pattern with 112 Nuria Hernández two outliers. Two of the SW speakers have exceptionally high O-form pro- portions in an almost 50–50 distribution of S- and O-form subjects: speaker TCA_EA used 69 we vs. 63 us, and speaker TCA_FK used 24 we vs. 21 us. If we take these two speakers out of the equation, the overall proportion of us in the SW area drops from over 77% to under 1%, so that the differences between the SW and the other areas become statistically insignificant. The in- fluence of these two outliers should also be kept in mind regarding the overall distribution depicted in the bottom right graph.

Figure 8. Areal distribution: O-form subjects (me, him, her, us, them)

The most surprising result in this particular analysis is perhaps the geographic distribution of O-form subjects in question tags. As observed above, this syn- tactic context has default S-forms, but O-forms are used occasionally. Now, a comparison of the four dialect areas shows that all O-forms in question tags appear in data from the SW. There are 27 O-form cases in the SW, which equals 12.8% of all SW cases. Even the most common O-form, them (18 cases, over 17.6% of all 3PL cases in the SW), appears exclusively in this one area. Hence, the distribution of non-standard them in question tags is Personal pronouns 113 very different from the distribution of non-standard them in other syntactic contexts, for instance in tensed S–V structures where it is distributed quite evenly across the country.

5.4. Personal pronouns in reflexive function

As mentioned at the beginning of this section, PE is traditionally described as a binary feature involving subject and object case forms. However, occur- rences of personal pronouns in reflexive function, on the one hand, and the use of self-forms in different subject and object functions, on the other hand, show that the ‘exchange’ involves three, not two, pronominal categories. The previous analyses showed that the exchange between S- and O-forms is bi-directional. In the following, we will see that this also applies to O- and self-forms. With S-forms, on the contrary, the exchange is uni-directional, given that no occurrences were found of reflexive I, he, she, we or they. The situation in FRED is reminiscent of earlier stages of English when reflexivity could be expressed by O-forms but not S-forms. Typologically, the corpus findings tally with the fact that identical reflexive and object forms are attested in various languages28 whereas “no nominative anaphors” appears to be “a pretty robust tendency” (Kiparsky 2008: 31).

(96) Mind you I was a bit on the safe side, I put a rope round me just, to tension up . . . (reflexive me, FRED, YKS_001) (97) Well, Boss lived in the houses at the back. He used to use this towel the one week and the next week he had the tail of a shirt for drying him on. (reflexive him, FRED, SAL_013) (98) No, he said, That isn’t me. Now he couldn’t recognise him [his own voice], he wouldn’t have it! (reflexive him, FRED, CON_006) (99) But Harold, he was a first bowler, and he got him a trial with four- teen. (reflexive him, FRED, LAN_012) (100) . . . she ’d, she ’d got hold – to try to save her, she ’d, she ’d got hold of this rail and it broke. (reflexive her, FRED, LAN_006)

Among the O-form reflexives in the corpus are 6 examples with me,3him, 2 her and you,1us and them, as shown in Table 13. Even if these numbers seem low, O-forms account for almost 4% of all reflexively used pronouns in 114 Nuria Hernández the data. In about half of all non-standard examples, the pronoun follows a verb which is intransitive in Standard English, such as wash, change or turn. In terms of areal distribution, not enough occurrences were found for any definite conclusion, but at least one example was found in each dialect area (1 SE, 3 SW, 6 MID, 5 N).

Table 13. O-form reflexives: absolute and relative frequencies

O-form reflexives frequency % me (of all 1SG reflexives) 6/ 116 5.2 you (of all 2SG/PL reflexives) 2/ 87 2.3 him (of all 3SGm reflexives) 3/ 56 5.4 her (of all 3SGf reflexives) 2/ 29 6.9 it (of all 3SGn reflexives) 0/ 16 0.0 us (of all 1PL reflexives) 1/ 46 2.2 them (of all 3PL reflexives) 1/ 53 1.9 TOTAL (of all reflexives) 15/ 403 3.7

This is ourselves, under pressure (Queen and “Under pressure")

5.5. Independent self-forms

Self-forms such as myself or yourself are the third formal variant involved in pronoun exchange. This section focuses on their use in different subject and object functions, including subject and prepositional complement positions, as well as no-verb utterances. Self-forms which fill one of these functions are neither reflexive nor should they be confused with standard intensifiers. Different terms have been proposed in the literature, including ‘absolute’, ‘unbound’ or ‘untriggered reflexives’. We will here avoid the term ‘reflexive’ altogether and refer to these occurrences as independent self-forms.29 In the literature, independent self-forms have been described as charac- teristic of Irish English (e.g., Filppula 1999, 2004; Harris 1993), (Miller 1993), and American English vernaculars (Evans and Evans 1957; Heacock 2008). The present study describes their distribution in Eng- land, where they represent a general spoken feature. Personal pronouns 115

The occurrences in FRED are historically linked to earlier stages of the language. It appears that, since Old English times, writers have never stopped using independent self-forms. They even appear in texts from the 18th cen- tury, a time when the ground was laid for many normative rules still respected today. Tieken-Boon van Ostade (1994: 220) describes the common use of ‘non-reflexive -self pronominals’ in the personal letters of various men and women of the time, both in subject and object functions, and especially in coordination. Similarly, Filppula (2004: 93) observes for Irish English that “although this feature is mainly found in vernacular and colloquial styles, occurrences can be spotted even in ‘educated’ varieties, including written language.”

(101) swa Þu self talast like you-NOM self-NOM say (Beowulf 594, quoted in Mitchell 1985)

(102) Himself drank water of the wel/ As dide the knight Sire Percivel (Canterbury Tales, The Tale of Sir Thopas, quoted in König and Siemund 1997)

In independent uses, self-forms are best understood in pragmatic terms, since they elude explanations of rule-based sentence grammar. Among the differ- ent explanations proposed in the literature are emphasis and contrast (e.g., Kuno 1987; Zribi-Hertz 1989), and politeness and modesty (Filppula 1999; Parker et al. 1988; Baker 1995). Another possible explanation seems to be that speakers who feel uncertain about the correct choice of pronoun case, for example in coordinated NPs such as me and him, choose self-forms as a less stigmatised alternative (cf. Emonds 1986: 119). In discourse grammar there is a shared conviction that “a grammatical theory of English reflexive pronouns cannot be complete without a discourse component” (Zribi-Hertz 1989: 703). Most explanations evolve around the concepts of speaker viewpoint (cf. Cantrall 1974: 94, ‘theory of incorporated identification’), empathy (Kuno 1987: 29, “syntactic manifestations of the speaker’s camera angles”), and logophoricity (Milroy and Milroy 1993: 147, “the reference draws on the shared knowledge of the speaker and hearer”). Filppula (2004: 93) mentions ‘topic reading’, a related concept which is ap- plicable to most FRED examples, especially when an emphatic interpretation is not indicated by stress in the recording: “an absolute reflexive is often used 116 Nuria Hernández with reference to that person or those persons who constitute the ‘topic’ of the conversation in some way or another.” Most independent self-forms, including most occurrences in FRED, con- form to one of these explanations. They are therefore not to be regarded as mere shortenings of underlying ‘personal pronoun + self-form’ combinations (you yourself → yourself), nor do they automatically correspond in meaning to complex intensified NPs (you yourself = yourself).30 Table 14 shows the absolute and relative frequencies of independent self- forms in different syntactic functions. Taking a closer look, one can detect patterns of context-specific behaviour, as well as individual high non-standard frequencies which are not apparent from the overall total obtained for each pronoun. The results will be interpreted in the following pages.

5.5.1. Subject function

The exchange between self- and S-forms has probably always been uni-direc- tional. In the data, self-forms appear in both subject and subject complement roles, although their use seems restricted to coordinated NPs. The complete absence of simplex cases tallies with observations found, for example, in the Oxford English Dictionary, where the use of non-coordinated independent self-forms is referred to as archaic or poetic (Murray et al. 1961/ 1989/ online edition). The latest OED examples date from the 19th century. They include the poetic use of myself after be,asinWhat am myself (year 1864); subject himself, archaic, as in The dagger which himself Gave Edith (1864); subject herself in Welsh and Gaelic varieties, as in Herself would . . . seat her down upon some linden’s root (1814; also to ridicule these vari- eties); subject ourselves,asinOurself learnt this craft of healing. Were you sick, ourself would tend upon you. (1847); and themselves, archaic or poetic, as in People’s timorousness shows how insecurely grounded themselves are (1853). Even the Dictionary of Contemporary American Usage (Evans and Evans 1957) – according to which self-forms are established beyond ques- tion in speech and literature, both in absolute constructions and after be – deems their use as simplex subjects artificial. And in his study of reflexivi- sation strategies in American English, Saha (1987: 232) even claims that “a reflexive [i.e. self-form] can never be the subject of a tensed verb.” Personal pronouns 117 % 0.00.2 0.00.0 0.1 0.0 0.0 1.0 0.0 0.0 0.4 0.0 0.0 0.1 0.0 4.2 0.4 0.0 0.00.0 0.00.0 3.40.0 0.0 0.0 0.0 16.7 0.00.0 0.0 33.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.1 0.0 0.0 5.3 0.0 21.8 0.0 16.1 0.04 0.0 0.10.01 0.1 0.01 0.02 0.07 0.02 0.04 0.01 0.04 1.1 0.0 6.3 0.0 6.3 2.1 0.1 0.1 25.6 25.020.0 100.0 0.0 0.0 0.0 0.0 20.5 (1/16)(0/21) (0/0) (0/1) (0/0) (0/1) (0/4) (0/2) (0/1) (0/4) (0/1) (1/24) (0/3) (0/30) (1/47)(7/28) (0/4)(5/25) (0/20) (1/1) (0/6) (0/0) (0/4) (0/10) (0/2) (0/4) (0/5) (0/4) (0/1) (1/92) (0/0) (0/1) (0/0) (8/39) (5/31) myself your-f/-ves* himself herself ourselves themselves TOT. (7/112)(11/43) (0/4) (1/29) (0/0) (0/13) (1/6) (1/3) (0/2) (0/9) (0/2) (9/170) (0/2) (12/55) (1/1915) (1/2278) (0/1949) (1/754) (1/945) (1/5406) (5/13247) (of 1SG) (of 2SG/ 2PL) (of 3SGm) (of 3SGf) (of 1PL) (of 3PL) (13/1210) (2/821) (1/941) (4/407) (2/565) (2/1418) (24/5362) (0/38472) (0/29977) (0/18640) (0/5913) (0/16859) (0/21722) (0/131583) (46/41848) (4/33085) (3/21590) (5/7097) (4/18387) (3/28569) (65/150576) Self-form proportions (%) in subject and object functions prepositional compl. prep. compl., coord. object, coord. ∗ extrapolated from sample syntactic function subject subject, coord. subject complement subj. compl., coord. no-verb subject no-verb subj., coord. object TOTAL Table 14. 118 Nuria Hernández

In other varieties such as Irish English, independent self-forms are more readily accepted in subject function. This is, for example, reflected in their regular use in Irish poetry as seen in the Songs of the Glens of Antrim by Moira O’Neill (1933):

(103) Och, when we lived in ould Glenann, Meself could lift a song! (“A Song of Glennan”)

(104) Meself began the night ye went, An’ hasn’t done it yet . . . (“Forgetting”)

(105) Herself ‘ud take the rush-dip an’ light it for us all . . . Himself ‘ud put his pipe down, an’ say the good word more . . . (“Grace for Light”)

Let us take a look at the distribution in FRED. As mentioned above, inde- pendent self-form subjects only appear in coordination (Table 14). Even if examples are rare, there are some tendencies to observe. First of all, myself seems more likely to occur as a coordinated subject than any other self-form. Secondly, the component structure of the coordinated NPs shows that all self- form subjects stand in non-initial position. And last but not least, occurrences can be found in all four dialect areas.

(106) I expect Mr Boobyer and myself have been in every rhyne there is in the moor that belongs to the Drainage Board. (coord. subj. myself, FRED, SOM_004) (107) I know as a boy I ’ve seen them around here, my granny and himself would get a big chunk of beef . . . (coord. subj. himself, FRED, CON_006) (108) . . . there was a place put on one side walled in where all the neigh- bours and ourselves used to put all our ashes . . . (coord. subject ourselves, FRED, SAL_005)

Emphasis seems to be a possible explanation for occurrences like (106)– (108), but in a slightly different sense than suggested in the literature. Let us consider the following idea. It is commonly accepted that conjunction naturally emphasises the difference between two or more entities involved, whereas plurality emphasises the similarities between them. Hence: Personal pronouns 119

(109) a. John, Bill and Tom have gone home. (conjunction: different boys)

b. The boys have gone home. (plurality: all boys)

These two examples were taken from Bhat (2004: 95), according to whom “[o]ne cannot use conjunction if no difference is indicated . . . whereas one cannot use plurality if differences cannot be disregarded.” If conjunction, by its very nature, emphasises the differences between conjunct elements, and if the use of self-forms is generally associated with emphasis, this explains the speaker’s predilection for self-forms in coordination. It also explains why, in many cases, the self-form itself does not carry extra stress: it is a variant that simply fits into the natural semantics of conjunction.

5.5.2. Third person cases

Special attention has been given to third person independent self-forms in the literature. Levinson (1997), for instance, has deemed them unacceptable, similar to Jespersen before him. An explanation is given in Levinson’s ‘Per- formative Hypothesis’ which states that

every sentence has as its highest clause in deep or underlying syntactic struc- ture a clause of the form [’I (hereby) V_p you (that) S’] – i.e. a structure that corresponds to the overt prefix in the explicit performative, whether or not it is an overt or explicit performative in the surface structure. (Levinson 1997: 247)

Thus, sentence (110-a) is judged acceptable, since it derives from (110-c), whereas (110-b) is not.

(110) a. Solar energy was invented by God and myself. b. *Solar energy was invented by God and herself. c. [I tell you that] solar energy was invented by God and myself.

Levinson is not alone with his argumentation. In the linguistic literature it is common practice to distinguish first/ second person pronouns from third person pronouns, based on the fact that only the former denote speech act participant roles. Some linguists even exclude third person from the semantics 120 Nuria Hernández of person altogether by regarding it as the absence of person (e.g., Harley and Ritter 2002b: 26; Richards 2008: 140). Another alleged difference between first/ second person and third person is the stronger need for reference disambiguation with third person pronouns, i.e. the need to distinguish co-reference in sentences such as Johni admires himselfi from disjoint reference in sentences such as Johni admires himj . According to König and Siemund (1997: 102), this distinction is a highly likely motivation for a distinctive reflexive paradigm.

As logical as all of these arguments sound, the data in the present study show that the claimed differences between first/ second and third person pronouns are not necessarily reflected in the degree of variation which the different pro- nouns allow for in actual conversations.31 Rather, the overall results support the following statement by Newmeyer:

The amount of formal ambiguity that one finds in language is enormous and ‘usefulness’ is such a vague concept that it seems inherently undesirable to base an explanation on it. In any event, it is worth asking how much ambiguity is reduced by a 3rd person reflexive anyway. (Newmeyer 2004: 535)

5.5.3. Subject complement function

More frequent than independent self-forms in subject function are indepen- dent self-forms in subject complement function (see Table 14). Here, too, almost all occurrences were found in coordinated NPs, almost all have my- self, and most of them are non-initial. In quantitative terms, the difference in frequency between simplex and coordinated cases is highly significant (X2 :3.65σ, p = .0003). Similar to self-form subjects, examples were found in all four dialect areas.

(111) ...we have the cowman and two two more men and then there ’s John and Michael and miself and a girl. (coord. subj. compl. my- self, FRED, NTT_015) (112) Let me see, there was Mr Scott isn’t it, suppose there was about eight men in the accounts and then there was two other girls and miself. (coord. subj. compl. myself, FRED, WIL_019) Personal pronouns 121

5.5.4. No-verb utterances (2)

Independent self-forms appear most frequently as subjects of no-verb state- ments and questions (only one object, see (116)). In the corpus, this context alone accounts for over 20% of all independent self-forms. Despite the fact that the absolute numbers are not very high, absence of V appears to play a significant role. Similar to the previous functions, most occurrences have myself, and their use is supraregional (SW, Mid, N). Among the corpus examples are several picture pronouns, a phenomenon which can not be discussed separately in this study.32 One example where the speaker was asked to comment on different photographs during the interview was previously shown in (86).

(113) {When you started, what age did you leave school then?} Myself, fourteen. (subject myself in no-verb answer, FRED, WES_003) (114) {And they heard that you bred pigeons?} Yes, oh, George Coleman and myself, George used to do the sending off. (coord. subject my- self in no-verb answer, FRED, DEV_005) (115) {Would you yourself hunt them back or did you get somebody?} No, yourself, after you buy them it ’s you it ’s up to. (subject yourself in no-verb answer, FRED, SOM_030) (116) Very good little school. We had three teachers, Miss Routledge the head, Miss Thomas the second, and Miss Thackeray the third one who taught the small ones. Myself to start with. (no-verb object myself, FRED, WES_016)

5.5.5. Object function

In the literature, the use of self-forms in object function is less frequently mentioned, and therefore less frequently condemned, than in subject func- tion. In coordination, especially in final position, self-form objects are usually considered acceptable (e.g., Murray et al. 1961/ 1989/ online edition; Evans and Evans 1957). Independent self-form objects are rare in the corpus. We find two cases with myself and one case each with yourself, herself, ourselves and them- selves. Although the number of examples is small, it suffices to confirm the 122 Nuria Hernández status of self-form objects as possible linguistic variants, especially because the different occurrences were produced by different speakers. A null hy- pothesis defined as self-forms never being used as independent objects can be rejected at a significance of 2.45 σ (p = .014). Once more, examples can be found throughout the country.

(117) Well, alas believe meself the wrongs we ’ve done. (object myself, FRED, NTT_013) (118) And there ’s a story to this. Might interest yourself. (object yourself, FRED, LAN_012) (119) If there ’d ha’ been any sea at all he ’d ha’ drowned herself. (object herself, referent ‘ship’, FRED, SFK_005) (120) . . . she did keep ourselves, and she did do all that in addition to her ordinary, you know . . . (object ourselves, FRED, SOM_010)

5.5.6. Prepositional complement function

In absolute terms, independent self-forms are most frequent in prepositional complement function. Compared to the large overall number of pronomi- nal prepositional complements, however, they represent a relatively rare phe- nomenon (0.4%, see Table 14). The examples from our data have historical precursors in earlier stages of the language as seen in example (121).

(121) Sittan læte ic hine wiþ me sylfne remain let I him-ACC with me-ACC self-ACC ‘I let him stay with me’ (Junius, Genesis 438, quoted in Van Gelderen 2000a)

Overall, 24 occurrences were found in the data. This number only includes occurrences which do not comply with the broad definition of ‘reflexive’ given above (pronouns coreferential with the preceding subject and bearing a beneficiary or recipient thematic relation). Despite this relatively small num- ber, a null hypothesis defined as self-forms never being used in prepositional complement function can be rejected at a significance of 2.4 σ (p = .02). Similar to the other functions described above, we find various instances of myself (13), but also himself (1), herself (4), yourselves (2), ourselves (2) Personal pronouns 123 and themselves (2). In prepositional complement function, too, independent self-forms appear throughout the country. Perhaps surprisingly – given the general rejection of non-coordinated in- dependent self-forms in the literature – all examples in the corpus are non- coordinated. Once more, emphasis and contrast present possible explana- tions. Especially the fact that self-forms appear to be favoured in comparative or contrastive utterances, indicates that they can be used to emphasise a spe- cific verb argument as opposed to other arguments in the same sentence. This is of course most obvious after prepositions such as as, besides, but, like and than. One of the few non-contrastive examples from the corpus is shown in sentence (123).

(122) ...I’lltakeyouaround different parts of the island. Which was very nice of him ‘course he ’s like miself now, he must be getting on in years. (prep. compl. myself after like, FRED, DEV_005) (123) . . . and I side-stepped the manager because he was a bit apprehen- sive about miself and asked if I could, you know, get the job. (prep. compl. myself, FRED, WIL_015) (124) Yes, one brother and then myself and three sisters, younger than myself. (prep. compl. myself after than, FRED, WES_005) (125) . . . and she used to have to go in there at six o’clock and nobody else in but herself. (prep. compl. herself, FRED, NBL_006) (126) They ’d tell the Lord everything that was wrong, not only with their- selves, not only with the church, . . . (prep. compl. themselves after not, FRED, SAL_023) (127) She make a bit of butter sometimes, she ’d save enough make a pound or a couple, you know, just for ourselves. (prep. compl. our- selves after just, FRED, SOM_010)

5.6. A revised definition of pronoun exchange

Pronoun exchange is a multi-faceted phenomenon which consists of different sub-features with different distributional tendencies. Overall, it appears that variation in case does not impinge on the functional interpretability of per- sonal pronouns or self-forms as long as they maintain an identifiable position within the sentence structure. 124 Nuria Hernández

Nevertheless, the distribution of formal variants is influenced by a variety of different determinants. First and foremost, the degree of variation observed between S-forms and O-forms, on the one hand, and S-/ O-forms and self- forms, on the other hand, varies in different syntactic functions as summarised in the Context Hierarchy of Non-standard Object Forms in (95). One highly influential factor is verb absence: the highest non-standard proportions of all syntactic functions analysed in this study were found in no-verb utterances. The influence of verb absence becomes clear in the absence of coordination, another important determinant. Coordination can be attributed special importance in pronoun variation. In coordinated NPs, the non-standard frequencies are demonstrably higher than in simplex NPs. This applies to O-form subjects, S-form objects and prepositional complements, and, in particular, self-form subjects and subject complements. The pronoun’s position within the coordinated structure can be decisive, too, as was shown by the preference for independent self-forms in non-initial position and the contrast between me and and and I. Two pronouns with particularly high non-standard frequencies in coordinated NPs are me and myself. In addition to syntactic determinants, pronoun variation can be triggered by pragmatic and discourse-structuring factors. Common explanations for the use of independent self-forms, in particular, are emphasis and contrast. Considering the many different influential factors which make PE such a complex phenomenon, it is difficult to sustain any general statements, both in terms of explanations and in terms of distributional tendencies. It can, how- ever, be observed that all of the individual sub-features appear throughout the country, i.e. in the four dialect areas specified in the England component of FRED (compare Figures 5, 7 and 8). This makes PE a supraregional phe- nomenon of spoken English, in contrast to the descriptions found in earlier accounts. A classification of PE as a Southwestern phenomenon is only justified for non-standard S-forms, but even here examples can be found in other parts of the country. The particular areal pattern of non-standard S-forms as opposed to the much more level distribution of non-standard O-forms and self-forms, explains why the former sub-feature is generally perceived as more dialectal and why it is usually given more attention in the literature. In the present study, the detailed analysis of different pronominal variants and syntactic functions showed that characteristic properties of one sub-feature of PE do not necessarily apply to the phenomenon as a whole. Personal pronouns 125

After a careful revaluation of the different PE sub-features, a revised defini- tion of PE is proposed which extends the traditional binary system visualised in Figure 9 to the ternary system shown in Figure 10. This revised definition includes self-forms as a third formal variant. It extends the functional range of PE into further syntactic contexts, including no-verb utterances, reflexive occurrences and also for–to and ECM constructions (not discussed here). Fur- thermore, this revised definition acknowledges the supraregional status of PE as a general feature of spoken English.

(128) Revised definition of Pronoun Exchange

Pronoun exchange is a multi-faceted phenomenon which includes different non-standard uses of subject, object and self-forms, in subject, object and reflexive functions. Variation can be observed between S-forms and O-forms, as well as between S-/ O-forms and self-forms. The exchange is bi-directional between all variants ex- cept between S-forms and self-forms, due to the absence of S-form reflexives. Pronoun variation can be triggered by syntactic factors such as coordination and verb absence, as well as pragmatic and discourse-structuring factors such as viewpoint, topicality, empha- sis and contrast. While individual sub-features may be regionally restricted, PE as a whole represents a supraregional phenomenon of spoken English.

Figure 9. Binary PE system, traditional definition

Figure 10. Ternary PE system based on FRED results 126 Nuria Hernández

A secret, kept from all the rest between yourself and me. (Lewis Carroll Alice’s Adventures in Wonderland, ch. 12) 6. Case variation in prepositional phrases

In the previous section on pronoun exchange, prepositional phrases (PPs) were described as one of various syntactic environments where variation can be observed between S-forms, O-forms and self-forms. This section now looks at pronominal PPs in more detail, giving special attention to factors which are known to favour non-standard case. After a general description summarising the findings for S-forms and independent self-forms presented in section 5, the focus will be on comparative PPs with as, like and than,as well as on so-called snake sentences.

6.1. Subject and self-forms in pronominal PPs

As might be expected, the vast majority of complements in pronominal PPs are O-forms, especially in non-coordinated cases. The two other morpholog- ical variants, S- and self-forms, show some interesting differences. To begin with, S-forms in prepositional complement function emerged during the Early Modern English period, representing a more recent development than the use of independent self-forms in the same function since Old English (compare section 5). In the FRED data, the use of S-forms after prepositions is most noticeable in the Southwestern counties and affects all pronouns to a similar extent (Ta- ble 10, Figure 5b), whereas the use of self-forms shows no regional clustering and is largely restricted to myself (the self-form which is also most likely to occur in other non-standard functions; Table 14). Case variation in pronominal PPs is a low-frequency phenomenon, yet this particular syntactic context appears to favour variation: in non-coordinated pronouns, S-form proportions are slightly higher after prepositions than in object function, and self-forms also appear more frequently after prepositions than in object or subject function. Here are some examples repeated from section 5:

(129) ...butheneverinterfered with I, but anybody else who came down here, he ’d go for. (prep. compl. I, FRED, SOM_020) Personal pronouns 127

(130) There was ten years between my sister and I see, she died last year, she was eighty-nine. (coord. prep. compl. I, FRED, WIL_023) (131) . . . and I side-stepped the manager because he was a bit apprehen- sive about miself and asked if I could, you know, get the job. (prep. compl. myself, FRED, WIL_015) (132) They ’d tell the Lord everything that was wrong, not only with theirselves, not only with the church, . . . (prep. compl. themselves, FRED, SAL_023)

As was mentioned in above, the use of S-forms in prepositional complement function is often associated with emphasis or is described as restricted to coordination.33 Even if counter-examples do not abound in the data, evidence from FRED contradicts any categorical descriptions of this kind. Regarding the role of specific prepositions, the corpus examples also fail to support the common perception that “the most common manifestation of this error [i.e. S-forms in prepositional complement function] contains the preposition between” (Riley and Parker 1998: 41). While this specific point still needs to be investigated in more detail, the present analysis did not return more instances with between than, for example, with for. The possibility for speakers to vacillate between different variants, even within the same sentence or paragraph, is illustrated by the consecutive use of between you and between yourselves in the following example:

(133) And when you ’d done that perhaps you ’d have three in the night to clean between you, and you ’d settle the work up between your- selves using cleaning oil . . . (prep. compl. you and yourselves in same sentence, FRED, SAL_011)

6.2. ‘He’s as tall as me’ – as, like, than: prepositions or conjunctions?

Grammarians generally agree that prepositional complements should always be encoded by object forms (come with me/ give it to him). Opinions vary, however, with regard to comparative constructions involving as, like and than, since these expressions are sometimes classified as prepositions and some- times as conjunctions. In the first classification, the pronoun following the preposition is a complement within the PP. It hence requires object case as shown in (134-a). In the alternative classification, the pronoun following the 128 Nuria Hernández conjunction is the subject of a VP with verb ellipsis, requiring subject case as shown in (134-b).34 For the purpose of this study, contrastive expressions such as as, like and than will be treated as prepositions.

(134) a. You are taller [PP than [NP him]]. b. You are taller than [VP he (is)].

This dispute dates back as far as the 18th century, when comparative PPs were already one of the contexts discussed with regard to pronoun case (cf. Leonard 1929 (1962, reissued: 263–264. According to Jespersen (1949: 227– 228), “the existence of such two-sided words as but, etc, is one of the primary causes of mistakes of me for I or vice versa, and careless uses of the cases generally.” Jespersen also mentions that “the conjunctions as and than, used in comparisons, give rise to similar phenomena . . . the feeling for the correct use of the cases is here easily obscured, and he is used where the rules of grammar would lead us to expect him and inversely” (Jespersen 1949: 231). Part of the problem is the speakers’ uncertainty with regard to pronoun choice, caused by contradictory rules and recommendations. Even today, the natural use of me in sentences like (134-a) is still branded incorrect in pre- scriptive grammars and educational institutions. At the same time, the use of I without a verb to follow continues being promoted, leading speakers who feel uneasy about this rule to “add a superfluous verb more frequently than peo- ple of other nations in such sentences as ‘He is older than I am.’ ” (Jespersen 1949: 264). For many decades, speakers striving to use ‘good English’ have been con- fronted with conflicting views. In his guide to Good English, for example, Vallins (1952: 73) stresses that than is a conjunction, not a preposition. He attributes the popularity of sentence-final me after comparative expressions to the fact that me, as he sees it, carries stress better than I. Since stress in Eng- lish usually falls on the last syllable of a sentence, Vallins argues, speakers tend to put sentence-final pronouns in the objective case. In a different Guide to Correct English, Stratton (1949: 160) insists that “[b]etween is a preposition; pronouns after prepositions must be in the objec- tive case.” Stratton therefore concludes that “no matter who says it ‘between you and I’ is always wrong. So are ‘between him and I,’ ‘between them and I,’ ‘between them and we.’ Only correct are ‘between you and me,’ ‘between him and me,’ ‘between them and me,’ ‘between them and us.’ ” Personal pronouns 129

Although the present study is not interested in aspects of ‘correctness’ or ‘well-formedness’, it is important to note just how much confusion such con- troversial statements can cause, and to be aware that extra-linguistic factors may influence the speech production to a considerable extent. In addition, the same uncertainty may also be responsible for the appearance of self-forms as an avoidance strategy, i.e. in order to avoid having to choose between subject and object case. According to Jespersen (1949: 172), “[s]ometimes the self- pronoun stands by itself; this is especially found in groups with and, and after as, like and than.” A different explanation for the appearance of self-forms after compara- tive expressions like as and than is emphasis, or rather contrast. Emphatic pronouns can be used to direct the hearer’s attention towards one specific ref- erent in the discourse, or they can be used to stress a particular difference between two or more referents. Zribi-Hertz (1989: 699), for instance, found that emphatic uses may trigger an alternation between pronouns and anaphors (self-forms) in violation of binding principle A as described in section 1.2. This can happen in multiple-foci structures or closed sets where a certain NP expresses a dominant role. Zribi-Hertz distinguishes between three types of structures: conjunctive (i.e. coordinated), disjunctive, and comparative, as shown in the following three examples:

(135) a. John believes that letter was sent to both him and Mary/ Mary and himself. (conjunctive) b. John believes that letter was sent to no one but him/ himself. (disjunctive) c. John thinks that Mary is taller than him/ himself. (comparative)

In our data, more than half of all independent self-forms in pronominal PPs form part of comparisons. They either appear after obviously comparative expressions such as as, besides, but, like or than, or in comparative sentences after prepositions like about, for, of or with. Before we take a look at the quantitative distribution of the different case forms, consider the following examples containing different comparative expressions. Other examples with just, only, not, too and but were shown in sections 5.3.4 and 5.5.6.

(136) Well mi brother and I, Fred, he was older than I ... (prep. compl. I after than, FRED, NTT_011) 130 Nuria Hernández

(137) I mean mi sisters knew, they were about four years older than me, both of them . . . (prep. compl. me after than, FRED, YKS_006) (138) I might be in the same boat as him. (prep. compl. him after as, FRED, LND_001) (139) {So your father would have had his team of two helpers, two under- gardeners} The two gardeners, much, nearly the same age as him- self ... (prep. compl. himself after as, FRED, WES_012) (140) ...my mother would take me with her, now she had, she had as I said nine of family besides herself ... (prep. compl. herself after besides, FRED, YKS_007) (141) . . . small places like you didn’t have a lot cattle to sell. No not like we here one, last year, year before we sold forty up Camborne in one day. (prep. compl. we after like, FRED, CON_005) (142) Well, one of them didn’t marry a Catholic like us, so she got mar- ried in the registrar office . . . (prep. compl. us after like, FRED, LAN_009) (143) Mother and Dad, they used to play draughts quite a lot. He was very keen on draughts, and Mother was quite good. She could play quite well, too. And that used to be uh, and I used to play, but, you know, I couldn’t play as well as they. But uh that used to put through a lot of the evenings. (prep. compl. they after as, FRED, CON_011) (144) ...ifMrsThatcher gets Prime Minister, she gets same pay as them. (prep. compl. them after as, LAN_012)

Compared to other prepositions in the corpus, the distribution of pronoun case after as, like and than indicates that comparative contexts favour variation. Table 15 shows a breakdown of all pronouns after as, like and than by pro- noun case, as well as all non-comparative occurrences. The totals in the last two rows show a comparatively higher proportion of non-object case forms in the first group. In addition, a closer look at the individual pronouns con- tradicts those linguistic authorities which assume a fundamental behavioural difference between me and third-person pronouns him or her. Merriam Web- ster’s Online Dictionary, for example, states in the dictionary entry for ‘than (preposition)’ that “me is more common after the preposition [than] than the third-person objective pronouns.” No such tendency is supported by our own findings.35 Personal pronouns 131

Table 15. Case form distribution after as, like and than (non-coordinated)

pronoun case after . . . S-forms O-forms self-forms TOTAL as 3 (7.7%) 34 (87.2%) 2 (5.1%) 39 (100%) like 3 (9.1%) 24 (72.7%) 6 (18.2%) 33 (100%) than 6 (4.8%) 118 (94.4%) 1 (0.8%) 125 (100%) TOTAL {as, like, than} 12 (6.1%) 176 (89.3%) 9 (4.6%) 197 (100%) TOTAL, non-comparative 46 (1.1%) 4318 (98.7%) 10 (0.2%) 4374 (100%)

From a wider perspective, comparative PPs match the default case environ- ments described by Schütze (2001). These are syntactic environments which are characterised by a general predominance of O-forms (in Table 15, 89.3%), accompanied by a relatively high proportion of other case forms, i.e. S-forms (here 6.1%) and independent self-forms (4.6%). Similar to other elliptical constructions, it could be argued that pronoun case in sentences with under- stood predicates is not determined by the syntax, and that speakers therefore choose default case as seen in most examples in the corpus.36 In the generative account of Emonds (1986), subjects with understood predicates are not immediate constituents of a sentence containing an in- flected verbal element, i.e. they are not governed by Infl, thus permitting the use of object case. The corresponding sentence structure is exemplified in Figure 11.

Figure 11. Tree structure for subjects with understood predicates (Emonds 1986: 99) 132 Nuria Hernández

Based on our own results, we can say that as, like and than favour the use of non-object case more than other prepositions. However, the difference is effectively not very strong, given the speakers’ predilection for O-forms in both environments. It thus makes sense to argue that as, like and than are generally treated like prepositions, and that additional variation in case is induced by emphasis and contrast, but also uncertainty and hypercorrection. These are the same factors which explain non-standard occurrences in various other elliptical environments, including no-verb utterances (sections 5.3.4, 5.5.4) and coordination (section 5.3.2).

6.3. Snake sentences

A different environment which appears to be conducive to variation in case are locative PPs. Here, variation mainly concerns the use of O-forms and self-forms after prepositions indicating a location relative to the subject, such as near or behind. An extensive list of such prepositions can be found in Jespersen (1949: 165–167), including above, beneath, over, about, around, before, in front of, behind, after, within, upon and with, directional for, to and towards, as well as reciprocal between and among. In the following, con- structions with these prepositions will be referred to as snake sentences,a term which owes its popularity to examples like Mary saw a snake near her or Bill found a snake near him (e.g., Koktova 1999: 252; Huang 2000: 23; Haspelmath 2008: 55). Snake sentences are usually discussed in connection with anaphor bind- ing. Consider example (145) where the sentence subject is coreferential with the pronoun inside a locative PP. The acceptability of prepositional comple- ments which are not marked for coreference in similar examples tallies with the typological tendency that direct objects are more likely to be specially marked than adjuncts or indirect objects (cf. Van Gelderen 2000b). English, in this respect, behaves similar to languages with optional coreferentiality- marking, such as Dutch (see (147)), but different from languages where the use of a reflexive form is required. One such language is German, as seen in (146). According to Faltz (1985: 100) languages like German have a Strict Clause Condition (SC) specifying that “a reflexivization rule must apply when- ever the distance between the [coreferential] NPs is smaller than a clause (i.e. whenever the NPs are in the same clause).” Hence, the German coreferential pronoun in (146) is an overt reflexive sich. Personal pronouns 133

(145) a. Maryi saw a snake near heri .

b. Maryi saw a snake near herselfi .

c. Maryi saw a snake [which was] near heri .

(146) Siei sah eine Schlange neben sichi /*ihri . She saw a snake besides REFL/ DAT

(147) Zij zag een slang naast zichi /haari . She saw a snake next to REFL/ OBJ

According to Faltz, snake sentences in English

may have arisen through a confusion between the interpretation of near him as a locative in the main clause and the interpretation of it as a reduced rel- ative clause on the direct object. In the second interpretation (which is still possible) the coreferent nonreflexive pronoun really is in a different clause, hence it is to be expected because of the Clause Mate condition. (Faltz 1985: 102)

This Clause Mate Condition (CM) states that a reflexivisation rule can only apply if the two coreferential NPs are in the same clause – which in many cases coincides with the Governing Category. Reflexivisation is therefore prevented if we assume that the locative PP is a reduced relative clause of the type shown in (145-c) (similar, in a way, to the elliptical construction in (134-b)). Unfortunately, the CM condition makes sense in some examples but is difficult to implement in others, as can be seen in (148) and (149) (*They placed the guns which were in front of them./*John hid the book which was behind him).

(148) a. Theyi placed the guns in front of themi .

b. Theyi placed the guns in front of themselvesi .

(149) a. Johni hid the book behind himi .

b. Johni hid the book behind himselfi . Based on the two conditions identified by Faltz, the observed variability in English could be attributed to rivalling interpretations: the subject and ana- phor are in the same clause (she saw a snake near herself ), or the subject and anaphor are in different clauses (she saw a snake [which was near her]). Either way, the problem remains that an underlying relative-clause structure can only be argued for in some utterances. 134 Nuria Hernández

6.3.1. Discourse perspective

An entirely different analytical perspective focuses on the discourse function of morphological variants in sentences like (145), (148) and (149). Cantrall has argued that pronoun case can be used to indicate a change in the speaker viewpoint. It could, for example, be argued that Mary is described as seeing the snake from an outside perspective in (145-a), but from an inside perspec- tive in (145-b). A subtle change in meaning is also illustrated in (148-a) and (148-b) (cf. Cantrall 1974). Here, it could be argued that the act of placing the gun in the first sentence is described from the viewpoint of somebody who does not belong to the group referred to by they, as opposed to somebody who forms part of they in the second sentence. In Cantrall’s words, the antecedents in (145-b) and (148-b) are “being asserted to be involved in the recognition of the co-reference” (Cantrall 1973: 46–47).37 Subsequent studies have proposed approaches similar to Cantrall’s key- concept of viewpoint. Kuno (1987: 206), for instance, bases his explanations on the concept of empathy, which he defines as “the speaker’s identification, which may vary in degree, with a person/ thing that participates in the event or state that he describes in a sentence.” For Zribi-Hertz, it is the Subject of Consciousness, not the syntactic subject, which has the strongest influ- ence on pronoun choice. This semantic property is assigned “to a referent whose thoughts or feelings, optionally expressed in speech, are conveyed by a portion of the discourse” (Zribi-Hertz 1989: 711).38 In (149), for example, the difference in meaning is attributed to the fact that John is the subject of consciousness in (149-b) but not in (149-a). While sentence (149-b) can be understood as the book being very close to John, (149-a) simply states that the book is behind John. Thus, “it is the structural properties of pronouns that are, in a sense, derived from their discourse properties” (Zribi-Hertz 1989: 705). Discourse grammar has changed the perception and understanding of snake sentences by breaking free from syntactic justifications solely based on the antecedent–anaphor relation. Instead, the focus has shifted to the extra-lin- guistic referent and the speaker–referent relation. Based on discussions of self-forms in snake PPs, inter alia, the binding domain of anaphora has been extended indefinitely “by assuming that NPs can be bound by NPs in any Ss provided that a certain ‘command’ relationship is met and that a set of semantico-syntactic conditions is fulfilled” (Kuno 1987: 74).39 Personal pronouns 135

6.3.2. Self-directed vs. other-directed verbs

An additional aspect in the discussion of snake sentences is the directedness of the verb, i.e. whether a certain verb is interpreted by default as self-directed or other-directed. This concept probably goes back to Jespersen (1949). König and Siemund (1997: 103) have shown that the distinction between self-directedness and other-directedness can be used to explain the morpho- logical case marking in typical snake sentences. Considering the meaning of put in sentences (150) and (151), for example, it can be argued that ‘to put something somewhere’ usually involves an other-directed change of location, and that self-directedness, therefore, needs to be marked in (150). In (151), on the other hand, the action described by the verb is naturally self-directed. No re-directing is required and the anaphor can stay unmarked.

(150) Mary put the book behind herself. (151) Mary put all problems behind her.

Similarly, Kiparsky (2008: 42) assumes that morphological variants as those in (152) and (153) mark a structural difference. In his view, the pronoun needs to be marked as a reflexive if the PP is a locative argument as in the first sentence, and “if and only if a referential expression can be substituted for it” (e.g., John aimed the gun at Bill). The pronoun stays unmarked if the PP is part of the predicate, as seen in the second sentence (compare *John brought the gun with Bill).

(152) Johni aimed the gun at *himi /himselfi .

(153) Johni brought the gun with himi /*himselfi .

6.3.3. Corpus results

The corpus results corroborate the outlined approaches only in part, and only where coreferential pronouns are unmarked. In accordance with Cantrall’s concept of viewpoint, for example, third-person referents not involved in the recognition of the co-reference are always encoded by object forms. This can be seen in sentences like (154)–(156) where the respective referents and their actions are described from an outside perspective. 136 Nuria Hernández

(154) . . . sleep anywhere, oh Christ aye, I seen them many a time sleeping under that bloody bridge, with papers all over them, and sacking and one thing and another, like that. (O-form in locative PP, FRED, NTT_013) (155) . . . and halfway through the match it started to rain and mi fa- ther always got a tarpaulin sheet and they were sitting in the back with a tarpaulin sheet over them. (O-form in locative PP, FRED, SAL_026) (156) And Fred he ’d they ’d always got a mechanic round them, he says ... (O-form in locative PP, FRED, NTT_014)

Other cases such as (157)–(160) correspond to the descriptions of Kiparsky and of König and Siemund, showing naturally self-directed verbs as well as instances where the PP is part of the predicate with unmarked forms. Also note the colloquial use of overt anaphora with verbs like look behind and leave behind.

(157) And from there onwards he never looked behind him, because he sold it to an American. (FRED, WES_017) (158) And the girls ran out when this bang came. And they carried their knitting with them. And they carried their knitting with them, and left the wool behind them, and it was all a network of wool. (FRED, WES_019) (159) Well, we had a cow dog with us, called Sharpy, he used to round them up if they start strolling off, you know . . . (FRED, SOM_011) (160) Some might take Old Dr Watson’s Tonic Stout with them, that we used to make. (FRED, WIL_009)

The approaches outlined are not in accordance with those utterances where we would perhaps expect to find a self-form anaphor, but where coreferen- tiality and self-directedness are not overtly marked. Let us consider the role of viewpoint and empathy first. Events or actions in which the speaker has a central role, for example as the agent or benefactive of the verb, naturally describe an inside perspective and the highest possible degree of speaker– referent identification. In examples (161)–(163) (including (96), repeated here for convenience as (164)), however, none of the speakers marked the corefer- ential pronoun as reflexive, despite the inside perspective and the high degree Personal pronouns 137 of identification in each utterance. This is also the case with third-person ex- amples such as (165).

(161) . . . you used to throw so much coal off the [coal] face, take that bit of coal off, and you used to timber up in front of you, and you used to go a bit further in and take a bit more coal out, then timber again ... (FRED, DUR_001) (162) ...I had crowds around me, I couldn’t half belt them. (FRED, DUR_003) (163) When we came out, you couldn’t see your hand in front of ‘ee ‘cause ‘t was so dark. (FRED, SOM_032) (164) Mind you I was a bit on the safe side, I put a rope round me just, to tension up . . . (reflexive me, FRED, YKS_001) (165) And the eel comes up and bites at the eels, at the worms, and then they ’ve got their big tank beside them that they just flicks it out and the eel drops off into the bath. (FRED, SOM_004)

Reflexive marking is also missing from some locative PPs which require it according to the above-mentioned arguments by Kiparsky. In sentence (164), for instance, reflexive me could be substituted for a different referential ex- pression, as in I put a rope around him/ John. The verb put is typically other- directed (compare (150)). The only difference between the FRED example in (164) and Kiparsky’s example in (152) lies in the person: first-person me in (164) can only refer to the speaker, whereas the absence of a reflexivity marker in (152) could potentially cause ambiguity. In Kiparsky’s sentence the reflexivity marker therefore also functions as a disambiguation device.

Probably the most interesting point in this discussion is the observable gap between, on the one hand, examples given in the literature to illustrate the need for overt reflexive marking, and, on the other hand, the rare need for disambiguation in actual conversations. In the FRED interviews reflexive marking in potentially ambiguous sen- tences is much less common than one would intuitively expect. While the distinction between self- and other-directed verbs presents a logical solution for otherwise ambiguous cases (shei put the book behind heri /j ), it appears that its use in spontaneous conversation is limited, especially if coreferential- ity can be inferred from the wider context. Instead, speakers seem to follow 138 Nuria Hernández a very simple, least effort principle which we can call the ‘Avoid Ambiguity’ Principle. This principle implies that no special case marking is required in sentences where no disambiguation is required, including sentences where relations of coreferentiality can be inferred from the context.40 Although, at present, the existence of such a principle still rests on conjecture, a tentative formulation is presented in (166). Special tests will be needed to ascertain whether such a principle really exists and in order to measure its influence on linguistic choices.

(166) ‘Avoid Ambiguity’ Principle

Use unambiguous case if disambiguation is required; if no disambiguation is required mark case as usual.

6.4. Summary: Prepositional phrases

In pronominal PPs, variation in case can be triggered by a variety of fac- tors. Emphasis, for instance, suggests itself as an explanation for the use of non-object forms after comparative expressions such as like, as and than,but also in other PPs where one referent is delimited against the other entities in the conversation. At the same time, external factors should not be underesti- mated. This concerns, in particular, the speakers’ uncertainty regarding some syntactic constructions. The ongoing debate about ‘correct’ case assignment after comparative as, like and than is reflected in many contradictory rules and recommendations in official grammars and educational institutions. A certain degree of hypercorrection can hence be expected in sentences con- taining these expressions. The debate also concerns so-called snake sentences. It has been argued that the morphological distinction between personal pronouns and self-forms in such sentences is used to mark underlying structural differences or dif- ferences in viewpoint. Based on the corpus results, however, it appears that speakers rarely make use of this option in spontaneous speech. The findings fully confirm Faltz’s statement that “English speakers agree that the nonre- flexive pronouns are perfectly acceptable even when coreference with the subject is intended” (Faltz 1985: 101). In the FRED interviews, O-forms are found in all contexts where, according to theories proposed in the literature, one would at least expect a strong predominance of self-form reflexives. The Personal pronouns 139 absence of overt reflexivity marking does not appear to cause any misunder- standings, even in supposedly ambiguous cases. Nevertheless, it is important to note that most examples that are used in the literature to illustrate links between pronoun case and viewpoint, or between pronoun case and underlying structural differences, are taken from formal discourse, often narratives. Such examples differ fundamentally from the data at hand, the latter being characterised by spontaneity and a face-to-face set- ting. It can hence be argued that the tendencies observed in this study do not so much contradict the existing theories but rather reflect the morphosyntactic variability and diversity of different registers and media. For the interpretation of examples from the corpus this means that the observed variation is not, or not entirely, predictable by the available theories. In the contrastive and loca- tive PPs in our data, viewpoint, empathy, and even coreferentiality, are rarely deducible from pronoun case. Ambiguity is usually absorbed by contextual information and logical connections between the sentence components (e.g., cause and effect). Overall, the results show a general preference for O-forms, irrespective of the semantic and structural differences that could potentially be marked by pronoun case.

7. Qualified pronouns

In the previous sections we looked at personal pronouns standing on their own in argument positions. We will now focus on pronouns that combine with qualifiers in complex NPs such as us girls. We will refer to these pro- nouns as qualified pronouns; in the literature they are also known as modified pronouns. The analysis presented in this section includes qualifiers from different grammatical categories: nouns as in us girls, numerals as in us two, names as in we Garbutts, and combinations thereof, such as us two girls or us Outton boys. Also included are the three quantifiers all, both and each,asinthey all, they both and they each. Excluded are appositional cases such as It’s me, John (cf. Bhat 2004: 44–45).41 The analysis will be presented in three parts. Section 7.1 gives a general overview of post-pronominal qualification as observed in FRED, including the distribution of case forms and a comparison of synthetically vs. analyti- cally qualified occurrences (us two vs. the two of us). In section 7.2 special 140 Nuria Hernández attention will be paid to post-pronominal qualification in second person plu- ral NPs. Section 7.3 finally will focus on pre-pronominal qualification. Areal distribution patterns will not be analysed in detail since pronoun qualifica- tion represents a supraregional phenomenon. No significant differences were detected between the four dialect areas in either synthetically or analytically qualified occurrences.

7.1. Constructions with qualifier nouns and quantifiers

Pronoun qualification occurs with both singular and plural referents. The best-known singular examples are probably second-person vocatives of the you bastard type (cf. Dunkling 1990) and pre-pronominal qualifiers as in poor me or soft you. In the plural, pronoun qualification is used to specify groups with more individuals than just the speaker and hearer, as in We teachers must resolve this problem or You students need to wait until four (Bhat 2004: 45). Here are some first examples from the corpus with qualified singular you:

(167) And I come home and he said to me, Who are you boy? (in the recording ‘you boy’ forms a prosodic unit, FRED, CON_006) (168) But he said, I ’ll get hold of you, you bastard! (FRED, LND_001) (169) Go on, do it you bastard, she said, . . . (FRED, LND_001)

Among the post-qualified pronouns in the corpus we find singular and plural you, 1PL we and us, and 3PL they and them. Table 16 shows a breakdown of the absolute frequencies by qualifier type, including qualifier nouns and personal names, qualifier numerals, and the three quantifiers all, both and each. In Table 16 the numbers for the second person include generic you and you all,asinEverybody started, well you all started (YKS_007). The two 3PL pronouns they and them can be classified as demonstratives when fol- lowed by a noun or numeral, as in them two or they days. Due to the many demonstrative occurrences, 3PL pronouns have the largest share in the over 2700 examples in the corpus. Occurrences with all, both and each were divided into quantificational uses (listed as all–Q, both–Q, each–Q) and universal uses (all–U, both–U, each–U). This distinction is, for example, made in the Cambridge Grammar (Huddleston and Pullum 2002: 427), where inseparable ‘universal pronouns’, Personal pronouns 141

Table 16. Qualified pronouns, breakdown by qualifier type (synthetic)

qualifier qualified pronouns noun/ numeral all all both both each each TOTAL name –U –Q –U –Q –U –Q you–SG 25 — — — — — — — 25 we 63 4 1 300 0 19 0 18 405 thereof obj./prep.compl. 9 1 0 0 0 0 0 0 10 us 72 8 24 0 2 0 0 4 110 thereof subj./subj.compl. 29 4 0 0 0 0 0 0 33 you–PL 24 6 4 36 0 2 0 1 73 they 144 5 0 385 0 36 0 10 580 thereof obj./prep.compl. 124 3 0 0 0 0 0 0 127 them 1297 19 180 2 10 0 0 7 1515 thereof subj./subj.compl. 143 10 0 2 0 0 0 0 155 TOTAL 1625 42 209 723 12 57 0 40 2708 as in She likes you all (She had liked [you all]), are distinguished from sep- arable ‘quantificational adjuncts’, as in We all enjoyed it ([We] had [all] en- joyed it). The question whether composite NPs like we all should be treated as lexical units can not be discussed here. On the one hand, the relatively high frequencies of such composite NPs, and the contiguous pronunciation of the two component parts (y’all /jO:l/) speak in favour of such an interpretation; the fact that they can be separated by verbs and other parts of speech speaks against it.

Based on the corpus examples, the most salient properties of pronoun quali- fication can be described as follows:

– The primary function of pronoun qualification is referent specification. Pronoun qualification provides additional information used to specify individual referents or groups of two or more referents. 142 Nuria Hernández

– Pronoun qualification can be used to distinguish number in the second person, where the distinction is no longer morphologically marked in English.

– Pronoun qualification is a means for quantifying plural referents and for indicating distinctions which are not marked in the Modern Eng- lish paradigm. Among the different qualifiers, the quantifier both and the numeral two encode duality – a concept which in earlier stages of English was expressed by the dual forms wit (‘we two’) and git˙ (‘you two’). While there is theoretically no limit to the number of referents that can be encoded by numeric quantifiers, it is interesting to see that speakers seem to prefer analytic constructions for multi-morphemic numbers, e.g. us two but thirty-two of us. Synthetic equivalents with large num- bers such as us thirty-two or us two hundred rank intuitively low in acceptability, indicating that the productivity of numeric qualifiers in synthetically qualified NPs is not unlimited.

– Pronoun qualification is not used to mark inclusive–exclusive distinc- tions. Similar to other European languages, English has no such dis- tinctions in its pronominal paradigm, i.e. no morphological or lexi- cal distinction between pronouns which do or do not include the ad- dressee(s): we can mean both ‘I + other(s)’ or ‘I + you (+ other(s))’. Covert clusivity can only be inferred from the situational context: if a speaker A welcomes a hearer B with the words I’m glad we two could meet, it can be assumed that we two means ‘A and B’.

– A certain degree of persistence can be observed in some interviewees who show personal preferences for specific pronoun–qualifier combi- nations. For instance, speaker IBME is a frequent user of ‘we + quali- fier noun’, whereas speaker WesDT is a frequent user of we all.

In the following, the phenomenon will be approached from two perspectives:

1. Does pronoun qualification have a positive impact on non-standard case, and does it affect subject as well as object functions? (The distribution of case forms will be investigated in synthetically qualified NPs as compared to non-qualified NPs, both non-coordinated.) Personal pronouns 143

2. Are certain qualifiers preferred in specific syntactic contexts? (Synthetically qualified pronouns will be compared against equivalent analytic occurrences.)

In order to answer these questions, we need to distinguish between synthetic and analytic occurrences. Pronouns followed by an adjacent qualifier noun, numeral or quantifier will be referred to as synthetic (we two); pronouns in complement position as analytic (two of us).

7.1.1. Qualified vs. non-qualified NPs

In order to answer the first question, i.e. whether pronoun qualification is effectively conducive to non-standard case, we first take a look at the dis- tribution of case forms in synthetically qualified NPs such as we boys or us girls. Similar to other areas of grammar, variation in qualified pronouns is per- ceived as critical by some language-related institutions. The Queen’s English Society, for instance, refers to qualified pronouns as an “area of confusion” on its website, similar to pronouns in coordination. Both phenomena have been registered as imprecision and error-prone by these guardians of ‘good English’ (http://www.queens-english-society.com). Variationist studies, of course, take a different view. Although non-stan- dard case in qualified NPs, for instance subject NPs like us kids, has not been analysed in great detail, it has not gone unnoticed in the literature (e.g., Anderwald 2004: 178; Emonds 1986: 100).42 Qualified pronouns have, for example, been described by Carson T. Schütze as one of various default case environments where pronoun case can vary due to missing or ambivalent case assignment. From a structural perspective, the pronoun in utterances such as (170) can be argued to occupy the position of head in absence of a determiner, in which case it would require subject case. If we assume, however, that case in similar occurrences is not assigned by syntactic rules, the appearance of object case can be explained as default case (Schütze 2001: 215–216). Along the same lines, the pre-qualified pronoun in (171) can not head the subject, either because the preceding material fills the corresponding position (e.g. the determiner in the real me), or it can only occur to the right of the determiner position, as is the case with soft me. It follows that the pronoun in the subject NP is not assigned subject case and therefore receives its case by default 144 Nuria Hernández

(O-form). Schütze’s own examples include qualifier nouns and numerals in subject position, as in We/ Us linguists are a crazy bunch or We/ Us three have to be leaving now. With other adnominal adjuncts variation seems less acceptable, for example How much would us/ ?we with insurance have to pay?

(170) ...butifitwasajobitwereneeded it were, was carried out, we, us bairns used to carry it out to them . . . (post-qualified us in subject function, FRED, NBL_006) (171) So ‘course, soft me, I took them. (pre-qualified me with subject function, FRED, WES_006)

If Schütze’s claims are true, ambivalence in case assignment should not only affect qualified NPs in subject function but also in object function. Never- theless, the view that non-standard O-forms are more acceptable than non- standard S-forms is quite common. In his article “Us Anglos are a cut above the field”, for instance, Kjellmer (1986) focuses on ‘us + noun’ phrases in subject function in both American and British English, including written examples from newspapers such as The Observer (BrE) and the LOB and Brown corpora. In Kjellmer’s opinion, ‘O-form + noun’ phrases in subject function are “reasonably acceptable in colloquial English” (p. 445) whereas ‘S-form + noun’ phrases in object function are described as hypercorrections. As an explanation, Kjellmer mentions the general drift towards O-forms in English: When by the conjunction of a pronoun and a noun phrase in standard Eng- lish the subject can behave in the normal noun-phrase fashion, viz. retain its objective form and stay invariable, and the reason for preserving a special distinctive subjective form of the pronoun is thus removed, the drift [towards default object forms] need not be checked any more but is free to assert itself. As a result, the type Us Anglos are . . . arises. (Kjellmer 1986: 448)

Let us take a look at the case distribution in FRED:

The results in Table 17 show that non-standard case in qualified NPs is a frequent phenomenon in both subject and object function. Perhaps unexpect- edly, qualified us in subject function is relatively less frequent than qualified we in object function (7.7% vs. 11.5%; the frequencies for they and them can not be taken at face value due to the large proportion of prepositional in they days/ them days). Personal pronouns 145

Table 17. Case form distribution in qualified vs. non-qualified pronouns (synthetic)

qualified subject/ subject compl. object/ prep.compl. pronouns qualified non-qualified qualified non-qualified we 395 16757 10 19 1PL us 33 116 77 1491 %ns* 7.7% 0.7% 11.5% 1.3% they 453 21675 127 49 3PL them 155 63 1360 6779 %ns* 25.5% 0.3% 8.5% 0.7% ∗non-standard

The following examples show different occurrences with post-qualified we, us, they and them in different syntactic functions:

(172) And mi father used to have home-brewed beer and bread and cheese and onion, and we lads had dripping cake and a mug of milk. (qual- ified we subject, FRED, SAL_019) (173) A Frenchman had this farm before we Garbutts come . . . (qualified we subject, FRED, YKS_009) (174) ...itmight be raining hard tomorrow, oh, yes, that ’s what she told we boys. (qualified we object, FRED, SOM_005) (175) ...hedidn’t touch this pottery stuff, you know left it all to we lads. (qualified we in prepositional compl. function, FRED, SOM_009) (176) . . . and she ’d go up there by, ehr, entrance Highbridge, she ’d come down in between we two, throwing out ropes each side . . . (qualified we in prepositional compl. function, FRED, SOM_028) (177) Us Outton boys Sunday dinnertimes plenty o’ times ...we used t’ walk t’ the Maid . . . (qualified us subject, FRED, SFK_011) (178) That was all right amongst us small people ... (qualified us in prepositional compl. function, FRED, SFK_007) (179) Willie Anning, see, because they couldn’t call us both Charlie, be- cause we wouldn’t know which one they ’d be calling! (us both object, FRED, SOM_009) 146 Nuria Hernández

(180) They both came from Bath. (they both subject, FRED, WIL_019) (181) Then after a bit we bought the New Hotel and we ran them both for one year. (them both object, FRED, WES_001) (182) All my frocks, or, or owt, that was torn or too small, them all used to go in. (them all subject, FRED, LAN_005) (183) That was a herrin’ voyage, when ol’ Oscar Pipes and them all had motor boats, little motor boats out there. (them all subject, FRED, SFK_005)

As mentioned above, non-standard case in qualified NPs occurs in subject as well as object function. However, in order to test whether qualification really has an impact on case selection, the obtained frequencies need to be compared against those in non-qualified pronouns (previously discussed under pronoun exchange). These additional frequencies are also included in Table 17 for both syntactic functions. The numbers strongly indicate that qualification has a positive effect on non-standard case, irrespective of the NPs syntactic func- tion. Among the different pronouns, the results for we and us are the most reliable, since the figures for they and them are heavily affected by their use as demonstrative modifiers. In subject function, we get 7.7% us in qualified NPs, as compared to only 0.7% non-qualified us. A similar result is obtained in object and prepositional complement function, where 11.5% qualified we stands in opposition to only 1.3% non-qualified we. Among the different qualifiers, qualifier nouns and numerals have the strongest impact on non-standard case. The quantifiers all, both and each, on the other hand, do not appear to have any particular influence on pronoun choice: all combinations consisting of ‘S-form + all/ both/ each’ in the corpus have subject or subject complement function, and almost all combinations consisting of ‘O-form + all/ both/ each’ have object or prepositional comple- ment function. The only two exceptions were shown in (182) and (183). Furthermore, the tendency for non-standard case to appear with qualifica- tion is also illustrated by sentence-internal switches. Take a look once more at (170) where simple we changes to qualified us in the same syntactic function. Personal pronouns 147

7.1.2. Synthetically vs. analytically qualified NPs

The second question asked at the beginning of this section was whether cer- tain qualifiers are preferred in specific syntactic contexts. For this purpose, synthetically qualified pronouns need to be compared against the equivalent analytic occurrences which are often perceived as more formal: compare us all vs. all of us. But how are the two structural variants distributed in the spontaneous conversations in FRED?

Table 18. Qualified pronouns: synthetic vs. analytic variants (the statistical values indicate how significant the difference in frequency is between the synthetic and analytic occurrences of each qualifier; the additional figures in brackets indicate non-standard uses) qualified 1PL 2PL 3PL pronouns (we + us) (you)*** (they + them) saσ sa σ saσ numeral 12 207 **40.9 6 24 **5.8 24 122 **15.5 (4+8) (4+203) (5+19) (2+120) all 325 27 **42.2 40 1 **27.9 563 49 **54.1 (301+24) (1+26) (385+178) (1+48) both 21 12 2.3 20 — 46 17 **5.8 (19+2) (0+12) (36+10) (0+17) each 22 4 **6.9 10 — 17 2 **7.9 (18+4) (0+4) (10+7) (1+1) TOTAL 380 250 **7.5 49 25 *4.2 650 190 **26.8 (342+38) (5+245) (436+214) (4+186) ∗significant at p < .05, ∗∗significant at p < .005, ∗∗∗sampled; s = synthetic, a = analytic

The numeric results in Table 18 include all instances of proper interchange- ability, i.e. all cases where the speaker could have used either of the two struc- tural variants (e.g., us all or all of us). Not included are, (a) qualifier nouns (we boys, but *the boys of us); (b) other qualifiers and quantifiers restricted to synthetic occurrences (e.g., us others, but *(the) others of us); (c) quantifiers restricted to analytic occurrences, such as one/ none/ half / dozens/ several/ any/ a load/ lot/ few/ crowd of ; (d) cases with potential of -elision (e.g., all they/ all them); and (e) obviously demonstrative or adverbial uses, as in two of them that was up in the draughtsmen department (WIL_019), or he used 148 Nuria Hernández to live out here Broomborough, they days (DEV_007). When combined with a numeric qualifier, they and them are mostly demonstrative, but they were included for the sake of completeness. In Table 18 the numbers in brackets show that syntheticity (s) and ana- lyticity (a) are variably correlated with pronoun case. Both structures appear with S- and O-form pronominals, but S-forms are generally preferred in syn- thetically qualified NPs while O-forms are generally preferred in analytically qualified NPs. In addition, the data show frequent alternation between the two structural variants, sometimes even within the same text passage or sentence. Two examples are shown in (184) and (204). Overall, the figures reflect a strong preference for quantifiers all, both and each to be used in synthetically qualified NPs. Especially all –asinwe all, you all, they all – is much more common as a post-qualifying element. The differences in frequencies between the two structural variants are highly sig- nificant for all three persons (1PL, 2PL, 3PL). Here are some examples:

(184) They took all of them into t’ yard and sorted them, catched them all. (all of them and them all objects in same sentence, FRED, WES_008) (185) Oh, how we all nearly died, we all screamed, . . . (we all subjects, FRED, NTT_012) (186) . . . well father was a navy man, they made us all do a job at home. (us all object, FRED, LAN_001) (187) They all ran it as a a rest place . . . (they all in subject function, FRED, MDX_001) (188) She died, she was the last of them all ... (them all in prepositional complement function, FRED, LAN_012) (189) . . . ’cause she had to do for all o’ we, see, when we come in the world. (all of we, prepositional complement, FRED, SOM_005) (190) I, ah, oh, all of they were down there, see they (unclear) used to be good chapel people. (all of they subject, FRED, WIL_010) (191) ...andeach of they used to go in the pitcher. (each of they subject, FRED, SOM_028) (192) I couldn’t keep the both of them, mind, . . . (the both of them object, FRED, DEV_001) Personal pronouns 149

Unlike all, both and each, numeric qualifiers are more commonly seen in analytic constructions. The corpus speakers generally prefer analytic vari- ants such as the two of us or two of us over synthetic variants such as us two (also remember the above-mentioned restrictions regarding larger multi- morphemic numbers). Once again, the differences in frequencies are highly significant for all three persons. In the 3PL, they are even underestimated since the pronouns’ frequent use as demonstratives favours their appearance in synthetic constructions.

(193) There was ten of we! (numeric qualifier in analytically qualified NP, FRED, SOM_030) (194) Your mother ’d send you t’ chip shop for t’ family of us and there were thirteen of us altogether in my family, with sixpence . . . (nu- meric qualifier in analytically qualified NP, FRED, LAN_020) (195) There was us two went through. (numeric qualifier in synthetically qualified NP, FRED, SFK_013) (196) And we three used to saw, go and do the timber sawing . . . (numeric qualifier in synthetically qualified NP, FRED, SOM_014)

The findings presented in this study have major implications for the classifi- cation of qualified pronouns in variational linguistics. In particular, the com- mon use of all in synthetically qualified NPs shows that this quantifier is by no means restricted to the second person, and that the increasing use of you all in different varieties of English can not be discussed out of context. It is very likely that you all, including contracted y’all, forms part of a broader development which consists in the increasing use of ‘pronoun + all’ NPs. This of course implies that the common argument for the spreading of you all – as a compensatory strategy for the missing 2SG–2PL distinction in English – only explains part of the phenomenon. Regarding the alleged preference for analytic constructions in non-stan- dard varieties, qualified pronouns are a case in point that such a tendency, if present at all, can not be generalised regardlessly.

7.2. Second person plurals

In the second person, quantifying qualifiers such as all, two or both serve the additional purpose of marking plural number on otherwise unspecified 150 Nuria Hernández pronominal heads. While other Germanic languages have distinct 2PL forms which can also be qualified (e.g., German ihr alle ‘you–2PL all–PL’), the loss of an overt 2SG–2PL distinction in English has resulted in one unspecified second person form you for both singular and plural addressees. In addition, the loss of this distinction was accompanied by a loss of social deixis, making English “the most weakly socially encoded European language” of modern times (Mühlhäusler and Harré 1990: 134; also compare Figure 12). In recent decades, the underspecification of you has led to the diffusion of alternative 2PL forms in different varieties of the language, the official reintroduction of a distinct 2PL pronoun being among the most likely candi- dates for a future grammatical change in English (cf. Kortmann 2001; Wright 1997). Should this change ever be accomplished, English would change from the current 5 person system, which is typologically rare, to a 6 person system, which is the typologically most common system (cf. Forchheimer 1953).

Figure 12. Politeness distinctions in pronouns (World Atlas of Language Structures, interactive reference tool; map by Johannes Helmbrecht, Helmbrecht (2005); my placemark)

Dialectal 2PL forms are one of the most prominent pronoun features in Eng- lish worldwide (cf. Kortmann and Szmrecsanyi 2004; Ingram 1978). Among them feature the dialectal homophones yous and youse, which are well-known in Irish English and British English varieties influenced by Irish English,43 as well as in American English and other varieties (see Hickey 2003 for more details). The usefulness of a you (SG) vs. youse (PL) distinction is illustrated in the following utterance from a female speaker from Belfast: Personal pronouns 151

(197) So I said to our Jill and our Mary: ‘Youse wash the dishes.’ I might as well have said: ‘You wash the dishes,’ for our Jill just got up and put her coat on and went out. (Harris 1993: 146)

Two other variants, yunz and yinz (also yinz guys), are commonly regarded as possible contractions of Scottish English/ Irish English you ones (cf. Miller 2004: 49). These 2PL forms are so characteristic for Pittsburgh English that Pittsburgh natives with a heavy accent are often referred to as yinzers. Besides the different variants in -s – so-called ‘motivated forms’ according to Brown and Levinson (1987: 23) – you can be combined with the quanti- fiers all, both and each as described above, and it is highly productive with qualifier nouns. The large variety of strategies used to accomplish a 2SG–2PL distinction in English points towards a concrete need for marking an opposi- tion which is no longer marked in the standard variety. In other words: It is the marking of the distinction that is important to speakers, rather than the form that the distinction takes: this can be seen from the fact that the particular pronoun that develops in a new variety of English is not necessarily dependent on there being distinct singular and plural second person forms in the older varieties from which they have developed. (Cheshire and Stein 1997: 8) In our data, the two main strategies for marking plurality on you are the use of quantifiers (you all, you two, etc.; 49 cases), and the overt marking of number on the qualifier noun (e.g., you bastards; 24 cases). The examples also include generic occurrences of the type shown in (203). It is interesting to note that none of the distinct 2PL forms known from other studies were found in the England data.44 Two instances of yous which were found elsewhere in the corpus are shown in (198) and (199), but the corresponding recordings are unclear. One instance of you guys was found in the Scotland component of the corpus; it is shown in example (200).45

(198) Then we used to have a game of kick back waters. {How do you play that, Arthur?} Well, yous ehgeta... (2PL yous, FRED, LEI_001, not in dataset) (199) I didnae recognise her at all, mind you it was hard to recognise any o’ yous. (2PL yous, FRED, ELN_011, East Lothian, Scotland, not in dataset) (200) ...Ithought I was gonna stop it and start rugby and carry on play- ing rugby, and then I didn’t want to get trampled by you. ...Ididn’t 152 Nuria Hernández

want to get trampled by you guys, right. (2PL you guys, FRED, ELN_015, East Lothian, Scotland, not in dataset)

Unlike plural qualified you, singular qualified you is usually either pejorative or has an obviously informal quality. Out of 25 cases in the corpus, 21 have ‘you + bloody/ stupid/ old/ daft + bastard/ bugger/ devil/ rascal/ villain/ bitch/ fool’; the rest have you boy. This clearly distinguishes these occurrences from plural cases where qualifiers are used for number distinction and referent specification. Compare the following examples (also (168) and (169) above):

(201) I says, No, get out, get out! (v ‘laughs’) I (trunc) sa- (/trunc) I says, You bloody local preacher, you bloody local shithouse! I says, You ’re not fit for your job, I says, And you go around preaching! I says, You filthy, flatfooted, codfish-eyed swine, you ’re nothing else. (qualified pejorative you, singular, FRED, LAN_009) (202) Here you are Mrs Bestwick – that ’s for mi mother – Here you are Mrs Bestwick, here ’s a lovely cabbage ...it’llserve you all today. (you all, universal use, FRED, NTT_007) (203) Uh, no, not a trap, a brake, you know, where you all sat around. (you all, generic, FRED, DEV_005) (204) Well, and then the other three, you chucked it in and you all three took, in the summertime, you all three shared out what was left; you used to share out the, the three of you used to have the same. (you all three, FRED, KEN_003) (205) I ’m going t’ buy you crew somethin’ special, when you go away t’ Padstow. (qualified plural you, FRED, SFK_010) (206) . . . you couldn’t throw a ball until you had all gone round lamp, and then you threw it . . . (you all, quantificational use, FRED, LAN_003) (207) {Where did they have the swimming races?} Oh just off the island, you used to have to, for the swimming races, you all went out in the boat and dived off the side, to do your swimming. (you all, FRED, DEV_005)

Finally, a brief note is in order regarding the well-known plural you all.In the corpus, you all occurs in its universal use (compare (202)) as well as in its quantificational use (compare (206)). Even if the latter is more common, a Personal pronouns 153 development towards a lexicalisation of you all is indicated by three empiri- cally substantiated observations. First of all, a comparison of you all against all of you shows how popular the synthetic variant is in the corpus (29 you all, plus 11 ‘you V all’, vs. only 1 all of you). Secondly, an attempt to paraphrase occurrences with ‘you all + numeral’, like those in (204), shows just how much of a unit you and all represent. Following the acceptability judgement in (208), a qualified NP like you all three is more likely to be analysed as ‘[you all][three]’, i.e. ‘complex pronoun + numeric qualifier’, than ‘[you][all three]’:

(208) all of you > you all all three of you > you all three/ *you three all

Thirdly, and perhaps most importantly, the contiguous pronunciation of you all shows the fusion of two components which have grown very close on the phonetic level.46 The effect of lexicalisation on internal word boundaries is most noticeable in contracted y’all, which has been reported to be spread- ing in Southern American English (cf. Bernstein 2003).47 In FRED, y’all is still outnumbered by non-contracted you all. The situation may of course be different among younger speakers.

7.3. ‘Oh deary me’: delexicalised exclamations

This last empirical section considers the use of pre-qualification by a qualifier adjective. Even if there are not enough examples in the corpus for a profound analysis, some interesting observations can be made regarding exclamations such as Dear me or Good gracious me (more frequent are occurrences with ‘poor/ poor old/ lucky/ bloody/ silly/ etc. + noun’). The only non-exclamatory case found in the data is shown in (171), repeated here for convenience as (209). All other examples consist of interjections or exclamations with me as the pronominal head (15 tokens).

(209) I said, Well would you be happier if you were all together? And they said oh yes, they would. So ‘course, soft me, I took them. (pre- qualified me, FRED, WES_006)

The grammatical status of exclamations such as Dear me is unclear. They have been described as informal abbreviations of longer phrases such as Dear 154 Nuria Hernández

God, save me (cf. Mayhew 1908), but at the same time they are also being perceived as rather exalted. In the FRED corpus, similar exclamations are rare despite the relatively high emotionality in some interviews. Nevertheless, exclamations of the Dear me or Good gracious me type were used by a variety of speakers, both male and female. A regional preference appears to be indicated for dear me type exclamations in the North: all dear/ dearie me were found in data from N, except two cases with good gracious/ goodness me in the SW.

(210) Oh dearie me, I just, you know, Sally knows him better than me. (FRED, WES_008) (211) Nowadays you ’d think, Good gracious me, that ’s a waste of time, you know. (FRED, CON_011) (212) What the deary me did they call the blacksmith? Yes, bless us what did they call him. (FRED, WES_009)

Even if the use of me in these examples seems so natural that it appears in- significant, the exclusive use of object case corroborates Schütze’s claim that pronoun form variation does not occur with pre-nominal modifiers (Schütze 2001: 215). Schütze’s own examples include The real me/ *I is finally emerg- ing; Lucky me/ *I gets to clean the toilets; Dear me/ *I!; Lucky us/ *we!; and Poor them/ *they! One of the above examples deserves special attention. In sentence (212), the exclamatory NP not only has the colloquial form dearie me (one of five dearie me in the corpus), but it is also preceded by a definite article. A possi- ble explanation, of course, is analogy to expressions such as what the hell or what the heck. Nevertheless, this example presents a particularly clear case of delexicalisation – a process which has been defined by Tognini-Bonelli as: the process through which a lexical item loses at least some of its original lexical value and often acquires other meanings and other functions within a larger unit. This kind of semantic impoverishment is again triggered when the dividing line between an item and its environment becomes blurred; then, strong collocational and/ or colligational patterns combine to create multi- word units where the function of the whole is of course different from the function of the individual parts. (Tognini-Bonelli 2001: 116) It is well known that lexical items in collocation and components of phrase- ological chunks have a tendency to become delexicalised (cf. Sinclair 1991). This is certainly the case in exclamations like the ones presented above, where Personal pronouns 155 both components – dear/gracious/goodness as well as me – have undergone semantic bleaching. From a syntactic point of view, we can also speak of grammatical impoverishment, regarding both the grammatical function of each component and the grammatical relation between the two components. In the particular case of (212), deary me even loses some of its exclamatory power by being embedded into the larger what the . . . phrase, which is ex- clamatory in itself. Similar examples are extremely hard to find, but it can be shown that the construction in (212) is not unique:

(213) “The deary me!” said John Jr., mimicking his sister’s manner, “how much lower is her origin than yours?” (Lena Rivers, 1856 novel by Mary J. Holmes, ch. XXXV) (214) In our little village there has been a tragedy Oh the deary me, what a terrible tragedy, Mary Ellen Bottomley, today should have been wed It’s a good job that she didn’t now ‘cause everybody said . . . (“Mary Ellen at the church turned up”, old music hall song) (215) oh the deary me the tension is brewing (online forum, http://www.planetcricket.net/forums/cricket-games- stories-strategies/heros-xi-return-cricket-07-a-28766-10.html, accessed 25 August 2011)

7.4. Summary: Qualified pronouns

Simple personal pronouns such as you, we, us, they and them are frequently accompanied by a noun or numeral, or by quantifiers such as all, both and each. Pronoun qualification has a variety of different functions, most impor- tantly the specification of the pronominal referent and the quantification of plural referents. In English, it can be used to mark grammatical distinctions which are not, or no longer, marked in the standard pronominal paradigm, such as duality or a 2SG–2PL opposition. In particular conversational situations, the drawbacks of under-specifica- tion are obviously felt by the speech community, leading to the introduc- tion, or re-introduction, of overt morphological distinctions. Among the best- known examples in English are distinct 2PL forms such as yous/ youse, which are currently spreading in different varieties. The phenomenon ranges among the most prominent dialectal pronoun features of English worldwide and is 156 Nuria Hernández one of the most likely candidates for a future grammatical change. Among the different markers of plurality, qualifier all in ‘pronoun + all’ NPs appears to be on its way to grammaticalisation, especially in combination with you in you all (y’all). Regarding variation in case, qualified pronouns are frequently mentioned in connection with non-standard O-forms in subject function, for example in sentences such as us boys used to go there. It was shown in the analysis that qualification has a strong positive impact on the use of non-standard case in subject and, less frequently, object functions. When comparing the synthetically qualified pronominals in the corpus to their analytic equivalents in the same dataset (us all vs. all of us, etc.), a strong preference was observed for speakers to use the quantifiers all, both and each in synthetic constructions, and numeric qualifiers such as two, three, etc. in analytic constructions. This result identifies ‘qualifier type’ as an important determinant of structural variation.

8. Synopsis and discussion

In this study a variety of phenomena were analysed which revolve around the interchangeability of personal pronoun forms in contrast to the prescriptive rules of written Standard English. While the individual phenomena differ re- garding their distribution and variation frequencies, they all point in the same direction: pronoun variation is a regular characteristic of spontaneous speech. The empirical results are no doubt incompatible with the prescriptivist con- cept of discrete and invariant linguistic units. The results obtained for different phenomena will now be summarised for the sake of a few more general observations. This synopsis will bring to light some of the most salient patterns of pronoun variation and will help esta- blish connections between empirically substantiated tendencies and general linguistic principles.

8.1. Areal distribution patterns

Some of the phenomena discussed in this study are among the most promi- nent dialect features in English worldwide, such as the use of subject me in coordination or the use of generic question tags. Others are less common in Personal pronouns 157

English in general and exhibit low relative frequencies in the data at hand. Nevertheless, there are tendencies which apply to both the frequent and rare phenomena in the corpus. One of these tendencies is the general supraregionality of the phenomena discussed, which characterises most of them as features of spontaneous spo- ken English rather than a specific dialect. This supraregionality is partly in conflict with previous accounts. Even the very few phenomena which show regional clustering are characterised by gradient differences between the four dialect areas. The results, therefore, provide strong empirical support for a definition of regional varieties in terms of qualitative and quantitative dis- tinctions. In the words of Chambers and Trudgill:

Lects may differ quantitatively when a variable is involved. That is, lects may be distinguished not only by the presence or absence of a variable, but also by the frequency with which a particular variant occurs. (Chambers and Trudgill 1998: 129)

For the majority of phenomena, the ubiquitous use indicates a natural struc- tural development that may have happened independently in different places. Diffusion by a certain group of speakers seems implausible in the case of ubiquitous phenomena because of the geographic spread (cf. Chambers 2004: 128). In addition, pronoun variation has not only been evidenced to play a role in most non-standard varieties of English but also in language acquisition and other languages (see 8.4). The supraregional appearance of non-standard occurrences also distin- guishes the phenomena under investigation from phonological phenomena, in line with the well-known fact that “grammatical and syntactic features . . . may unite areas showing a great deal of phonological differentiation” (Iha- lainen 1994: 248). The corpus results show that a geographical demarcation of dialects which is based on phonological criteria – such as the traditional dialect areas identified by Trudgill in Figure 1 – can differ considerably from the grammatical variation observed in the same space.

A summary of the phenomena investigated in this study and their areal dis- tribution in FRED is presented in the feature catalogue in Table 19. Most phenomena are distributed relatively evenly throughout the country, with very few exceptions. Non-standard O-forms, for example, appear in all four dialect areas irrespective of their function or the surrounding syntactic structure. 158 Nuria Hernández

Table 19. Non-standard pronouns and their areal distribution in FRED

phenomenon areal distribution section singular us supraregional* 3.1 generic question tags supraregional 3.2 innit: supraregional* gendered pronouns South–North continuum 4 pronoun exchange O-form subjects: supraregional 5 O-forms in question tags: SW S-form objects/ prep.compl.: heavy clustering in SW O-form reflexives: supraregional independent self-forms: supraregional (mostly myself ) case variation after supraregional 6.2 as, like, than qualified pronouns supraregional 7 ‘pronoun + all’ plurals supraregional 7.1 second person plurals supraregional 7.2 ∗rare feature

The only two phenomena that stand out among all others as truly regional are the use of O-forms in question tags and the use of non-standard S-forms, both found in the Southwest of the country (sections 5.3.7 and 5.2.2).48 Non- standard S-forms in object and prepositional complement function constitute the only regionally marked component of the complex phenomenon known as pronoun exchange. Taking into account the low mobility of the interviewees, the regional restriction of non-standard S-forms seems to agree with an exemplar-based conception of persistence, i.e. language acquisition by exposure (cf. Skousen 1989; Tomasello 2003). Together with the observed variation in gender (see South–North continuum), non-standard S-forms identify the Southwest of England as a special area: it shares all of the features found elsewhere but Personal pronouns 159 also has some grammatical idiosyncrasies of its own. This may be the reason why this area has long been notorious for its use of non-standard pronouns.49 A different tendency observed in the analyses is the fact that most of the investigated phenomena represent generalised features, meaning features that are used frequently and by a substantial number of speakers. The vast majority of speakers in the FRED corpus uses at least some non-standard pronouns. Despite the largest possible reduction of potentially influential sociolinguistic and stylistic factors (see section 1), the current dataset still exhibits a considerable amount of unpredictable variation across and within the different interviews, down as far as the same sentence.50 In other words, we are faced with inherent variability and internal variability in the Labovian sense (inter- and intra-personal; cf. Labov 1971; Adger 2006). Consistent micro-parametric variation is not given, since individual speakers make inconsistent morphosyntactic choices.

8.2. Frequency distribution

Considering the prevailing supraregionality illustrated in Table 19, the main difference between the individual phenomena in quantitative terms lies in differing degrees of variation. Overall, the analyses showed that semantically equivalent variants are mostly used variably. In spontaneous speech, in other words, the complementary distribution of semantically equivalent or near- equivalent expressions is the exception rather than the rule. In many cases, the data show straightforward ‘complementarity failures’ (Kiparsky 2002: 182). The corpus results hence serve to identify those gram- matical rules which are not functionally relevant in actual conversations but represent a mere formality, such as positive–negative verb sequences in ques- tion tags, or question tag agreement in general. If we reduce the results to concrete figures, we get non-standard frequen- cies ranging from 0.01% to almost 100%. The frequency range goes from rare phenomena such as S-form objects (0.01%), O-form subjects (0.3%) or gen- dered pronouns with inanimate mass reference (0.3%), to almost-categorical occurrences which still contradict the prescriptive standard, such as non-S subjects in no-verb utterances (97.9%). The two syntactic contexts that are most resistant to non-standard inclu- sions are by-subjects and ‘subject before finite V’. In syntactic functions with an extremely high non-standard proportion, on the other hand, prescriptive 160 Nuria Hernández claims are hard to maintain. This is the case in subject complement function (84.1% non-S forms in the data), or in pronouns after comparative like, as and than (89.3% O-forms). Table 20 shows a breakdown of non-standard phenomena into frequency ranges. Although non-standard frequencies in the corpus cover an extremely wide range, most phenomena are either rare (featuring in the < 10% and < 20% groups), or highly frequent, with relative frequencies > 90%. In this regard, it is interesting to observe that the perceived salience of certain phenomena, and the attention they receive in the dialectological liter- ature, are not necessarily correlated with their empirical frequency in actual performance data. Indeed, some of the most widely discussed phenomena, such as PE or gendered pronouns, range among the least frequent features in the corpus. S-form objects, in particular, have the lowest relative frequency of all non-standard pronouns in the data, but they are still being discussed in the literature as a highly salient grammatical feature of dialects spoken in the Southwest of England (salience is of course given due to the exceptional regional restriction of this feature). Based on the insight that all of the investigated phenomena are gradient phenomena, several categorical claims can be refuted. This includes the cat- egorical rejection of: gendered reference for mass referents (see (40): Cause if he ’s all mixed up, then you got to bag en (cowfeed)) as well as S-forms in prepositional complement function (see (53): he never interfered with I). At the same time, claims predicting the categorical use of non-standard forms also have to be refuted.

8.3. Case assignment

Based on the corpus results, different systematic tendencies could be iden- tified and translated into preferential hierarchies. Some of the most impor- tant feature-specific findings were summarised in the Context Hierarchy of Non-standard Object Forms ((95)) and the projection of gendered pronouns onto the Animacy Hierarchy (Figure 2). However, it appears that the strongest overarching link between the micro-level variation in specific syntactic func- tions and the macro-level variation of pronoun forms in the corpus is the great functional versatility of O-forms, as summarised in the Functional Diversity Hierarchy in (11). Personal pronouns 161

Table 20. Breakdown of non-standard phenomena by relative frequency range

relative non-standard frequencies 0%–5% non-O objects non-S subjects independent self-forms (obj., prep. compl., subj. compl.) self-forms after comparative like, as, than O-form reflexives non-O analytically qualified pronouns non-O prepositional complements gendered pronouns, inanimate mass and count referents 5%–10% independent self-form subjects in coordination S-forms after comparative like, as, than S-forms in qualified object NPs 10%–20% independent self-form subjects, no-V, simplex and in coordination non-O objects and prepositional complements in coordination O-forms in qualified subject NPs generic question tags 20%–50% independent self-form subject complements in coordination non-standard case in ‘all + pronoun’ NPs gendered pronouns, animate non-human referents 50%–80% O-form subjects in coordination non-standard case in ‘all + pronoun’ NPs in coordination 80%–90% non-S subject complements O-forms after comparative like, as, than > 90% non-S subject complements in coordination, with and without V non-S subjects/ subject complements, no-V

All in all, the results do not go so far as to suggest the complete abolition of pronoun agreement in English, or the absence of structural case assignment. 162 Nuria Hernández

Even if it can not be ruled out that English could, in the far future, develop into a different typological language type (neutral, no case distinctions), there is enough evidence still that certain syntactic positions require morphological realisations in line with a nominative–accusative system. Object forms, for instance, clearly dominate in direct object position, and locally-bound reflex- ives are usually overtly marked by self-forms. The most stable position for S-forms is as subject before a tensed verb. Currently, structural case assign- ment is under no threat in this position. If pronoun agreement – hypothetically speaking – were ever to be abolished in favour of an O-form paradigm, the ‘subject before tensed V’ position could well become the last S-form resort.51 Among the grammatical functions investigated in this study, there are some syntactic environments which stand out with respect to two very spe- cific properties. Firstly, these environments exhibit an above-average degree of variation in pronoun case; secondly, the quantitative distribution in these environments points towards the use of default case, which is usually object case. In terms of markedness, “the opposition between two or more categories is suppressed, and it is the unmarked member which appears” (Greenberg 1966: 29) – the unmarked variant being O-form pronouns. In the analyses, such environments were repeatedly referred to as default case environments, following the approach of Schütze (2001). It is here where the discrepancy between empirical variation, on the one hand, and prescriptive complemen- tarity in case assignment, on the other hand, becomes particularly noticeable, up to a point where it seriously challenges studies which refuse to consider variation. The use of default case has been attributed to ambiguity in case assign- ment, which goes hand in hand with linguistic uncertainty on the part of the speaker. In the different linguistic features investigated above, the following syntactic functions were affected most strongly:

– subjects in sentences with V-elision (section 5.3.4) – subject complements (section 5.3.1) – qualified pronouns (section 7) – pronouns in comparative PPs (section 6.2)

In addition, coordination – the “great bugbear of prescriptivists” (Denison 1998: 109) – is renowned for causing ambiguity in case assignment. The pos- itive impact of coordination on non-standard case was confirmed in section 5.3.2. Personal pronouns 163

Based on the results obtained, the concept of default case requires spec- ification in two respects. On a qualitative level, it was shown that variation in case is not restricted to subject–object alternation, since it also involves the use of independent self-forms as a third pronominal category or formal variant. In default environments, we usually find both S-forms and self-forms as the less frequent variants. The exact reasons for the appearance of these secondary forms remain to be investigated. Emphasis alone is not intuitively satisfying. The second specification concerns the quantitative distribution of morpho- logical variants in default case environments. While it is not common in the literature to include concrete frequencies in the description of default case, the results obtained for different syntactic functions in this study lie so clearly within the same range that frequency itself can be regarded as an integral part of default use. The results show that O-form proportions can be expected to be > 60% (mostly 70%–80%), and that the remaining occurrences are filled by either S-forms or independent self-forms, often at equal amounts. Con- sidering this striking regularity, it is conceivable that information acquired during language acquisition not only includes the syntactic environments that allow for variation in case, but also the probability with which the different case forms can be expected to appear (along the same line of thought, cf. Adger 2006: 509).

8.4. Prioratisation of morphosyntactic categories

The phenomena investigated in this study fall into four groups, depending on the morphosyntactic category affected by the variation in pronoun choice: person, number, gender and case. Based on the overall results, a simple but clear prioratisation emerges which is summarised in the Hierarchy of Morpho- syntactic Categories in (216). It shows the relative amount of variation found in each category and at the same time reflects the overall importance of each category for the correct processing of personal pronouns in English. (Note that the status of gender in (216) corresponds to the current findings. In other sets of data – present and future – it will depend on the persistence, or disap- pearance, of the only gender-related phenomenon: gendered pronouns.) Our results, in combination with concurrent observations from the lit- erature, allow the assumption that (216) can be expected to apply to other speech data and probably to written documents, too. When faced with Eng- 164 Nuria Hernández lish language data, we can generally expect to find less variation in person and number than in case. From a diachronic perspective, the strong potential for variation in case in Modern English presents a natural continuation of the progressive levelling of case distinctions in the language, the only present- day survivors being those pronouns that have retained distinctive subject and object forms.

(216) Hierarchy of Morphosyntactic Categories in Personal Pronoun Processing person > number > gender > case

From a wider perspective it is interesting to see that the Hierarchy of Morpho- syntactic Categories corresponds to different proposals found in the linguistic literature of the last two or three decades, which in the modelling of morpho- syntactic feature hierarchies have attributed different phi-features a different grammatical status (e.g., Noyer 1992; Harley and Ritter 2002a; Carminati 2005). At the same time, the observed prioratisation correlates with earlier obser- vations from sociolinguistic and language acquisition studies which attest the early acquisition of person distinctions as compared to the later acquisition of case distinctions among English-speaking children (e.g., Emonds 1986).52 During his investigations of the Verbal Deprivation Hypothesis, for instance, Labov (1970: 172) found that the use of non-standard case among African- American children in urban ghettos (e.g., Me got juice) showed “only that [the child had] not learned the formal rules for the use of the subjective form I and oblique form me.” Labov’s description continues as follows: We have in fact encountered many children who do not have these formal rules in order at the ages of four, five, six, or even eight. It is extremely dif- ficult to construct a minimal pair to show that the difference between he and him,orshe and her carries cognitive meaning. In almost every case, it is the context which tells us who is the agent and who is acted upon. ...itisevident that the children concerned do understand the difference in meaning between she and her when another person uses the forms. All that remains, then, is that the children themselves do not use the two forms. (Labov 1970: 172–173) Furthermore, the hierarchy in (216) correlates with some major typological trends in the realisation of pronominal features described in (i) to (iv). The ob- vious explanation for (216), as well as for the overarching typological trends, Personal pronouns 165

lies in the role that the different morphosyntactic categories play in the ex- pression and interpretation of referential and coreferential relations.

(i) Person has the status of a typologically omnipresent feature. According to Greenberg’s Universal 42, “all languages have pronominal categories involving at least three persons and two numbers” (Greenberg 1963: 90); compare Siewierska (2004); Trudgill (2009); Mühlhäusler and Harré (1990), for exceptions; Forchheimer (1953), for a typology of per- sonal pronoun systems; Ingram (1978), for a categorisation of person sys- tems on the basis of abstract semantic and deictic information.

(ii) The overt marking of number distinctions is widespread, with some excep- tions. Compare Corbett (2000); Greenberg’s Universal 36 states that “[i]f a lan- guage has the category of gender, it always has the category of number” (Greenberg 1963: 90); in the feature geometry of Harley and Ritter (2002a), languages without number distinctions lack an ‘individuation’ node.

(iii) Gender varies more widely than either person or number. According to Harley and Ritter (2002a: 514), “gender (or class) features vary more widely in the worlds’ languages than either person or num- ber. For example, while all languages seem to have at most four persons and four numbers, the set of gender/ class systems seems much less con- strained. Some languages have no gender marking whatsoever . . . Other languages have two or three genders. The limiting case is probably pre- sented by the Bantu languages, which have upwards of ten distinct genders or classes of nouns (see Corbett 1991).” In her contribution to the World Atlas of Language Structures, Siewierska (2005) finds that (i) about 30% of the languages in her sample mark gender in independent pronouns (378 languages, predominantly Africa, Eurasia and Europe), (ii) most gender contrasts in personal pronouns are sex-based, (iii) gender oppositions are characteristic of the third rather than the first or second person (similar Mühlhäusler and Harré 1990), and (iv) gender is more typical of the singu- lar than non-singular. Note that in Ido, an auxiliary language specifically designed as a universal alternative to English, the gender-neutral pronoun lu can mean ‘he’, ‘she’ or ‘it’. 166 Nuria Hernández

(iv) Case varies most widely in the world’s languages. The overt morphological marking of case distinctions has been described as “the most highly variable grammatical phenomenon associated with the grammatical relations of ‘subject’ and ‘object’ ” (Croft 1990: 152). Com- pare Blake (2001) for a global perspective on case systems and marking; Sapir (1921), for the levelling of the seven cases in Indo-European nouns, as well as the instability of the who/ whom distinction in relative pronouns; Ingram (1978), for a case-based classification of language types. Note that some constructed international languages do not mark case dis- tinctions. For instance, Otto Jespersen’s Novial relies on word order: me observa vu ‘I observe you’ vs. vu observa me ‘you observe me’ (http://inter languages.net/AIL.html). However, Jespersen (1928: Part II, Case) con- cedes: “Still, many people would prefer a mark of the accusative to be used in those rare cases in which ambiguity might be feared, and the best ending seems to be -m.” Reflexivity: In many languages reflexivity stays unmarked. While the con- cept itself appears to be universal, not all languages have special reflexive pronouns. Some languages “simply use personal pronouns in their place” (Kiparsky 2002: 203); others only have reflexive markers in the 3rd person. The use of polysemous reflexive pronouns is well-known from other lan- guages: Asian ziji, zibun and caki, for instance, are devoid of phi-features. These pronouns can be used as subjects or objects (direct and indirect) and their antecedents are usually co-arguments of the predicate of the matrix clause, similar to so-called long-distance reflexives in English (cf. Huang 2000: 191).

Besides its descriptive and predictive potential, the Hierarchy of Morpho- syntactic Categories in Personal Pronoun Processing has fundamental im- plications for a conjunctive analysis of pronominal expressions. In English, the specification of person and number features is considered vital for a cor- rect resolution of both independent and anaphoric pronouns, whereas the im- portance of gender and case features is radically reduced by the availability of supplementary information in the discourse. For a correct interpretation of personal pronouns in actual conversations such as the interviews used in this study, overt information is usually required, first and foremost, about the speech act role of the respective antecedent or extra-linguistic referent. Person, therefore, turns out to be the most robust category of all, given its Personal pronouns 167 vital importance for the marking of speech act roles: speaker (1) vs. hearer (2) vs. non-participant (3) (in Harley and Ritter 2002a, [±participant] and [±addressee]). Case, by contrast, represents the least robust category. In English, its orig- inal purpose of establishing syntactic dependencies has become redundant to a large extent due to the fixed word order. In spontaneous discourse, syntactic dependencies can also be inferred from the verb’s semantics (cf. König and Siemund 1997), mechanisms for the track-keeping of referents (cf. Bosch 1985), the information structure, prosodic elements and the overall context. According to Greenbaum and Quirk (1990: 108), all pronouns have one thing in common: “their referential meaning is determined purely by the grammar of English and the linguistic or situational context in which they occur.” Overt pronoun–antecedent or pronoun–referent agreement therefore becomes redundant in many cases. In a sentence such as I saw me, the use of a 1SG pronominal object plainly satisfies the morphological requirements for a reflexive interpretation. The same may or may not apply to the same sentence in the 3SG, He saw him, depending on the contextual information available to the interlocutors. This explains why, against initial expectations, the present study returned no significant differences in the treatment of first/ second vs. third person pronouns regarding variation in case.

8.5. Outlook

Notwithstanding various interesting aspects that remain to be investigated, the present study will hopefully bring some clarity to the much discussed and multi-layered topic of non-standard pronouns in English. The analyses presented in this investigation provide strong empirical ev- idence against a structuralist ‘one form – one meaning’ principle. The vari- ation reflected in different non-standard phenomena shows that a purely for- malist approach cannot explain the actual behaviour of personal pronouns and self-forms in Modern English, especially in the spoken medium. Instead, a complex array of syntactic and extra-syntactic determinants is at work. Overall, pronoun behaviour in spontaneous speech does not conform to expectations based on more formal data and minimal pair examples, the sim- ple reason being that what appears ambiguous out of context usually becomes clear within context. In face-to-face conversations such as the interviews used in this study, part of the necessity for morphology “to regulate expressive 168 Nuria Hernández traffic” is lost, as compared to registers where “the absence of nonlinguistic, gestural and situational context information necessitates the support of the conventionalized and socially controlled organization principles to ensure its functioning” (Stein 1994: 6). Immediate practical implications of these observations concern language teaching. The demand expressed by Emonds (1986: 124) that “the real em- phasis in grammar teaching for native English speakers should be re-directed to an explicit linguistic formulation and appreciation of the differences in nat- ural language class and ethnic group dialects” seems boldly optimistic. It is, however, feasible for language teachers to try and convey a sense of linguistic diversity and respect for vernaculars and the people who use them. If we venture a look into the future of the English language, the develop- ment of case assignment will certainly be exciting to follow, especially given the current status of English as a high-contact language. Particularly inter- esting from a more general perspective are those determinants of variation which affect pronouns in different syntactic environments and in different varieties of English. Aspects that are especially worth exploring are the role of emphasis, including the extent to which correlations can be observed be- tween non-standard usage and stress; the avoidance of ambiguity, a linguistic principle for which appropriate tests still need to be developed; and finally, the apparent connection between the overall frequency of specific syntactic functions and the frequency of non-standard occurrences in these functions. Personal pronouns 169

Appendix

Table 21. Syntactic functions with corpus examples (all syntactic functions filled by personal pronoun forms in the FRED interviews, with functions investigated in this study printed in bold type)

Syntactic function FRED example subject although her knows I ain’t a hundred percent subject, coord. initial well, me and mi old chap was up there subject, coord. final my dad and him went to join the army subject, coord. middle mother and I and mi sister came to the farm question, no verb {What school did you go to?} Me? subject/ subj. compl., no verb {Did you know the books?} No, not me, no. sub./ subj. compl., no verb, coord. {Who would milk the cows?} Oh, I and the wife. by-subject Everything else was done by hand, by us. direct address . . . and I said to my old man, You, What d’ you hit me? interjection/ exclamation Oh goodness me!/ Oh, that’s gone, dearie me. object ...sothefarmer didn’t want I, did he? object, coord. initial they prayed me and mi brother to make them object, coord. final they put Peggy and I together object, coord. middle she only asked her own sisters and me and my husband object, no verb {Your mum pushed you up there in a bassinet?} Not me. subject complement and there was I by there trying to light it subject compl., coord. initial there’s just I and Brendan left subject compl., coord. final that were father and I on the binders subject compl., coord. middle and then there’s John and Michael and miself and a girl prepositional complement he gave it to myself prep. compl., coord. initial they’d all pitched onto I and Albee prep. compl., coord. final my uncles used to drive the cart for Joshua and they by + pronoun ‘alone’ it was a bit of a bore if you was all by yourself all night ‘pronoun + V-ing’ (object) all for a copper to keep us going for–to construction Don’t suppose you would like forItodraw your ship ECM construction I said, They want I to go to Charlton. ‘pronoun + V-ing’ (adverbial) Me being light, I used to jump right at the top. continued on next page 170 Nuria Hernández continued from previous page ‘pronoun + V-ing’, coord. (adv.) me or mi mum or whoever it is being that bit of a boss ‘pronoun + V-ing’ (gerund-part.) I said, excuse me asking your age, but . . . ‘pronoun + V-ing’ (for zero) instead o’ me gettin’ my ninety pound, I go seventeen intensifier He ought to pull that out himself. reflexive he had the tail of a shirt for drying him on go and get you back into bed mi lass non-standard benefactive I was standing there having me a drink . . . up jumped the pike, have him a meal if he could get one pleonastic in imperative don’t you ask me a question like that mind you I had a cup of tea, mind you pronoun + qualifier (synthetic) all us kids used to go up there and we three used to saw, go and do the timber sawing so you each got a ration o’ herrin’ we all got jobs/ we got all paid qualifier + pronoun (analytic) There was ten of we!/ ‘cause she had to do for all o’ we ‘all + pronoun’ pretty nigh all them had Friday night . . . ‘all + pronoun’, coord. the sailors and all them used to come out qualifier adjective + pronoun So ’course, soft me, I took them. (generic) question tag good idea though, innit we used to make our own bread, didn’t it your brother got away with that case last year, is it disjunctive, post-positioned I love flowers, me. disjunctive, pre-positioned Me, I go twice every Sunday. disjunctive, post-pos., coord. they used to go fishing, him and a man named Wood disjunctive, pre-pos., coord. one of our mates and misel’, we were dare-devils resumptive Like a rubber collar thing that you could wash en. possessive determiner and we have done all us lives the doctor used to make him own bottles of -genitive the owner of her/ clever to get out of t’ road of you demonstrative determiner when we had all them fish the Frenchmen used to come over in they days with onions Personal pronouns 171

Table 22. Function matrix I: S- and O-forms in FRED

syntactic function S-forms O-forms you (only confirmed functions) I he she we they me him her us them subject x x x x x x x x x x x subject, coord. initial x x x x x x x x — x x subject, coord. middle x — — — — — — — — — — subject, coord. final x — — — x x x x — x — subject complement x x x x — x x x x x x subject compl., coord. initial x — — — — x x x x x — subject compl., coord. middle — — — — — x x — — — — subject compl., coord. final x — — — — x x x — x — subject/ subject compl., no verb — x — — — x x x x x — subj./ subj. compl., coord., no verb x x — — — x x x — — x question, no verb — — — — — x — — — — — (generic) question tag x x x x x — x — x x x by-subject — — — — — x x — x — — object x x x x x x x x x x x object, coord. initial — — — — — x x — x x — object, coord. middle — — — — — x — — — — — object, coord. final x — — — — x — x x x — object, no verb — — — — — x — — — — — prepositional complement x x x x x x x x x x x prep. compl., coord. initial x — — — — x x x x — — prep. compl., coord. final x — — — x x x — — x — ‘pronoun + V-ing’ (object) — — — — — x x x x x x ‘pronoun + V-ing’, coord. (object) x — — — — x — — — — — ‘pronoun + V-ing’ (adverbial) — — — x x x x x x x x ‘pronoun + V-ing’, coord. (adv.) — — — — — x — — — — — ‘pronoun + V-ing’ (gerund-part.) — x — x — x x x x x x ‘pronoun + V-ing’ (for zero) — — — — — x — — — — x for–to construction x — — x x x x x x x x ECM construction x — — x — x — x — x — continued on next page 172 Nuria Hernández

continued from previous page interjection/ exclamation — — — — — x — — — — — pleonastic in imperative — — — — — — — — — — x mind you — — — — — — — — — — x pronoun + qualifier noun/ numeral — — — x x — — — x x x ‘all + pronoun’ — — — — x — — — — x — ‘all + pronoun’, coord. — — — — — — — — — x — direct address — — — — — — — — — — x qualifier adjective + pronoun — — — — — x — — — — — disjunctive, post-positioned x x — — x x x — x x — disjunctive, pre-positioned — — — — — x x — x x — disjunctive, post-pos., coord. x x x — — x x x x x — disjunctive, pre-pos., coord. x x — — — x x x — x x resumptive — x x — x — x — — x — reflexive (incl. usually intransitive V) — — — — — x x x x x x non-standard benefactive — — — — — x x — — — x of -genitive x — — — x — x x x x x possessive — — — x — — x — x x x demonstrative — — — — x — — — — x — total number of contexts 21 13 8 12 15 35 29 21 22 30 21 Personal pronouns 173

Table 23. Function matrix II: Self-forms in FRED

syntactic function self-forms my- your-f him- her- it- our- your-ves them- subject — — — — — — — — subject, coord. initial — — — — — — — — subject, coord. middle — — — — — — — — subject, coord. final x — x — — x — — subject complement x — — — — — — — subject compl., coord. initial x — — — — — — — subject compl., coord. middle x — — — — — — — subject compl., coord. final x — x — — — — — subj./subj. compl., no verb x x — — — — — — subj./subj. compl., coord., no verb x — — — — — — — question, no verb — — — — — — — — (generic) question tag — — — — — — — — by-subject — — — — — — — — object x x — x — x — x object, coord. initial — — — — — — — — object, coord. middle x — — — — — — — object, coord. final — — — — — — — — object, no verb x — — — — — — — prepositional complement x — x x — x — x prep. compl., coord. initial — — — — — — — — prep. compl., coord. final — — — — — — — — ’by + pronoun’ meaning ’alone’ x x x x x x — x ’pronoun + V-ing’ (object) — — — — — — — — ’pronoun + V-ing’, coord. (object) x — — — — — — — disjunctive, post-positioned x — — — — — — — disjunctive, pre-positioned x — x — — x — x disjunctive, post-pos., coord. x — — — — — — — disjunctive, pre-pos., coord. x — — — — — — — intensifier x x x x x x — x reflexive (incl. intransitive V) x x x x x x x x benefactive with monotrans. V x x — — — — — — total number of contexts 20 6 7 5 3 7 1 6 174 Nuria Hernández

Notes

1. A wider range of phenomena is discussed in Hernández (forthcoming). 2. Other speaker variables such as language proficiency, occupation, and socio- economic status were not considered in the selection process due to known uncertainty factors (cf. Sankoff and Laberge 1978). 3. Principles which constrain relations of pronominal expressions to possible an- tecedents, based on government and binding relationships. Principle A states that “An anaphor is bound in its governing category.”; Principle B states that “A pronominal is free in its governing category.” (Chomsky 1981: 188). Note that ‘anaphor’ refers to reflexive and reciprocal pronouns only. Binding is defined as: α binds β iff α is in an argument position, α c-commands β, and α and β are coindexed; α c-commands β iff the first branching node dominating α dominates β, α does not dominate β and β does not dominate α. According to Principle A, anaphora must not only be bound, but must be bound in a local domain. In the simplest case, the anaphor and antecedent are clause-mates, as in Johnihurthimsel fi. 4. The historical evidence stands in contrast to the complementary use of per- sonal pronouns and reflexives in standard Modern English, but it is surprisingly similar to the situation in FRED. It may therefore well be that present-day ver- naculars are preserving an inherited constraint (cf. Kiparsky 2008: 32). 5. "OE poets used both ic and we. In most of those with we, the poet could be including his audience; there are similar examples in the prose. But we are left with a few sentences in both prose and poetry where we may have the ‘plural of authorship’. . . . There are also places where a single individual other than an author seems to use the first person plural. . . . the so-called ‘plural of majesty’." 6. For question tags in Scottish English, compare Miller (2004). 7. In addition, (Vogelaerand Sutter 2011) have observed the development of Dutch het into a default pronoun “that can be used whenever grammatical gender agreement fails.” 8. Despite the occasional use of masculine forms with female referents, the corpus data show no regular masculinisation of anaphoric pronouns. Masculine refer- ences to female referents are the exception rather than the rule. Compare Curzan (2003: 93) for similar results in Old English data. 9. Compare Kimball (1991: 451) for Koasati, a Native American language spoken in Louisiana. 10. Of course one could argue that the concepts of animacy and humanness are also subjective to some extent. Language users may regard a specific object as animate based on their cultural or religious background, for instance ‘the water’ or ‘the moon’, or personify an object based on their personal attitude. 11. “The use of animate pronouns for the entire continuum is not known, except for some Creoles (Jamaican creole im; cf. Patrick 2004: 428). He and she are Personal pronouns 175

normally not found for reference to abstracts and non-individuated matter.” (Siemund 2008: 4) 12. Referents can be upgraded from it to he/ she, or downgraded from he/ she to it. According to Wagner (2004a: 124), this “intimate pattern” can also involve switches between masculine and feminine forms. 13. The different case forms behave very similar in this respect, which is why form- specific properties (e.g., phonological aspects) can be ruled out as a major de- terminant. Note that the SED questionnaires and incidental material show gen- dered pronouns mainly in the SW, but also, for example, in Yorkshire. For the geographic distribution of gendered proportions among the different Southwest- ern counties compare Wagner (2004a). 14. (Azevedo 1989: 863–864), for example, reports the use of S-form objects in Brazilian Portuguese, both in the ‘substandard’ varieties and the colloquial speech of educated speakers. For example, Ela chamou eu ‘she called I’ (Standard Brasilian Portuguese chamou-me), or Eu chamei ela mas ela nao˜ respondeu ‘I called she but she did not answer’ (StBP chamei-a). Unlike English PE, the Portuguese phenomenon appears to be uni-directional. 15. Second person cases are not included because of the missing case distinction in you. Among the 38 occurrences of thou there are two objects: he said, Look here, Frank, if I take thou to court, thou will have to pay the value of the cow (SAL_013), and my mum ’d say, Oh, What ’s eh, what ’s he give thou? (LND_003). Note that all thou in the corpus occur in quotations from the bible or quotations of direct speech produced at some point in the past. Occasion- ally, thou occurs where a speaker produces lists of dialect words from a spe- cific region, e.g. lookst tha was lookest thou, and seest tha was seest thou, and harkest tha was harkedst thou (WES_005). In one case, the speaker used thou to address the interviewer: Then her ’d shuffle herself, thou know’st, stand up straight (SAL_038). 16. Weighted frequencies were used to facilitate a comparison between the four dialect areas. The weighted frequencies are based on the relative frequencies for each area. For subject me, for example, we get frequencies of 0.24 SE, 0.07 SW, 0.17 MID, 0.18 N; taken *100 this renders values of 24 SE, 7 SW, 17 MID, 18 N (total 66), and weighted percentages of 36.4% SE, 10.6% SW, 25.8% MID, 27.3% N. 17. SED questions IX.7.2. ‘is her married/ is them (two) wedded’; IX.7.7. ‘her is, them is’; IX.7.9. ‘her is’; IX.7.10. ‘her’s not’; IX.7.3. ‘aren’t them’; IX.7.4.+ 5. ‘isn’t them’; IX.7.6. ‘wasn’t/ weren’t them’. Incidental material: 15,2 VIII.5. ‘didn’t them’; 15,4 VI.7.4 ‘them’s acome’; 37,1 IX.7.2. ‘does thee think them married?’ Also compare Chambers and Trudgill (1998: 130/ 133–134/ 136). 18. In the special case of 2SG, the diffusion of object you into the domain of sub- ject ye came to completion during Early Modern English, and although 2SG th-forms continue to exist in some varieties, they, too, show a similar develop- 176 Nuria Hernández

ment. According to Stratton (1949: 148), the singular th-forms were still used by Quakers in the mid 20th century, but in Quaker English “they have replaced the older correct nominative thou by the objective thee.” In the SED, a subject thou area was identified in the North of England, a subject thee area in the western Midlands and the Southwest, and a small subject ye area in the Northeast. For the use of thyself / theeself in Southwestern dialects, see Wakelin (1986: 33). In present-day English, the use of subject me in coordination ranges among the three most prominent non-standard pronoun features listed in the Handbook of Varieties of English (Kortmann and Szmrecsanyi 2004: 1198), next to demon- strative them and distinct 2PL forms. 19. Based on Traugott (1972: 129). 20. Emonds’ assumption that the use of prescriptive It is I “is not part of a dia- lect spoken (and hence acquired) as a native language by any natural language speech community” (Emonds 1986: 93) would imply that the historical devel- opments up to 1600 represent ‘unnatural’ developments. 21. Some studies have ruled out the use of specific O-forms in subject function. According to Ihalainen (1991), for example, him and us subjects do not appear in the Somerset dialect. 22. Compare Wakelin (1986: 33) for the use of thou subjects, and thee subjects and objects, in the Southwest of England. The SED reports the use of thee as a stressed nominative in Somerset, Wiltshire, Cornwall, Devon and Dorset, the use of thou as an unstressed objective in Wiltshire, and the use of thee as an unstressed objective in the southwestern counties, except Cornwall. 23. Unless we attribute all S-forms in coordination to learned ‘Prestige Usage’ (cf. Emonds 1986). 24. Question tags with do, can, should, would, have, be, and corresponding negative and inflected forms, incl. ain’t, won’t, and variants like wun’t, din’t, hent.No instances were found of third person enclitic O-form subjects of the type Don’ ’er? ‘Doesn’t she/ he?’ (cf. Ihalainen 1994: 216). 25. Excluding direct-speech and turn-taking markers such as And she, Well, . . . (LAN_002). 26. No-verb answers such as Who did that? – Her. have long been noticed as one of different disjunctive O-form environments, for example in Wright 1905. 27. The results presented in this study give no indication that the occurrence of her subjects is historically motivated. Wagner (2002), for example, found that her and us were the two most frequently used O-forms in her data. According to Wagner (ibid., fn. 21), “[a]t least for the feminine form, this can very well be historically motivated. While regions under strong Scandinavian influence adopted the new form she quite readily to have two clearly distinct forms for the subject and oblique cases, the Southwest could have kept the old h-form, modern PE thus representing a kind of relict form/ historical retention/ conser- vatism.” The idea that the Southwest of England, as a conservative area, may Personal pronouns 177

still be preserving a preference for the older h-form (OE heo¯ ) does, however, not tally with the fact that all occurrences of object she in FRED were also found in this particular area. 28. For example Frisian. According to Kiparsky (2002: 203), “[m]any languages lack reflexive pronouns entirely and simply use personal pronouns in their place.” 29. Compare Kiss (2001) for ‘exempt anaphora’, i.e. anaphora which are exempt from binding principle A like personal pronouns. Compare Huddleston and Pul- lum (2002: 1484–1485) for ‘override reflexives’. 30. I argue against the view that self-forms in argument positions are really in- tensifiers attached to covert or incorporated pronominal heads (cf. König and Siemund 2000; Baker 1995). The problem of such a definition lies with ut- terances where the independent self-form does not function as an intensifier or contrastive expression (this problem has been recognised by König and Siemund). 31. Note that the development from variant to invariant reflexive markers can cur- rently be observed in Brazilian Portuguese, se replacing me, te, nos and vos (cf. Azevedo 1989: 865). 32. Picture NPs are a phenomenon where the situational context plays a decisive role in pronoun resolution, due to the simultaneous presence of speaker, hearer and referent (e.g., if the picture lies before the conversation participants), but also paralinguistic means such as pointing (at the referent in the picture). The term was probably first used by Warshawsky (1965). For discourse-analytic ap- proaches to picture pronouns see Cantrall (1973: 46), Cantrall (1974: 22) and Kuno (1987: 166). For an experimental study of third-person picture NPs see Goldwater and Runner (2006). Compare Stern (2004: 276–277) for an explana- tion based on the Role Conflict Hypothesis: “the referent of a pronoun used for a logophoric message plays at least two roles in an utterance: the role for which the referent is mentioned, and as the cognizer of the situation. Zribi-Hertz calls this role the Subject of Consciousness (SC).” 33. Compare Riley and Parker (1998: 41), who state that “the prohibited nomina- tive case occurs after a preposition only in coordinate constructions.” Contrary to Riley’s and Parker’s categorical statement, other surveys have shown that subject case is occasionally used in non-coordinated sentences such as come with me/ I, especially by speakers from the SW (e.g., Viereck 1991, map M15). 34. The fact that “there is no independent evidence that true conjunctions can be case assigners” (Schütze 2001: 213) has so far been ignored in this classifica- tion. 35. For 3SG pronouns after like, see the ‘Constraint on NP Like X-self’ in Kuno (1987: 123–124). 36. Schütze himself, however, describes sentences with understood predicates as “one further construction that superficially looks like an elliptical default envi- ronment” (2001: 212, fn. 8), treating comparative expressions like as or than as regular prepositions. 178 Nuria Hernández

37. In this argumentation, of course, the antecedent has to be animate and physically present in the event. 38. Subjectivity has also been recognised to play a crucial role in the semantics of substitutional relations, for example regarding relative clause restrictiveness and verbal aspect (cf. Bache 1985). 39. Conditions such as the so-called ‘Empathy Constraint on Reflexives’ (Kuno 1987: 158). 40. Compare the description of German reflexivisation strategies by Faltz (1985: 118): “In the case of languages like German, the reflexive is not used when it does not contribute information that cannot be carried by ordinary pronouns. ...Wemight describe a setup like this as being ‘functionally streamlined’; use a reflexive pronoun only when an ordinary pronoun will not do.” 41. Qualifier nouns have elsewhere been classified as appositions (cf. Kjellmer 1971: 44–45). 42. Emonds (1986: 98) proposes a generative explanation based on the so-called Adjacent Head Condition: “two heads of phrases can be related by a transfor- mational rule only if one governs the other.” In qualified NPs like us girls, this condition is not satisfied. The categorical conclusion that the pronoun in such an NP must therefore have object case is not supported by the distribution found in FRED. 43. Especially areas of Irish immigration such as the metropolitan areas of Liver- pool, Newcastle and Manchester, also Glasgow, New York, and urban Australia; compare Trudgill (1999: 92); Beal (1993: 205); Filppula (1999: 55); Cheshire et al. (1993: 81). For Scots, see Miller (1993: 108); for Irish English, see Harris (1993: 139), Hickey (2003). 44. No instances were found of you...together,asinWhere are you together? or Come you on together!; see Trudgill (2004) for examples from East Anglia. 45. You guys is especially common in American English. The increasing gender- neutrality of guy has been considered as a motivating factor for the spreading of you guys (cf. McLennan 2004). 46. Compare the Spanish 2PL pronoun vosotros (‘you others’), and the polite forms of address usted/ ustedes (Catalan vusted), originally vuestra merced ‘your Grace/ Highness’. 47. For the use of y’all with singular reference, see Hyman (2006). For an early reference to you all as a polite expression in Southern American English, see Morrison (1926). 48. In an investigation of personal pronoun forms in demonstrative function, Hernán- dez (forthcoming) finds a similar distribution of southwestern demonstrative they as compared to supraregional demonstrative them. 49. Other geographic areas are associated with other grammatical phenomena. In the North of England, for example, we find the Northern Subject Rule and a regularised pattern of reflexive pronoun forms (cf. Beal 2004; Pietsch 2005). Personal pronouns 179

50. In his investigation of persistence phenomena, Szmrecsanyi (2006: 197) notes: “More often than not, FRED exhibited the lowest level of persistence ...Itis likely that this has less to do with the data sampled (dialect speech), but rather with the much higher mean age of speakers in the corpus.” Szmrecsanyi also finds that “[o]n aggregate, age seems to have a weakening effect on persis- tence.” In addition, there appears to be a reduced chance for allo-repetition, or comprehension-to-production priming, in longer stretches of monologue (cf. Tannen 2007; Cleland and Pickering 2003). 51. In the Noun Phrase Accessibility Hierarchy of Keenan and Comrie (1977), the subject position is the most frequently relativised grammatical function. It is the entrance gate for innovations and the last resort for relic forms. Harris (1981: 19) predicts that “I will in due course be reserved solely and exclusively to contexts where it is directly bound to a main or auxiliary verb form within a finite verb phrase as its subject (exactly like je in French); in all other contexts, me will be appropriate (cf. French moi).” 52. Similar in Dutch. Tieken-Boon van Ostade (1994: 226), for example, describes the use of O-form subjects and S-form prepositional complements among Dutch children as follows: “It is only after continued exposure to standard adult us- age and as the result of persistent correction that the grammatically correct forms begin to appear more regularly. This seems true for gender distinctions in pronominals, too. The set of pronominals in which case, number and gender are distinguished therefore has to be actively acquired as part of the process of first language acquisition.”

Personal pronouns 181

References

Adger, David. 2006. Combinatorial variability. Journal of Linguistics 42: 503–530. Anderwald, Lieselotte. 2004. The varieties of English spoken in the Southeast of England: Morphology and syntax. In: Bernd Kortmann and Edgar Schneider (eds.), A Handbook of Varieties of English, 175–195. Berlin and New York: Mouton de Gruyter. Audring, Jenny. 2006. Pronominal gender in spoken Dutch. Journal of Germanic Linguistics 18: 85–116. Azevedo, Milton M. 1989. Vernacular features in educated speech in Brazilian Portuguese. Hispania 72: 862–872. Bache, Carl. 1985. The semantics of grammatical categories: A dialectical ap- proach. Journal of Linguistics 21: 51–77. Baker, Carl L. 1995. Contrast, discourse prominence, and intensification, with spe- cial reference to locally free reflexives in British English. Language 71: 63–101. Barnes, William. 1886. A Glossary of the . London: Trübner. Beal, Joan. 1993. The grammar of Tyneside and Northumbrian English. In: James Milroy and Lesley Milroy (eds.), Real English: The Grammar of English Dia- lects in the British Isles, 187–213. London: Longman. Beal, Joan. 2004. English dialects in the North of England: Morphology and syntax. In: Bernd Kortmann and Edgar Schneider (eds.), A Handbook of Varieties of English, 114–141. Berlin and New York: Mouton de Gruyter. Bernstein, Cynthia. 2003. Grammatical features of southern speech: Yall, might could, and fixin to. In: Stephen J. Nagle and Sara L. Sanders (eds.), English in the Southern United States, 106–118. Cambridge: Cambridge University Press. Bhat, Darbhe N. S. 2004. Pronouns. Oxford: Oxford University Press. Blake, Barry J. 2001. Case. Cambridge: Cambridge University Press. Bosch, Peter. 1985. Constraints, coherence, comprehension. Reflections on ana- phora. In: Emel Sözer (ed.), Text Connexity, Text Coherence, 299–319. Ham- burg: Buske. Brown, Penelope and Stephen C. Levinson. 1987. Politeness: Some Universals in Language Usage. Cambridge: Cambridge University Press. Burchfield, Robert W. (ed.). 1994. The Cambridge History of the English Lan- guage, Vol. V: English in Britain and Overseas. Origins and Developments. Cambridge: Cambridge University Press. Cantrall, William R. 1973. Reflexive pronouns and viewpoint. Linguistische Berichte 28: 42–50. Cantrall, William R. 1974. Viewpoint, Reflexives, and the Nature of Noun Phrases. The Hague: Mouton. Cardinaletti, Anna and Michael Starke. 1994. The typology of structural deficiency: On the three grammatical classes. University of Venice Working Papers in Lin- guistics 4: 41–109. 182 Nuria Hernández

Carminati, Maria Nella. 2005. Processing reflexes of the feature hierarchy (person > number > gender) and implications for linguistic theory. Lingua 115: 259– 285. Chambers, Jack and Peter Trudgill. 1998. Dialectology. Cambridge: Cambridge University Press. Chambers, Jack K. 2004. Dynamic typology and vernacular universals. In: Bernd Kortmann and Edgar Schneider (eds.), A Handbook of Varieties of English, 127– 145. Berlin and New York: Mouton de Gruyter. Cheshire, Jenny and Dieter Stein. 1997. The syntax of spoken language. In: Jenny Cheshire and Dieter Stein (eds.), Taming the Vernacular – From Dialect to Writ- ten Standard Language, 1–12. London: Longman. Cheshire, Jenny, Viv Edwards, and Pamela Whittle. 1993. Non-standard English and dialect levelling. In: James Milroy and Lesley Milroy (eds.), Real English: The Grammar of English Dialects in the British Isles, 53–96. London: Long- man. Chomsky, Noam. 1981. Lectures on Government and Binding. Dordrecht: Foris. Cleland, Alexandra A. and Martin J. Pickering. 2003. The use of lexical and syn- tactic information in language production: Evidence from the priming of noun- phrase structure. Journal of Memory and Language 49: 214–230. Coates, Jennifer. 2004. Women, Men and Language: A Sociolinguistic Account of Gender Differences in Language. London: Longman. Corbett, Greville. 2000. Number. Cambridge: Cambridge University Press. Corbett, Greville G. 1991. Gender. Cambridge: Cambridge University Press. Coupland, Nikolas. 1988. Dialect in Use: Sociolinguistic Variation in Cardiff Eng- lish. Cardiff: University of Wales Press. Croft, William. 1990. Typology and Universals. Cambridge: Cambridge University Press. Curzan, Anne L. 2003. Gender Shifts in the History of English. Cambridge: Cam- bridge University Press. Dahl, Östen. 2001. Principles of areal typology. In: Martin Haspelmath, Ekke- hard König, Wulf Oesterreicher, and Wolfgang Raible (eds.), Language Typol- ogy and Language Universals: An International Handbook, Vol. 1, 1456–1470. Berlin and New York: De Gruyter. Denison, David. 1998. Syntax. In: Suzanne Romaine (ed.), The Cambridge History of the English Language, Vol. IV: 1776–1997, 92–329. Cambridge: Cambridge University Press. Dunkling, Leslie. 1990. A Dictionary of Epithets and Terms of Address. London: Routledge. Ellis, Alexander John. 1869–1889. On Early English Pronunciation: With Especial Reference to Shakespeare and Chaucer. London: Trübner. Elworthy, Frederic Thomas. 1875–1886. The Grammar of the Dialect of West Som- erset (containing The Dialect of West Somerset, Grammar of West Somerset Dialect, West Somerset Word-Book, or Glossary). London: Trübner. Personal pronouns 183

Emonds, Joseph. 1986. Grammatically deviant prestige constructions. In: Michael Brame, Heles Contreras, and Frederick J. Newmeyer (eds.), A Festschrift for Sol Saporta, 93–129. Seattle: Noit Amrofer. Erdmann, Peter. 1978. It’s I, It’s me: A case for syntax. Studia Anglica Posnanien- sia 10: 67–80. Evans, Bergen and Cornelia Evans. 1957. A Dictionary of Contemporary American Usage. New York: Random House. Faltz, Leonard. 1985. Reflexivization: A Study in Universal Syntax. New York and London: Garland. Fanselow, Gisbert. 2002. Quirky ‘subjects’ and other specifiers. In: Ingrid Kauf- mann and Barbara Stiebels (eds.), More than Words: A Festschrift for Dieter Wunderlich, 227–250. Berlin: Akademie Verlag. Filppula, Markku. 1999. The Grammar of Irish English – Language in Hibernian Style. London: Routledge. Filppula, Markku. 2004. Irish English: Morphology and syntax. In: Bernd Kort- mann and Edgar Schneider (eds.), A Handbook of Varieties of English, 73–101. Berlin and New York: Mouton de Gruyter. Forchheimer, Paul. 1953. The Category of Person in Language. Berlin: Walter de Gruyter. Goldwater, Micah B. and Jeffrey T. Runner. 2006. Coreferential interpretations of reflexives in picture noun phrases: An experimental approach. In: Pascal De- nis, Eric McCready, Alexis Palmer, and Brian Reese (eds.), Proceedings of the 2004 Texas Linguistics Society Conference: Issues at the Semantics-Pragmatics Interface, 28–34. Somerville, MA: Cascadilla Proceedings Project. Grano, Thomas. 2006. ‘Me and Her’ Meets ‘He and I’: Case, Person, and Linear Ordering in English Coordinated Pronouns. Unpublished BA thesis, Stanford University. http://home.uchicago.edu/ tgrano/uht.pdf. Greenbaum, Sidney and Randolph Quirk. 1990. A Student’s Grammar of the Eng- lish Language. Harlow: Longman. Greenberg, Joseph H. 1963. Some universals of grammar with particular reference to the order of meaningful elements. In: Joseph H. Greenberg (ed.), Universals of Language: Report of a Conference Held at Dobbs Ferry, New York, April 13-15, 73–113. Cambridge, MA: MIT Press. Greenberg, Joseph H. 1966. Language Universals: With Special Reference to Fea- ture Hierarchies. The Hague: Mouton. Halliwell, James O. 1887. A Dictionary of Archaic and Provincial Words, Obsolete Phrases, Proverbs, and Ancient Customs, from the Fourteenth Century: In Two Volumes. London: John Russell Smith. Harley, Heidi and Elizabeth Ritter. 2002a. Person and number in pronouns: A feature-geometric analysis. Language 78: 482–526. Harley, Heidi and Elizabeth Ritter. 2002b. Structuring the bundle: A universal morphosyntactic feature geometry. In: Horst J. Simon and Heike Wiese (eds.), 184 Nuria Hernández

Pronouns: Grammar and Representation, 23–39. Amsterdam and Philadelphia: John Benjamins. Harris, John. 1993. The grammar of Irish English. In: James Milroy and Lesley Milroy (eds.), Real English: The Grammar of English Dialects in the British Isles, 139–186. London: Longman. Harris, Martin. 1981. It’s I, It’s me: Further reflections. Studia Anglica Posnanien- sia 13: 17–20. Haspelmath, Martin. 2008. A frequentist explanation of some universals of reflex- ive marking. Linguistic Discovery 6.1: 40–63. Heacock, Paul (ed.). 2008. Cambridge Dictionary of American English. Cam- bridge: Cambridge University Press. Helmbrecht, Johannes. 2005. Politeness distinctions in pronouns. In: Martin Haspelmath, Matthew S. Dryer, David Gil, and Bernard Comrie (eds.), The World Atlas of Language Structures, 186–189. Oxford: Oxford University Press. Henry, Alison. 1995. Belfast English and Standard English: Dialect Variation and Parameter Setting. Oxford: Oxford University Press. Hernández, Nuria. forthcoming. Personal Pronouns in the Dialects of England – A Corpus Study of Grammatical Variation in Spontaneous Speech. PhD thesis, Albert-Ludwigs-Universität Freiburg. Hickey, Raymond. 2003. Rectifying a standard deficiency: Second person pronom- inal distinctions in varieties of English. In: Irma Taavitsainen and Andreas H. Jucker (eds.), Diachronic perspectives on address term systems, 345–74. Ams- terdam: John Benjamins. Holmes, Janet. 1995. Men, Women and Politeness. London: Longman. Honey, John. 1995. A new rule for the Queen and I? English Today 11: 3–8. Householder, Fred W. 1987. Some facts about me and I. Language Research 23: 163–184. Howe, Stephen. 1996. The Personal Pronouns in the Germanic Languages: A Study of Personal Pronoun Morphology and Change in the Germanic Languages from the First Records to the Present Day. Berlin and New York: De Gruyter. Huang, Yan. 2000. Anaphora–ACross-linguistic Study. New York: Oxford Uni- versity Press. Huddleston, Rodney and Geoffrey K. Pullum. 2002. The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press. Hughes, Arthur and Peter Trudgill. 1996. English Accents and Dialects: An Intro- duction to Social and Regional Varieties of English in the British Isles. London: Arnold. Hyman, Eric. 2006. The all of you-all. American Speech 81: 325–331. Ihalainen, Ossi. 1985. ‘He took the bottle and put ’n in his pocket’: The object pronoun it in present-day Somerset. In: Wolfgang Viereck (ed.), Focus on: England and Wales, 153–161. Amsterdam and Philadelphia: John Benjamins. Personal pronouns 185

Ihalainen, Ossi. 1991. On grammatical diffusion in Somerset folk speech. In: Peter Trudgill and Jack Chambers (eds.), Dialects of English: Studies in Grammatical Variation, 104–119. London: Longman. Ihalainen, Ossi. 1994. The dialects of England since 1776. In: Robert W. Burch- field (ed.), The Cambridge History of the English Language, Vol. V: English in Britain and Overseas. Origins and Developments, 197–274. Cambridge: Cam- bridge University Press. Ingram, David. 1978. Typology and universals of personal pronouns. In: Joseph H. Greenberg (ed.), Universals of Human Language, Vol. 3, 213–247. Stanford, CA: Stanford University Press. Jespersen, Otto. 1928. An International Language. London: Allen & Unwin. Jespersen, Otto. 1949. A Modern English Grammar on Historical Principles, Vol. VII: Syntax. Copenhagen: Ejnar Munksgaard. Ježek, Elisabetta and Paolo Ramat. 2009. On parts-of-speech transcategorization. Folia Linguistica 43: 391–416. Keenan, Edward L. and Bernard Comrie. 1977. Noun phrase accessibility and universal grammar. Linguistic Inquiry 8: 63–99. Kimball, Geoffrey D. 1991. Koasati Grammar. Lincoln: University of Nebraska Press. Kiparsky, Paul. 2002. Disjoint reference and the typology of pronouns. In: In- grid Kaufmann and Barbara Stiebels (eds.), More than Words: A Festschrift for Dieter Wunderlich, 179–226. Berlin: Akademie Verlag. Kiparsky, Paul. 2008. Universals constrain change; change results in typologi- cal generalizations. In: Jeff Good (ed.), Language Universals and Language Change, 23–53. Oxford: Oxford University Press. Kiss, Tibor. 2001. Anaphora and exemptness: A comparative treatment of anaphoric binding in German and English. In: Dan Flickinger and Andreas Kathol (eds.), Proceedings of the 7th International HPSG Conference, UC Berkeley, 22–23 July, 2000. Stanford, CA: Center for the Study of Language and Information. Kjellmer, Göran. 1971. Context and Meaning. Stockholm: Almqvist & Wiksell. Kjellmer, Göran. 1986. ‘Us anglos are a cut above the field’: On objective pronouns in nominative contexts. English Studies 67: 445–449. Koktova, Eva. 1999. Word-Order Based Grammar. Berlin and New York: Mouton de Gruyter. König, Ekkehard and Peter Siemund. 1997. On the development of reflexive pro- nouns in English: A case study in grammaticalization. In: Uwe Böker and Hans Sauer (eds.), Anglistentag 1996 Dresden – Proceedings of the Conference of the German Association of University Teachers of English, Vol. XVIII, 95–108. Trier: Wissenschaftlicher Verlag. König, Ekkehard and Peter Siemund. 2000. Intensifiers and reflexives: A typolog- ical perspective. In: Zygmunt Frajzyngier and Traci S. Curl (eds.), Reflexives: Forms and Functions, 41–74. Amsterdam and Philadelphia: John Benjamins. 186 Nuria Hernández

Kortmann, Bernd. 1999. Typology and dialectology. In: Bernard Caron (ed.), Pro- ceedings of the 16th International Congress of Linguists, Paris 1997. City: El- sevier Science: CD-ROM article 0060. Kortmann, Bernd. 2001. In the year 2525. . . Reflections on the future shape of English. Anglistik: International Journal of English Studies 12: 97–114. Kortmann, Bernd. 2002. New prospects for the study of English dialect syntax: Impetus from syntactic theory and language typology. In: L. Cornips S. Barbiers and S. van der Kleij (eds.), Syntactic Microvariation, 185–213. Amsterdam: Meertens Institute Electronic Publications in Linguistics. Kortmann, Bernd and Edgar Schneider (eds.). 2004. A Handbook of Varieties of English, vol. 2: Morphology and Syntax. Berlin and New York: Mouton de Gruyter. Kortmann, Bernd and Benedikt Szmrecsanyi. 2004. Global synopsis: Morphologi- cal and syntactic variation in English. In: Bernd Kortmann and Edgar Schneider (eds.), A Handbook of Varieties of English, 1142–1202. Berlin and New York: Mouton de Gruyter. Krug, Manfred. 1998. String frequency: A cognitive motivating factor in coales- cence, language processing, and linguistic change. Journal of English Linguis- tics 26: 286–320. Krug, Manfred. 2003. Frequency as a determinant in grammatical variation and change. In: Günter Rohdenburg and Britta Mondorf (eds.), Determinants of Grammatical Variation in English, 7–67. Berlin and New York: Mouton de Gruyter. Kruisinga, Etsko. 1905. A Grammar of the Dialect of West Somerset: Descriptive and Historical. Bonn: Hanstein. Kuno, Susumu. 1987. Functional Syntax: Anaphora, Discourse, and Empathy. Chicago: Chicago University Press. Labov, William. 1966. The linguistic variable as a structural unit. Washington Linguistic Review 3: 4–22. Labov, William. 1970. The logic of non-standard English. In: Frederick Williams (ed.), Language and Poverty: Perspectives on a Theme, 153–187. Chicago: Markham. Labov, William. 1971. The Notion of System in Creole Studies. In: Dell Hymes (ed.), Pidginization and Creolization of Languages – Proceedings of a Con- ference Held at the University of the West Indies at Mona, Jamaica, 447–472. Cambridge: Cambridge University Press. Labov, William. 1996. When intuitions fail. In: L. McNair, K. Singer, L. Dol- brin, and M. Aucon (eds.), Papers from the Parasession on Theory and Data in Linguistics,, 77–106. Chicago Linguistic Society 32. Lass, Roger. 1987. The Shape of English: Structure and History. London: Dent. Lass, Roger. 1990. How to do things with junk: Exaptation and language evolution. Journal of Linguistics 26: 79–102. Personal pronouns 187

Leonard, Sterling Andrus. 1929 (1962, reissued). The Doctrine of Correctness in English Usage, 1700–1800. Madison: University of Wisconsin Press. Levinson, Stephen C. 1997. Pragmatics. Cambridge: Cambridge University Press. Lowsley, Barzillai. 1888. A Glossary of Berkshire Words and Phrases. London: Trübner. Marshall, William. 1789. Provincialisms of East Norfolk (1787)/ East Yorkshire (1788)/ the Vale of Glocester (1789)/ the Midland Counties (1790)/ West De- vonshire (1796). Reprinted Glossaries, 1873. London: Trübner. Mathiot, Madeleine and Marjorie Roberts. 1979. Sex roles as revealed through referential gender in American English. In: Madeleine Mathiot (ed.), Ethnolin- guistics: Boas, Sapir and Whorf Revisited, 1–47. The Hague: Mouton. Mayhew, Anthony L. 1908. ‘Dear’: ‘O Dear No!’. Notes and Queries s10-X, 257: 434–435. McLennan, Sean. 2004. Guy, guys, and gender neutrality. Proceedings from the Annual Meeting of the Chicago Linguistic Society 40: 211–219. Miller, Jim. 1993. The grammar of Scottish English. In: James Milroy and Lesley Milroy (eds.), Real English: The Grammar of English Dialects in the British Isles, 99–138. London: Longman. Miller, Jim. 2004. Scottish English: Morphology and syntax. In: Bernd Kortmann and Edgar Schneider (eds.), A Handbook of Varieties of English, 47–72. Berlin and New York: Mouton de Gruyter. Milroy, James and Lesley Milroy. 1993. Real English: The Grammar of English Dialects in the British Isles. London: Longman. Mitchell, Bruce. 1985. Old English Syntax, Vol. I: Concord, the Parts of Speech, and the Sentence. Oxford: Clarendon. Morrison, Estelle Rees. 1926. ‘You all’ and ‘we all’. American Speech 2: 133. Mühlhäusler, Peter and Rom Harré. 1990. Pronouns and People: The Linguistic Construction of Social and Personal Identity. Oxford: Blackwell. Murray, James A. H., Henry Bradley, W. A. Craigie, and C. T. Onions (eds.). 1961/ 1989/ online edition. The Oxford English Dictionary. Oxford: Clarendon Press. Nevalainen, Terttu and Helena Raumolin-Brunberg. 1994. Its strength and the beauty of It: The standardization of the third person neuter possessive in Early Modern English. In: Dieter Stein and Ingrid Tieken-Boon van Ostade (eds.), To- wards a Standard English 1600–1800, 171–216. Berlin and New York: Mouton de Gruyter. Newmeyer, Frederick J. 2004. Typological evidence and universal grammar. Stud- ies in Language 28: 527–548. Noyer, Robert Rolf. 1992. Features, Positions and Affixes in Autonomous Mor- phological Structure. PhD thesis, Massachusetts Institute of Technology. http://hdl.handle.net/ 1721.1/12895. O’Neill, Moira. 1933. Collected Poems of Moira O’Neill. Edinburgh and London: William Blackwood & Sons; xtf.lib.virginia.edu/xtf/view?docId=chadwyck_ ep/uvaGenText/tei/chep_3.2381.xml. 188 Nuria Hernández

Orton, Harold, Stewart Sanderson, and John Widdowson. 1978. The Linguistic Atlas of England. London: Croom Helm. Parker, Frank, Kathryn Riley, and Charles Meyer. 1988. Case assignment and the ordering of constituents in coordinate constructions. American Speech 63: 214– 233. Parker, Frank, Kathryn Riley, and Charles Meyer. 1990. Untriggered reflexive pro- nouns in English. American Speech 65: 50–69. Patrick, Peter L. 2004. Jamaican creole: Morphology and syntax. In: Bernd Kort- mann and Edgar Schneider (eds.), A Handbook of Varieties of English, 407–438. Berlin and New York: Mouton de Gruyter. Petyt, Ken. 1985. Dialect and Accent in Industrial West Yorkshire. Amsterdam and Philadelphia: John Benjamins. Pietsch, Lukas. 2005. Variable Grammars: Verbal Agreement in Northern Dialects of English. Tübingen: Niemeyer. Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik. 1972. A Grammar of Contemporary English. London: Longman. Redfern, Richard K. 1994. Is between you and I good English? In: Greta D. Little and Michael Montgomery (eds.), Centennial Usage Studies, 187–193. Tuscaloosa and London: Alabama Press. Richards, Marc. 2008. Defective agree, case alternations, and the prominence of person. In: Marc Richards and Andrej L. Malchukov (eds.), Linguistische Ar- beitsberichte 86, 137–161. Leipzig: University of Leipzig Press. Riley, Kathryn and Frank Parker. 1998. English Grammar: Prescriptive, Descrip- tive, Generative, Performance. Boston: Allyn and Bacon. Rohdenburg, G’unter´ and Britta Mondorf (eds.). 2003. Determinants of Grammat- ical Variation in English. Berlin and New York: Mouton de Gruyter. Saha, Prosanta K. 1987. Strategies of Reflexivization in American English. Amer- ican Speech 62: 211–234. Sankoff, David and Suzanne Laberge. 1978. The linguistic market and the statistical explanation of variability. In: David Sankoff (ed.), Linguistic Variation: Models and Methods, 239–250. New York and London: Academic Press. Sapir, Edward. 1921. Language: An Introduction to the Study of Speech. New York: Harcourt, Brace and World. Sasse, Hans-Jürgen. 1993. Syntactic categories and subcategories. In: Joachim Ja- cobs, Arnim von Stechow, Wolfgang Sternefeld, and Theo Vennemann (eds.), Syntax: An International Handbook of Contemporary Research, 646–686. Berlin: De Gruyter. Schütze, Carson T. 2001. On the nature of default case. Syntax 4: 205–238. Siemund, Peter. 2008. Pronominal Gender in English – A Study of English Varieties from a Cross-Linguistic Perspective. London: Routledge. Siewierska, Anna. 2004. Person. Cambridge: Cambridge University Press. Personal pronouns 189

Siewierska, Anna. 2005. Gender distinctions in independent personal pronouns. In: Martin Haspelmath, Matthew S. Dryer, David Gil, and Bernard Comrie (eds.), The World Atlas of Language Structures, 182–185. Oxford: Oxford University Press. SigurDsson, Halldór Ármann. 1992. The case of quirky subjects. Working Papers in Scandinavian Syntax 49: 1–26. Sinclair, John. 1991. Corpus, Concordance, Collocation. Oxford: Oxford Univer- sity Press. Skousen, Royal. 1989. Analogical Modeling of Language. Dordrecht: Kluwer Academic Publishers. Stein, Dieter. 1994. Sorting out the variants: Standardization and social factors in the English language 1600–1800. In: Dieter Stein and Ingrid Tieken-Boon van Ostade (eds.), Towards a Standard English 1600–1800, 1–17. Berlin and New York: Mouton de Gruyter. Stein, Dieter. 1997. Syntax and Varieties. In: Jenny Cheshire and Dieter Stein (eds.), Taming the Vernacular – From Dialect to Written Standard Language, 35–50. London: Longman. Stern, Nancy. 2004. The semantic unity of reflexive, emphatic, and other -self pronouns. American Speech 79: 270–280. Stratton, Clarence. 1949. Guide to Correct English. New York: Whittlesey House, McGraw-Hill Book Company. Szemerényi, Oswald J. L. 1996. Introduction to Indo-European Linguistics. Ox- ford: Clarendon Press. Szmrecsanyi, Benedikt. 2006. Morphosyntactic Persistence in Spoken English. A Corpus Study at the Intersection of Variationist Sociolinguistics, Psycholinguis- tics, and Discourse Analysis. Berlin and New York: Mouton de Gruyter. Szmrecsanyi, Benedikt. 2011. The geolinguistics of grammatical variability in tra- ditional British English dialects: A large-scale frequency-based study. Post- doctoral habilitation thesis, University of Freiburg. Tannen, Deborah. 2007. Talking Voices – Repetition, Dialogue, and Imagery in Conversational Discourse. Cambridge: Cambridge University Press. Tieken-Boon van Ostade, Ingrid. 1994. Standard and non-standard pronominal usage in English, with special reference to the eighteenth century. In: Dieter Stein and Ingrid Tieken-Boon van Ostade (eds.), Towards a Standard English 1600–1800, 217–242. Berlin and New York: Mouton de Gruyter. Tognini-Bonelli, Elena. 2001. Corpus Linguistics at Work. Amsterdam and Philadelphia: John Benjamins. Tomasello, Michael. 2003. Constructing a Language: A Usage-Based Theory of Language Acquisition. Cambridge, Mass.: Harvard University Press. Traugott, Elizabeth Closs. 1972. A History of English Syntax. New York: Holt, Rinehart, and Winston. Trudgill, Peter. 1974. The Social Differentiation of English in Norwich. Cambridge: Cambridge University Press. 190 Nuria Hernández

Trudgill, Peter. 1999. The Dialects of England. Oxford: Blackwell. Trudgill, Peter. 2004. The dialect of East Anglia: Morphology and syntax. In: Bernd Kortmann and Edgar Schneider (eds.), A Handbook of Varieties of English, 142– 153. Berlin and New York: Mouton de Gruyter. Trudgill, Peter. 2009. Sociolinguistic typology and complexification. In: Geof- frey Sampson, David Gil, and Peter Trudgill (eds.), Language Complexity as an Evolving Variable, 98–109. Oxford: Oxford University Press. Upton, Clive, Stewart Sanderson, and John Widdowson. 1987. Word Maps: A Dia- lect Atlas of England. London: Croom Helm. Upton, Clive, David Parry, , and J. D. A. Widdowson. 1994. Survey of English Dialects: The Dictionary and Grammar. London: Routledge. Vallins, George Henry. 1952. Good English – How to Write It. London: André Deutsch. Van Gelderen, Elly. 2000a. A History of English Reflexive Pronouns: Person, Self, and Interpretability. Amsterdam and Philadelphia: John Benjamins. Van Gelderen, Elly. 2000b. Bound pronouns and non-local anaphors: The case of earlier English. In: Zygmunt Frajzyngier and Traci S. Curl (eds.), Reflexives. Forms and Functions, 187–226. Amsterdam and Philadelphia: John Benjamins. Viereck, Wolfgang. 1991. The Computer Developed Linguistic Atlas of Eng- land. In collaboration with Heinrich Ramisch; computational production Har- ald Händler et al. Tübingen: Niemeyer. Vogelaer, Gunther De and Gert De Sutter. 2011. The geography of gender change: Pronominal and adnominal gender in Flemish dialects of Dutch. Language Sciences 33: 192–205. Wagner, Susanne. 2002. We don’ say she, do us? Pronoun Exchange – A fea- ture of English dialects? Unpublished manuscript, Albert-Ludwigs-Universität Freiburg. http://www.tu-chemnitz.de/phil/english/ling/download/Wagner200 2PronounExchange.pdf. Wagner, Susanne. 2004a. Gender in English Pronouns: Myth and Real- ity. PhD thesis, Albert-Ludwigs-Universität Freiburg. http://www.freidok.uni- freiburg.de/volltexte/1412. Wagner, Susanne. 2004b. English Dialects in the Southwest: Morphology and Syn- tax. In: Bernd Kortmann and Edgar Schneider (eds.), A Handbook of Varieties of English, 154–174. Berlin and New York: Mouton de Gruyter. Wagner, Susanne. 2004c. ‘Gendered’ pronouns in English dialects – A typological perspective. In: Bernd Kortmann (ed.), Dialectology Meets Typology: Dialect Grammar from a Cross-Linguistic Perspective, 479–496. Berlin and New York: Mouton de Gruyter. Wagner, Susanne. 2005. Gender in English pronouns: Southwest England. In: Bernd Kortmann, Tanja Herrmann, Lukas Pietsch, and Susanne Wagner (eds.), A Comparative Grammar of British English Dialects: Agreement, Gender, Rel- ative Clauses, 211–367. Berlin and New York: Mouton de Gruyter. Personal pronouns 191

Wakelin, Martyn. 1986. The Southwest of England. Amsterdam and Philadelphia: John Benjamins. Wales, Katie. 2002. ‘Everyday verbal mythology’: Pronouns and personification in non-literary English. In: Sybil Scholz, Monika Klages, Evelyn Hantson, and Ute Römer (eds.), Language, Context and Cognition: Papers in Honour of Wolf- Dietrich Bald’s 60th Birthday, 327–335. München: Langenscheidt-Longman. Warshawsky, Florence. 1965. Reflexivization. Unpublished manuscript. Mimeo- graphed, MIT. Wierzbicka, Anna. 1996. Semantics – Primes and Universals. Oxford: Oxford University Press. Woolford, Ellen. 2006. Lexical case, inherent case, and argument structure. Lin- guistic Inquiry 37: 111–130. Wright, Joseph. 1905. The English Dialect Grammar: Comprising the Dialects of England, of the Shetland and Orkney Islands, and of Those Parts of Scotland, Ireland and Wales Where English Is Habitually Spoken. Oxford: Frowde. Wright, Susan. 1997. ‘Ah’m going for to give youse a story today’: Remarks on second person plural pronouns in Englishes. In: Jenny Cheshire and Dieter Stein (eds.), Taming the Vernacular – From Dialect to Written Standard Language, 170–184. London: Longman. Yamamoto, Mutsumi. 1999. Animacy and Reference: A Cognitive Approach to Corpus Linguistics. Amsertdam and Philadelphia: John Benjamins. Zribi-Hertz, Anne. 1989. Anaphor binding and narrative point of view: English reflexive pronouns in sentence and discourse. Language 65: 695–727.

Complement clauses

Daniela Kolbe

1. Introduction

In the research on the syntax of British dialects of English, there are numerous accounts of non-standard features in particular dialects. Two frequently de- scribed dialectal variants of complement clauses (see Kortmann 2004: 1095) are the use of for to instead of to in infinitival clauses, as in (1), and the inver- sion of verb and subject (i.e., the use of direct question word order) in indirect questions, as in (2).

(1) ...he would try for to tell her . . . (Northern Ireland Transcribed Cor- pus of Speech (NITCS), text A32.3, LM7) (2) . . . But the crab always does – sort of – eats the net away I don’t know is it with his claws or with its teeth or whatever he uses he somehow manages to get out (FRED, HEB_024)

According to the overview of dialectal variation in morphology and syntax in Kortmann et al. (2004), both features occur in Irish English (Filppula 2004: 85–86, 93–94), in Scottish English (Miller 2004: 58, 64), in (Penhallurick 2004: 104, 108) and in dialects in the North of England (Beal 2004: 129, 134–135). For to is also reported for the English spoken in South- west England (Wagner 2004: 168). Neither of these features, however, is men- tioned in the description of dialects in Southeast England (Anderwald 2004). Three of the four mentioned dialects in which the inversion of verb and subject in indirect questions occurs – Irish English, Scottish English and Welsh English – are spoken in historically Celtic regions and thus they are re- ferred to as “Celtic Englishes” (though not unanimously, see Tristram 1997: 4–17). It would therefore seem natural to consider this non-standard comple- mentation pattern as typical of Celtic Englishes. Its occurrence in Northern English dialects could be traced back to ’s geographical 194 Daniela Kolbe proximity to Celtic areas. Social contact might have transferred these linguis- tic features across the border. However, the inversion of subject and verb in indirect questions is also listed as a feature of colloquial, not dialectal, English in the Longman Gram- mar of Spoken and Written English (Biber et al. 1999: 920–921). Does this mean that it is widespread enough to be considered a “supra-regional” non- standard phenomenon and not typically “Celtic”? A similar question is posed by research on for to: how typical of a certain dialect (or of a group of dia- lects) can a feature be that occurs in nearly all mainland dialects of British English except in those of the Southeast and maybe the English Midlands (the latter of which are not covered in Kortmann et al. 2004)? Are the Celtic roots of Cornwall in Southwest England (see Wakelin 1984) important for sharing this feature with other Celtic Englishes? Another dialect feature in complement clauses, however, does not occur in overviews of British English dialect grammar (e.g., Kortmann et al. 2004; Milroy and Milroy 1993; Hughes and Trudgill 1996): the use of as instead of that in complement clauses, e.g.,

(3) I wouldn’t like to say as there weren’t any poachers in Radford but I can’t think of one (FRED, NTT_016)

Thus, while two dialect features occur in the descriptions of nearly all British English dialects, another one is hardly noticed. The most obvious explanation as to the cause of this discrepancy would be that often-mentioned features are used in more locations, by more speakers (and / or) in dialects that have been well studied, while neglected features are regionally more restricted used less frequently (and / or) in understudied dialects. Consequently, the central questions the present study aims to answer are: Firstly, how frequent are these three syntactic dialect features are in which regions in the UK? Secondly, are they “typical” of a particular language vari- ety – or of a fairly homogeneous group of varieties? And thirdly, when is a dialect feature not a dialect feature anymore, but a vernacular feature of the (colloquial) spoken language? The analysis seeks to validate whether the three features word order in in- direct questions, complementizer as vs. that and (for) to are typical of parti- cular varieties of English. Potential determinants of the non-standard variants are language contact (with Celtic substratum or Scandinavian languages) and sociolinguistic factors such as age or sex and semantic differentiation. The Complement clauses 195 influence of these factors on syntactic variation is compared to the influence of regional preferences. This comparison of different factors is essential, as, for instances, seman- tic differentiation could override regional preferences, so that higher frequen- cies of a semantically determined variant in a particular region would lead to superficially higher scores of the syntactic variant in this region. The same is true for the variation according to age or sex: variants that are more fre- quent in younger people’s or women’s speech could appear as regionally dis- tinct in those regions whose data contain more women or younger speakers. The interplay of these determinants of variation is analysed in inferential and predictive statistical analyses, which are based on data from the FReiburg English Dialect corpus (FRED) and from the Northern Ireland Transcribed Corpus of Speech (NITCS). The remainder of this section provides an overview of relevant terminol- ogy (1.1) and a brief sketch of previous research on the topic (1.2). Section 2 introduces the data and methods underlying the analysis of complement clauses. It is followed by three case studies of the dialect features exemplified above: section 3 examines the use of subject-verb inversion in interrogative complement clauses, section 4 is concerned with the use of the complemen- tizer as in that clauses that has received little attention so far and section 5 analyses the use of for to in complement, postmodifying and adverbial clauses. Section 6 offers a concluding discussion.

1.1. Terminology

Grammars typically assign a higher value to clauses than for sentences in grammatical analysis. They are considered to be “a more clearly-defined” (Quirk et al. 1985: 47), or “a more basic unit than the sentence” (Huddleston and Pullum 2002: 45).1 While phrases combine to form clauses, clauses do not have to combine to form sentences – a single clause can already constitute a sentence, e.g., He won (Huddleston and Pullum 2002: 45). The general definition of “clause” is that it typically consists of a noun phrase and a verb phrase, which are often also referred to as subject and predicate respectively (Quirk et al. 1985: 42–43, 50, Biber et al. 1999: 120, Huddleston and Pullum 2002: 22, Carter and McCarthy 2006: 486). The verb (phrase) “is the most ‘central’ element” (Quirk et al. 1985: 50) in a clause, as “it largely determines what else must or may occur in the 196 Daniela Kolbe clause” (Carter and McCarthy 2006: 486). In contrast, noun phrases are not obligatory in all non-finite clauses, as these often do not require an explicit subject, an example of which is (try for) to tell her. The term “complement clause” for a clausal complement (and not nominal clause) is based on Biber et al. (1999: 194). Their use of “complement” is one of the few cases in which they do not adapt Quirk et al.’s (1985) terminology: In the few areas where we have departed from CGEL [i.e., Quirk et al. (1985)], it has been for a specific reason. For example, the term ‘comple- ment’ ...is used in LGSWE [i.e., Biber et al. (1999)] in a broad sense that is well-entrenched in American tradition, and roughly equivalent to ‘comple- mentation’ in CGEL. To avoid confusion, we have adopted ‘complement’ in this American sense, and have replaced the terms ‘subject complement’ and ‘object complement’ in CGEL with ‘subject predicative’ and ‘object predica- tive’ in LGSWE (Biber et al. 1999: 7). There are two parallel definitions of “complement”. In functional linguis- tics, a complement is a grammatical relation similar to subject, object and verb. There are subject complements or object complements, which provide information on the subject or object. In (4), very pretty would be the sub- ject complement and it would be the only complement. In formal linguistics, however, complement is an umbrella term for constituents of a phrase: in (4), very pretty would be a complement in the verb phrase, the old a complement in the noun phrase etc.

(4) The old town is very pretty

The terminological confusion around “complement” is increased by the relat- ed term “complementation”. “Complementation”, as pointed out in the quote above, is used to encompass all arguments (or complements) a certain word class (the heads) can select (Quirk et al. 1985: 1150–1232, Carter and Mc- Carthy 2006: 504–529). Thus, an analysis of the complements of the verb phrases try to tell her and embedded to tell her in (4) shows that to tell her is a clause functioning as a complement of try, i.e., a complement clause, whereas her is a complement of tell consisting of a noun phrase. In terms of verb complementation both verbs in the sentence he would try to tell her, try and tell, are transitive. In the case of try, the required object is the (infinitive) clause to tell her. Her has the same relation to tell as to tell her to try. For a study on syntactic variation the difference between a clausal and a non-clausal object or complement is crucial. Non-clausal, i.e., nominal, ob- Complement clauses 197 jects or complements cannot be expected to show much syntactic variation as regards word order or non-standard complementizers. Therefore, it was more appropriate to select a framework that offers a distinct term for clauses func- tioning as complements or objects. “Complement clause” as used by Biber et al. (1999) is the most unambiguous term and also both wide and narrow enough to delimit the exact phenomenon under analysis.2 As all complement clauses are dependent clauses controlled by a matrix clause, it makes more sense to search for the matrix verbs, although this ren- ders also non-clausal material, e.g., object nouns. Consequently, the possible search strings had to be limited to the best manageable ratio between the amount of extracted material and the number of complement clauses it ac- tually contained. This was achieved by selecting verbs that most frequently control the relevant types of complement clauses (see section 2.2 for a de- tailed account of the extraction method and analyses). Biber et al. (1999) offer the best framework for a corpus-linguistic study as their grammar was compiled in a corpus-based approach. They offer an overview of how frequently complement clauses occur with which matrix verbs and in which noun phrases (1999: 644–759) and thus present a useful basis for selecting appropriate search strings. Consequently, Biber et al.’s categorisation of complement clauses is adop- ted. In addition, main clauses that are superordinate to complement clauses in one syntactic unit are referred to as “matrix clauses” as defined in Quirk et al. (1985: 991). Biber et al. (1999) distinguish four major types, the names of which are based on the lexical characteristics of the clause types. The types relevant for the analyses in sections 3 – 5 are (complement clauses in bold): that clause, as illustrated in (5), wh- clause, as illustrated in (6) and to infini- tive clause, as illustrated in (7).

(5) I honestly believe that he never told you anything wrong (FRED, CON_010) (6) I don’t know why I didn’t go out that Sunday night (FRED, WIL_004) (7) No, they wouldn’t want to be with us (FRED, DEN_004) 198 Daniela Kolbe

1.2. Previous research on complement clauses and British English dialect syntax

British dialects have been the subject of a considerable amount of linguistic research since the 17th century (McArthur 1992: 297). Until the 1980s, how- ever, interest in dialect phonology and lexis was more pronounced than that in morphology and syntax, or dialect grammar. Since then research has begun to deal with syntactic variation in British dialects (see sections 1 and 2 of the general introduction to this volume). Research on complement clauses, on the other hand, usually focuses on a specific type of complement clause, variation within this type of complement clause or the selection of different types of complement clauses by the same verbs or groups of verbs (see, e.g., Mair 1990; Rudanko 1998, 1999, 2005; Rudanko and Luodes 2005; Granath 1997). These studies do not take into account regional differences. More theoretical accounts of complement clauses are offered by Noonan (1985) and Dixon (1995). Dixon points out the importance of semantic dis- tinctions in the choice between different complement clause types (1995) and Noonan (1985) describes English complement clauses from a typological per- spective. The syntactic phenomena relevant for the present study are subject-verb inversion in indirect questions, the infinitive marker for to and the replace- ment of the complementizer in that clauses (I think (that) it’s true)byas. As these loci of variation are concerned with individual types of complement clauses, the relevant previous research on these types of complement clauses is discussed in the respective sections of the present study. Early descriptions of transformation rules for English complement clauses are described in Rosenbaum (1967). However, more current generative theo- ry deals with different types of complement clauses in different generative frameworks. For to clauses are investigated in Government and Binding theo- ry (Henry 1992, 1995: 81–104) and in X-bar theory (Chomsky and Lasnik 1977). Embedded inversion has been discussed in more detail after the publi- cation of the Minimalist Program (McCloskey 1991, 2006; Henry 1995: 105– 123), which does not aim to determine transformations from deep to surface structure such as the wh- transformations necessary for generating English relative clauses (see Chomsky 1992: 1–8). The respective constituent struc- tures of the individual types of complement clauses will be discussed in detail in the corresponding sections. Complement clauses 199

The tradition of descriptive studies of English dialects also continues in studies such as Millar (2007) or Hughes and Trudgill (1996) who both focus on providing an inventory of dialect features and place emphasis on phonol- ogy and vocabulary, while the descriptions of dialect grammar remain rather short. From the 1980s onwards, several publications collect descriptions of British dialect grammar to provide a comparative overview: Trudgill (1984) and 2000, Milroy and Milroy (1993) and Kortmann et al. (2004). All of them have two things in common: firstly, several features appear in the descrip- tions of more than one variety, the most extreme example of which is multi- ple negation, as in The young ones don’t know nothing about it today (FRED, WIL_012) (see, e.g., Wagner 2004: 170; Anderwald 2004: 187–188; Edwards 1993: 226; Trudgill 2004: 151; Filppula 2004: 82) and secondly, none of them contains a section on the grammar of the dialect spoken in the English Midlands. Earlier dialectological studies examined English Midlands dialects rather frequently (e.g., Bonaparte 1875; Darlington 1887; Campion 1976). The (West) Midlands dialects are discussed in Kortmann and Schneider’s (2004) first volume on phonology (Clark 2004), but do not occur in the sec- ond volume on morphology and syntax (Kortmann et al. 2004). Consequently, the bibliography on research on the English Midlands dialects provided by Kortmann and Schneider (2004: CD-ROM) mainly cites work on Midlands phonology and quite a number of non-specialist accounts such as private web- sites most of which most are exclusively concerned with the dialect spoken in the “ Country” (a conurbation north and west of ) dialect. It seems as if the grammar of English Midlands dialects contained no dialectal variants that are considered worth analysing. Edwards and Welten discuss the fact that there are great differences in the amount of research published on individual regions in the United King- dom. A lot of research published before their review describes the varieties of English spoken in Scotland, Ireland, Southwest England, Northern England (especially Yorkshire and Tyneside) and London. The dialects of East Anglia, the English Midlands and Southeast England (except London), however, had received considerably less attention at that time (1985: 100–105, 123). Con- sequently, McArthur’s encyclopaedic article on the Midlands dialect (1992: 660) does not list any grammatical dialect features, in contrast to his article on dialect in Scotland (1992: 299). The two features analysed in sections 3 and 5 are the most frequently men- tioned dialect features in complement clauses in Kortmann et al.’s (2004) 200 Daniela Kolbe overview. Penhallurick (2004: 108), Beal (2004: 134), Miller (2004: 64), Filppula (2004: 85–86) and Wagner (2004: 168) mention the use of for to in- stead of to and subject-verb inversion in indirect questions (embedded inver- sion), as in I don’t know have you seen this (FRED, LAN_012) is described in Miller (2004: 58), Filppula (2004: 93–95), Penhallurick (2004: 104–105) and Beal (2004: 128–129). That many syntactic dialect features are not exclu- sive to a particular region is frequently commented on in the literature. Harris (1984: 131) notes that “[m]any features of northern Hiberno-English mor- phology and syntax are to be found in other non-standard dialects throughout the English-speaking world ...orin records of earlier forms of the standard language”. To Thomas (1984: 189) it is no surprise that “frequently, of course, non-standard forms are not specifically confined to varieties of Welsh Eng- lish, but are more generally features of non-standard usage”. Macafee notes that the infinitive marker to becomes for to “as in other non-standard vari- eties” (1992: 14). Particularly widespread features are often considered to be the results of dialect-levelling, by which the differences between individual dialects disap- pear. However, (Edwards and Welten 1985: 121–122) challenge this assump- tion and suggest that what appears to be dialect levelling might be a feature that used to be shared by all varieties of English, but was lost in standard English (as is the case with multiple negation). Comparative studies of the syntax of different dialects are still rare; excep- tions are Filppula’s (1994) comparison of different Irish varieties of English and Tagliamonte et al. (2005). The aim of the latter is, however, to esta- blish baseline features for a comparison with American dialects and not for the analysis of differences between these dialects. The data represent six locations in the : Northern Ireland, the Scottish Lowlands, Northeast Scotland, Northwest England and Northeast England (2005: 91). The analysis in Tagliamonte and Smith (2005) is based on data from four locations in Northern Ireland, the Scottish Lowlands and Northwest England (2005: 297). However, it does not take into account differences between these dialects with regard to the frequency of the omission or retention of that. Consequently, the present study aims to contribute to filling the gaps iden- tified by Edwards and Welten (1985: 123): four requirements ... should be met if we want to have a sufficiently clear and valid description of British dialects. First, there is a great need for more sociolinguistic research into dialect variation. Although we have descriptions of a large number of dialects, it is in most cases unclear who uses these dia- Complement clauses 201

lects when, where and to what degree. Second, certain geographical areas need to be studied in more detail. As we have already argued, some areas like Scotland and the North of England are very well documented; others, such as Wales and the central South coast of England, are relatively neglected. Third, certain grammatical areas deserve more attention and there are some very interesting pointers as to where we might start when looking for gram- matical variation, for example, the use of modal auxiliaries and the complex verb phrase in general. Fourth, research should start to deal with syntactic variation. (1985: 123).

This study meets these requirements by the analysis of syntactic variation in British dialects through sociolinguistic methods. It is based on data from a wide range of regional varieties, with the inclusion of the English Midlands and Southeast English rural areas; it takes into account social factors like the age and the sex of speakers and it examines the frequencies of dialectal syn- tactic variation. Similarly to studies such as Tagliamonte and Smith (2005: 397) and studies published in Rohdenburg and Mondorf (2003) it aims to identify the determinants of syntactic variation with the assumption that re- gion is one of these determinants. Auer (2004: 70–72) distinguishes between three types of syntactic vari- ation: syntactic features of the standard spoken language, syntactic dialect features and supra-regional non-standard features which occur in all dialects, but not in the spoken standard. Multiple negation is one of the latter features. He observes that there is a continuum between dialect features and pervasive [supra-regional] non-standard features (2004: 79). The present study tries to locate the variation in complement clauses in this continuum.

2. Data and methods

The core objective of this study is to determine whether a speaker’s regional origin determines her or his choice between the available options for the ex- pression of the same syntactic structure. If that is the case, a dialectal variant is identified. In order to achieve this objective, the present study draws on the statistical analyses of spoken data. The basic assumption is that when there is a choice between two or more linguistic options this choice depends on both internal, i.e., linguistic, and external factors. The linguistic options in one linguistic structure, e.g., the use of for to versus to, or subject-verb ver- sus verb-subject word order in I don’t know where they went versus where did 202 Daniela Kolbe they go, are each considered to be two different options of the same syntactic structure, or two variants of a linguistic variable .

2.1. Data

Besides FRED (see section 3 in the general introduction to this volume), the analyses draw on The Northern Ireland Transcribed Corpus of Speech (NITCS), which also consists of transcribed interview recordings. It contains roughly 240,000 words and was collected in Northern Ireland. In contrast to FRED, this corpus contains a higher percentage of younger speakers and women (Barry 1981; Kirk 1992), which will be discussed below. For exam- ples that could not be obtained in these two corpora, the British National Cor- pus (BNC) (Burnard 2000; see http://www.natcorp.ox.ac.uk/) was also used. Altogether, FRED and the NITCS contain 2.74 million words. In general, FRED and the NITCS are similar: they are not tagged for mor- phosyntax and the transcriptions in both corpora are orthographic, although they represent pronunciations such as wanna for want to and other contrac- tions such as don’t. Nevertheless, there are important differences between FRED in general and the NITCS on the one hand and within different re- gional subsets of FRED on the other hand. These will be pointed out below. This section continues with a description of the database in sections 2.1.1 and 2.1.2, which is followed by a discussion of speech style in the corpora in section 2.1.3 and a short comment on the representation of the corpus texts (section 2.1.4. Section 2.1.5 explains the extraction of relevant data from the corpora and the exclusion of “noise” from the data samples.

2.1.1. Locations

Table 24 lists the nine regional data subsets and the number of words con- tained in each; it also states typical abbreviations used for each region (Kirk 1992: 65; Hernández 2006: 4). All regions but Northern Ireland and the Scot- tish Highlands and Islands are subsets in FRED (see section 3 in the gen- eral introduction to this volume). As the average number of words from each region in FRED (approximately 250,000 words) is similar to the total number of words in the NITCS (just over 240,000), the Northern Irish data were not divided according to county, but subsumed in the region “Northern Ireland”. Complement clauses 203

Table 24. Regional data subsets in the present study with typical abbreviations and number of words contained Region Abbreviations Number of words Northern Ireland North Irl., N. Irl. 240,332 Scottish Highlands & Islands (Scot.) High& Isl. 166,554 Scottish Lowlands Sc./Scot. Lowl. 175,819 Isle of Man Man, I. Man, IoMan 10,461 Northern England North, N. Engl./England 487,477 English Midlands Midlands, E. Midl. 351,284 Wales Wal 88,755 Southwest England Southwest, SWest Engl. 588,864 Southeast England Southeast, SEast Engl. 624,431

As FRED contains few data from the Scottish Highlands (23,872 words; see section 3.2 in the general introduction to this volume), these data were sub- sumed with the data from the Hebrides in the region “Scottish Highlands and Islands” which is in line with linguistic research on this area (see, e.g., Shuken 1984). According to Macafee and Ó Baoill (1997: 245), the Highlands and Western Isles (comprising the Inner and Outer Hebrides) constitute one of three cultural areas in Scotland because of their common Gaelic language background. The Celtic language spoken on the Isle of Man is Manx and thus differs from Scottish Gaelic (see Thomson 1992; Barry 1984). Therefore, the data from this region were not merged with any other data set, although they con- tain rather few words. Although most of the data in FRED stem from oral history projects, this is not the case for the data from the Hebrides and parts of the Scottish Lowlands data, which were collected and compiled for linguistic purposes (Benedikt Szmrecsanyi: personal communication),3 as was the NITCS (Barry 1981: 23). This affects the social composition of the respective data subsets (see section 2.1.2) and possibly also the style of speech used by the informants (see section 2 in the general introduction to this volume and the discussion below in section 2.1.3). This aspect must be considered in the interpretation of the analyses. 204 Daniela Kolbe

Table 25. Age means in FRED and the NITCS Region Mean age English Midlands 81,02 Isle of Man 81,00 Northern England 73,85 Northern Ireland 39,70 Scottish Highlands & Islands 44,49 Scottish Lowlands 57,74 Southeast England 75,15 Southwest England 78,16 Wales 82,43

2.1.2. Social composition

By far the most speakers in both FRED and the NITCS are from a rural working-class background. However, for about 200 speakers in FRED, the information on their occupation is not obtainable. The corpora thus provide data for geographic, rather than social class differentiation (see section 2.2.1). Interviewers and their speech are not included in the present study. Whereas most of the speakers in FRED are NORMs (Hernández 2006: 6), the NITCS data approximate a more representative sample of the gen- eral population by including speakers from three age-groups in most of the locations (Kirk 1992: 65). The average age of all speakers in both corpora is 63.1 years of age, with a range from 6 to 102 years, which is the range of age of years in FRED. In the NITCS, the youngest speaker is 9 years and the oldest is 91 years. The individual regions differ significantly with regard to the mean age of their speakers. Table 25 lists the mean age in each region in the corpora. For missing information on the exact years of age of a speaker, the corpora provide different kinds of approximations: FRED offers the decade of birth (e.g., 192 for the 1920s), while the NITCS lists indications such as 60+ for a speaker who is older than 60 years. Figure 13 illustrates the differences in the distribution of speaker’s age in the corpora. The black boxes represent the mean age in each region. The up- per and lower limit of years of age of a 95% confidence interval is shown by the two horizontal lines above and below each black box, connected by a ver- tical line, the error bar. These confidence intervals indicate the range of mean age for which there is a 95% probability in a second data sample collected under the same conditions (in the same region). Thus there is a 5% proba- Complement clauses 205

Figure 13. Age means and confidence intervals (CI) across the regions included in the corpora bility of error that the mean age of a second data sample would be without this range. This probability of error corresponds to the p-value that indicates a statistically significant observation (Zoefel 2002: 73–74): means of age out- side the confidence interval are statistically significant. Consequently, regions for which error bars do not overlap, differ significantly from each other with respect to their mean age. The error bar for the age of speakers from the Isle of Man overlaps with all other regions. FRED has only two speakers from the Isle of Man, which is too small a number to be representative. Consequently, the mean age of a second sample could be nearly anything within the range of the rest of the corpora. Since the mean age of speakers in the corpora is rather high, the programme impressively assumes that the mean age of another sample from the Isle of Man could be up to 130 years. Disregarding the Isle of Man, however, two clearly separated groups of age of speakers emerge: Northern Ireland, the 206 Daniela Kolbe

Table 26. percentage of women in the regions of FRED and the NITCS, statistically significant deviations from the average percentage of all regions in bold, statistical significance calculated in cross tabulations according to adjusted residuals (see appendix); overall chi2 p <0.001

Region % Female speakers Total n of speakers

English Midlands 24,6 57 Isle of Man 0 2 Northern England 45,2 73 Northern Ireland 41,7 115 Scottish Highlands & Islands 49,0 51 Scottish Lowlands 34,0 53 Southeast England 16,7 54 Southwest England 30,2 106 Wales 15,4 13

Total 34,5 524

Scottish Highlands & Islands and the Scottish Lowlands on the one hand and the rest of FRED on the other hand. These are exactly those regions in which the data were collected for linguistic purposes (as mentioned above). The differences in the percentage of men and of women in the individual regions is not as clearly separable into different groups, and, more impor- tantly, does not differ according to the compilation methods of the respective data subsets. Nevertheless, in a couple of regions, the percentage of women and men differs significantly from others, as shown in Table 26. In total, 34.5% of the speakers in the corpora are women. There are sig- nificantly fewer women in the data from Southeast England and significantly more women in the data from Northern England and the Scottish Highlands and Islands. The total numbers refer to only those speakers whose sex is stated in the supplementary material to the corpora. Although there are lower per- centages of women in the data from Wales and the Isle of Man than in the data from Southeast England, there are too few speakers included in the data from the two former regions to make their difference from the average statistically significant. To sum up, the different purposes (and thus possibly methods) of compi- lation of the data from the three regions Northern Ireland, Scottish Lowlands and Scottish Highlands and Islands, as opposed to the purpose of the remain- Complement clauses 207 ing FRED data, have an impact on the age of speakers whose speech was recorded, but less so on the sex of the informants.

2.1.3. Speech style

Due to the interview situation, the language style in the database is rather for- mal, although the speech is spontaneous (Labov 1972b: 79–80). The aware- ness of being reported is sometimes directly addressed in the interviews, as in (8). The brackets contain the original transcription showing pronunciation variants (gorn, t’, f’) and are given here as an example (Hernández 2006: 2. 10–11, 35). In all subsequent examples, these original transcriptions are omitted.4

(8) Course, I aren’t (reg sic=gorn) going (reg sic=t’) to swear. That wun’t do (reg sic=f’) for the mike, what he said. (FRED, SFK_037)

Nevertheless, some interviews also contain features that are likely to trigger casual speech: in Labov’s terms, these are the contexts “speech not in direct response to questions”, “childhood rhymes and customs” and “the danger of death” (1972b: 90–94). All life memories collected in the oral history projects are also generally instances of a context corresponding to speech styles in which “the emotional state or attitude of the speaker overrides any formal restrictions, and spontaneous speech emerges” (Labov 1972b: 91). The central aim of oral history interviews is to elicit the informant’s per- sonal, emotional experience for which “ it is essential to give the person you are recording plenty of space to tell you what they think matters” (Oral His- tory Society 2007), i.e., starting a narrative is not considered a digression from the answer to a question, but exactly what the interviewer intends to get. Oral history interviewers are advised to “try to explore motives and feelings with questions like ‘Why?’ and ‘How did you feel?’ ” (Oral History Soci- ety 2007) in order to elicit personal experiences. Consequently, although the interview situation is one that represents a formal setting, the topics of oral history are likely to evoke emotional stories, in which more casual speech may emerge and in which the observer’s paradox can be at least partly over- come. Due to the observer’s paradox, it may have been more difficult to elicit casual speech in the data subsets that are compiled for linguistic purposes (cf. Kirk 1992: 68–69), even though the interviewers’ aim was to elicit sponta- 208 Daniela Kolbe neous and rather casual speech in which non-standard forms are more likely to occur. Too much encouragement by the linguist to use dialect features, however, might also cause an overuse of non-standard grammar and violate the principles of the observation of natural language use set forth in Labov (1972b: 208–209). In conclusion, although the aims of the interviews differ in some parts of the data subsets, both types aim to elicite a similar style of speech and the corpora provide similar material. Nevertheless, the objectives may have had an effect on the speech used by the informants – a specialist in language is perhaps more intimidating than somebody who is interested in the informant’s life memories.

2.1.4. Representation of the data

Both FRED and the NITCS consist of orthographic transcriptions in plain text files, no general changes of the examples are necessary. In the following examples, tags and brackets will be removed from the examples, so that (8), recited here as (9), would appear as in (10) in the following. All spellings are retained and marked “[sic]” if they are unusual. Words or phrases that are possibly difficult to understand are explained in square brackets, as in example (11). Only spacing is adjusted: spaces in contractions and between words and following commas are deleted, so that, e.g., both don ’t and don’ t are quoted as don’t and Course , becomes Course,.

(9) Course, I aren’t (reg sic=gorn) going (reg sic=t’) to swear. That wun’t do (reg sic=f’) for the mike, what he said. (FRED, SFK_037) (10) Course, I aren’t going to swear. That wun’t do for the mike, what he said. (FRED, SFK_037) (11) I’ve heerd [heard] me dad say that he goo [sic] to Grimes’s at the pub ond [sic] he’d get blind drunk. (FRED, SAL_038)

2.1.5. Data extraction, categorisation and problem cases

The database for the analyses of finite clauses in sections 3 and 4 was col- lected through concordance searches of the most frequent matrix verbs of wh- and that clauses in British English: think, say, know and see.5 Think and say are the most frequent verbs controlling that clauses in conversation (Biber Complement clauses 209 et al. 1999: 668). Know and see are the two most frequent verbs controlling wh- clauses and they are also the third and forth most frequent verbs con- trolling that clauses in conversation (Biber et al. 1999: 668–669). As finite complement clauses had to be identified in the concordances of know and see for section 3 anyway, the that clauses controlled by know and see were also extracted. Certain uses of the selected matrix verbs posed obstacles in data extrac- tion, as they had to be distinguished from their function as matrix verbs: you know and you see are discourse markers and say introduces direct and indirect speech reports. These functions and the distinction from their uses of matrix verbs of complement clauses and their pragmatic functions are illustrated be- low. The consequent coding schemes for the case studies of embedded inver- sion and of the complementizer as are summarised in the respective method- ology sections 3.2 and 4.2.1. The database for the analysis of infinitive clauses introduced by (for) to could be compiled directly by concordance searches of infinitives. This compilation is discussed in section 5.3.

Data extraction

Most finite complement clauses cannot be identified automatically in the data. That, which is a complementizer in finite clauses, can also be a relativizer the film that you watched or a demonstrative pronoun that house. Both functions are pervasive in spoken language. The same is true of wh- words introducing complement clauses – e.g., what in [you] found out what stalls are vacant (FRED, WAR_001), as they are multi-functional and also occur as interroga- tive words and relative pronouns (Quirk et al. 1985: 366, 817, 1006). As all complement clauses are controlled by a matrix expression, they can be extracted via these expressions. A “matrix” clause differs from a “super- ordinate” clause in that it does not contain the subordinate clause and it differs from a “main” clause in that it does not occur independently without the subordinate clause (Quirk et al. 1985: 991). Based on Biber et al. (1999: 668, 689, 711, 749) the most frequent matrix verbs of finite and non-finite complement clauses in conversation were se- lected as search words for concordance searches with WordSmith. All finite complement clauses included in the data sample are controlled by verbs (rather than adjectives) and occur in post-predicate position. The resultant 210 Daniela Kolbe data sample thus contains finite complement clauses in their most frequent syntactic position and with the most frequent type of matrix expression (Biber et al. 1999: 660, 674). Extracting complement clauses via their matrix verbs also has an advan- tage for calculating percentages of a linguistic variable: as percentages can only be calculated from a total number of observed cases, these cases need to be elicited from a clearly delimited linguistic structure. The most frequent matrix verbs thus serve as the delimitation in which a representative distribu- tion can occur; they represent the point at which a speaker most frequently has to choose one of several variants of a complement clause, e.g., a clause introduced by as or by its standard equivalent that. The search strings in the concordance searches consisted of the verb lem- mata, i.e., all inflectional forms, including dialectal variants, such as regu- larised past tense forms thinked or knowed, or dialect vocabulary, i.e., Scots ken ‘know’. In the concordances of each of these verbs their uses as matrix verbs of finite clauses were identified manually. The respective complement clauses of these matrix verbs were then coded for the relevant variables listed in the respective methodology sections.

Direct versus indirect speech

A central question in extracting complement clauses from the concordances of the matrix verbs for the present study was how to distinguish direct from indirect speech, in particular for the concordances of say and for the differen- tiation of direct questions from indirect questions with embedded inversion. Indirect speech is generally rendered in a subordinate clause, as in (12-a) and (13-a). Their possible direct speech equivalents are shown in (12-b) and (13-b).

(12) a. So I went along to Bessey and Palmer’s man and asked what he’d give me for the new lot of coal. (FRED, SFK_006) b. I asked, “What will you give me for the new lot of coal?” (13) a. ...they said that they would win and they were beat. (NITCS, A28.1, CM17) b. They said, “We will win.” Complement clauses 211

They said and (I) asked are the matrix or reporting clauses (Quirk et al. 1985: 1020), what he’d give me for the new lot of coal and that they would win are their subordinate or complement clauses. Reported direct speech may con- tain direct questions, as in (12-b), imperatives, e.g., I said, “Come here”, and may reproduce ellipsis, e.g., /0 You want to leave right now?. In direct speech, deixis locates the utterances in the contexts of the speaker and of time and place of the reported event. In contrast, indirect speech reports are super- ficially marked by punctuation changes (e.g., absence of question or quote marks) and also feature deictic shifts that locate the reported clauses in the contexts of the speaker, and of time and place of the reporting event. The consequent shifts mainly include tense, pronouns and adverbials of time and space (Quirk et al. 1985: 1025–1030). Pronoun shifts in the examples above can be observed in the shift from you, the addressee in direct speech in (12-b), to he in indirect speech in (12-a) and from we in direct speech in (13-b) to indirect they in (13-a). The presumed underlying auxiliary will in both (12-b) and (13-b), has to be shifted to past tense would in (13-a) or contracted ’d in (12-a). Direct questions become dependent wh-clauses in indirect speech, as in (12-a) and, in standard English, assume subject-verb word order. If (13-b) was extended by an adverbial, such as tomorrow (We’ll win tomorrow) the appropriate shift in a direct speech report in past tense would result in the change to the next day in they said that they would win the next day. The illocutionary force of directives is mainly expressed by the verb in the reporting clause, so that, e.g., I said, “Come here” would become I told you to come to me in indirect speech, including a shift of place reference from here to to me. The syntactic and semantic relationship between reported and reporting clause is freer with direct speech than with indirect speech (Biber et al. 1999: 794; Quirk et al. 1985: 1022–1024), which is illustrated in the following mod- ifications of (12) and (13).

(14) a. “What,” I asked, “will you give me for the new lot of coal?” b. ?What I asked would he give me for the new lot of coal. (15) a. “We,” they said, “will win.” b. ?That they would win they said.

(14) and (15) show medial and final positions of the reporting clause in the sentences. Moving the reporting clause does not cause any shifts in thematic focus or from general question intonation in direct speech. In indirect speech, 212 Daniela Kolbe however, these positions of the matrix clauses are marked. Firstly, it is un- usual for a subordinator (here: what) not to appear immediately before the subordinate clause as in (14-b).6 Secondly, pre-predicate that clauses as in (15-b) are quite rare and are mostly reserved for certain styles and functions as they create ‘heavy’ subjects, which are “virtually non-existent in conversa- tion” (Biber et al. 1999: 676, 677). Their main epistemic function is to create verisimilitude, and they are typically used in complex sentences before mul- tiple phrases or clauses and by sports writers (Biber et al. 1999: 676–679).7 Only sentence-initial reporting clauses are possible matrix clauses. Reported direct speech is in general more independent than complement clauses and also occurs without reporting clause. Firstly, in contrast to depen- dent clauses, it can comprise several sentences (Huddleston and Pullum 2002: 1026–1027), as in (16), whereas in indirect speech reported and reporting clause usually make up one sentence.

(16) I say, “I’m all right. I think I’ll finish my days in the Warrior.” (FRED, SFK_004 [quotation marks added, punctuation given])

Secondly, reporting and reported clause are also semantically more indepen- dent in direct speech reports than in indirect speech reports (Quirk et al. 1985: 1023), so that say can introduce a direct question, but not their indirect equiv- alents as exemplified in (17) and (18). When say controls a wh-clause it is not an indirect question but an answer-oriented wh-clause (see Ohlander 1986). Ask is typically used as a matrix verbs for the reports of indirect questions.

(17) a. . . . mother asked me, What happened, what happened? (FRED, HEB_032) b. Mother asked me what had happened. (18) a. ...hesaid What happened? (FRED, SOM_019) b. He asked (me) what had happened? c. He said what had happened.

FRED contains an abundance of direct speech reports. They belong to a speech style that aims to keep listeners interested and involved in the con- versation. Involvement is also created by the rhythmic intervals produced by the frequent insertion of I say(s), he say(s). Together with the inclusion of a lot of details, the speakers in the corpora show a “high involvement” strategy (Tannen 1989: 9–35). Example (19) is a typical excerpt (FRED, SFK_002). Complement clauses 213

(19) So one of the tea blokes who was hauling the herrings ashore say, Tea! Tea!, he say, You’ll be lucky to get no bloody tea here, he say. And even if we did bring it ashore, I bet you wun’t drink it. I say, Why? He say, We’re got the dirtiest bugger a-going. Do you know what?, he say. He made a blooming beef pudding the other day, he say, And he wrapped it round with the bloody Daily Mirror!, he say. When we undone it, he say, There was a photograph of Hitler on the pudding!

According to Biber et al. (1999: 1119), the repetition of reporting clauses is also a feature speakers use “to clarify that they are reporting quoted speech”. Reported speech in the corpora is often not distinctly direct or indirect. In theory, there is a clear boundary: indirect speech is syntactically subor- dinated to its reporting clause as subordinate clause, while direct speech is syntactically more independent, and it is usually set in quote marks. However, in actual spoken language, some of the features of indirect speech may not always be present. Tense backshift, for example, is often ignored if actions in the past are narrated in present tense (historical present) or if they refer to a current situation, to ongoing or recurring action. An ex- ample from FRED for present tense used instead of past tense in an indirect speech report is given in (11), cited here as (20). Pronoun shift does not take place if a speaker reports his or her own speech, or reports a dialogue between two other persons (21). Most of the interviews in FRED contain I-narratives.

(20) I’ve heerd [heard] me dad say that he goo [sic] to Grimes’s at the pub ond [sic] he’d get blind drunk. (FRED, SAL_038) (21) So I said I would take two. (FRED, WES_006)

The possible omission of that as in (21) further erodes the difference between direct and indirect speech. Without any change in word order or deictic ex- pressions, (21) could be a direct speech report (So I said, “I would take two”). Moreover, the corpora contain speech reports that contain features of both in- direct and direct speech, as (22):

(22) And so he come along and said, Would I like to give a lift to putting the army hut up? (FRED, LAN_007)

The punctuation and orthography with comma, capital letter and question mark is the typical representation of a direct speech report in both corpora under analysis. The word order is that of a direct question, as in embedded 214 Daniela Kolbe inversion (see section 3; Quirk et al. 1985: 1052 note; Biber et al. 1999: 920– 921). The verb in the reporting clause is said which, in contrast to ask, does not usually introduce an indirect question. These aspects mark the reported clause as a direct speech report. However, the pronoun is shifted to the report- ing event and is in first person as the referent is the interviewee. This is not a singular error, but one of many instances in the data that exemplify “a gra- dient from direct speech that is clearly independent to direct speech that is clearly integrated into the clause structure” to a “mixture of indirect and di- rect speech” (Quirk et al. 1985: 1023–1024). Nevertheless, for the data analysis indirect speech reports had to be clearly distinguished from direct speech to be coded as complement clauses, espe- cially when they followed a say-reporting clause. Although the examined cor- pora aim at a rather orthographic style of punctuation, this style may well be influenced by the personal style of individual transcribers and different guide- lines for each of the corpora. Hence punctuation alone does not seem to be a reliable source for syntactic categorisation.8 Several complement clauses with retained that are separated from their matrix clauses by a comma, e.g.,:

(23) I would say, that uhm, to me they’re just ordinary people, they were the people I’d played with, so I don’t expect anything special of them, you see. (FRED, LAN_011)

Consequently, criteria other than punctuation had to be defined for the cate- gorisation of reported speech. In general, clausal material is considered to be an indirect speech report unless it is unambiguously direct speech. Directives as in (24-a) and direct questions following say as in (25-a), are always considered to be direct speech, even if deictic shifts as the I occur in the reported clause. Standard indirect speech reports of both directives and questions would employ a dif- ferent reporting verb than say (e.g., told ...to, asked) and the syntactic struc- ture of the reported clause would be changed. Simple imperative put would require the infinitive marker to after tell me in (24) and would I in direct speech would need to change to indirect I would and to be introduced by if or whether. Bold print in the following examples shows the locations of typical changes from direct (a.) to indirect speech (b.).

(24) a. And then he would say, put those two together (NITCS, A32.3, LM7) b. And then he would tell me to put those two together. Complement clauses 215

(25) a. And so he come along and said, Would I like to give a lift to putting the army hut up? (FRED, LAN_007) b. He came along and asked (me) if I would like to give a lift to putting the army hut up.

Since direct question word order is a non-standard variant of direct questions, the use of the reporting verb say instead of ask was crucial for the decision to categorise (25-a) as direct speech report. If the following features occur in utterances, they were consequently regarded as direct speech reports:

Questions verb-subject word order after the reporting verb say Directives expressed by the base form of the verb as in (24-a) are generally considered to occur in direct speech only.

In order to distinguish between direct and indirect speech reports in more ambiguous utterances, a list of further characteristic features of direct speech was compiled. If two instances of these features applied to a clause controlled by the selected matrix verbs, the clause was considered a direct speech report. Because of the possible mixture between direct and indirect speech the pres- ence of only one of these features was not considered to be sufficient evidence of a direct speech report.

Absence of deictic shifts Deictic expressions and the tense of the reported clause refer to speaker, time and place of the reported, rather than the reporting event. Each unshifted deictic expression in a reported clause counted as one instance. If backshift occurred, the distinction between simple past and past perfect was not considered necessary. Past perfect rarely occurs in conversation (see Biber et al. 1999: 461; 468–469). Address terms and greetings The speech report contains a name, title, etc., a greeting such as hello or a good-bye. Commas and capital letters The reported clause is introduced by a comma and / or a proper noun or a capitalised word other than I. The occur- rence of both comma and capital letter counts as just one instance. Commas that separate hedges (e.g., , eh,) or discourse markers (, you know,) from reporting or reported clause are not included. Question and exclamation marks The reported clause ends with an excla- mation or question mark. 216 Daniela Kolbe

Utterance-launchers The reported clause is introduced by oh, well, alright, aye, yes, no etc. (cf. Biber et al. 1999: 1118–1119). Non-embedded reported speech The speech report continues after the re- ported clause with one or more independent reported clause(s) uttered by the same speaker (Huddleston and Pullum 2002: 1026–1027).

You know and you see: Discourse markers or matrix verbs?

According to Biber et al. (1999: 668, 689), know and see should control that- and wh-clauses less frequently than think and say. In their corpus, think and say together control a that clause over 3,200 times per one million words (1999: 668), while know and see control a wh- or that clause about 2,100 times per one million words. In FRED and the NITCS, however, the concor- dance searches yielded over 25,000 instances of the think and say lemmata, while the concordance searches of the know and see lemmata returned nearly 38,000 hits. This higher frequency of know and see in the corpora mainly results from their use in the discourse markers you know and you see (Schiffrin 1987: 267– 330; Schourup 1999). The strings you know and you see occur 21,774 times in the corpora, i.e., nearly 60 % of the concordances contained you know and you see. By comparison, the strings you think and you say together occur 323 times in the two corpora under analysis, which makes up about one per cent of the concordances. In order to verify that you know and you see predominantly occur as dis- course markers rather than as matrix clauses, two sample searches of a hun- dred random instances of you know and you see each were selected from the concordances (by means of Word Smith’s “Reduce to N” function). In all four samples, the percentage of complement clauses was seven or eight per cent. Seven or eight per cent of all 21,774 instances of you know and you see in the corpora would correspond to 1,600 or 1,700 instances functioning as matrix clauses. The effort of manual analysis of 20,000 concordance lines for potentially less than 2,000 additional complement clauses in the database was not considered to increase the reliability of the data sample to an extent that would have justified the inclusion of 20,000 concordance lines without matrix clauses. Therefore, all instances of you know and you see were deleted from the concordances. The resultant database of that and wh- clauses consists of 10,577 clauses (extracted from 41,849 concordance lines). Complement clauses 217

2.2. Methods

The question the present study seeks to answer is: How dialectal is syntactic variation in complement clauses? Since non-standard syntactic features oc- cur in a wide range of dialects, it seems worthwhile to differentiate between dialectal and supra-regional variation. However, “[v]ariation in language is conditioned by a complex range of factors, several of which are likely to be relevant for any given instance of a variant” as Cheshire (1996: 15) rightfully observes. Consequently, no variant can be expected to be a clear-cut case of either dialectal or supra-regional variation a priori, as “there is likely to be a multiple conditioning effect” (Cheshire 1996: 15): a number of factors compete with the regional origin of a speaker, for instance his or her age or sex (see, e.g., Labov 2001; Chambers and Trudgill 1980: 71–74; 88–100). The influence of dialects thus has to be compared with the influence of other factors in order to determine which factors exert the strongest influence on the choice of variants. Counting the frequencies or percentages of one variant is not sufficient for this purpose. Section 2.1.2 has shown that, for instance, in certain regions in the corpora speakers are significantly younger than in others. If variant A of a complement clause is more common in the language of younger speakers than variant B, it will be more frequent in the regions that have younger speakers. If, however, age was not considered in the analysis, it would seem that variant A is a regional feature. One basic assumption is that since all data included in the corpora are dialectal data, no region is more standard than the others a priori. The anal- yses of syntactic variables below do not compare the absence of one variant in some regions (e.g., for to) with the presence of the same variant in other regions. Instead, the deviation from the average frequency of the syntactic pattern in all regions is measured. Consequently, regions in which a dialect feature as for to is less frequent than in others appear as less non-standard than the average. This procedure allows to detect the spread of each dialect feature. Research on syntactic variation has identified a number of linguistic fac- tors that determine the choice of a particular variant (see all contributions to Rohdenburg and Mondorf 2003). These influential factors are not necessarily rival phenomena, but interact in the choice of one or the other variant. In order to conduct quantitative empirical studies, statistical analyses are essential (Chambers and Trudgill 1998: 127–148; Woods et al. 1986: 1–7). 218 Daniela Kolbe

Szmrecsanyi (2006), Tagliamonte and Smith (2005) and Jones (1985) exem- plify the usefulness of statistic procedures for the study of syntactic variation. Gries (2003) demonstrates that research on syntactic variation benefits from the identification of a multitude of variables. He shows that an analysis that takes into account the interplay of influential factors, in short, a multifacto- rial analysis, is a useful tool for the description of variation in that it is able to determine the strength of individual variables. The present study adopts this empirical approach. The statistical procedures applied in the following sections are listed and explained in the appendix to this chapter. The word “significant(ly)” is only used when it refers to statistical significance. Statis- tical concepts and analyses are explained in the appendix. In particular, this study examines the influence of the region a speaker is from on her or his selection of a linguistic variant in complement clauses. This examination consists of a comparison of the influence of region with social factors such as age and sex as well as with linguistic factors such as the semantics of the clause. It seeks to determine whether, e.g., the for in he would try for to tell her (NITCS, A32.3, LM7) occurs because it is uttered by someone from Northern Ireland, because the speaker is an older woman, because it is introduced by a matrix verb preceded by a modal or because it expresses a purpose. The analyses of variation in complement clauses therefore distinguish between external and internal factors that determine variation. The external factors in this study are the region a speaker is from and his or her age and sex; the internal factors are linguistic properties of the complement clause or its matrix clause. These factors are “independent variables” (see section 2.2.1 below). Possible internal variables are identified in the literature on the varia- tion between different types of complement clauses in each relevant section. This study assumes, firstly, that at least three external independent vari- ables may influence the choice of complement clause type: REGION, SEX and AGE. Secondly, it aims to determine the strength of the influence of individ- ual variables – with the central question of whether REGION has a statisti- cally significant influence on the choice of a linguistic variant. In this case, the strength of the influence of each variable has to be compared with the strength of the influence of each further variable involved in the analyses, which yields a so-called multi-variate or multifactorial analysis (cf. Cham- bers and Trudgill 1980: 136–137, 140–144). The presentations of the findings start with a general overview of the regional distribution of the different types of complement clauses, which Complement clauses 219 exhibits in which regions the non-standard feature is significantly more or less frequent than on average. They continue by examining whether this re- gional distribution shows genuine regional preference for or avoidance of a certain type of complement clause or whether it is the regional distribution of a more influential factor that causes the regional distribution of the comple- ment clause type. This is achieved by means of logistic regression analyses.9 Additional analyses concern the (cor)relations between clause types and individual independent variables (see section 2.2.1) other than REGION. The appendix contains a detailed description of the applied statistical analyses and relevant concepts. The following sections introduce and exemplify the variables and coding scheme employed in the present study.

2.2.1. Variables

The basic distinction between different types of variables in statistics is that between dependent and independent variables (see the appendix for more in- formation and explanations). Variables are set in SMALL CAPS. When a spe- cific variant of a variable is referred to, it is added in parentheses to the name of the variable VARIABLE NAME(value) e.g., FORTO(yes), REGION(Northern England). The linguistic variables that the present study deals with are: Embedded inversion word order in embedded interrogative clauses Complementizer in that clauses as vs. that Infinitive clause marker for to vs. to

These correspond to the dependent variables EI in section 3, ASTHAT in section 4 and FORTO in section 5. Dependent variables render the observed variation and correspond to the linguistic variables under analysis. They dis- tinguish between the standard variant and the non-standard variant of com- plement clauses by binary numeric codes. “0” is used for the standard form and “1” is the code for the non-standard form.. As mentioned above, there are two kinds of independent variables as- sumed to determine syntactic variation: internal, i.e., linguistic, and language- external, i.e., social and regional, variables. The internal factors are identified in the research on the variation in each type of complement clause and are discussed in the respective sections. The external variables, however, are the same throughout the study: AGE, SEX and REGION. The age and the sex of a speaker have a strong influence 220 Daniela Kolbe on his or her linguistic choices (see, e.g., Labov 2001; Francis 1983: 42–45; Chambers and Trudgill 1980: 71–74; 88–100; Mondorf 2002). Women are generally expected to use more standard forms than men (Coates 1986: 77– 78) and ongoing language change emerges in the difference in language use between younger and older speakers. AGE is measured in age of years of the speakers. AGE was only coded if the data exhibited an exact number. Cues such as “65+” for speakers in the NITCS as well as decades of birth as in FRED (e.g., 192) were ignored, as the available groupings in the corpora – decades of birth in FRED and three age groups in the NITCS – are not compatible. A speaker who is older than sixty- five years of age could be sixty-nine or eighty-two . This would classify him or her in the age-group “3” in the NITCS but in different decades in FRED. SEX refers to the sex of speakers, coded “0” if the speaker is a woman and “1” if the speaker is a man. For a certain number of speakers, the infor- mation on age or sex is not retrievable. The values are then “missing”, which statistical analyses take into account. REGION is the essential variable in the present study. It measures the fre- quency of syntactic variants in each regional subset of data from the corpora. Each region is coded by a one- or two-letter code, e.g.,“h” for the Scottish Highlands and Islands, “ni” for Northern Ireland, “se” for Southeast England. In only 222 FRED files, i.e., for not even two thirds of the speakers, is the occupation of the informants retrievable. In FRED there is also no indica- tion of occupation for most women, so that an inclusion of occupational data would skew the data in terms of the interrelation of sex and occupation. In addition, most adults in FRED and the NITCS are working class. Therefore, on the one hand, the data do not render a social stratification and, on the other hand, the existing groups are difficult to compare. Occupation or social class background is therefore not taken into account in the present study. Table 27 summarises the variables and their values. All variables except AGE are nominal (categorical, discrete) variables: they assign data to different categories which cannot be ordered in a meaningful way (see the appendix for a more detailed explanation). AGE is a ratio-scaled variable: coded as the actual years of age, it has an absolute zero, and each year of age cannot only be ordered to be more or less than any other year but the meaning of the unit year is also constant, the difference between sixty-four and sixty- three represents the same unit as the difference between six and seven. In contrast, a for to clause does not have a higher or lower value than a to clause – it is or is not used, but there is no unit Complement clauses 221

Table 27. List of variables Variable Dependent or independent Values EI dependent 0, 1 FORTO dependent 0, 1 ASTHAT dependent 0, 1 SEX independent f, m AGE independent ‘years of age’ REGION independent h, i, l, m, n, ni, se, sw, w REGION?, e.g., MID? independent 0, 1 MATRIX independent know, say, see, think of measure that differentiates one variant from the other. The nine regions covered in the corpora can also not be ordered from one to nine in a way that would represent one more unit in one region than in another, where a unit refers to a fixed scale of measurement (Howell 2003: 17–21). The values for binary variables are always 0 and 1, which – except in SEX – express ‘not true’ and ‘true’ respectively. The non-binary categorical variables are either one-or two letter / symbol abbreviations or the matrix lexemes. Table 28 illustrates how a clause is coded in the database.

Table 28. Coding example, dependent clauses in italics, TYPE represents EI, ASTHAT and FORTO respectively Texte Type Sex Age Region I’m tryin’ to see where’s ’is photo 1f–sw I would say that real love was ruling 0 m 50 h I was run home like billy-o for to get this rabbit 1 m 71 n

3. Embedded inversion

Embedded inversion is the inversion of subject and verb in embedded or sub- ordinate wh- clauses, as in (26).

(26) I need to, I says to go down and see him, to see what is he going to do with it. (FRED, HEB_018)

In contrast to standard English word order in which the subject always pre- cedes the verb in declarative clauses, in the complement clause controlled by see in (26), the auxiliary verb is has undergone inversion with the subject 222 Daniela Kolbe he. This inversion of verb and subject in the interrogative complement, i.e., embedded, clause has come to be referred to by the more concise (and more memorable) term “embedded inversion” (Filppula 2000). The word order is he corresponds to that of a direct question, e.g., in What is he going to do with it?. However, in the actual situation that is reported in (26), the speaker asked somebody else and thus probably said What are you going to do with it?. In the report in (26), the pronoun you is shifted to third person he, because the original addressee is not present. This shift of person in pronoun use is a feature of indirect speech (Quirk et al. 1985: 1026–1027). Consequently, the analysis of embedded inversion is connected to the difference between direct and indirect speech as discussed in section 2.1.5. Interrogative complement clauses are typically introduced by words begin- ning with wh-, such as who, which, when or whether and thus are referred to as wh- clauses. These are the second major structural type of English finite complement clauses (Biber et al. 1999: 658). An exception are how and if, which are also used as interrogative subordinators in wh-clauses. Wh- words appear in different functions, e.g., introducing direct questions, as in (27), or as relative pronouns, as in (28), and when they are used to introduce a finite dependent clause, as in (29), they function as a subordinator similarly to that (1999: 85).

(27) When was the road tarmacked then? (FRED, WES_005) (28) that’s the people who (reg sic=git) get the fish out (FRED, SFK_036) (29) Daddy doesn’t even know what the length of a centimetre is, he al- ways sticks to the inches. (NITCS A44.1, DF112)

Which wh- word is chosen depends on its referent. Most wh- clauses express an adverbial relation. Causes, for instance, are referred to by why (30) etc.

(30) I don’t really know why it [New Year]’s celebrated so much! (FRED, HEB_013)

Hence, variation in the choice of wh- words is content-specific and not dia- lectal and consequently not an object of investigation in this study. Although if expresses the adverbial relation of condition in conditional clauses and then is content-specific (Biber et al. 1999: 85; Quirk et al. 1985: 1086–1089), it also function as an alternative to the complementizer whether in wh-clauses, as illustrated in (31) and (32). Complement clauses 223

(31) ...Idon’t know if you know them . . . (FRED, DEN_004) (32) I don’t know whether you know him, . . . (FRED, KEN_005)

When wh- complement clauses are introduced by if or whether as in (31) and (32) respectively, they correspond to a direct yes/no question – in both (31) and (32) this would be Do you know them/him? – which evokes a yes or a no as an answer. Subject-verb inversion occurs in embedded wh- questions such as (26), cited here for convenience as (33). When subject-verb inversion occurs in embedded yes/no questions such as (34), there is no subordinator; the inver- sion seems to make it redundant. The non-standard word order is also used when do-support is needed in the inversion, as in (35):

(33) I need to, I says to go down and see him, to see what is he going to do with it. (FRED, HEB_018) (34) I don’t know did they find it [‘whether they found it’] too expensive or what. (FRED, HEB_027) (35) But he was telling me he didn’t know how did he manage it. (FRED, HEB_025)

Since there is no subordinator in embedded yes/no questions with subject- verb inversion, any instances of this dialect feature would not be identified by a search string such as wh-. Consequently, eliciting wh-clauses from the corpora by means of their most frequent matrix verbs is a more appropriate method to elicit a fairly representative data sample of language use regarding the choice between inverted, non-standard, and standard word order. The literature on embedded inversion proposes several hypotheses regard- ing its source. These are presented and discussed in section 3.1. The most frequent matrix verbs of wh- clauses occur in discourse markers (you know, you see) and the differences between direct and indirect questions are relat- ed to the differences between direct and indirect speech (see section 2.1.5 above). Section 3.2 summarises the consequences of these overlaps for data extraction before presenting the variables applied in the analysis of embed- ded inversion in the database and the consequent coding scheme. Section 3.3 provides the analysis of the distribution of embedded inversion in the data sample. The results of the analysis are summarised and discussed with regard to the research on embedded inversion in section 3.4. 224 Daniela Kolbe

3.1. Previous research on embedded inversion

As mentioned above, embedded inversion is reported in the descriptions of four varieties in Kortmann et al. (2004), the Englishes spoken in Scotland, Ireland, Wales and Northern England. It is also attested to occur, if rarely, in ManxE (Kewley Draskau to appear 2012). This makes it one of “the two most pervasive [morphosyntactic] features in the British Isles” of comple- ment clauses in the British Englishes (Kortmann 2004: 1095). In the literature, embedded inversion is not unanimously defined as a dia- lect feature, but it is very often also called a feature of colloquial speech in general. It is included in the discussion of standard English dependent inter- rogative clauses in Quirk et al. (1985: 1051–1052), Biber et al. (1999: 920) and Huddleston and Pullum (2002: 983). However, it is also generally con- sidered to occur frequently in dialects of English, especially in Irish English (see, e.g., Quirk et al. 1985: 1052; Hayden and Hartog 1909: 938; Curme 1931: 247–248; Edwards and Welten 1985: 920; Filppula 1999: 167–179) and in Ulster Scots (Montgomery 2006: 325). The Englishes spoken in Scotland, Ireland and Wales show influence from the Celtic languages spoken in these areas (see, e.g., Filppula 1999; Thomas 1997; Edwards and Welten 1985: 121), yet in Scotland this is true of the Highlands and Hebrides rather than the Lowlands, although Lowland Scots has some Gaelic features (Macafee and Ó Baoill 1997). Consequently, em- bedded inversion is frequently traced back to Celtic substratal influence (see, e.g., Filppula 2000). The varieties of English spoken in the Celtic regions of the United Kingdom developed in language contact situations between the Celtic languages and English in which the Celtic languages functioned as substrate languages (cf. Thomason and Kaufman 1988: 38–39). For the Eng- lish spoken in Northern England, Beal points out that embedded inversion is reported only in “dialects of the north-east” (2004: 128) and mentions the possible influence of Irish immigrants (1993: 189). A further possible explanation of embedded inversion is the retention of Old English V2 word order. As a general vernacular feature, embedded inver- sion is also regarded as a pragmatic or a narrative device or simply a disabil- ity to construct syntactically “correct” indirect questions. Before turning to these explanations in more detail, the formal structure of embedded inversion is discussed. As in direct questions, the inversion of auxiliary and subject represents a fronting of the tense marker (INFL) to complementizer position (COMP or Complement clauses 225

C, in CP). In an interrogative whether or if clause, the complementizers occur in the position COMP in the complementizer phrase CP, which itself consists of COMP and the inflection phrase (IP): CP → COMP+IP (Chomsky 1986: 4–7; Van Valin 2001: 194). Thus, in I don’t know if you have seen this, if you have seen this is the CP with the COMP if and the IP you have seen this. The inflection phrase consists of the tense-marking constituent INFL (for inflection) and I: have is INFL, a tense marker (IP → INFL+I’). In if you saw this INFL consists of a zero constituent. With embedded inversion, INFL is fronted and appears in COMP position, as in (36) and in standard English direct questions. This analysis is supported by the fact that embedded inversion never occurs together with a complemen- tizer (Emonds 1976: 23–25; McCloskey 1991: 294–295; Henry 1995: 110, 116–117), which is illustrated in (37) and holds for the data sample under analysis, as well.

(36) I don’t know [CP[COMP have] [IP[/0][I you seen this or not]]] (FRED, LAN_012) (37) *I don’t know if have you seen this or not.

However, when embedded inversion occurs in embedded wh- questions, e.g., he didn’t know how did he manage it (FRED, HEB_025) formal linguists hesitate to assume the same structure as in a direct wh- question, How does he manage it?. The direct question has the constituent structure

(38) [CP[SPEC how][C [COMP does] [IP[NPhe] [I [INFL /0][VPmanage it]]]]]. This complementizer phrase has X’ structure because of the optional specifier how (see Chomsky 1986: 3–4): CP → SPEC+C’; C’ → COMP+IP. Since embedded inversion “is considered by most speakers to be slightly better with yes-no questions than with wh-questions, and for a not insubstan- tial number of speakers, inversion is only grammatical in embedded yes-no questions” (Henry 1995: 106) in Belfast English, Henry proposes a differ- ent derivation for inversion in wh- than in yes/no questions. When the wh- complementizer slot in embedded yes/no questions is not lexically filled by whether or if, the verb is raised to complementizer position as exemplified in (36). In embedded wh- questions however, the wh- element occurs in SPEC and creates agreement with the (phonologically zero) complementizer even if the head verb (e.g., think) usually selects a non-wh- complementizer ([- wh]). The underlying complementizer thus agrees with the wh- word in SPEC 226 Daniela Kolbe in that it also becomes a wh- element ([+wh]). Only because of the created [+wh]-agreement can the verb be raised to COMP position, as in the direct question in example (38) and in embedded yes/no questions (Henry 1995: 117–120). The agreement of complementizer and specifier (SPEC) is necessary, as raising of the verb does not occur if there is no wh- specifier. Thus, embedded inversion with wh- specifier is possible even with verbs as claim and think within direct questions in which the wh- element occurs as pushdown-element in the matrix clause: Who did you claim did he see? and Who did you think did John convince that Mary went? (Henry 1995: 118). She rejects the analysis presented in McCloskey (2006), for which it is preliminary that embedded inversion only occurs with verbs that also allow adverbial adjuncts before their complements.10 McCloskey (2006: 98–99) illustrates this distinction by the examples (39) and (40), among others. Pred- icates that behave like ask are, e.g., wonder, negated know and negated be sure, predicates that behave as be amazing are, e.g., establish, depend and astonished (2006: 98–100).

(39) a. He asked me when I got home if I would cook dinner. (Mc- Closkey 2006: 98) b. Ask your father when he gets home does he want his dinner. (McCloskey 2006: 99) (40) a. *It was amazing while they were out who had got into their home. (McCloskey 2006: 98) b. * It was amazing who did they invite. (McCloskey 2006: 99)

According to Henry (1995: 117–120), embedded inversion is possible with predicates that do not allow adverbial adjuncts in Belfast English, in contrast to the varieties of Irish English examined by McCloskey. Because of the restriction on the kind of complements the matrix verb may select, McCloskey establishes a parallel to the verb-second languages Dutch and Spanish (McCloskey 2006: esp. 101–105; cf. Rizzi and Roberts 1989: 21–22 and Bakovic´ 1998). Verb-second word order refers to the inversion of verb and subject if, e.g., an adverbial occurs at the beginning of a clause (Trask 1993: 2998). Generally, in verb-second languages as well as in Eng- lish, T-to-C movement, or the movement of INFL to COMP, is not possible when the complement clause is lexically selected, as in complement clauses of wonder. Dutch and Spanish clauses, however, have syntactic structures Complement clauses 227 that represent exceptions to this selectional constraint. McCloskey transfers these exceptional structures to Irish English. When a verb apparently selects a complement clause with inversion, this clause is not immediately dominated by what appears to be the matrix, e.g., wonder, but by an inserted comple- mentizer phrase that consists of COMP and a further complementizer phrase (CP → COMP+CP). The resultant constituent structure is presented in (41) (adopted from McCloskey 2006: 101):

(41) wonder[CP [COMP /0][CP [DP what [nolabelnode] [COMP2 should][CP we do]]]

Nevertheless, some points in this analysis seem to remain unresolved, at least from a more functional perspective. As embedded inversion represents a transformation (from INFL to COMP) that is ungrammatical in standard Eng- lish, McCloskey claims that it constitutes a different transformation, based on similar analyses of the verb-second languages Dutch and Spanish. However, it does not appear to be self-explanatory that the violation of syntactic rules of Irish English, which is not a verb-second language, should be best explained by the violation of a syntactic rule in verb-second languages. It seems un- usual that the Irish English varieties examined in McCloskey (2006) should have more in common with or Spanish than with Belfast English as examined by Henry (1995: 105–123). In Old English, which exhibits verb-second word order in main clauses, the verb occurs in final position in embedded clauses (Quirk and Wrenn 1957: 92–94; see also Stockwell 1984: 579–583). Consequently, an Old English or common Germanic origin of embedded inversion in verb-second word order (Visser 1966: 780) also seems debatable. In addition, diachronic examina- tions generally detect an increase in the use of embedded inversion in later periods of English that takes place simultaneously with the decline of verb- second word order (e.g., Jespersen 1927: 44; Filppula 1999: 177, 2000: 443). The ungrammaticality of inversion in embedded wh- questions for some speakers observed by Henry might be due to “negative overreporting” as she notes herself on comments on the grammaticality of for-to clauses (Henry 1992: 282). “Negative overreporting” is basically “underreporting” the non- standard feature. Speculatively, one could argue that in embedded yes/no questions, embedded inversion “replaces” the complementizers if or whether and thus might be considered more legitimate than subject-verb inversion in embedded wh-clauses. However, that speakers consider a syntactic structure 228 Daniela Kolbe to be incorrect does not mean that they do not use it. Using native speaker intuition as a source for the grammaticality of syntactic structures has been a cause for much debate (see Labov 1972b: 191–192). Based on a questionnaire administered to schools in England as part of the FRED project, grammaticality judgements on a small number of examples of embedded inversion are available. 40 questionnaires were evaluated and although this number is not large enough to postulate regional differences, the answers show that embedded inversion is frequently regarded as correct usage by native speakers of English from England. The sentence He asked me would it be alright if he came with me was considered to be correct by 52% of the respondents and regarded to be potential dialect usage by another 24%. In sum, the exact syntactic structure of embedded inversion in embed- ded wh- question appears to be unrevealed. The corpora exhibit instances of subject-verb inversion in embedded yes/no as well as in wh- questions. Hilbert (2008) observes that in embedded wh- questions inversion is more frequent with the verb be, which often results in fusions such as what’s. Theories that consider embedded inversion as a general vernacular feature or to be caused by substratal influence from Celtic languages seem to have more explanatory power than the claim that it originates in verb-second va- rieties or languages. Although embedded inversion is reported as a syntactic feature of many regional varieties it is often considered as a feature of the vernacular without any regional bias, or as a grammatical error. As the word order of embedded inversion can be regarded to display a lacking distinction between main and subordinate clauses (Huddleston and Pullum 2002: 983), the occurrence of subject-verb inversion has often been considered to show a lack of the ability to construe grammatically correct indirect questions. While Visser (1966: 780, 831) only bemoans the lack- ing differentiation between direct and indirect questions, McDavid and Card (1972: 105) and Miller and Weinert (1998: 83) describe embedded inversion as a grammatical error that occurs in the complex task of constructing an indirect question which consists of applying all necessary shifts and word or- der changes. This seems quite a bold claim since the deictic expressions in clauses with embedded inversion are usually shifted according to the general rules of indirect speech formation. Example (35), cited here as (42), for in- stance, shows pronoun shift (I > he) and example (43) illustrates backshift of tense (are >was). Complement clauses 229

(42) But he was telling me he didn’t know how did he manage it. (FRED, HEB_025) (43) Then it [hay] had to go to be passed, and all, to see was it right shade (NITCS, A 53.3 TK21)

In this context it is important to note that there is no evidence that the inverted word order is easier (Filppula 2000: 173). In the language of children (as well as in creoles), declarative word order is often retained in direct questions (Bickerton 1981: 70. 187), which suggests that the declarative and not the inverted word order might also be the easier option. Jespersen (1927: 44–45, 2.4.5) regards embedded inversion as a device to avoid “stiff” whether and confusion with conditional if, which is “extremely frequent in colloquial speech” (1927: 44). In Curme (1931: 247), Erdmann (1979: 8–9) and Biber et al. (1999: 920) it is also described as a general colloquial feature. And although Quirk et al. (1985: 1051–1052) mention that it is frequent in Irish dialects of English, they also mention its use in other dialects in certain syntactic environments, e.g., with heavy subjects and be, as in She told us how strong was her motivation to engage in research (1985: 1052 note, their example). The retention of inverted word order in reported questions establishes a compromise or blend between direct and indirect speech and thus “can be related to the differences, similarities and blends between main and subordi- nate interrogative clause” (Trotta 1998: 86; see also Baker 1970: 198). The discussion of coding strategies in section 2.1.5 above has shown that there is no clear distinction between direct and indirect speech and that the live- lier use of features of direct speech creates greater immediacy and thus more involvement of the listener. The use of deictic shifts, however, establishes a syntactic integration in the superordinate clause structure. Strictly speaking, embedded inversion is not a narrative blend of direct and indirect speech, but of free indirect and direct speech that “can also be used in very rhetorical discourse, ...a fact that does not allow an exclusive ‘colloquialism’ reading” (Fludernik 1993: 152–153). In free indirect speech, direct speech word order is retained, but tense backshift and pronoun shift occur, as is commonly the case in the literary reports of streams of thought, e.g., “So that was their plan, was it? He well knew their tricks . . . ” (Quirk et al. 1985: 1032, their example). Furthermore, the reports are not embedded in a superordinate clause structure and therefore “free”. Fludernik’s (1993) example sentence could be uttered as direct speech with the same word order 230 Daniela Kolbe and “unshifted” deictic expression (So that is your plan, is it? I well know your tricks). It is the backshift of tense and further deictic shifts that renders the report as indirect (see Sabban 1982: 462). As mentioned above, embedded inversion is regularly mentioned as dia- lectal feature in the Celtic Englishes Scottish (e.g, Catford 1957: 110; Sabban 1982: 460–483; Miller 1993: 126), Irish (e.g., Harris 1993: 168) and Welsh English (e.g., Penhallurick 1991: 209–210; Parry 1999: 119; Thomas 1997: 79). Although Scots, or the English spoken in the Scottish Lowlands, is not a Celtic English as the English spoken in the Scottish Highlands and Hebrides (Macafee and Ó Baoill 1997), embedded inversion is often included in de- scriptions of “Scottish English” in general or even referred to as a feature of Scots (Miller 2004: 56). Already in the earliest accounts of embedded inversion it was therefore traced back to similar structures in the Celtic languages Irish Gaelic, Scottish Gaelic, Manx Gaelic (spoken on the Isle of Man) and Welsh Brythonic (Hay- den and Hartog 1909: 938; Van Hamel 1912: 279–280). This hypothesis is also supported by, e.g., Thomas (1994: 138, 1997: 79), Penhallurick (1991: 210) and Filppula (2000). In general, Celtic languages are VSO languages and there is no difference in word order between main and subordinate clauses in general (MacAulay 1992a: 6–7). Direct as well as indirect questions are introduced by an inter- rogative particle or a wh- element (which is sometimes optional in Manx), which is then followed by verb and subject. Thus, the word order in direct and indirect questions is identical (Ó Siadhail 1989: 321; Gillies 1993: 211– 212, 217; MacAulay 1992b: 172–174, 187; Thomson 1992: 103–104, 107; Thomas 1992b: 272, 282). The general word order in Cornish is also VSO (Thomas 1992a: 346, 348; George 1993: 414), although in later Cornish sev- eral SVO structures occurred (George 1993: 455, 458–459). (44) and (45) from Irish and Welsh exemplify the structures of direct and indirect ques- tions in Celtic languages. Interrogative particles are glossed by INTER and question words by Q according to the source texts.

(44) Irish a. An raibh tú sásta? INTER be-PAST you content ‘Were you content?’ (Ó Siadhail 1989: 321) Complement clauses 231

b. Chur sé ceist ort an raibh tú sásta put he question on-you INTER be-PAST you content ‘He asked you if you were content’ (Ó Siadhail 1989: 321) (45) Welsh a. A welodd ef y gath? INTER saw he the cat ‘Did he see the cat?’ (Thomas 1992b: 272) b. Gofynnodd Mair a oedd hi am fynd asked Mair Q was she for go ‘Mair asked whether she wanted to go’ (Thomas 1992b: 282)

Filppula (1999, 2000: 167–179) concludes that substratal influence is more likely to be the source of embedded inversion than Old English, due to the par- allel linguistic constructions in Celtic languages and the temporal correlation of increasing occurrence of embedded inversion and decreasing verb-second word order over time. This is supported by findings in Kolbe (2001), a pilot of this case study, which shows that although embedded inversion is common in non-Celtic regions (the Scottish Lowlands and Northern England), in Celtic regions (Northern Ireland, Scottish Highlands and Islands) it is not only more fre- quent in general, but also distributed more equally across different matrix verbs. Interrogative clauses controlled by ask and wonder contain a higher percentage of subject-verb inversion in all examined regions, whereas the in- version is less frequent with know and see in general. In the Celtic regions, however, the distribution of embedded inversion across all matrix verbs is more balanced (2001: 68). Similarly, Davydova et al. (2011: 298–309) show that the use of embedded inversion is more equally distributed across different linguistic structures in Irish English than in other L2 varieties. Ask is the most typical reporting verb of direct questions (see section 2.1.5) and wonder also occurs in this function, as exemplified in (46). Know and see, however, are not typically used with direct question, even when the reporting clause is rephrased to be less assertive, as shown in (47).

(46) a. I wonder can I have a chat one time (FRED, CON_006) b. I wonder, “Can I have a chat one time?” (47) a. ?I don’t know, “Can I have a chat one time?” b. ?I came here to see, “Can I have a chat one time” 232 Daniela Kolbe

The use of embedded inversion as a narrative device to make reported speech more lively and to create more immediacy (Erdmann 1979: 8–9; Biber et al. 1999: 920) is thus more frequent and has a wider regional distribution with the typical reporting verbs of direct questions, ask and wonder. With the matrix verbs know and see, which do not appear in reporting clauses of direct speech, embedded inversion occurs less frequently in general, but less distinctly so in Celtic varieties of English (Kolbe 2001: 58–74). The distinction between wonder and ask on the one hand and know and see on the other hand is related to the distinction between indirect questions and other wh-clauses.

(48) He wondered where I’d come from. (49) He didn’t know where I’d come from. (50) He knew where I’d come from.

In (48), the verb wonder means ‘to ask oneself’ when followed by a com- plement clause (OED online, wonder, v. 2.). It refers to a question that is not necessarily uttered but at least exists in a speaker’s mind. With negated know as in (49) there is still a lack of information, but this utterance does not imply that the referent of he is actually interested in an answer. In (50), the infor- mation that would represent an answer to the underlying question is already provided. Thus, only in (46-a) and (48) are the complement clauses actually indirect questions, while (49) and (50) are interrogative, but no reports of ac- tual questions (see Quirk et al. 1985: 1051; Ohlander 1986: 972–973; Trotta 1998: 53–54). A similar distinction can be made between question-orientation in (46-a), (48) and (49) on the one hand and answer-orientation in (50) on the other hand (see Ohlander 1986; Huddleston and Pullum 2002: 981). Question ori- entation refers to a “lack of knowledge”, whereas answer orientation refers to the “possession” of knowledge (Ohlander 1986: 967). It is also reflected in the different proportions of embedded inversion with different matrix verbs: ask and wonder have a stronger question-orientation per se than know and see. Within the matrix clauses with know, only wanted to know displays strong, or “active” (Ohlander 1986: 971) question-orientation, while negated know expresses passive question orientation and assertive know implies answer ori- entation. Consequently, embedded inversion should not occur with negated or assertive know (Ohlander 1986: 972–973). Huddleston and Pullum (2002: 983) also mention the dialectal distribution of embedded inversion (especially Complement clauses 233 in the USA) but claim that only a clause with stronger question orientation “allows inversion”. Their example of weaker question-orientation is He didn’t know was she ill, which they mark as ungrammatical even in dialects in which He wanted to know was she ill is grammatical, thus replicating the assertions presented in Ohlander (1986: 971–972). The corpora under analysis, how- ever, frequently contain instances of embedded inversion after a matrix clause with negated know. In sum, based on previous research, embedded inversion can be expected to be more frequent in Celtic Englishes and in utterances displaying question- orientation. It might also be more restricted in wh- questions. These contexts will be taken into account in the analysis in section 3.3.

3.2. Methods: data, coding and variables

The data sample for the analysis of the distribution of embedded inversion consists of all wh-clauses controlled by know and see, except those after the possible discourse markers you know and you see (see section 2.1.5). The re- striction of matrix verbs to know and see thus reduces the amount of extracted clauses in which embedded inversion is employed as a narrative device in re- ported speech as after ask and wonder (see section 3.1 and Kolbe 2001). The database of wh-clauses controlled by know and see consists of 2,477 clauses. However, in embedded wh- question, embedded inversion can only occur in those clauses that would require subject-verb inversion in the direct question. If the wh- word functions as the subject of the complement clause, as in (51), embedded inversion does not occur, but if the wh- word represents a complement, an adverbial or an object, as in (52), embedded inversion is possible.

(51) I don’t know what happened to the funds (FRED, WES_019) – What happened to the funds? (52) I don’t know what they did (FRED, ROC_001) – What did they do?

Consequently, 256 wh-clauses in which the wh- element functioned as subject were eliminated from the data sample. The resultant database contains 2,221 complement clauses. As the word order in clauses with embedded inversion is the same as in direct questions, instances of direct speech, i.e., direct questions, had to be 234 Daniela Kolbe distinguished from indirect questions. This distinction was based on the fea- tures of direct speech identified in section 2.1.5. Direct question word order in complement clauses controlled by know and see alone was not considered to represent a direct speech report but a non-standard variant of word order in indirect questions. Two features of direct speech as defined in section 2.1.5, e.g., capitalisation of the first word in the clause, utterance launchers or lack of tense and pronoun shift were necessary for a clause to be classified as a direct speech report.

Dependent variable

Ei yields the general distribution of embedded inversion. If a clause shows inversion of verb and subject, it is coded “1”, all other clauses are coded “0”. What was/is/’s the matter was not coded as embedded inversion, as what the matter was/is does not occur in FRED, the NITCS or the BNC and was thus considered unlikely.

Independent variables

Based on the research on embedded inversion, the following linguistic factors were chosen as potential determinants of of embedded inversion. Yesno distinguishes between embedded yes/no questions, coded “1’ and embedded wh- clauses, coded “0”. Orientation distinguishes between question orientation, coded “q” and answer orientation, coded “a”. According to Ohlander (1986: 971– 972) and Huddleston and Pullum (2002: 983), individual verbs express different degrees of question and answer orientation. Know expresses question orientation only in imperatives or negated clauses. Embedded inversion is regarded as exclusive to sentences with strong, or “active” question orientation, i.e., if it occurs with the matrix want to know,but not in clauses with passive or weak question-orientation with negated know, e.g., never knew, didn’t know. However, the majority of clauses with inverted word order in the database (28 of 44) is controlled by a matrix that expresses weak question-orientation (negated know or see). Hence, ORIENTATION does not distinguish between strong / active and weak / passive question orientation, but between question and answer Complement clauses 235

orientation in general. It captures whether question orientation signifi- cantly increases the probability of inverted word order. When the interrogative clause is an embedded yes/no question, it was always coded as question-oriented. Question orientation can occur with imperatives, e.g., Tell me if you lied to me, but not in past tense ?He told me whether he lied to me (Ohlander 1986: 974–975). When it is pos- sible to insert an “impatience marker” such as on earth, the devil/hell, the sentence is question oriented; compare He asked where the hell I’d come from with ?He knew where the hell I’d come from (Ohlander 1986: 975–977). All negated and imperative matrix verbs were coded as showing ques- tion orientation. Assertive know denotes the possession of knowledge and was coded for answer orientation, and so was see in its literal sense ‘visual perception’ or when it means ‘understand’. When, however, see is used synonymously with ‘find out’, meaning “[t]o ascertain by inspection, inquiry, experiment, or consideration” (OED online, see v. I 6.a), it expresses a lack of knowledge and was consequently coded as question-oriented, in clauses such as (53).

(53) I’ll have a go, see what it is (FRED, CON_006)

When a matrix clause forms a direct question, as in (54), the sentence was coded as question oriented.

(54) Caerphilly, know what that is? (FRED, SOM_031)

Clauses introduced by conditional if, e.g., (55), are coded as answer oriented although they basically refer to a potential event about which there still exists a lack of knowledge. The condition, however, can only be true if some knowledge exists about the proposition of the wh- clause.

(55) And mind if you saw where they’d taken it off from you’d be surprised (FRED, WES_014)

Subjectlength measures the length of the subject of the complement clause. According to Quirk et al. (1985: 1052), longer or heavier sub- 236 Daniela Kolbe

jects are more likely to undergo inversion. All word classes were treated equally. If the subject noun phrase consisted of one word, it was coded “1”, if it consisted of two words, it was coded “2”, and if it consisted of more than two words, it was coded “3”. Be distinguishes between complement clauses whose verb phrase is headed by be, coded “1” and complement clauses whose verb phrase is headed by any other verb, coded “0”. In interaction with long subjects it results in a syntactic structure that allows embedded inversion even in standard English (Quirk et al. 1985: 1051–1052).

As in all analyses in the present study, the variables AGE, SEX and REGION are included as independent variables. REGION has two sub-variables, AREA and CELTIC:

Area In order to distinguish between Cornwall and the other Southwest- ern counties without possible Celtic influence, the category “con” is used instead of “sw” as area code for the files from Cornwall. Thus, the analysis below differentiates between ten “areas” in contrast to the usual nine regions. Celtic distinguishes between the Celtic regions Northern Ireland, the Scottish Highlands and Islands, the Isle of Man,Wales and Cornwall, coded “1” and the non-Celtic regions, coded “0”.

3.3. Analysis

The overall regional distribution of embedded inversion is highly statistically significant (chi2, p <0.001) and is displayed in Table 29. The null hypothesis that embedded inversion is equally frequent in all regional varieties included in the database can therefore be rejected. As Cornwall, in contrast to the other Southwestern counties, has a Celtic background, a subdivision of the Southwestern data distinguishes between the distribution of embedded inversion in the Cornish data from that of the other Southwestern counties (provided by the variable AREA). Remarkably, the data from Cornwall are significantly different from the data from all other Complement clauses 237

Table 29. Regional distribution of embedded inversion (EI) and non-inverted word order (SV); AR = adjusted residuals, significant residuals in bold, overall chi2: p <0.001 EI SV Total REGION/AREA % (n) AR % (n) AR n

Northern Ireland 6.7 (10) 4.3 93.3 (140) -4.3 150 Scottish Highlands & Islands 13.2 (17) 9.4 86.8 (112) -9.4 129 Isle of Man 9.1 (1) 1.7 90.9 (10) -1.7 11 Scottish Lowlands 0.0 (0) -1.7 100.0 (132) 1.7 132 Northern England 1.0 (4) -1.4 99.0 (377) 1.4 381 English Midlands 0.4 (1) -2.0 99.6 (265) 2.0 266 Wales 0.0 (0) -1.3 100.0 (78) 1.3 78 Southwest England Total 1.9 (10) -0.1 98.1 (503) 0.1 513 Cornwall 5.5 (5) 2.5 94.5 (86) -2.5 91 rest 1.2 (5) -1.3 98.8 (417) 1.3 422 Southeast England 0.2 (1) -3.5 99.8 (560) 3.5 561

Total 2.0 (44) 98.0 (2177) 2221

Southwest counties. This is shown in the adjusted residuals (AR), which in- dicate the deviation from the expected frequency if the distribution of em- bedded inversion was the same as in the overall data. In binary variables, the value of the adjusted residual for one category corresponds to the same value with opposite polarity for the other category: while embedded inversion has an adjusted residual of (+) 2.5 in Cornwall, subject-verb word order has an adjusted residual of -2.5. On average, embedded inversion occurs in two per cent of the clauses. In the data from Cornwall, it occurs in 5.5% of the clauses. In the remaining Southwestern counties (Wiltshire, Somerset, Devon and Oxfordshire) it occurs in only 1.2%, which is significantly less frequent than expected. Embedded inversion is also significantly more frequent in the data from the Scottish Highlands and Islands and in the data from Northern Ireland. It is more frequent than expected, though not significantly so, in the data from the Isle of Man. In all other regions, embedded inversion is less frequent than expected, although this deviation is only significant in the data from the English Midlands and in the data from Southeast England. In general, this distribution provides evidence for a substratal origin of embedded inversion: it is significantly more frequent than expected only in areas with a Celtic background: the Scottish Highlands and Islands, Cornwall 238 Daniela Kolbe and Northern Ireland. Although Ulster Scots has had a noteworthy influence on the English spoken in Northern Ireland, the influence of Irish Gaelic on Northern Irish English is also noticeable (see Harris 1984: 115–118; Filppula 1999: 32–34). The predominance of the English spoken by the indigenous English population may also have influenced the English settlers’ native lan- guage (Thomason and Kaufman 1988: 79). In the two remaining Celtic regions represented in the database – Wales and the Isle of Man – embedded inversion does not occur significantly more frequently than expected. While it still is more frequent than expected in the data from the Isle of Man, it is less frequent than expected in the data from Wales, though in both cases, this deviation is not statistically significant. The data subsets from these two regions contain the fewest number of words, so differences in these regions are unlikely to reach a statistically significant level. In comparison with the 5.5% in Cornwall and the 6.7% in Northern Ireland, the 9% of clauses with embedded inversion in the data from the Isle of Man would most probably be significantly more frequent than expected, if they had occurred in a larger database. The correlation between CELTIC and EI is highly significant but not very strong – it explains 15.6% of the variability of the data (Spearman rho = 0.156, p <0.001). Whether the regional distribution is actually determined by regional preferences rather than by the regional distribution of other factors that lead to embedded inversion, such as subject length or question orientation was verified in a logistic regression analysis (see appendix 6) . In logistic regression analysis, all cases (i.e., complement clauses with all associated coded variables) lacking information for any variable (“missing cases”) are not included. Therefore, the overall percentage of clauses with subject-verb word order increases from 98.0% to 98.3%. Without the consid- eration of any factor, the prediction of the absence or presence of embedded inversion would be correct in 98.3% of all cases if one guessed that embed- ded inversion did not occur. By the inclusion of the variable AREA and the interaction between AREA and SUBJECTLENGTH this high percentage of cor- rectly predicted cases can be slightly increased to 98.4%. This model explains 29.9% of the variability of the data (Nagelkerke R2 = 0.299). The significant factors that influence the odds of embedded inversion are presented in Table 30. The odds ratio (OR) displays the factor by which the odds of embedded inversion are increased by the respective variable or in- teraction, in comparison with the odds of a reference category of the same variable. If a complement clause of know or see occurs in the data from the Complement clauses 239

Table 30. Predictors of embedded inversion, significant values in bold Factor Exp(b) (OR) p

AREA – <0.001 AREA(Cornwall) 3.037 0.133 AREA(Scottish Highlands & Islands) 10.58 <0.001 AREA(Isle of Man) 0.000 .999 AREA(Scottish Lowlands) 0.000 .996 AREA(English Midlands) 0.000 .995 AREA(Northern England) 0.745 0.689 AREA(Northern Ireland) 3.73 0.040 AREA(Southeast England) 0.164 0.099 AREA(Wales) 0.000 0.997 AREA*SUBJECTLENGTH – 0.959 AREA(Cornwall)*SUBJECTLENGTH(2 words) 18.00 0.008 AREA(Cornwall)*SUBJECTLENGTH(>2 words) 0.000 0.999 AREA(Scott. H&I)*SUBJECTLENGTH(2 words) 1.550 0.599 AREA(Scott. H&I)*SUBJECTLENGTH(>2 words) 1.3E+010 0.999 AREA(Isle of Man)*SUBJECTLENGTH(2 words) 2.6E+018 0.999 AREA(Engl. Midl.)*SUBJECTLENGTH(2 words) 1.1E+008 0.994 AREA(Engl. Midl.)*SUBJECTLENGTH(>2 words) 1.000 1.000 AREA(North Engl.)*SUBJECTLENGTH(2 words) 0.000 0.998 AREA(North Engl.)*SUBJECTLENGTH(>2 words) 0.000 0.999 AREA(North Ireld.)*SUBJECTLENGTH(2 words) 0.000 0.999 AREA(North Ireld.)*SUBJECTLENGTH(>2 words) 3.143 0.325 AREA(SE Engl.)*SUBJECTLENGTH(2 words) 0.000 0.998 AREA(SE Engl.)*SUBJECTLENGTH(>2 words) 0.000 0.999 AREA(Wales)*SUBJECTLENGTH(2 words) 1.000 1.000 AREA(North Engl.)*SUBJECTLENGTH(>2 words) 1.000 1.000 Constant 0.012 <0.001

Scottish Highlands and Islands (OR 10.58) or from Northern Ireland (OR 3.73), it is more likely to exhibit inverted verb-subject word order than in the reference category of AREA, Southwest England, i.e., without Cornwall. This area was selected as reference category because the distribution of clauses with and without subject-verb word inversion was least deviant in this area (see Table 29) with an adjusted residual of +/-1.3. The odds of embedded in- version in the data from the Scottish Highlands and Islands versus the data from Southwest England are 10.58 : 1. 240 Daniela Kolbe

The influence of the area Southeast is not statistically significant at the usual threshold of p ≤0.05 (see appendix). However, its p-level of 0.099 means that there is only a probability of 9.9% that this result is due to chance alone. Thus, with a probability of 90.1%, another sample of speakers from Southeast Eng- land is also less likely to use embedded inversion (OR 0.164) than speakers from Southwest England. As shown in Table 29, there is only one instance of embedded inversion in the data from Southeas England, as opposed to 560 clauses with standard word order. The reference category for SUBJECTLENGTH is one word (1), because of its predominance: 1,976 out of the total of 2,221 clauses have a subject that is one word long. The interaction between AREA and SUBJECTLENGTH there- fore means that the odds of embedded inversion in the data from Cornwall increase by a factor of 18 when the subject of the clause consists of two words – as compared to the odds of embedded inversion in the data from Cornwall when the subject consists of just one word. (56), the only clause in the Cornwall data that contains a subject with more than two words, ex- hibits subject-verb inversion and therefore the interaction between this region and this value of SUBJECTLENGTH is taken to disfavour embedded inversion (OR =0.000), but its influence is far from statistically significant (p = 0.999).

(56) I don’t know what the name of him was now, a good dog he was, you know. (FRED, CON_006)

The inclusion of any other variable or interaction term does not increase the explanatory power of the model. Although all significant regional predictors identified are varieties with a Celtic background, the inclusion of the variable CELTIC decreases the fit of the model. This reflects the fact that in two Celtic areas (Isle of Man and Wales), embedded inversion is not significantly more frequent than expected, and in the data from Wales even less frequent than expected. The question- or answer-orientation of the complement clause does not correlate with embedded inversion: although embedded inversion is more frequent in question-oriented clauses, this difference is not significant (see Table 31). In order to measure correlations, ORIENTATION was automatically recoded into a numeric variable with the values “1” for answer- and “2” for question-orientation. There is hardly any correlation between ORIENTATION and word order: only 1.5% of the variance is explained (Spearman rho = 0.015), and it is not even significant, nor does the overall distribution differ Complement clauses 241

Table 31. Distribution of embedded inversion (EI) and non-inverted word order (SV) in question- and answer-orientation; AR = adjusted residuals Orientation SV EI Total % (n) AR % (n) AR n Answer-orientation 99.2 (358) 1.3 0.8 (3) -1.3 361 Question-orientation 97.8 (1817) -1.3 2.2 (41) 1.3 1858 chi2 continuity corrected p = 0.131 significantly (continuity corrected p = 0.131). The answer-oriented clauses with embedded inversion are the following:

(57) He’d been up and seen how was I doing mi job, see. (FRED, KEN_002) (58) Can’t tell you, just to know what time would be your . . . (FRED, NBL_003s (59) I know, what, what would they do, they would just die . . . (NITCS, A20.2, TD93)

In sum, embedded inversion is significantly more frequent in the data from Northern Ireland, from the Scottish Highlands and Islands and from Corn- wall. In the latter region, the higher frequency of embedded inversion is as- sociated with subjects that consist of two words.

3.4. Discussion and summary

The analysis has confirmed the findings from Kolbe (2001) in that embedded inversion is more frequent in Northern Ireland and the Scottish Highlands and Islands than elsewhere, even if all of the FRED data are taken into account. In the data from Cornwall, embedded inversion is also frequent with two-word subjects. The semantic distinction between question- and answer-orientation does not influence its use. The composition of the NITCS data from Northern Ireland and the data from the Scottish Highlands and Islands differ from most of FRED, in that they also contain the speech of younger and more female speakers. However, the data from the Scottish Lowlands have a similar composition in terms of age and sex of the speakers as these two regions (see section 2.1.2). If these factors alone influenced the use of embedded inversion, it should be more 242 Daniela Kolbe frequent in the Scottish Lowlands, as well. In contrast, embedded inversion is also more frequent in the data from Cornwall, whose composition is typical of FRED with mostly male speakers (11 of 15) over sixty years of age (age range: 59–86, average age: 70.5 years). The method of data collection, formal interviews, is the same in all data sets in both corpora. As embedded inversion is rare overall and all speakers predominantly pro- duce “grammatically correct” interrogative clauses, embedded inversion can hardly be the result of speakers’ inability to produce grammatically correct in- direct questions, as claimed by McDavid and Card (1972: 105) and by Miller and Weinert (1998: 83). Whether the complexity of the production of a de- pendent wh-clause increases the probability of embedded inversion could be determined in a psycholinguistic study. The variables in this study associated with processing factors are SUBJECTLENGTH and BE, neither of which has a significant main effect on embedded inversion. Only in the data from Corn- wall does the length of the complement clause subject increase the odds of embedded inversion. The frequency of embedded inversion is only significantly lower in the data from two regions – Southeast England and the English Midlands – where it occurs once each. However, it is not a feature that is restricted to dialects with a Celtic background, either. It is employed in most regional varieties in Great Britain and in Northern Ireland. Thus the data also prove its use as general vernacular feature, even in complement clauses controlled by verbs that typically do not occur in reporting clauses of direct questions (see section 3.1 above and Kolbe 2001). If verb-second word order emerged in embedded inversion, it would not only seem remarkable that it should do so in embedded instead of main clauses, but also why it should be more frequent in individual regional vari- eties, instead of spread more equally throughout the United Kingdom. The three Englishes in which embedded inversion is significantly more frequent in the present study all have a Celtic background. These findings consequently provide further evidence for the hypothesis that inverted word order is a Celtic substratal feature, which is confirmed also for Southern Irish English by Kolbe and Sand (2010, cf. also Davydova et al. 2011: 298-309). Although embedded inversion does not occur in the data from Wales, its lower frequency there is not statistically significant. The database contains fewer wh-clauses from Wales (n=78) than from Cornwall (n=91). Most probably, embedded inversion does occur in Welsh English, just not in this data sample, as FRED does not contain enough words from this area. Zero occurrence in Complement clauses 243 a corpus, and especially in a small data sample, does not constitute negative evidence (see Stefanowitsch 2006: 67–68). Consequently, although embedded inversion is a feature of Scottish Eng- lish, it does not seem to be a feature of Scots as such (cf. Miller 2004: 58), but rather a supra-regional vernacular feature that is supported by Celtic sub- stratal influence and thus more frequent in Celtic Englishes. Although Irish immigrants in the Northeast of England may have contributed to the spread of embedded inversion in Northern England, it is not restricted to the Northeast- ern counties (in FRED, Northumberland or Durham), but also occurs further south and west in Lancashire (cf. Beal 2004: 128–129): of four instances of embedded inversion in the Northern English counties in FRED, one is from Northumberland (NBL_003) and three are from Lancashire (LAN_003, 005 and 012). These instances are further evidence of the general use of embed- ded inversion in vernacular and colloquial British English. Embedded inversion is not a feature that is restricted to the British Isles. In the English spoken in the Americas, it is one of the most widespread features in general (Kortmann and Szmrecsanyi 2004: 1165–1166). “Once believed to be a characteristic of only African American Vernacular English (AAVE; cf. Labov 1972a: 62–64, 228), this has since been shown to occur throughout the United States”(Murray and Simon 2004: 224). Figure 14 shows that embedded inversion not only occurs in American and British, but also in African, Asian and Australian / Oceanian varieties of English. This map renders answers to a survey on morphosyntactic features. In this survey, authors often reported the use of embedded inversion in the variety they described, although they did not include it in their descriptions in the printed edition (Kortmann et al. 2004). To sum up, although embedded inversion is a feature of non-standard syn- tax that occurs in varieties of English around the world, Celtic substratal in- fluence seems to increase its use especially in the Scottish Highlands and Islands, Ireland and, in certain syntactic environments, in the English spoken in Cornwall. That Celtic substratal influence should surface in the Englishes spoken in countries such as India or Kenya, in which lively contact with many more indigenous languages exists, seems unlikely (see Sand 2005). The find- ings presented here are thus in line with the concluding observation in (Davy- dova et al. 2011) that even seemingly universal features may have different characteristics in individual (groups of) varieties. 244 Daniela Kolbe

Figure 14. Embedded inversion worldwide (Kortmann and Schneider 2004: CD- ROM, Interactive Map, Complementation); A (black dots): ‘pervasive feature’, B (dark grey dots): ‘exists but is infrequent’, C (light grey dots):‘ does not exist’ (Kortmann and Szmrecsanyi 2004: 1142)

4. The complementizer as

This section explores regional variation in that clauses, which are the most common finite complement clauses in English (Biber et al. 1999: 658). In standard English, they are introduced by the complementizer that, as in (60), which can be omitted, as in (61). The dialectal variant is as (62).

(60) And I think that we were poor but happy in them days. (FRED, CON_011) (61) And I think /0we were poor but happy in them days. (FRED, CON_011) (62) Don’t think as you can make a lot of money out of them, because you can’t (FRED, SAL_009) ‘Don’t think (that) you can make a lot of money out of them ...’ Complement clauses 245

Clauses with retained complementizer, as in (60), will be referred to as explicit-that clauses and clauses in which that is replaced by as are called as clauses. If clauses with omitted complementizer, as in (61), are referred to, they will be called zero-that clauses. The use of the complementizer as instead of that was observed in the concordances of the most frequent matrix verbs of all that clauses in Kolbe (2008). After providing an overview of research on non-standard as in section 4.1, section 4.2 presents the methodological issues underlying the analysis of the variation between the complementizers as and that. In particular, it presents the reasons for refraining from the extraction of all instances of that and as from the corpora. The analysis of the choice between that and as is presented in section 4.3, on the basis of which section 4.4 offers a concluding discussion and summary.

4.1. Previous research on the variation in that clauses

The most striking variation in that clauses is the omission of the comple- mentizer. What determines the retention and / or omission of that has evoked interest in scholars from various linguistic disciplines, such as diachronic lin- guistics, sociolinguistics, psycholinguistics, corpus linguistics and generative linguistics (see the overview in Kolbe 2008: 91–105). In contrast, the use of as instead of that as complementizer has not received much attention, although the relativizer as is reported frequently (Upton et al. 1994: 489, Edwards 1993: 228-229, Anderwald 2004: 190, Wagner 2004: 155; Kortmann 2004: 1095). Kolbe (2010) compares the historical development of that and as into standard and non-standard complementizers and relativizers. Both functions occur in (63):

(63) When I started to repair them I thought, Well, to think as we drink water as is pumped through these pipes. (FRED, SAL_013) ‘to think that(COMP) we drink water that/which(REL) is pumped through these pipes’

Descriptions of British dialect syntax mention the use of as only sporadically, e.g., Edwards and Welten (1985: 118) and Peitsara (1996: 300). It is also ob- served in (Montgomery 2004: 276), which is influenced by a wide range of British English dialects (Montgomery 2004: 245–246). The OED notes that as is “a merely subordinating conjunction” when it 246 Daniela Kolbe

[introduces] a noun sentence, after say, know, think, etc. Sometimes expanded into as that. Obs. and replaced by that; but still common in southern dialect speech, [bold face added] (OED online: as, adv. conj., and rel. pron. ...BVII. 28.) A similar complementizer occurs in Yiddish: as occurs as the temporal con- junction ‘when, as’ and as complementizer ‘that’ (Timm 2005: 161–162; Weissberg 1988: 121).

4.2. Methods: data, categorising and variables

This section introduces the database and the coded variables defined for the analysis of that clauses. The concordances of think, say, know and see in FRED and the NITCS (originally compiled for the analysis of that clauses in Kolbe [2008]), contained 8,278 that clauses. Due to the remarkable fre- quency of the discourse markers you know and you see in the corpora (see section 2.1.5), these strings were excluded from the concordances. Conse- quently, matrix clause and discourse marker use did not have to be distin- guished in the extraction of complement clauses from the concordances of know and see (see section 2.1.5 for general information on data extraction). Say frequently occurs in reporting clauses of direct speech, so direct speech following say had to be distinguished from indirect speech, which occurs as complement clauses. This issue is addressed in section 4.2.2 below. In the large majority of these clauses, however, the complementizer is omitted, as is to be expected in conversational data (see Biber et al. 1999: 680). In order to focus on the choice betweenthat and as, the database was reduced to the 839 that clauses with retained complementizer that or as. Each clause was coded manually for the variables presented in section 4.2.3.

4.2.1. Database

Although as and that are lexical items, the database for the analysis of speak- ers’ choice between these items was extracted from the corpora through con- cordance searches of the lemmata matrix verbs of that clauses rather than in concordances of as and that themselves. Both words are extremely frequent and multi-functional: in standard English, as is an adverbial of time, place, reason and purpose/result and thus also functions as a subordinating conjunc- tion (OED Online 2006: as). It also functions as (at least one) part of many Complement clauses 247

fixed expressions, e.g., such as, as well (as), as regards, as of, so / as ... as (Quirk et al. 1985: 1137). Similarly, that also functions as demonstrative pro- noun or adjective and occurs in compound subordinators (e.g., so that). In addition, and parallel to their use in complement clauses, as is a non-standard variant of that in relative pronouns, as mentioned in section 4.1 above. Concordance searches of that and as returned over 50,000 and roughly 11,000 hits respectively, only a fraction of which can expected to be com- plementizers – 801 instances of that introduce a finite complement clause controlled by the four most frequent matrix verbs. While it would have been possible not only to capture all instances of the complementizers in the cor- pora but also to include an analysis of the parallel use as relativizer, this was refrained from for two reasons. Firstly, even the largest corpus still represents only a sample of language, and the database identified through concordances of the matrix verbs of that clauses was considered a reliable sample. As out- lined in section 2.1.5, matrix verbs are a valuable access to variation in their complement clauses in representing the syntagmatic locus at which speakers make a choice: when they plan a complement clause after this verb, which complementizer are they going to choose? Secondly, including relativizers would have meant not only to deal with the choice between that and as,but also between these two and wh- relative pronouns, as well as their omis- sion, on which much has been published already (see Herrmann 2005 for an account of relative pronouns in the FRED project), so it would have shifted the analysis too far away from the original focus on complement clauses. In order to examine whether the complementizer as is avoided after the most frequent matrix verbs, the search was extended to concordance searches of the next frequent matrix verbs of that clauses in conversation. These verbs are find, believe, feel, suggest and show (in descending order; see Biber et al. 1999: 668); they do not control any as clauses in FRED or the NITCS. The following collocations were deleted from the concordances immedi- ately:

– as I/you/they say, like I/you/they say – think of /about – search strings in sentence-final position

These collocations represent complete units or units that are never followed by clausal material. In addition, the concordances of say regularly contain speech reports, either direct or indirect. Direct speech reports were also 248 Daniela Kolbe deleted from the concordances on the basis of the criteria outlined in the following section.

4.2.2. Identifying indirect speech after say

The list of characteristic features of direct speech (as defined in section 2.1.5) subsumes six features, five of which can only occur once with each reported clause: (i) address terms and greetings; (ii) commas and capital let- ters; (iii) question or exclamation marks; (iv) utterance-launchers; (v) non- embedded reported speech. A clause either contains or it does not contain markers of a direct address such as greetings and good byes, address terms like sir, madam, names or titles (doctor etc.). It is either introduced by a comma and / or a capitalised letter or it is not and it either ends with a ques- tion or exclamation mark or it does not. The clause following the reported clause either belongs to the same speech report and also the same reporting clause as in (64) or it is part of a new unit – of a sentence without a speech report as in (65) or of a new speech report with a new reporting clause (66). As an utterance can be launched only once, so a series of three items that could each act as utterance-launchers by themselves – such as oh, well, you know in (67) – is considered to function as one utterance launcher.

(64) I say, I’m all right. I think I’ll finish my days in the Warrior. I like her. (FRED, SFK_004) (65) I hate people saying you know, I’m going to heaven. It’s not so easy as all that at all ... (FRED, HEB_015) (66) And, go up to the house, he says, and, you get your breakfast.OhI had my breakfast already he says. (FRED, HEB_025) (67) He said, Oh well you see, we have public graves, . . . (FRED, LAN_002)

Whereas these five features describe characteristics of a clause as a whole, the sixth feature, absence of deictic shifts, can occur several times in the same clause. It is connected to different constituents of the clause that each can be shifted, e.g., personal pronouns, verb tense and adverbials. Thus, two deictic shifts or their absence in a clause can mark this clause as a direct or indirect speech report. Complement clauses 249

In (68-a) neither the pronoun I nor the tense of the verbs understand and is (’s) are shifted to locate the reported clause in the reporting event. They remain as in the reported event, marked by inserted commas and quote marks in (68-b). (68-c) is a possible indirect speech report of the same event.

(68) a. She said I understand that there’s a... (FRED, ELN_009) b. She said, “I understand that there’s a ...” c. She said (that) she understood that there was a . . .

Say, says and said are used interchangeably in reporting clauses as exempli- fied in (69) and can be considered to be historic present (see, e.g., Johnstone 1987; Schiffrin 1981).11 Hence, the morphologically present tense forms are expected to trigger backshift of tense in the reported clause as well, as shown in example (70). Consequently, example (71) and similar speech reports are considered to be instances of direct speech because of the absence of deictic shift from I to he and of the backshift of tense to past in I am (’m).

(69) Up he come. He say, We’re [sic] just had a little row, he say, Over it. He said, He don’t see my point, and I don’t see his. He says, All right, pull her [a ship] round. Ring her on full and tairke [take] her back. (FRED, SFK_037) (70) I said, I was told to bring this horse here for you, he says that I was to stop and see it shot. (FRED, KEN_010) (71) . . . and told him to phone to the doctor, the stone is there. Euclid says I, I’m suffering with an awful pain (FRED, HEB_018)

Consequently, utterances such as (72) and (73) are considered to consist of matrix and complement clause, especially since the complement clause sub- ject is I and thus is always spelt with a capital letter and the only possible marker of a direct speech report is the comma. The reporting clauses would be identical in direct and indirect speech, so (72) and (73) were considered to be instances of indirect speech and included in the database.

(72) I’d say, I never had four pints,... (FRED, NTT_013) (73) I said, I got Georgia [name of a farm]. (FRED, CON_009) 250 Daniela Kolbe

4.2.3. Variables

ASTHAT is the dependent variable in the analysis of complementizer choice in that clauses. It distinguishes between as clauses, coded “1”, and explicit that clauses, coded “0”. As the complementizer use of as is not widely reported even in secondary literature on dialects, it could not be expected to be more or less frequent in particular syntactic environments, so no language-internal factors have been identified to trigger its use. As throughout the present study, the independent variables are REGION, AGE and SEX (see section 2.2.1). REGION was recoded into representing the difference between clauses from one region versus clauses from all other re- gions, in particular the English Midlands, represented by MIDLANDS?. Lin- guistic factors that influence the omission or retention of that (see Kolbe 2008: 90–129) proved not to be influential.

4.3. Analysis

Table 32 shows the regional distribution of explicit that clauses and as clauses. There are significantly more as clauses than expected in the data from the English Midlands (AR 10.5 – for the interpretation of adjusted residuals see the appendix) and significantly fewer as clauses than expected in the data from Southwest England (AR -2.5). The null hypothesis that as clauses are equally frequent in the data from all ex- amined regions can be rejected. There are (nearly significantly) fewer as clauses in the data from Northern Ireland and from the Scottish Highlands and Islands, but in exactly these two regions, speakers are the youngest in the data sample (see section 2.1.2, Figure 13). In order to examine the explanatory power of the correlation between the use of as and the English Midlands, REGION was recoded into MIDLANDS?, distinguishing between data from this region and data from all other regions. The correlation between MIDLANDS? and ASTHAT explains 36.4% of the variance (Spearman rho = 0.364, p <.001). The strength of the influence of the independent variables can be com- pared in logistic regression analysis (see the appendix), which allows to as- Complement clauses 251

Table 32. Regional distribution of explicit that- and as clauses, AR = adjusted resid- uals, significant residuals in bold, overall significance: p <0.001 as explicit that REGION % (n) AR % (n) AR

Northern Ireland 0.0 (0) -1.8 100 (65) 1.8 Scottish Highlands & Islands 1.0 (1) -1.8 99 (99) 1.8 Isle of Man 0.0 (0) -0.3 100 (2) 0.3 Scottish Lowlands 1.2 (1) -1.6 98.8 (84) 1.6 Northern England 6.5 (7) -1.0 93.5 (101) 1.0 Englisch Midlands 27.1 (23) 10.5 72.9 (62) -10.5 Wales 0.0 (0) -1.5 100 (44) 1.5 Soutwest England 1.4 (3) -2.5 98.6 (211) 2.5 Southeast England 2.2 (3) -1.4 97.8 (133) 1.4 Total 4.5 (38) 95.5 (801) certain which factors determine the choice of complementizer in that clauses. Although overall 95.5% of these clauses are explicit that clauses and 4.5% are as clauses, the baseline percentage of that clauses in the logistic regression model is 94.9%, as data with missing values in any variable are excluded from the model. The correctly predicted percentage of cases can be increased to 95.2% in a model that explains nearly 38% of the variability in the data (Nagelkerke R2 = 0.378). Table 33 shows the variables that are included in this logistic regression model. The selected reference categories are MATRIX(think) and REGION(Isle of Man) and they are subsumed in the constant against which the other vari- ables are compared. In these categories the distribution of as and that com- plementizers is most similar to their overall distribution. The odds ratios refer to the influence of the respective factor on a speaker’s choice of as. The only linguistic variable that significantly influences the choice of as instead of that is matrix (p = 0.006). However, no individual matrix verb has a significant effect on as or explicit that.REGION is not a predictor of as on its own, because it does not significantly influence the choice of the explicit complementizer. Nevertheless, in interaction with AGE it has a significant effect on the choice of as: the odds that a complement clause is introduced by an as-complementizer increase with each additional year of age in the data from the English Midlands. The influence of the variable MATRIX is also significant, although no individual verb has a significant effect at the usual level of p ≤0.05. However, considering the fact that at p = 0.1, there is still 252 Daniela Kolbe

Table 33. Predictors increasing the odds of as clauses, significant values in bold Factor exp(b) (OR) p

MATRIX – 0.006 MATRIX(know) 1.768 0.315 MATRIX(say) 0.295 0.082 MATRIX(see) 2.612 0.109 AGE*REGION – 0.000 AGE*REGION(Scottish Highlands & Islands) 0.607 1.014 AGE*REGION(Scottish Lowlands) 0.996 0.459 AGE*REGION(Midlands) 1.054 0.009 AGE*REGION(Northern England) 0.114 1.035 AGE*REGION(Northern Ireland) 0.995 0.326 AGE*REGION(Southeast England) 0.396 1.018 AGE*REGION(Southwest England) 0.632 1.011 AGE*REGION(Wales) 0.998 0.817 Constant 0.002 0.006

Table 34. Distribution of explicit that- and as clauses across matrix verbs, AR = adjusted residuals, significant residuals in bold, overall significance: p <0.001 as explicit that Total MATRIX % (n) AR % (n) AR n

know 6.3 (11) 1.2 93.8 (165) -1.2 176 say 1.4 (5) -3.7 98.6(346) 3.7 351 see 16.1 (15) 5.7 83.9 (78) -5.7 93 think 3.2 (7) -1.1 96.8 (212) 1.1 219 Total 4.5 (38) 95.5 (801) 839 a 90% probability that the given distribution is not the result of chance alone, the influence of the matrix verbs say (p = 0.082) and see (p = 0.109) is still noteworthy. Say reduces the odds of as (OR 0.295) and see increases the likelihood of as (OR 2.612). This is also reflected in the proportions of both complementizers with dif- ferent matrix verbs as illustrated in Table 34, where see and say differ from the overall percentage in the same manner. The frequency of as clauses across the different matrix verbs in FRED (there are no as clauses in the NITCS) does not correspond to the frequency of that clauses across these verbs. Complement clauses 253

85

80 79

75 Age

70 69

65

60 that as Complementizer

Error Bars: 95% CI

Figure 15. Age means and confidence intervals in as and that clauses

Whereas think and say control more that clauses than know and see, know and see control more as clauses than think and say. Most likely, the higher fre- quency of think and say with the standard complementizers zero and explicit that becomes a collocational constraint for as clauses after these verbs.12 In addition, seeing as forms a colloquial variant of the quasi-conjunction seeing that in ‘since, considering the fact that’ (OED Online: seeing, quasi-conj.). This collocation could be either the cause or the result of the higher frequency of as with see. Although, like REGION, AGE is not a predictor on its own, it helps pre- dict the choice between as and that. Speakers using as are significantly older than speakers using that, as shown in Figure 15. The black circle indicates the mean age of the speakers that utter the respective complementizer and the bars above and below this square show the range of the age of the speakers within a 95% “confidence interval” (CI). This area comprises the range of 254 Daniela Kolbe mean age for which there is a 95% probability to occur in a second sample compiled under the same conditions as the reported sample. Since this corre- sponds to the 0.05 p threshold that holds for significant deviations, one can be “confident” that the mean age of speakers from this region is within this interval (Zoefel 2002: 73–74). As the error bars of mean age of speakers of either complement clause type do not overlap with the other error bars, the differences between mean age of users of the complementizers can be regarded as statistically significant.13 The literature on complementizer choice has also not identified a significant influence of the age of a speaker on complementizer choice, so these differ- ences in AGE could not be expected. In sum, as clauses are more likely selected by the older the speakers are, especially in the English Midlands and, less reliably so, by the matrix verb see, whereas say does not as readily select as clauses. The association be- tween as clauses in the English Midlands and older age hints at the possibil- ity that they are a receding dialectal feature. However, speakers’ age and the region they are from individually do not determine the choice of as. The prediction is complicated by the fact that the matrix verb see is more frequent in the data from the English Midlands than in other regions, but an inclusion of this interaction did not lead to a better model in logistic regres- sion: it results in a decrease of correctly predicted cases and the factors as such are not significant, except speakers’ age in the English Midlands.

4.4. Discussion and summary

In contrast to the OED information, which states that the complementizer as is more frequent in the South of England (OED Online 2006: as, B VI), in the compiled database of that clauses, the complementizer as occurs more often in the data from the English Midlands than in the data from the South of England. All as clauses occur only in the data from FRED. A concordance search of all tokens of as did not reveal any as clauses in the NITCS (see Kolbe 2010). The complementizer as is most frequent in the data from the English Mid- lands and significantly more frequent in the speech of older speakers. These two factors – increase in speakers’ age and the region English Midlands – determine the probability of as clauses only when they interact. Although there is a strong correlation between MIDLANDS? and ASTHAT, this associa- Complement clauses 255 tion is caused by differences in language use according to the age of speakers from this region. In the database, speakers from the English Midlands are the oldest on average (see section 2.1.2, Figure 13). The youngest speakers in the database from Northern Ireland and the Scottish Highlands and Islands hardly use as. This non-standard complementizer therefore seems to be a receding feature. Due to the dearth of research on dialect syntax in earlier centuries (see sec- tions 1 and 2 in the general introduction to this volume), it is difficult to ver- ify whether this complementizer was more frequent in earlier periods of the English language. It is mentioned in 19th century descriptions of Midlands dialect, e.g., Darlington (1887: 95), but dialectologists then did not quantify their data. In more recent research, the morphosyntax of English Midlands dialects appears to be uncharted territory – it is the only mainland dialect region that is not represented in Kortmann et al. (2004). According to these findings it seems fair to consider the complementizer as to be a feature of English Midlands dialect, probably more so of the tradi- tional than of the modern dialect. It is also preferred after see.

5. For to clauses

In for to clauses, the infinitive marker to is preceded by for, as in (1), cited here as (74),

(74) ...hewould try for to tell her . . . (NITCS, A32.3, LM7)

For to was previously a standard infinitive marker in Early English (Freeborn 2006: 185, 193; Wagner 2004: 168); in Present Day Standard English only to remains. In non-standard English, for to is still moderately frequent. For to clauses are closely connected to similar structures in other lan- guages. The German um zu and the Dutch/Afrikaans om te (Chomsky and Lasnik 1977: 455; van Gelderen 1998: 66–68) are of exactly the same struc- ture as for to. Similarly to for, um and om indicate purpose. Even closer lexi- cally, für NP zu occurs as dialectal variant of German um zu and Swedish has för att to introduce purpose clauses. Att, however, is not an infinitive marker, but a subordinator used in finite as well as in non-finite clauses; thus, för att may mean both ‘in order to’ and ‘so that’, depending on the following clause (Byrman and Holm 1992: 72; Gommer and Huber 1992: 169). 256 Daniela Kolbe

For to not only occurs in complement clauses, but also, for instance, in ad- verbial clauses, such as (75), in which it could also be replaced by standard English in order to, as in (76).

(75) I was only about eight, nine, when I used to go to and fro to all the pubs, for to fetch beer. (FRED, KEN_003) (76) I was only about eight, nine, when I used to go to and fro to all the pubs, (in order) to fetch beer.

Therefore the analysis of the use of for to goes beyond the study’s general focus on complement clauses and includes for to clauses in all their func- tions. Consequently, in contrast to all other data in this study, the data were extracted from the corpora by searches of (for) to rather than of their matrix expressions, as adverbial clauses do not have particular matrices. This chapter is structured as follows: section 5.1 introduces the grammar and functions of for to clauses in general, including a more detailed termi- nological distinction of different functions of for to clauses. Section 5.2 dis- cusses previous research on for to clauses, and in section 5.3 the extraction and selection of the data as well as variables and coding issues are described. Section 5.4 examines the data in general and in particular the regional distri- bution of for to clauses. All results are summarised in section 5.5.

5.1. Grammar and functions of for to clauses

This section provides an overview of the different functions of for to clauses and their differentiation. It also addresses the connection between for to and for NP to clauses (e.g., at first there even weren’t any sweets for childrens [sic] to get; FRED, NTT_006). For to clauses are a non-standard variant of to clauses. Consequently, when examples below show general concepts, they contain the more familiar to clauses rather than their non-standard variant. A basic difference between adverbial clauses on the one hand and post- modifying and complement clauses on the other hand is that an adverbial clause provides additional, optional information about the main clause and does not complete the phrase. It expresses an adverbial relation to the main clause (Biber et al. 1999: 194), e.g., condition, reason, time, purpose, etc., which is often made explicit by a subordinator, e.g., if, because, while, in order, etc. With to-infinitive clauses, this subordinator is mostly in order Complement clauses 257 or so as, but it is frequently omitted. Their position in the sentence is not fixed: they occur in sentence-initial, sentence-medial or sentence-final posi- tion. Non-finite adverbial clauses, however, usually do not occur sentence- medially (Biber et al. 1999: 842–844). Hence, example (75) above contains the purpose clause for to fetch beer, which conveys the purpose of the action described in the main clause (going to and fro to all the pubs). In noun phrases, Biber et al. distinguish between postmodifying clauses and complement clauses (1999: 574–575). Relative clauses constitute a fre- quent kind of postmodifying clause. Whereas a complement clause completes the meaning of the head, a postmodifying clause serves to describe, or “spec- ify" it (Carter and McCarthy 2006: 323). It is not always easy to tell the difference between complement clauses and postmodifying clauses. The typical definition of complement clauses, ‘com- pleting the meaning’ of the head, is somewhat ambiguous. A relative clause, which is not a complement clause, also helps contribute to the meaning of its referent. Why is this contribution not a syntactic completion? The essential distinction between postmodifying and complement clauses is that “postmodifying clauses are not complete (i.e., they have a gap), and they could not stand on their own as independent sentences”, whereas “com- plement clauses ...donothavea gap corresponding in meaning to the head noun” (Biber et al. 1999: 644–645). However, the first criterion only applies to finite clauses. Non-finite clauses per se can never stand on their own as fully fledged sentences, as they lack a finite verb. In addition, non-finite com- plement clauses have a gap in subject position when the subject is the same as that of the matrix clause. Thus, in (77), we in the main clause is the under- stood subject of to get a job again. Without a subject, however, to get a job again is an incomplete sentence, whereas you might get a decent fee in (78) (without complementizer that) is a complete sentence.

(77) ...wewere...given the chance to get a job again . . . (FRED, NTT _002) (78) You’ve a chance that you might get a de-decent fee. (FRED, PER_003)

Consequently, the distinction between non-finite complement and postmodi- fying clauses has to rely on the fact whether they have a gap that can be filled by their head noun. (77) contains a complement clause to get a job again in which the gap can be filled by the subject of the matrix clause, we (we get a 258 Daniela Kolbe job again), but not by the head noun of the phrase it belongs to (?the chance gets a job again). Postmodifying to clauses are illustrated in (79) and (80). They contain gaps which correspond to their head nouns, money and water respectively. As they do not contain explicit subjects, they also have additional subject gaps. The examples are followed by transformations of the non-finite clauses into finite clauses with filled (subject) and head gaps.

(79) But you’d no money, no money to spend (NITCS, A54.3, SB7) (you) spent money (80) You have to get Father Walsh the water and the wine. And the water to wash his hands. (NITCS, A21.1, CR36) (he) washes his hands with the water

For NP to clauses, as in (81), are similar to for to clauses. (82) shows that for can be optional, usually after to like and other verbs of “desire” (Biber et al. 1999: 698).14 Very often, however, as in (83), for is obligatory, so that (84) would be ungrammatical. Although their grammatical relation in the non- finite clause is that of subject, personal pronouns occur in their object form, because they are then objects of the preposition for, e.g., him in (83) (Quirk et al. 1985: 1004, 1061).

(81) You wouldn’t like for the horses to step down the corn. (FRED, CON_007) (82) In them days they didn’t like [/0] the wind to get at ’em. (FRED, KEN _004) (83) . . . these people they’d lock their doors not for him to come in. (FRED, SOM_013) (84) *these people they’d lock their doors not he/him to come in.

For is therefore sometimes regarded as a subordinator in to-infinitive clauses, albeit with some limitations: “Since for may be combined with the subordi- nator in order to [see (85)], it seems to be a device for introducing the subject rather than to be a true subordinator” (Quirk et al. 1985: 1004).15

(85) other things have been altered in order for people to save their jobs (BNC, KLY)16 Complement clauses 259

In non-standard grammar, however, for seems to function as subordinator, as it is used even when there is no subject, which results in the sequence for to. Considering the fact that for to used to be a standard infinitive marker which was lost in standardisation (van Gelderen 1998), one might argue that in standard English for NP to, for marks the beginning of a to clause when it contains a subject. If the for in both for to and for NP to is the same subordinator, it cannot occur twice in the same clause, i.e., introducing both the subject and the to infinitive. This would result in a sentence such as (86), which is modified (83), containing a for to clause with an additional subject introduced by for (him).

(86) ?these people they’d lock their doors not for him for to come in.

Nevertheless, there is one utterance in the database, (87), which seems to contain a subject-marking as well as an infinitive-marking for. Could this possibly be a for to clause in which the subject is fuel?Asburn is followed by a direct object (them), for fuel for to keep theirself warm cannot be a complement of this verb, only an adverbial clause. A finite purpose clause – in which subjects can be integrated more easily – corresponding to the supposed adverbial clause would be similar to (87-a), which seems awkward. Two more likely interpretation would be (87-b) and (87-c), in which the first option would contain a postmodifying clause, the second an adverbial clause.

(87) All the gates were burnt, and all the, in a lot of cases the posts as well, they had them, they burnt them for fuel for to keep theirself warm and that. (FRED, DEV_001) a. ?‘they burnt them so that fuel would keep them(selves) warm’ b. ‘they burnt them for fuel [with which] to keep themselves warm’ c. ‘they burnt them for fuel in order to keep themselves warm’

In both (87-b) and (87-c), however, for fuel for to keep theirself warm is considered to consist of two constituents – for fuel as adverbial and for to keep themselves warm as separate clause – and neither of them contains a clause in which the infinitive marker for to occurs together with the subject marker for. For fuel for to keep theirself warm is hence most likely an adverbial for to clause because it is not fuel with which one can keep oneself warm, but more plausibly, when posts are burnt ‘as’ fuel, this action is performed with 260 Daniela Kolbe the purpose of keeping oneself warm. Hence, the data suggest that there is no for NP for to construction, which has also been noticed by Henry (1992: 286–287). In FRED, there are two sentences in which for NP to and NP for to are used interchangeably in structurally and semantically very similar sentences, (88) and (89), which indicates that the same for can occur in different po- sitions. There is accordingly no indication from the data that presupposes a different grammatical status of for in for to than that of for when it introduces a subject in to clauses.

(88) . . . they begged and prayed for me to take the Dun Cow over (FRED, SAL_008) (89) . . . they begged and prayed us for to make, me and mi brother, . . . them [broaches] (FRED, LAN_012)

The analysis of for as subordinator rather than as preposition is also supported in the literature on for NP to clauses. Wagner (2000) and Mair (1990: 41–44) come to the conclusion that for NP is often not a typical prepositional phrase. Mair also observes that all for NP to clauses in his corpus can at least be analysed as one constituent (introduced by for), although he also points out that, in the end, their syntactic status is indeterminate (1990: 52–54). In generative grammar (“Government and Binding”), however, for is a complementizer in for NP to clauses that is deleted in the students want  Bill to visit Paris and becomes explicit in, e.g., what the students want is for Bill to visit Paris (Chomsky 1981: 19). In contrast to dialects in which contiguous for to occurs, standard English has a filter that prohibits contiguous for to (Chomsky 1981: 248–249, 299–300; see also Chomsky and Lasnik 1977). Even in postmodifying and complement clauses, however, for may con- vey the meaning of purpose. It is therefore quite common in the literature to consider for NP to clauses after, e.g., enough and too, as adverbial clauses (Wagner 2000: 195;17 Erdmann 1997: 40–41). The replacement of for to by in order to, or the insertion of in order in the following examples from the corpora would not be possible if these had no adverbial meaning, even though their syntactic role is that of a complement (see Biber et al. 1999: 527).

(90) I was just getting ready for to go around (FRED, DUR_003) ready in order to go around Complement clauses 261

(91) Too far back for me to remember, I think. (FRED, CON_007) too far back in order for me to remember

However, this replacement or insertion is not always possible:

(92) they was only too glad for me to slide out (FRED, DEV_004) ?too glad in order for me to slide out

Phrases containing enough are a special case when it comes to (for) to clauses. The distinction between postmodifying and complement clauses results in the fact that to clauses in noun phrases containing enough are postmodifying clauses, whereas they are complement clauses in adjective clauses postmodi- fied by enough. This is illustrated in (93) – (96).

(93) She hadn’t got enough money to bury my father. (FRED, SAL_013) (94) She hadn’t got money to bury my father. (95) I wasn’t strong enough to lift the bloody thing off, oh dear. (FRED, NTT_008) (96) *I wasn’t strong to lift the thing off

Although, from a semantic point of view, just the same kind of information is missing in She hadn’t got enough money as in I wasn’t strong enough, the syntax of these two constructions is different. Enough may be omitted before a noun phrase, as in (93), because the to clause modifies the head noun money and not the premodifying adjective enough. It cannot, however, be omitted from the adjective phrase in (95), because the to-infinitive clause complements enough rather than strong (cf. Quirk et al. 1985: 66/67).

5.2. Previous research on for to clauses

Research on for to clauses deals with three important issues: firstly, their regional distribution, secondly, whether for to clauses are always adverbial clauses and thirdly, the syntactic status of for. For to clauses have caught the attention of functional and generative linguists, both of whom take into account standard and non-standard varieties of English. Kortmann et al. (2004) provides a valuable overview of the use of for to clauses in the British Isles. Table 35 summarises for which varieties the use of for to clauses is reported. They occur in the description of five of 262 Daniela Kolbe

Table 35. For to clauses in varieties of British English (Kortmann et al. 2004) Variety for to? p(p). Orkney& Shetland (Melchers: 34–46) no — Scottish English (Miller: 47–72) yes 64 Irish English (Filppula: 73–101) yes 85–86 Welsh English (Penhallurick: 102–113) yes 108 Northern England (Beal: 114–141) yes 134 East Anglia (Trudgill: 142–153) no — Southwest England (Wagner: 154–174) yes 168 Southeast England (Anderwald: 175–195) no — British Creole (Sebba: 196–208) no — these nine varieties: Scottish, Irish, Welsh English and the English spoken in Northern and Southwest England. They are not mentioned in the sections on the English spoken in Orkney and Shetland, East Anglia and Southeast England, nor in the section on British Creole. In addition, Kewley Draskau (to appear 2012) reports its occurrence in . Although Anderwald (2004) and Melchers (2004) do not include con- tiguous for to in their descriptions of the English spoken in Southeast Eng- land and in Orkney and Shetland respectively (Table 35), in the survey that accompanied the handbook they reported that for to is used in these varieties (Kortmann 2004). In this survey each author was asked to comment on the frequency of 76 morphosyntactic features in the variety they described (Kort- mann and Szmrecsanyi 2004: 1142–1148). The dialect of East Anglia is thus the only British regional variety described in the handbook in which for to apparently does not occur at all (Kortmann 2004: 1095). The fact that Ander- wald and Melchers assert the use of contiguous for to but do not include it in their sections may be caused by two reasons: either they acknowledge its use, but consider its frequency to be too low, or they consider it as a non-standard feature without regional bias and correspondingly not a feature of the dialects they described. In earlier publications that aim at an overview of British dialect gram- mar (Hughes and Trudgill 1996, Edwards and Welten 1985, Milroy and Mil- roy 1993, Upton et al. 1994), the accounts of the regional spread of for to vary considerably. If one adds monographs on individual regional varieties to these overviews, for to clauses are documented in the English spoken in every larger area in mainland Great Britain and Ireland, as shown in the following summary. Complement clauses 263

Irish English Harris (1993: 167); Hughes and Trudgill (1996: 116 note); Edwards and Welten (1985: 111–112): Northern Ireland Scottish English Edwards and Welten (1985: 111–112); Hughes and Trudgill (1996: 116 note); Millar (2007: 76): Scots English spoken in the North of England Beal (1993: 200); Hughes and Trudgill (1996: 114); Upton et al. (1994: 504); Shorrocks (1999: 248–249): Bolton area in Lancashire Isle of Man Upton et al. (1994: 504); Kewley Draskau (to appear 2012) English Midlands Upton et al. (1994: 504) Wales Parry (1999: 118) East Anglia Upton et al. (1994: 504) Southern England Upton et al. (1994: 504); Edwards and Welten (1985: 111–112): “parts of” Southern England; (Wakelin 1986: 48): Southwest England

In sum, for to clauses seem to be “widespread throughout the country” (Wake- lin 1972: 118), but the literature also suggests that they are receding, archaic or at least infrequent (Wakelin 1972: 118; Harris 1993: 167; Miller 2004: 64; Millar 2007: 76; Tagliamonte et al. 2005: 100–101). A decrease in use would explain why for to was observed in East Anglia during the collection of the Survey of English Dialects (SED) in the 1960s (Upton et al. 1994: 504), but not listed as a feature of this dialect recently (e.g., Trudgill 2004). A variant of for to is discrete for as infinitive marker, which was a mod- erately frequent infinitive marker in Old and (van Gelderen 1998). Judging from the literature, discrete for has survived only in Southwest England. According to Wakelin, for to clauses are “widespread in dialect” in general, but in Southern (not only Southwest) English to may be “completely absorbed” (1986: 38; examples: 73, 145). In Upton et al. (1994), for see is attested in the data from the Midlands as well as from Cheshire (Northwest 264 Daniela Kolbe

England), which corresponds to Edwards and Welten’s account of for + in- finitive (1985: 112). The fact that for infinitives are not mentioned in more recent reports of other varieties of British English suggests that they have become obsolete in the British Isles. The distribution of the different functional types of for to clauses is an- other aspect of interest in previous research. Due to the meaning of the prepo- sition for, which includes “purpose or destination” (OED Online 2006: for, prep. and conj., A. IV. 8a.; see also van Gelderen 1998: 45–48), for to clauses are mostly considered to be adverbial clauses of purpose. Thus, it is common to translate for to into standard ‘in order to’ (Parry 1999: 118). In Kortmann et al. (2004), for to clauses are generally classified as adverbial clauses of purpose. The catalogue of morphosyntactic features contains the description “unsplit for to in infinitival purpose clauses” (Kortmann and Szmrecsanyi 2004: 1163). Only Filppula (2004: 86) and Beal (2004: 134–135) explicitly note that the use of for to in Irish English and in Northern English dialects re- spectively is not restricted to adverbial clauses. They cite examples in which for to clauses are complements, e.g., try for to... or glad for to.... A couple of non-purpose for to clauses are documented in other sources, e.g., it begun for to rain (Upton et al. 1994: 504) or It is difficult for to see that (Henry 1992: 279).18 Section 5.1 has also shown that not all for to clauses are adverbial clauses. Montgomery (2006: 324–325) concludes that “[t]he construction for tae (less often for til) is used to introduce infinitive phrases in Ulster Scots, most often to express purpose (‘in order to’) ...However, speakers in this study accepted for tae as a general equivalent to the infinitive marker tae/til (110a-b)”. Henry (1995: 98) states that this broad use of for to as infinitive marker “is now confined to Belfast and a few other areas such as parts of County Armagh”, an assessment that would seem to need revision, on the basis of the findings in Montgomery (2006) for Ulster Scots and Macafee (1992: 10.2) for Lowlands Scots. The differentiation between functional types of clauses in which for to occurs is also relevant for the analysis in Henry (1992, 1995: 81–104) and Carroll (1983). Henry (1992; 1995: 81–104) distinguishes between weak and strong for to dialects: weak for to dialects use for to only in purpose clauses; strong for to dialects use for to in non-purposive contexts, e.g., in complement clauses controlled by want. According to her data, Belfast English is a strong for to dialect, although Milroy (1981: 15) claims that for to is ungrammatical after want and like in Belfast English. This incongruity is most probably re- lated to differences in data collection. Henry’s examples are “compared with” Complement clauses 265

(1992: 282) actual speech data and not based on grammaticality judgements by native speakers, because such judgements do not always reflect actual lan- guage use. Dialect speakers may well notice that a sentence they are asked to judge is not grammatical in standard English and hence report it as ungram- matical although they use the same construction in spontaneous speech – be it consciously or unconsciously. Henry refers to this behaviour as “negative overreporting” (1992: 282). Besides the inclusion of for to clauses (adverbial or not) in the description of certain dialects of English, research has also dealt with the syntactic func- tion of the for in for to and for NP to and proposes the following terms: prepo- sition (Carroll 1983), complementizer (Chomsky and Lasnik 1977), clitic (Henry 1992; 1995: 81–104), indeterminable status (Mair 1990: 52–54) and “a clause-introducing device” (Wagner 2000: 208). An important difference arises between functional analyses (Mair 1990; Wagner 2000) and generative analyses of for (Carroll 1983; Chomsky and Lasnik 1977; see also Chomsky 1981). Functional analyses focus solely on for NP to clauses and do not consider contiguous for to clauses to be the same – or a closely related – type of clause. Generative analyses, however, presuppose that for is a complementizer (COMP) in for to and in for NP to clauses and any differentiation between the two has to be justified. Generative analyses usually hearken back to Chomsky and Lasnik (1977: 442) who claim that standard English has a filter for contiguous for to. Never- theless, Rosenbaum (1967: 7) formulated a rule of “obligatory complemen- tizer deletion” that, inter alia, prescribes the deletion of for when it imme- diately precedes to. For is obligatorily deleted when it would be contiguous to to (Rosenbaum 1967: e.g., 53–58). Chomsky and Lasnik (1977) observe similar rules in standard English, but also discuss dialects that do not have a filter preventing contiguous for to, e.g., Ozark English. Their analysis is in- tegrated in Chomsky’s Government and Binding theory, according to which for is COMP and to is INFL (short for “inflection”). The following excursion into constituent structure according to Government and Binding theory helps to elucidate this analysis. The S of more simple tree diagrams, consisting of a noun phrase and a verb phrase (S → NP + VP) has no positions for complementizers or clause-initial wh- words. Therefore, clauses which are introduced by wh- words and / or complementizers are described as complementizer phrases (CPs). A comple- mentizer phrase has an X (X-bar) structure: CP → SPEC + C. This means that complementizer phrases consist of a specifier (SPEC, e.g., a wh- word) 266 Daniela Kolbe and C.C consists of a complementizer and an inflection phrase (IP; C → COMP + IP). Thus, in a direct question, e.g., When did you leave?, when is in the specifier position and did you leave is in C, whereas did is the com- plementizer (COMP) and you leave is the inflection phrase (IP) (Van Valin 2001: 200). Inflection phrases also have a “bar”-structure: IP → NP+I. The subject of a non-finite clause, e.g., you in I don’t know where for you to go is the noun phrase, and is followed by the I: to go.I has the following structure: I → INFL + VP. INFL assigns ±TENSE to the following verb, i.e., it determines its tense (Van Valin 2001: 194). In a for NP to clause, to is INFL and assigns –TENSE: in to go, to determines the selection of the tenseless infinitive go (Chomsky 1981: 18–21). For you to go consequently has the following struc- ture: [C [COMPfor][IP[NPyou][I [INFLto][VPgo]]]]. In contiguous for to clauses, NP is empty, but for is in COMP and to remains in INFL (Chomsky 1981: 9, 248; Chomsky and Lasnik 1977: 454). The contrast between contiguous for to clauses and for NP to clauses lies merely in NP, which is realised in a when there is a new subject, but which remains empty (PRO, see Chomsky 1981: 20) when it is the same subject as in the matrix clause For to is not a Modern English innovation. Early English had three in- finitive markers: for, for to and to. According to van Gelderen (1998), for to functioned as a preposition in Old and Middle English, as both of its con- stituents in separation still do. Infinitives in Old English had case endings and therefore functioned as nouns. With the loss of case endings in Middle English, contiguous for to became a complementizer of infinitive clauses in the same position and was used frequently. In Late Middle English, however, for and to were reanalysed as two units and for could be placed before the subject of a non-finite clause in complementizer position (COMP), whereas to remained in verb-preceding position (INFL). This uncoupling of for from to led to a decrease in use of the for to complementizer, because discrete for had come to fulfill a separate function (van Gelderen 1998: 60, 65). Based on Government and Binding theory, Henry (1992; 1995: 81–104) discusses “for to dialects". She postulates that for in for to is the clitic of to in Belfast English: it is moved out of the complementizer position in the CP and moves to INFL in the inflection phrase. Corrigan observes that South Armagh English is also a strong for to dialect in which the complementizer for cliticises to the infinitive marker to. Corrigan furthermore considers substratal influence from Irish (Gaelic) to be a “potential explanation” (2003: 335) for the fact that two (Northern) Complement clauses 267

Figure 16. For to clauses worldwide (Kortmann and Schneider 2004: CD-ROM, In- teractive Map, Complementation)

Irish varieties of English differ from other Englishes in this respect. In Irish, complementizers may also cliticise to the verb of the clause, but this “is doc- umented for finite clauses only” (Corrigan 2003: 335, endnote), and it is a moot point whether a finite clause structure of the substrate language may ac- tually influence non-finite clause structure in the superstrate language. How- ever, as clitics only occur attached to other lexemes (Crystal 1997: 64), but for may appear in INFL even without to as an infinitive marker in other va- rieties of English (see Wagner 2004: 168), this analysis seems to apply only to individual varieties, such as Belfast English and South Armagh English. The fact that the dialects analysed by Chomsky and Lasnik (1977) and Car- roll (1983) are North American shows that for to clauses also occur in North American varieties of English. In the corresponding sections in Kortmann et al. (2004), they are mentioned in Colloquial American English (Murray and Simon 2004: 236), Appalachian English (Montgomery 2004: 257), rural and ethnic varieties in the Southeast of the USA (Wolfram 2004: 298) and Newfoundland English (Clarke 2004: 315). In sum, they appear to be a fairly widespread phenomenon in the United Kingdom and in North America. Figure 16 illustrates the reports of the frequency of for to according to the survey that accompanied Kortmann et al. (2004). The black dots repre- 268 Daniela Kolbe sent category A, ‘the feature is pervasive’, the dark grey dots represent cat- egory B ‘the feature exists but is infrequent’ and the light grey dots repre- sent category C ‘the feature does not exist’ (for a more detailed explanation of the degrees of frequency [A,B,C], see Kortmann and Szmrecsanyi 2004: 1142). For to clauses (“unsplit for to”) have been reported to be “pervasive” in only five varieties of English (Kortmann and Szmrecsanyi 2004: 1142): Irish, Cameroon, Newfoundland, Isolated Se. American19 and Ozark English (black dots, from east to west). Although contiguous for to has apparently been reported to be pervasive in Cameroon English on the CD-ROM – as the only African variety of English – it is not mentioned in the respective chapter in the handbook (Mbangwana 2004). In most regional varieties in Great Britain contiguous for to has been reported to be a moderately frequent feature (dark grey dots); only in East Anglia (light grey dot, C) does it seem to be obsolete. The North American dark grey dots are, from right to left: Appalachian, Earlier and Urban African American Vernacular and . Contiguous for to is not men- tioned in the handbook’s section on Chicano English (Bayley and Santa Ana 2004), but was reported in the survey (Schneider 2004: 1111), similarly to the Englishes in Orkney and Shetland and the Southeast of England (see Table 35 above and the following discussion). The question underlying any research concerning the regional distribution of for to clauses should consequently in- clude the question in which varieties they do not occur. For the present study, this concerns the frequency of for to in East Anglia, or more precisely, Suf- folk, as FRED does not contain any data from Norfolk. In conclusion, the literature on for to clauses addresses their synchronic worldwide distribution as well as their historical development and includes theoretical discussions of the syntactic status of for. As a historical infini- tive marker of English, for to continues to be used in non-standard English, especially in the varieties of English spoken in the Northern hemisphere. In contrast to the for in for NP to clauses, the syntactic function of for in contiguous for to has remained undiscussed in functional linguistics. For is generally assumed to express the adverbial relation ‘purpose’ and for to can be replaced by in order to. This seems to explain the function of for suffi- ciently. However, as section 5.1 has indicated and as is observed by Filppula (2004: 86), Beal (2004: 134–135) and Henry (1992; 1995: 81–104), for to clauses cannot inevitably be presumed to be adverbial clauses. In line with generative analyses (Chomsky and Lasnik 1977; Chomsky 1981; van Gelderen 1998) for is considered here to be a complementizer or Complement clauses 269 subordinator both in for to and in for NP to. This terminology also corre- sponds with the findings of Mair (1990) and Wagner (2000) on for NP to clauses, who regard for as a subordinator rather than a preposition (see also the discussion of (88) and (89) in section 5.1). Despite the amount of literature published on for to clauses, the following points demand verification in empirical research: Are for to clauses a receding feature? How frequent are non-adverbial for to clauses? Which factors inhibit or facilitate the use of for to clauses? So far, no study in the literature accessi- ble has examined the distribution of for to clauses across British varieties of English on the basis of comparable corpora. The analysis in section 5.4 aims to provide answers to the questions above whilst basing its investigation on the analysis of actual speech.

5.3. Methods: data, coding and variables

This section presents the methods applied in the analysis of for to clauses. In contrast to the analysis of finite complement clauses in sections 3 and 4, the analysis was extended to adverbial and postmodifying clauses. Consequently, data were not merely extracted via their most frequent matrices.

5.3.1. Database

As mentioned in section 5.1, no matrix expressions had to be defined in order to extract for to clauses. Instead, the first search string chosen for concor- dance searches was for, followed by to within up to five words. The limit of four words between both search items was chosen in order to allow interjec- tions, discourse markers, etc. or transcribers’ remarks to appear in-between. Both the discourse marker you know and the FRED correction tag “(reg sic= ‘corrected item’),” for instance, consist of two words; if they appeared in combination ((reg sic=y’) you know) between for and to, the defined search string was still able to capture the to. Besides to, its abbreviation t’ was also used as a context search word. This concordance search returned 224 for to clauses out of altogether 1,643 hits. As for to clauses are a variant of to clauses, it is most useful to exam- ine their frequency in relation to the frequency of to clauses to show how often speakers select the for to option when they have a choice. Therefore, 270 Daniela Kolbe the most reliable comparative database would probably contain all to clauses of the corpora because all for to clauses had been extracted. To, however, is not only an infinitive marker but also a common preposition (of which the infinitive marker is a grammaticalised use, see OED Online: to, prep., conj., adv., B.; van Gelderen 1998) and occurs 76,528 times in the corpora, of which fewer than 20,000 items could be deleted directly that were clearly no infini- tives (sentence-final to, to before capitalised words [proper nouns], pronouns, ordinals, finite verbs etc.). In order to construct a large but manageable database of to clauses, the four most frequent verbs in the corpora were selected. Be, get, come and do are the only verbs in the corpus that occur over 2,000 times within two words after to. The respectove absolute frequencies within two words after to are: be: n=5090, get: n=3134, come: n=2116, do: n=2873. After want and go- ing, to is reduced and contracted to wanna and gonna. Since these contracted forms are not consistently transcribed as, e.g., want to and going to in the corpora, a search of be, get, come and do after wanna and gonna was added. Altogether, this amounts to 13,078 to clauses in the two corpora, which was considered a large enough and yet managable sample of to clauses. This sam- ple should not be skewed with regard to lexical preferences, as all words are flexible semantically and therefore not restricted to specific contexts in use. The database of to clauses and for to clauses together comprises 13,302 clauses. However, in order to examine variation between to and for to, the database of for to clauses was also reduced to infinitives of be, get, come and do. The variable for the comparison of for to and to is thus choice of infinitive marker with the four most frequent verbs.

Dependent variable

The aim of the following analysis is to determine whether the occurrence of for to clauses depends on the region a speaker comes from. The correspond- ing variable is

Forto. All for to clauses are coded as “1”, all other to clauses as “‘0”.

First of all it was necessary to distinguish between actual for to clauses and clauses that are similar superficially, but in fact belong to a different type. In contrast to the coding of other clauses, commas were considered decisive and could not be overridden by grammatical / semantic factors. As for is an Complement clauses 271 infinitive marker just like to, commas between for and to are always possible indications of a repair strategy, as they very likely indicate a pause or break in speech. This would occur when a speaker starts constructing a clause intro- duced by for, but then realises that this for is superfluous, because there is no new subject, or “wrong”, i.e., a non-standard construction, and then decides to use a simple to clause. It is impossible to determine whether the speaker actually meant to say for to in, e.g., this sentence:

(97) It was rare for, to have some-one in the class that had boots on. (FRED, WES_005)

Another criterion for deciding whether for and to actually belong to the same clause is if for could be easily omitted or replaced by to without a change in meaning. Again, that for to is in effect a double infinitive marker is crucial for this criterion: in a for to clause as in (98), for is always optional, as illus- trated in (98-a), and it can often be replaced by in order, due to the frequent ‘purpose’ connotation of to clauses which is shown in (98-b).

(98) That was about what they needed for to feed a cow. (FRED, SEL_002) a. ?That was about what they needed to feed a cow. b. That was about what they needed in order to feed a cow.

If, however, for has to be retained, and in order can easily be inserted between for and to, for marks the end of the preceding clause (usually as stranded preposition), even if that clause is not separated by a comma from the follow- ing to clause, as in (99), in which for cannot be omitted, which is illustrated in (99-a) versus (99-b).

(99) . . . that’s what they dip them for to prevent all they things, you see. (FRED, PEE_002) a. ?that’s what they dip them (in order) to prevent all their things b. that’s what they dip them for, (in order) to prevent all their things 272 Daniela Kolbe

5.3.1.1. Independent variables

The external independent variables are, as usual, AGE, SEX and REGION.In order to examine the salience of a particular region, it is also interesting to examine its differences from the rest of the data. A set of variables which can be subsumed under the heading REGION? permits these examinations. The individual categories of REGION? consist of the (short) name of the region, followed by a question mark: N? (Northern England), S.EAST? (Southeast England) etc. They capture whether the clause was from this regional data set or not. The next distinction that had to be made for each clause was its type, re- presented by the variable TYPE, with the values postmodifying, complement, adverbial and unclear. As all types of for to clauses were to be analysed and not only complement clauses, these codes allowed the analyses of variation according to different types of clauses.

Type For to clauses occur in three functions of to clauses: adverbial, com- plement and postmodifying clauses, as discussed in section 5.1, coded a, c, p respectively. Complement clauses complete the meaning of their head. Postmodifying clauses in noun phrases specify the head noun rather than complete it and can often be rephrased in a relative clause. Many postmodifying and complement clauses also convey an adverbial relation, which, with to clauses, is mostly that of ‘purpose’ or ‘result’. Structurally, however, adverbial clauses do not complete or specify the meaning of the head of the phrase they belong to. Instead, they add optional information to a clause – usually the purpose of the action or event described in the main clause – so the for always had to be re- placeable by in order or so as. Syntactically, adverbial clauses are freer than complement and postmodifying clauses and also occur sentence- initially, e.g., And for [/in order] to launch him they used to take him over and go down the slipway (FRED, SOM_036). Clauses were only coded as adverbial clauses if they did not classify as postmodifying or complement clauses. This was especially important for the distinc- tion between postmodifying clauses with adverbial gaps and adverbial clauses as such. Matrix represents the matrix expression (i.e., matrix verb or head of the noun phrase) and thus only applies to postmodifying and complement clauses. These are mostly simply the lexical items that fulfill this func- Complement clauses 273

tion, e.g., chance, people, decide, etc. In some cases, though, they were simplified or extended. The variable code “x”, which was cho- sen for extraposed clauses, also applies to raising constructions (Quirk et al. 1985: 1202–1203). Head nouns preceded by enough are coded as “enough NP”, which also subsumes cases in which enough itself is the head of the noun phrase, e.g., we had enough to see and to do (FRED, HEB_033). Have/had got, their contractions (’d /’ve got) and just got (they got to come back up) are coded as got.

Because of unclear passages in the transcriptions, five for to clauses could not be classified as a certain type. These are:

(100) . . . that was all the examination you got for to get a higher school, only a standard. (FRED, NBL_008) (101) . . . they say that if he had, if they collected dinner for to have he [sic] brought home, he’d have the biggest funeral of the lot. (FRED, CON_001) (102) And I said to Joan and Ba – have a hard job put en up, but I couldn’t go back for no note for that, ...butyouknow, I call that mind, for to write, way to have a note wrought/brought [transcriber undecided], it’s two minutes late, you been – well, couldn’t ask Ma to write a note in the last five minutes. (FRED, CON_005) (103) So one man was selected for to. . . {Goanddo...}Go,go and do that. (NITCS, A28.2, PM12, 13) (104) . . . you’d found a moulder and signed on, then they would take you back again. Then for to go for another summer and you get a, a jo-, job in the brickfield (FRED, KEN_003)

Although the rule was to code clauses as postmodifying and not as adverbial clauses if this was feasible, in (100) and (101) both options are unsatisfactory. Most likely, for to get a higher school in (100) means to get [into] a higher school or to get a higher school degree, which as postmodifying clause would be introduced by [examination] through which and as adverbial clause by in order. Although consequently the postmodifying clause is possible, the ad- verbial clause seems more likely if one compares That was all the examina- tion you got through which you could get to a higher school with That was all the examination you got in order to get to a higher school. So firstly, the 274 Daniela Kolbe postmodifying interpretation is possible, but slightly awkward and secondly, this is all based on the “most likely” meaning of the utterance. Both points to- gether make the identification of a certain clause type problematic. In (101), even when one replaces for to have he brought home with to have him brought home or to have brought home to him, it remains unclear why a dinner should be collected that is brought home to this person or why it should be collected so that he is brought home. It thus remains impossible to assign a certain type of clause to to have he brought home. Similarly, the meaning of the whole passage in (102) is – despite repeated efforts – incomprehensible, so for to write cannot be included in any category. A classification of the at first incomplete for to clause in (103) was refrained from because the interviewer () provides the verbs. (104) contains a peculiar instance of a for to clause, the underlying meaning of which is that of a finite clause: you went. One might consider a finite adverbial clause an appropriate equivalent, from which, however, and would have to be deleted – Then when you went for another summer you got a job in the brickfield –but this involves too many changes to code for to go as an adverbial (non-finite) clause.s unclear why a dinner should be collected that is brought home to this person or why it should be collected so that he is brought home. It thus remains impossible to assign a certain type of clause to to have he brought home. Similarly, the meaning of the whole passage in (102) is – despite repeated efforts – incomprehensible, so for to write cannot be included in any category. A classification of the at first incomplete for to clause in (103) was refrained from because the interviewer () provides the verbs. (104) contains a peculiar instance of a for to clause, the underlying meaning of which is that of a finite clause: you went. One might consider a finite adverbial clause an appropriate equivalent, from which, however, and would have to be deleted – Then when you went for another summer you got a job in the brickfield –but this involves too many changes to code for to go as an adverbial (non-finite) clause.

5.4. Analysis

The overall aim of the analysis is to determine whether for to clauses are equally frequent in all regions included in FRED and the NITCS or whether Complement clauses 275 they are particularly frequent or infrequent in individual dialects. The under- lying null hypothesis is therefore: For to clauses as non-standard variants of to clauses are equally frequent in the data from all examined regions.

The analysis will be presented in the three following sections. The first gen- eral results show that a further reduction of the data sample was commend- able, because to has come to form a fixed unit with some verbs, e.g., want to - wanna. These verbs were eventually removed from the database, because they function similarly to (modal) auxiliary verbs and the insertion of a for into a gramaticalised expression is virtually impossible (section 5.4.1). Sec- tion 5.4.2 explores the regional distribution of for to clauses and further influ- encing factors. The discussion in section 5.5 comments on the results of the analyses with a reference to the literature on for to clauses.

5.4.1. Preliminary results and consequences

All for to clauses and the to clauses with be, come, do and get together make up a database of 13,302 clauses. Of these 13,302 clauses, 224 clauses are for to clauses. The for to clauses are distributed across the regions as shown in Table 36. The Isle of Man is the only region in the corpora in which no for to clauses occur, but this absence means only that for to clauses are not extremely more frequent in this region than in others. The lack of for to clauses in the nearly

Table 36. Absolute frequencies of for to clauses in regions, one sample t-test (test value 0) p <0.013, 95% CI 1.37–8.49, significant values in bold

REGION n per 10,000 words Northern Ireland 37 1.5 Scottish Highlands & Islands 8 0.5 Isle of Man 0 0 Scottish Lowlands 13 0.7 Northern England 30 0.6 English Midlands 5 0.14 Wales 2 0.2 Southwest England 83 1.4 Southeast England 46 0.7 Total 224 0.8 276 Daniela Kolbe

10,500 words from the Isle of Man is in line with the overall frequency of for to clauses of less than once in 10,000 words. Compared with an assumed value of zero for no for to to expect in standard English, the absolute frequen- cies of four regions stand out from the rest and there is a 95% probability that another data sample would yield similar results: for to occurs significantly less often in the data from the English Midlands and the Isle of Man, while it occurs significantly more often in the data from Southwest England and Northern Ireland. However, these differences are not reliable as linguistic variables, as the higher or lower frequencies can be due to the use of more or fewer infinitives. If for some reason, there are fewer infinitives in the English Midlands in gen- eral, there will be fewer for to clauses in these data, even if the percentage of for to in all infinitives is the same as, for instance, that in the Midlands. In order to be able to calculate percentages of a linguistic structure, the data sample of for to clauses was reduced to infinitives of be, come, do and get to match the database of to clauses. The database for the analysis below thus consists of 13,352 clauses altogether of which 174 are for to clauses.. Table 37 shows the ten most frequent matrix expressions of infinitive clauses in the database. The six most frequent matrices (used, have, going, got, want and supposed) do not control any for to clauses. However, it is gen- erally problematic to regard them as matrix verbs of a to clause, as they have properties of auxiliary verbs when followed by to (Quirk et al. 1985: 140, 143–146). Thus, used to, have to, going to, got to, want to and supposed to each form a unit that is followed by a bare infinitive rather than a matrix unit consisting of a head verb that controls an embedded to clause. Have to do and

Table 37. Most frequent matrices of (for) to clauses

MATRIX n of tokens n of controlled for to clauses used 6027 0 have 1779 0 going 388 0 got 352 0 want 337 0 supposed 159 0 extraposed 150 3 try 145 0 seem 128 0 nothing 109 1 Complement clauses 277 going to do, for instance, should be divided into have to + do and going to + do (comparable to must + do, will + do), rather than have + to do and going + to do. With the loss of the subjunctive in the history of English, to infinitives have been used to express potential events or actions when used with modal verbs, as after hope (van Gelderen 2006: 212). Although used to semantically does not belong to the core modal verbs, syntactically it has more typical modal properties than going to, have to and want to, which express prototypical modality (see Quirk et al. 1985: 137, 140, 143–146). In the corpora, used is negated similarly to an auxiliary verb several times, without do-support and with post-positioned not – used not (n=2) or usedn’t (n=4). Consequently, to is a part of these grammatical and semantic units and therefore not likely to be extended to for to, which is evidenced in the database. In his discussion of the Bolton dialect, Shorrocks (1999) lists “ought (for) to” (157) and “used (for) to” (179), but all his cited examples show simple to. If the addition of (for) in the lists refers to documented usage or possibility remains unclear. Four of the most frequent matrix verbs in the data sample often fuse with to in contracted forms, which are usually rendered in the following spellings in the corpora: hefta for have to, hetta for its past tense had to, gonna for going to, gotta for got to and wanna for want to. The existence of these con- tracted forms makes the insertion of for virtually impossible. Krug stresses the interrelation between frequency, grammaticalisation and “bondedness” of forms (2000: 176–177). The relation between the bondedness of multi-word units and their frequency is circular: on the one hand, “frequency is an indica- tor of increased bondedness between the constituents of constructions” (2000: 176) – the more bonded constituents are, the more frequently do they occur. On the other hand, “the frequency of word-form sequences is a largely inde- pendent factor that determines native speakers’ intuitions about how closely connected two adjacent words are” (1998: 305) – the more frequent multi- word strings are, the more native speakers perceive them as one unit. The verbs that most frequently control a to clause can thus be intuitively considered to form a unit with the following to which furthermore acts as a modal verb. This fusion impedes the insertion of for between the two con- stituents of this unit and their frequency as matrix verbs of discrete to clauses consequently prevents these verbs from being used as matrix verbs of for to clauses in the corpora. Used, had, have, got, supposed, wanted and want are among the ten most frequent words that immediately precede to in the con- 278 Daniela Kolbe cordance of be, get, come and do (WordSmith patterns and collocates). They control nearly 68% of all clauses in the database. Want (n=128) and wanted (n=108) together immediately precede to 236 times, almost twice as often as like (n=110) and liked (n=17), which also never control a for to clause. In order to measure the influential strength of different variables, it there- fore makes more sense to eliminate the quasi-auxiliaries from the database as their frequency and “auxiliary-hood” preclude the insertion of for. Used, have, going, got and want all occur over 300 times. This reduces the total number of (for) to clauses in the database to 4,419. Although be supposed to is also one of the “semi-auxiliaries” (Quirk et al. 1985: 137), it was retained as matrix verb in the data sample as its complement clauses are approximately as frequent as extraposed clauses or clauses controlled by try.

5.4.2. The regional distribution of for to clauses

The reduced data sample consists of 4,246 (for) to clauses containing the infinitives of the four most frequent verbs in the corpus, be, come, do and get. As illustrated in Table 38, the overall percentage of for to clauses ranges from 0% in the data sets from the Isle of Man and from Wales to around 8% in the data from Southwest England (8.4%) and Northern Ireland (7.9%).

Table 38. Regional distribution of for to clauses in reduced data sample; AR = ad- justed residuals, significant values in bold; overall chi2 p = 0.007 for to to REGION % (n) AR % (n) AR Northern Ireland 1.4 (6) 0.3 98.6 (432) 0.3 Scottish Highlands & Islands 0.3 (1) -1.5 99.7 (323) 1.5 Isle of Man 0 -0.5 100 (22) 0.5 Scottish Lowlands 0.3 (1) -1.4 99.7 (297) 1.4 Northern England 1.5 (11) -0.7 98.5 (744) 0.7 English Midlands 0.0 (0) -2.9 100 (582) 2.9 Wales 0.0 (0) -1.4 100 (157) 1.4 Southwest England 1.8 (17) 2.0 98.2 (907) -2.0 Southeast England 2.0 (15) 2.2 98.0 (731) -2.2 Total 1.2 (224) 94.9 (4195)

Adjusted residuals (see the appendix) show more clearly whether any regions deviate from the expectations if there were no regional differences. The ex- Complement clauses 279 pected percentage of for to clauses in a region is calculated on the basis of the average percentage of for to clauses in the sample – 1.2%. This is the percentage that is expected in each region, and the closer the actual percent- age of for to clauses in a region is, the closer the residuals are to 0. They are displayed for the frequency of for to clauses in Table 38. The statistically significant adjusted residuals (> ±1.96) are set in bold and mark those three regions which can actually be regarded to differ from the average in the use of for to clauses. There are more for to clauses in the data from Southwest England (AR 2.0) and Southeast England (AR 2.0) and fewer for to clauses in the data from the English Midlands (AR - 5.0). The null hypothesis that for to clauses as non-standard variants of to clauses are equally frequent in data from all examined regions can be rejected. The regional distribution of for to in percentages of infinitive clauses partly differs from the absolute frequencies discussed above (in Table 36). Whereas for to is especially frequent in absolute numbers in the data from Northern Ireland, speakers from this region do not use it more frequently as non-standard infinitive marker with the most frequent verbs than speakers from other regions. The percentage of for to clauses in Northern Ireland is closest to the overall percentage (AR 0.3). Similarly, while the absence of for to in the Isle of Man in absolute numbers stood out, the comparison of per- centages shows that this fact is to be expected, considering the small amount of data from this region (AR -0.5, not significant). In contrast, the percentage of for to in infinitive clauses in the data from Southeast England is higher than can be expected from the overall percentage of for to (AR 2.2), although the absolute frequency of for to is not different from the rest of the corpus. In two regions, the absolute frequencies of for to as well as its percentages in infinitive clauses are significantly higher or lower than in the other regions. These are the English Midlands where for to occurs less frequently than on average in the data sample (AR -2.9), and Southwest England, where for to is significantly frequent (AR 2.0). Nevertheless, the absence or presence of a for to clause can be caused by an unusually high or low frequency of another factor but region that co- occurs with for to in general. The following analyses aim to verify whether a speaker’s choice between to and for to is determined by their dialect or rather by their age or sex or by the type of clause they used. Table 39 illustrates the 280 Daniela Kolbe

Table 39. Co-occurrence of different clause types with for to; significant residuals in bold; overall chi2 p <0.001

TYPE % for to in type (n) Adjusted residuals unclear 10 (1) 2.6 adverbial 3.1 (32) 6.4 complement 0.3 (7) -5.6 postmodifying 1.2 (11) -0.1 distribution of for to in different types of clauses. Again, the null hypothesis that to and for to clauses are equally frequent in each function can be rejected. In sum, for to clauses cannot be assumed to be a phenomenon of individual dialects – they are uncommon, but they occur in virtually all British dialects. Nevertheless, there are differences in their distribution, especially when cal- culated in percentages of (for) to clauses: the data show more for to clauses than expected in the data from Southern (Southwest and Southeast) England, as well as in adverbial clauses, and fewer for to clauses than expected are complement clauses or occur in the data from the English Midlands. In the data from six regions (Northern Ireland, the Isle of Man, Northern England, Wales, Scottish Highlands and Islands as well as Lowlands), the distribution is as expected, even though that means a very low percentage. Yet how do the variables TYPE and REGION interrelate? Are there, for instance, more for to clauses in the data from Southern England, because the data sample contains more adverbial clauses? Do the external variables AGE and SEX have an im- pact on the use of for to? The interplay of all independent variables and their influence on the use of for to is identified in a logistic regression analysis (which is explained in more detail in the appendix). Logistic regression identifies those factors that significantly increase or decrease the probability of a for to clause. In the data sample, 98.8% of all clauses are to clauses. They represent the baseline percentage. Thus, in pre- dicting that a clause in the data sample is a to clause, one would be correct in nearly 99% of all cases already. Logistic regression tries to identify factors that help increase the percentage of correctly predicted cases, which is then nearly impossible. Logistic regression also identifies factors that increase the explained vari- ance of the data, which is expressed by Nagelkerke R2. The linguistic regres- sion model for the probability of for to clauses presented here explains 19% of the variance of the data (Nagelkerke R2 = 0.191). It is the best conceivable Complement clauses 281

Table 40. Predictors increasing the odds of for to clauses, significant values in bold Factor Exp(b) (OR) p

REGION – 0.529 REGION(Scottish Highlands & Islands) 0.222 0.650 REGION(Isle of Man) 0.000 1.000 REGION(Scottish Lowlands) 1.585 0.858 REGION(English Midlands) 0.000 0.999 REGION(Northern England) 1.576 0.864 REGION(Southeast England) 0.000 <0.010 REGION(Southwest England) 0.582 0.854 REGION(Wales) 0.000 1.000 AGE*TYPE – <0.001 AGE*TYPE(unclear) 0.565 0.998 AGE*TYPE(adverbial) 1.014 0.007 AGE*TYPE(complement) 0.981 0.013 AGE*REGION – 0.464 AGE*REGION(Scottish Highlands & Islands) 1.003 0.957 AGE*REGION(Isle of Man) 0.932 1.000 AGE*REGION(Scottish Lowlands) 0.973 0.521 AGE*REGION(English Midlands) 0.965 1.000 AGE*REGION(Northern England) 0.987 0.736 AGE*REGION(Southeast England) 1.193 <0.011 AGE*REGION(Southwest England) 1.000 0.999 AGE*REGION(Wales) 1.001 1.000 SEX(female) 0.552 0.215 AGE 1.012 0.571 Constant 0.012 0.000 model, but its explanatory and especially predictive powers are thus rather low. The percentage of correctly predicted cases could not be increased. This is on the one hand due to the extremely high baseline percentage of 98.8%. On the other hand, no restriction of the database to render a sample that contains higher percentage of for to leads to an increase in the predictive strength of the model.20 Nevertheless, Table 40 shows the factors that decrease or increase the odds of a for to clause, usually called predictors (see the appendix), as they help to predict what the outcome of another sample compiled under the same con- ditions would be. Due to the weak predictive effect of the model, however, 282 Daniela Kolbe they must rather be taken to reflect variation only in the data sample under analysis. The selected reference category for REGION is Northern Ireland, where the percentage of for to is closest to its average (AR 0.3). The other reference cat- egories are male in SEX and postmodifying clauses in TYPE, i.e., SEX(male) and TYPE(postmodifying), and were selected automatically. AGE*REGION describes the interaction of the variables AGE and REGION and compares the influence of age on the use of for to in each region in the database (as compared with Northern Ireland). The constant comprises the reference cat- egories. The predictors of for to clauses occur in the interactions between AGE and the variables TYPE and REGION. Each additional year of age of a speaker increases the odds that she or he chooses for to by a factor (OR) of 1. 014. The likelihood that a speaker in Southeast England chooses for to also rises with each additional year (OR 1.193). Although the odds ratio at less than 2 seems low, it is important to remember that it refers to an increase with each additional year of age. The factors that decrease the probability of for to clauses are the region Southeast England (OR 0.000) and when it occurs in a complement clause (OR 0.981). The opposed effects of Southeast England in general and when in interaction with age shows that it is the age of speakers in the data from South- east England rather than the effect of the regional dialect itself that causes the higher percentage of for to in infinitive clauses in this region. The fact that the observed distribution of for to across regions is different from the influence of the region is most likely related to the interplay of all independent variables. This interplay could also be another cause besides the high frequency of to clauses why no better logistic regression model could be computed. The distribution of for to and to according to speakers’ sex, however, is not significant (chi2 continuity corrected p = 0.6). The correlation between AGE and FORTO (p = 0.085, Pearson’s r = 0.03) is weak and only nearly signifi- cant at p ≤0.05. It is therefore not a predictor and only in some interactions influences the choice between to and for to. Another question is why the observed significant differences in the per- centages of for to in the English Midlands and in Southwest England are no predictors and why the Southeast English data contain significantly more for to clauses, but as a factor in logistic regression decrease its likelihood. Table 41 displays the interplay of TYPE of clause and FORTO for these three regions. Complement clauses 283

Table 41. Regional distribution of for to clauses in different types; AR = adjusted residuals, significant values in bold; overall chi2 p ≤ 0.001 Adverbial Complement Postmodifying REGION % AR % AR % AR English Midlands Total 30.1 3.4 48.5 -2.3 21.5 -0.5 for to 0.0 n.a. 0.0 n.a. 0.0 n.a. to 30.1 n.a. 48.5 n.a. 21.5 n.a. Southwest England Total 27.7 2.5 49.8 -2.2 22.4 0.1 for to 76.5 4.5 5.9 -3.7 17.6 -0.5 to 26.8 -4.5 50.6 3.7 22.5 0.5 Southeast England Total 25.6 0.8 49.8 -2.3 25.1 2.0 for to 66.7 3.7 13.3 -2.8 20.0 -0.5 to 24.8 -3.7 49.8 2.8 25.2 0.5

As there are no for to clauses in the data from the English Midlands, no percentages of for to and to are expected in the distribution across differ- ent types, so there are no residuals to show deviation from this expectation. Overall, however, there are more adverbial clauses (AR 3.4) and fewer com- plement clauses (AR -2.3) in this data set than on average. The absence of for to in this data sample despite the higher frequency of adverbial clauses corresponds to the fact that for to is less frequent in this region than expected. As for to is infrequent in general, this difference is probably not large enough to become a predictor of the absence of for to The distribution of adverbial clauses (AR 2.5) and postmodifying (AR -2.2) clauses in the data from Southwest England in general is similar to that in the English Midlands. However, for to is more frequent in the adverbial clauses from this region than expected (AR 4.5) and less frequent in the com- plement clauses than expected (AR -3.7). This corresponds to the overall dis- tribution of infinitive markers in different clause types. The overall effect of clause type thus appears to override the influence of the region in the logistic regression model, so that REGION(Southwest England) is not a factor with significant influence on the use of for to. In the data from Southeast England, there are significantly fewer com- plement clauses (AR -2.3) and significantly more postmodifying clauses (in which the distribution of (for) to is closest to its average; AR 2.0). The ad- verbial clauses in this region are more often than expected introduced by for to (AR 3.7), as is the case in Southwest England. However, in contrast to the Southwest England data, the higher frequency of for to in adverbial clauses 284 Daniela Kolbe is not accompanied by a higher frequency of adverbial clauses as such. This probably causes the higher percentage of for to in the observed distribution. When other factors are also considered, however, the distribution of for to and to in adverbial and complement clauses is in line with the overall distribution of the infinitive markers in these clause types. In addition, the effect of both region and clause type alone are overridden by their interactions with AGE of speakers. In sum, the independent variables are interrelated in a way that counter- balances their individual effects. Together with a very high percentage of the standard English infinitive marker to, these interrelations appear to make a good predictive model inconceivable. For to is rare but widespread and no region can be determined in which it is more frequent than elsewhere. The interactions between AGE and REGION and between AGE and TYPE seem important for the prediction of for to clauses. They indicate that the higher frequency of for to clauses in the data from Southern England results from the influence of the age of the speakers in these regions as well as from the clause type that they use. The influence of the data from Southeast England is also noteworthy. On the one hand, it decreases the probability of for to clauses by the strongest possible factor (OR 0.000). On the other hand, an increase of the age of a speaker in this region is also the strongest factor that increases the probability of a for to clause.

5.5. Discussion and summary

This section summarises the findings concerned with the choice between for to and to and it aims at an evaluation of these results in relation to previous research. It also takes into account noteworthy aspects involving for to clauses which cannot necessarily be verified statistically or are beyond the scope of the present study. Only the decision not to restrict the data sample to complement clauses enabled a thorough analysis of the distribution of for to clauses in British En- glishes. The twenty-seven complement for to clauses exhibited by the corpora under analysis would not have rendered any conclusions. The incorporation of adverbial and postmodifying clauses allowed the examination of the inter- relation of different types of clauses, which are strongly connected with the use of for. It is the semantics of the preposition that add an adverbial relation Complement clauses 285 to the infinitive that is also made use of in complement clauses. Nevertheless, for has to be considered to function syntactically as a subordinator rather than as a preposition even in a for NP to clause, which is demonstrated in Chom- sky and Lasnik (1977), Chomsky (1981), Mair (1990), van Gelderen (1998) and Wagner (2000). However, the non-standard infinitive marker is so infrequent that it was not possible to construct a valuable predictive logistic regression model. It is questionable to which extent the findings of this study would apply to a different data sample. Any conclusion drawn therefore has to remain more descriptive and applies to the data under analysis only. This summary will take into account both the total sample of for to of 224 instances and the instances of for to as variants of to in the four most frequent infinitives in the corpora. The accounts of for to clauses in the literature raise the question of whether there is a connection between the semantics of for and the functional type of for to clauses – does the for inevitably express the adverbial relation ‘purpose’ or, less frequently, ‘result’ (Biber et al. 1999: 828)? As mentioned in section 5.4.2, for to occurs more often than can be expected in adverbial clauses than in complement clauses. 138 of the total of 224 for to clauses are adverbial clauses and merely twenty-seven are complement clauses. The adverbial connotation of purpose is present in almost all postmodify- ing and complement clauses in the corpora that are introduced by contiguous for to, as illustrated in examples (105) – (108).

(105) . . . the ship was ready for to go to eh the engine works. . . (FRED, DUR_003) (106) Yes, sickles for to cut the grain. (FRED, SEL_001) (107) . . . they hadn’t got no pension for to carry on. (FRED, KEN_003) (108) I used to take these loaves of bread down for to make the sausages, . . . (FRED, WIL_005)

(105) contains a for to clause that is a complement to ready and thus also strongly conveys the adverbial relation ‘purpose’ because the state of be- ing ready always has a purpose for which this state is required. In examples (106) - (108) the for to clauses postmodify nouns that have the purpose of what is described in this postmodifying clause (cutting grain, carrying on, making sausage) and each of the for to clauses consequently has an adverbial 286 Daniela Kolbe gap that can be filled by with which (. . . the grain was cut, . . . they could carry on,...wemade the sausage). Merely seven for to clauses in the corpora are complement or postmodi- fying clauses introduced by for to that cannot be considered as expressing an adverbial relation to the matrix clause or head noun:

(109) I don’t think they knew which school for to send him to, . . . (FRED, LAN_012) (110) But it was a thrill for to get your good clothes on, . . . (FRED, NBL_006) (111) Because it were really, all stress for to keep the tub full of milk you know, . . . (FRED, SOM_029) (112) ’Tweren’t no good for to say you weren’t going to do anything. (FRED, SOM_036) (113) It’s very important, you know, for to have such a man like him, to speak out, you know. (NITCS, A19.2, PL23) (114) And that was a, a treat then, for to get that. (NITCS, A28.2, PM14) (115) Well, that’s how I learned for to bake bread. (NITCS, A32.3, LM26) (116) And the father, he would try for to tell her, like, . . . (NITCS, A32.3, LM7)

Example (109) contains a postmodifying for to clause with prepositional gap (to which school), examples (110) – (114) make up a larger subgroup of ex- traposed for to clauses, and both (115) and (116) show for to clauses that are complements of verbs. Although it is impossible to turn these last two for to clauses into adverbial clauses by inserting in order or so as (?I learned in or- der to bake bread,?he would try so as to tell her), it is possible to argue that the verb semantics also connote a purpose: when one learns, or tries, these actions have a specific result, e.g., to be able to bake bread, or a purpose, e.g., to tell someone something. In sum, for to clauses in the data sample typically, though not exclusively, express an adverbial relation. The number of clauses is far too low to de- termine regional differences in this respect, but the data confirm that non- adverbial clauses are used in the English spoken in Northern England (Beal 2004: 134–135) and in Northern Ireland (Filppula 2004: 85–86). Complement clauses 287

An interplay of TYPE, AGE and REGION determines the choice between to and for to. The age of a speaker in the data from Southeast England is the strongest factor that increases the probability that for to is chosen. However, for to is avoided in Southeast England in general. Due to its adverbial connotation of purpose and result, for is more likely to be used in adverbial clauses and less so in complement clause the older a speaker gets. Even though a speaker’s age is not a main predictor of for to, it plays an important role. In FRED, the youngest for to speaker is fifty-seven years of age; eight (out of 55) speakers in the youngest age group of the NITCS from nine to forty- seven years of age use contiguous for to. None of the thirty-five speakers in FRED that are forty-seven years or younger uses for to. These findings indicate that for to clauses are indeed a receding feature, as claimed by, e.g., Miller (2004: 167) or Harris (1993: 167). Nevertheless, the moderating influence of age on the effects of its in- teraction terms also suggests that the cognitive processing of complement clauses might be different with older speakers than with younger speakers (see Szmrecsanyi 2006: 196–198). Since there are great differences in the average age and range of age of the speakers in individual regions in the cor- pora (see section 2.1.2), these findings need to be confirmed in studies that use more balanced data samples for a more reliable apparent-time analysis. The data from East Anglia do not contain a single for to clause although this data set comprises half the words from the Southeast of England, the region in which there is the strongest preference for for to clauses in the speech of older people. This may be caused by the fact that Norfolk, the only county in East Anglia where contiguous for to has been reported (Upton et al. 1994: 504), is not represented in the FRED data from East Anglia – they were exclusively collected in Suffolk. Nevertheless, as mentioned above, East Anglia is also the only British region in which the use of for to clauses is not attested in Kortmann et al. (2004). In conclusion, for to clauses are not only widespread across British vari- eties of English. Related structures also occur in other Germanic languages. Although for to used to be a neutral infinitive marker in Earlier English, it has come to express the adverbial relation of ‘purpose’ nearly exclusively, a relation that is derived from the semantics of the preposition for. In sum, as for expresses an adverbial relation, it appears most often in adverbial clauses, which stresses the importance of semantics for syntax (see Duffley 1992: 7). Most observations on for to clauses in the literature could be confirmed: they occur in virtually every British dialect with the possible 288 Daniela Kolbe exception of East Anglia (at least of Suffolk), they are probably in decline, and – even if there are counter-examples – they are mostly adverbial clauses. Even if they are receding and perhaps “on the brink of extinction” (Taglia- monte et al. 2005: 102), for to clauses have to be considered a supra-regional non-standard feature (cf. Auer 2004: 70–72), since they occur in nearly all British and also in North American varieties of English. Whether their us- age further decreases or whether it is revived in which individual varieties of English could be examined in future research.

6. Conclusion

This study’s aim was to determine the actual distribution of syntactic dialect features in complement clauses. The findings show that embedded inversion and the complementizer as are regionally restricted dialect features. They are both significantly more likely to occur in specific regions. For to is more widespread throughout the UK and it is significantly less likely to occur only in a few regions. Since infinitive clauses introduced by (for) to function not only as complements, but also as adverbials, the latter are included in the analysis. On the basis of the results in logistic regression, the following overview summarises which regions significantly influence the choice of syntactic vari- ants, and which type of non-standard feature according to Auer (2004: 70–72) they are.

Embedded inversion is more likely in Northern Ireland, the Scottish High- lands and Islands and, when the subject consists of 2 words, also in Cornwall. It is a syntactic dialect feature. The complementizer as is more likely in the English Midlands the older the speaker is and thus a syntactic dialect feature. For to is less likely in in Southeast England in general, but more likely the older the speaker is. It is a supra-regional non-standard feature.

Embedded inversion is the most strongly regionally determined syntactic variant in the present study. The best predictors are regions and the interac- tion between region and the length of the subject of the complement clause. Although the complementizer as is more likely in the English Midlands only in interaction with age, it is considered to be regionally restricted, because in Complement clauses 289

Table 42. Overview of the percentages of syntactic dialect features across regions

REGION % as % Embedded inversion % for to Northern Ireland 0.0 6.7 1.4 Scottish Highlands& Islands 1.0 13.2 0.3 Isle of Man 0.0 9.1 0.0 Scottish Lowlands 1.2 0.0 0.3 Northern England 6.5 1.0 1.5 English Midlands 27.1 0.4 0.0 Wales 0.0 0.0 0.0 Southwest England 1.4 1.9 1.8 Southeast England 2.2 0.2 2.0 Total 2.0 4.5 1.2 the observed distribution it is also significantly more frequent than anywhere else and there is a strong correlation betwen this region and its use. With for to clauses, however, the interaction between age and region coun- terbalances the main effect of one region. Overall, for to is more frequent in the data from Southern England, but this is connected to increasing age of speakers and the type of clause in which it is used. It is infrequent and most likely a receding feature, but it still appears virtually everywhere in the UK. In general, all non-standard features are less frequent than their standard equivalents. The infinitive marker for to is the feature least frequently chosen by speakers in the UK. As variant of to it is so infrequent (1.2%) that no reliable factors could be determined that determine its likelihood. The highest percentage of a dialect variant selected by speakers instead of the standard form is as in the English Midlands with nearly 30%. The second highest percentage is embedded inversion in the Scottish Highlands and Islands with 13.2%. All remaining percentages of dialect features in individual regions are less than 10%. All dialect features – for to, the complementizer as and embedded inver- sion – are widespread. The fact that they hardly appear in the data from Wales and the Isle of Man is most probably due to the scarcity of data from these two regions (fewer than 100,000 words from each). Disregarding the low overall frequency, as is used in six of the nine re- gions, embedded inversion occurs in the data from seven regions and for to is the most widely distributed feature and exhibited by eight of the nine re- gional data subsets – it does not occur in the data from the Isle of Man. These features are so widespread that the statistical software expected them to occur nearly everywhere. For a study on non-standard grammar it is an important 290 Daniela Kolbe observation that in some regions the non-standard clauses are less frequent than expected. The extent of the spread of the selected dialect features ex- plains why for to and embedded inversion occur in the descriptions of several varieties (see, e.g., the respective sections in Kortmann et al. 2004, as shown in section 3 for embedded inversion and in section 5 for for to). From a global perspective, the British varieties of English can therefore be regarded as a fairly homogeneous group (see Szmrecsanyi and Kortmann 2009). The com- plementizer as is not very widespread in general and may be missed more easily. It occurs mainly in the English Midlands dialects, which are strangely neglected in research on dialect syntax. The age of a speaker has a statistically significant predictive effect (if in interaction with another variable) on both non-standard complementizers: the older a speaker the more likely is she or he to use for to and as. This suggests that for to and as are receding dialectal features that were more frequent in traditional dialects and cease to be used in modern dialects (see Trudgill 2000 for the distinction between traditional and modern dialects). Embedded in- version seems to be more stable, as it is most frequent in the regions with the youngest mean age of speakers. The increasing preference for the complementizer as with increasing age in the English Midlands is surprising, because the youngest speaker from this region using as is 66 years old, the youngest speaker using as from the Midlands is 72, the oldest 91. Considering that this is the smallest age range in the regions in which as actually occurs, as well as the region in this study with the highest mean age and the smallest age range (see section 2.1.2, 13) it is remarkable that the likelihood of as increases with each additional year of age. Yet these (significant) differences in speakers’ age between regional data subsets skew the data, so the findings are perhaps unreliable and need to be confirmed in future research. The present analysis also provides further evidence for the claim that de- spite its worldwide spread, embedded inversion is more frequent in areas with a Celtic language background in the British Isles, also in Cornwall. Subse- quent research comparing data from ICE (Kolbe and Sand 2010) has shown that although embedded inversion also occurs in many New Englishes, it is significantly more frequent in Irish English. Thus it seems that in Celtic re- gions in the British Isle substratal influence combines with universal collo- quial usage (see also Davydova et al. 2011: 298–209). None of the three selected features has unequivocally proved to be typ- ical of a British English dialect or of a group of dialects. In any case, it Complement clauses 291 is difficult to define and delimit a specific “dialect”, as language use may vary even within one location, while, as this study has shown, certain fea- tures occur nearly throughout a country. In the regional subsets of FRED, the complementizer as, for instance, occurs in the English Midlands and in Northern England. This subsumes the counties Lancashire and Westmorland in Northern England on the one hand and Nottinghamshire and Shropshire in the Midlands on the other hand. In the Midlands counties Warwickshire and Leicestershire as is not used. The complementizer as therefore seems to be used in north-ish Western England, but it does not seem beneficial for dialect studies to create dialect “areas” for each individual feature (cf. Labov 1972b: 191–192). More insight into the regional distribution of a multitude of dialect features is provided by dialectometrical studies in general and on dialectal morphosyntax in particular (Szmrecsanyi 2008, 2011). A multitude of factors determines syntactic variation (see, e.g., Jaeger 2006: 51–95; Szmrecsanyi 2006), which is confirmed by the present study. Not only the external factors age and sex of a speaker, but also linguistic fea- tures of a complement clause or its matrix clause determine which variant is chosen. For to is more frequent in adverbial clauses of purpose, embedded inversion also depends on the length of the subject and as introduces more complement clauses after see, but is more likely avoided after say. Regional variation is one of the determinants of variation and should be taken into ac- count in future research. This study also contributes to filling the gap of research of the more con- temporary grammar of English Midlands dialects. It is precisely in the dia- lects of the English Midlands that one of two regionally restricted features has been identified: the use of the as-complementizer. The second regionally restricted dialect feature is embedded inversion in Celtic Englishes. Although the complementizer as appears to be more frequent than for to, it has been easy to overlook, since it occurs in one of the few areas neglected by research on British English dialects and it is also very similar to the rela- tivizer as. For to, however, is more widely spread, though less frequent over- all, justified by adverbial connotations and examined in diachronic studies. Data-driven studies are therefore indispensable to avoid circular research that focuses on those features that have already been observed. Another point that seems to reward more investigation is the influence of age on syntactic variation in general and on complementizer choice in particular. Corpora that are balanced in terms of age, such as the NITCS, could serve as database in order to track the development of as and for to and 292 Daniela Kolbe to determine whether it is language change or cognitive processing that leads to the predictive power of age.

Acknowledgements

Jill Schneller, Christiane Maaß, Benedikt Szmrecsanyi, Clare Fielder and Janet Duke have read substantial parts of this study and have helped to im- prove it considerably (or should I say significantly?). Special thanks are due to Benedikt Szmrecsanyi for help in statistical and Latex matters (and lots of patience, I believe). Any errors and omissions, however, are entirely my own. Most of all I’d like to thank Adel for his invaluable support and Elias for being the blessing that he is. Complement clauses 293

Appendix

Statistical concepts This section explains important concepts for the statistical analyses in the present study. It also describes the procedures that are employed in the analyses. It is meant to function as a rather elaborate glossary for readers to look up concepts they are not familiar with. The concepts and procedures presented are – variables – null hypotheses – statistical significance – cross tabulations – correlations – logistic regression in this order.

Variables

The basic distinction between different types of variables in statistics is that between dependent and independent variables. Independent variables are factors that influ- ence the value of the dependent variable. In apparent time studies, the age of the speaker is the independent variable that influences which linguistic variant a speaker uses. The use of the linguistic variable thus depends on the age of the speaker: it is the dependent variable. Further types of variables depend on the kind of coding applied and the number of possible values for each variable. Which statistical procedures can be used in an analysis depends on the types of variables involved (Zoefel 2002: 85–89). The first distinction refers to the scale that is used in the variable. There are three possible scales: nominal, ordinal, and ratio-scale variables. Variable names are conventionally set in small capital letters, e.g., AGE. Individual categories or values of these variables are expressed as VARIABLE NAME(value), e.g., REGION(Northern England). Nominal variables are also referred to as categorical variables: their codes are names and cannot be ranked in a meaningful way. An example of a nominal variable in the present study is REGION. All regions have the same weight, there is no reason to sort them in any significant order. Wales does not have any more of a certain prop- erty than Northern Ireland, of which Northern Ireland itself could have more of than the English Midlands so that one could assign the values 1 to Wales, 2 to Northern Ireland and 3 to the English Midlands and so forth. All categories of REGION have the same weight. Nominal variables can be coded by numbers or by letters. 294 Daniela Kolbe

Ordinal or ranked variables can be ordered in a meaningful way. Their categories express notions of more or less, better or worse, etc. A widely known example is the ranking of athletes in the Olympics in winners of gold, silver and bronze medals. There is a clear order: gold is better than silver and bronze, but it does not matter by which distance this rank has been reached. The difference between the gold and the silver medal in a race can be just a tenth of a second ahead of the silver medal winner, while the difference between silver and bronze medal is 5 seconds. An ordi- nal variable in this study is SUBJECTLENGTH, which distinguishes between subjects containing one word, two words and more than two words. The values of ratio-scaled variables cannot only be ranked, the distance between them can also be measured in fixed units. The only ratio-scaled variable in the present study is AGE. Speakers are not just older or younger than other speakers, the unit of one year of age is also a stable unit so that the one can measure the difference of age between two speakers by counting the years. A difference between 3 years of age is comparable in all values of this variable: 12 years is three years younger than 15 years and 65 is three years older than 62.

Null hypothesis

An answer to the question “how dialectal is syntactic variation in complement clauses?” can range from “none at all” to “ speaker’s dialect is the most impor- tant factor in his or her choice of syntactic variants”. Hence, it is most important to determine whether there is dialectal variation in syntax. Syntactic variants are either equally distributed across the regional data subsets or their distribution is unequal. These options are expressed in two hypotheses in statistics: Null hypothesis: Syntactic variants are equally distributed across all regions repre- sented in the corpora. Alternative hypothesis: Syntactic variants are not equally distributed across all re- gions represented in the corpora.

Statistical analyses determine whether the null hypothesis (or, in short, H0) can be rejected (Howell 2003: 142–149). This is the case when there are statistically signif- icant differences. For each analysis in the study, the appropriate null hypothesis is stated.

Statistical significance

Statistical significance depends on the probability of error in rejecting the null hy- pothesis. Probability is expressed by p-values. A p-value of 1 equals a probability of 100%, a p-value of 0.5 equals a probability of 50%. A p-value of 0.05 thus means Complement clauses 295 that there is a 5% probability that rejecting the null hypothesis and accepting the alternative hypothesis is wrong. In the analyses below, p-values of 0.05 or less thus indicate that there is a 95% probability that the distribution of a syntactic variant is not equally distributed across all regions. This p-level is generally accepted as the upper limit for statistical significance (Zoefel 2002: 63–64). There are three different levels of significance, which are also represented by asterisks: p≤0.05 is significant (*), p≤0.01 is very significant (**), p≤ is highly significant (***) (Zoefel 2002: 64). Statistical procedures differ according to the type of variables, but the interpretation of p-values remains the same.

Cross tabulations

Cross tabulations are calculated in contingency tables, in which the values of two variables are compared. They are useful for the analysis of the relation between two nominal variables and, as most variables in the present study, are nominal variables, they are one of the two most frequent analyses used here. The p-values are calculated on the basis of a chi2 (χ2) test of independence. A contingency table can be calculated, e.g., for the comparison of sex of the speaker (woman vs. man) and the use of for to (for to vs. to). This comparison could show that for to is more frequent in the speech of men. Consequently, if p≤0.05, we know that men use for to more frequently than women. However, in the comparison of the use of for to across the nine different regions of the corpora, the result is not as clear. If, e.g., there are 2% for to clauses in the data from Wales, 2.5% for to clauses in the data from Southeast England and 3.2% for to clause in the data from Northern England and this distribution is statistically significant, we do not know from a chi2 test which region(s) cause(s) the distribution to be significant. Thus, it is impossible to determine where for to clauses are significantly more frequent and where they are significantly less frequent. The deviation of individual regions from the overall percentage can be measured via adjusted residuals. In the example above, they would calculate for each region whether it has more or fewer for to clauses than expected. This expectation is based on the overall percentage, which is taken as a reference point in a normal distribution. Adjusted residuals yield the deviation from this reference point in standard deviation units. 95% of the data in a normal distribution occur from -1.96 to +1.96 standard deviation units, when the reference point equals a zero. Consequently, adjusted resid- uals that are smaller than -1.96 or larger than 1.96 are statistically significant, because they correspond to the 5% probability level generally accepted to indicate statistical significance (Howell 2003: 110–114). 296 Daniela Kolbe

Correlations

A correlation between two variables exists when an increase in the value of one vari- able causes an increase or decrease in the value of another variable. Hence, nominal variables with more than two values, such as REGION, cannot correlate with another variable. The fact that the mean age of speakers from Northern Ireland is lower than the mean age of speakers from Southwest England, for instance, is not a correlation, because the difference between Northern Ireland and Southwest England does not represent a decrease or increase of anything (Zoefel 2002: 118–119). Binary nominal variables such as FORTO, which indicates whether an infinitive marker is to or for to, however, can correlate with a ratio-scaled variable such as AGE. The difference between a presence or an absence of for to is comparable to an increase or decrease in the value of a variable (Zoefel 2002: 137–138). Depending on the types of the variables involved in a correlation analysis, dif- ferent kinds of measurements apply – they are called correlation coefficients (r). The correlation coefficients applied for the correlations in this study are Spearman ρ(rho) and the Pearson product-moment correlation coefficient (Pearson r) (cf. Zoe- fel 2002: 124–141). Correlation coefficients range from -1 to +1. The polarity (+/-) indicates whether the correlation is positive (an increase in one variable correlates with an increase in another variable) or negative (an increase in one variable corre- lates with a decrease in another variable). The strength of the correlation is expressed by the value between 0 and 1. A value of 1 is a perfect, strong correlation, a value of 0 equals no correlation. Consequently a correlation coefficient r = -0.75 indicates a strong negative correlation, while r = 0.2 refers to a weak positive correlation (Zoefel 2002: 120–121). The square of correlation coefficients (R2) shows how much of the variance of the data is explained. 100% variance explained equals R2 = 1; 50% explained variance equals R2 = 0.5 (Zoefel 2002: 126,219).

Logistic regression

There are different computer programmes that provide different kinds of multivariate analyses. The procedure chosen here is logistic regression. Linear regression displays the correlation between two variables in a straight line (see Zoefel 2002: 140–144), but correlations can also be non-linear, when the value of the dependent variable does not increase or decrease at the same rate. This is the case with dichotomous variables in which all values are, e.g., either 0 or 1. For linear regression, this vari- able lacks values between 0 and 1 that could determine the exact incline of a line that illustrates the correlation. Logistic regression thus assumes a logistic (S-shaped) curve that illustrates the correlation, with curves at the upper and lower ends for the 1 respectively 0 values (Menard 2002: 7–11). Its advantages over other statistic pro- Complement clauses 297 cedures is its ability to predict outcomes rather than to describe observed frequencies and to balance the effects of different variables within one model. In contrast to cross tabulations, which identify the differences between observed and expected frequencies, logistic regression assesses the strength of the influence of each of the independent variables. It thus shows whether statistically significant regional distributions of the syntactic variant under analysis result from the regional distribution of another variable, e.g., AGE.Iffor to clauses are more frequent in the English Midlands, but also in older people’s speech, the higher frequency of for to in the English Midlands could result from the higher average age of speakers from the English Midlands in the database (see section 2.1.2). This would show in logistic regression by more strength of influence of AGE than of REGION. The aim of logistic regression is to infer from the given data sample a model that detects which factors determine the given distribution in a way that they would exert the same influence in another data sample of the same “population”. In other word, the model identifies factors, or (values of) variables, that help predict the out- come of the dependent variable. It thus shows, e.g., whether knowing the value of an independent variable, e.g., REGION, facilitates the prediction of the outcome of the dependent variable, e.g., FORTO (Menard 2002: 12). This is measured by the percentage of correctly predicted cases. If 90% of the infinitive markers in the database are to clauses, in trying to predict whether the next infinitive marker was to or for to, one would be right in 90% of the cases by guessing that it would be to. These 90% would be the baseline percentage which a logistic regression analysis would try to increase by identifying when one should predict a for to. If region was a determinant in the choice of for to, knowing the region an in- finitive is from would help to increase the percentage of correctly predicted cases. If one knows that, e.g., for to is especially frequent in the English Midlands and does not occur in the data from Southeast England, this knowledge can help to be correct in more than 90% of all cases. The independent variables that significantly help to increase the number of correctly predicted cases therefore are variables that increase the odds of the syntactic variant in the dependent variable, as in this example, the use of for to, or the variable FORTO. Although odds are basically the same as probability, for stochastic reasons, odds is the more appropriate term (see Menard 2002: 12–14). Another measure that is important for the evaluation of a logistic regression model is the explained variance, as discussed above. The R2 in logistic regression in SPSS, which is used for the present study, is Nagelkerke R2 (see Pampel 2000: 50; Menard 2002: 25). As R2 values in general, Nagelkerke R2 renders the percentage of explained variance in a decimal number: a value of 0.35 equals 35% explained variables.21 A Nagelkerke R2 value of 0.2 to 0.4 represents a sufficient level of ex- plained variance (Ludwig-Mayerhofer 2007: Pseudo-R2). The influence of an independent variable on the dependent variable in logistic regression is the variable’s predictive power, since it influences the outcome of the prediction of syntactic variants. The increase of correctly predicted percentage of 298 Daniela Kolbe clauses as well as the explained variance of the logistic regression model depends on the strength of the identified predictors. The independent variables are therefore often referred to as predictors, or factors. While ordinal and ratio-scaled variables only appear as one predictor in logistic regression, categories of categorical variables are included in the model as individual predictors. The analysis calculates whether an independent variable increases or decreases the probability of one value of the dependent variable, e.g., the probability that an infinitive clause is introduced by for to instead of to. Hence, an increase of one unit in the value of a ratio-scaled variable such as AGE could increase or decrease the probability that for to instead of to was selected. An association like this is not possible with categorical variables such as REGION, since they do not express an increase. Instead, they are transformed into dummy vari- ables with a reference category for the analysis and their influence on the dependent variable is compared with the influence of the reference category. If the reference cat- egory of region is, e.g., Southeast England, the predictive effect of this region would not be calculated. Instead it calculates whether an infinitive clause in the Scottish Lowlands, or in the English Midlands, etc. is more or less likely to be introduced by for to instead of to than in Southeast England (see, e.g., Menard 2002: 43–56). The selected reference is usually the regional data subset in which the distribution of the respective complement clause types is closest to their total distribution in per- centages and based on their adjusted residuals. Reference categories can be chosen automatically (based on alphabetical order) or set manually. For the purpose of the present study it sometimes seemed most appropriate to select the reference categories manually. The predictive power of a model in logistic regression is indicated by the odds ratios of the independent variables in logistic regression, expressed by exp(b)i n SPSS (Menard 2002: 56). Predicting the value of the dependent variable is related to betting, which is where the calculation of odds is also quite frequent. For instance, when a football team plays at home this is generally considered an effect that in- creases the odds of this team winning the game. An odds ratio of 1 indicates that the independent variable has no effect on the dependent variable; an odds ratio that is smaller than 1 means that the independent variable decreases the odds of the se- lected variant of the independent variable; an odds ratio greater than 1 shows that the independent variable increases the odds of the selected variant. Other than that, their interpretation is, however, not intuitive. Odds cannot be negative, so the lower limit of an odds ratio is 0, but they do not have an upper limit. This illustrates that they cannot be interpreted as probabilities, which can generally be expressed by per- centages from 0 to 100%. An odds ratio of 20 for for to, in, e.g., REGION(Northern England) also does not indicate that for to is twenty times more likely in Northern England than in the reference category, e.g., the Isle of Man. It means that the odds that for to occurs in the data from Northern England compared to the odds that for to occurs in the Isle of Man are 20:1 (Pampel 2000: 21–23). They are interpreted as Complement clauses 299 factors in the evaluation of the analysis (the odds of for to are increased by a factor of 20 if the clause occurs in the data from Northern England). The basic procedure in logistic regression is computing the influence of the inde- pendent variables (e.g., REGION, AGE, SEX) on the dependent variable (e.g., FORTO). However, the overall statistical power of the model and its overall outcome often vary considerably depending on which independent variables are chosen. The logistic re- gression models used for the analysis of embedded inversion and the complementizer as are based on stepwise forward regression, in which the software identifies the best model by starting with the inclusion of one variable and then adding further variables (one in each step) until the inclusion of no other variable results in a statistically sig- nificant increase in the fit of the model (Menard 2002: 63–67). What is regarded as this procedure’s disadvantage by Menard (2002: 66), “[t]his is a search for pre- dictors, not a convincing test of any theory”, is an advantage for the present study. The analyses do not aim to test the theory that syntactic variation is regionally deter- mined, they aim to identify the predictors of syntactic variation and at determining if and if yes, in which region is one of these predictive independent variables. The present study is thus the kind of exploratory research for which stepwise regression is appropriate (see Menard 2002: 63). In the analysis of for to clauses, however, this procedure results in models that have a low percentage of explained variance (Nagelkerke R2 ≈10), because it elimi- nates those variables from the model that do not have a significant effect. As this can sometimes be considered too harsh a decision (see Jaccard 2001: 66–67), variables and interactions were selected manually. Nevertheless, a model with a good fit could not be obtained. The interaction between two variables can be an important predictor in logistic regression. For instance, the general effect of AGE on FORTO might be such that an increase in age also increases the odds of the choice for to instead of to.Inthe data subsets from individual regions, however, the effect of AGE might be different. An increase in age could, e.g., lead to a decrease of the odds of for to in the data from Southwest England, but to an increase of the odds of for to in the data from the Scottish Highlands and Islands and could have no effect on the odds of for to in the data from Wales (as opposed to the reference category) (Jaccard 2001: 18– 37). These interaction effects are indicated by an asterisk between the variables whose interaction is computed. The overall interaction is shown by AGE*REGION, and categories of categorical variables are enclosed in parentheses. Table 43 exemplifies how the effects of independent variables are represented: it displays independent variables included in a model, or interactions between them (“factors”) firstly by stating whether the variable or interaction as a whole has a significant effect (here: AGE*REGION no OR, p = 0.000) and then by listing the odds ratios (exp[b]) for all (interactions of) values of nominal or ordinal variables and their significance levels (in column "p"). Significant values are set in bold print. 300 Daniela Kolbe

Table 43. Examples of predictive effects of different regions and their interaction with age; significant values in bold Factor exp(b) (OR) p REGION(Scott. High.& Islds.) 0.209 0.008 REGION(Isle of Man) 9.1E+008 1.000 REGION(Scott. Lowlands) 0.615 0.355 REGION(English Midlands) 1.202 0.605 REGION(Northern England) 2.095 0.024 AGE*REGION – 0.000 AGE*REGION(Scottish Highlands & Islands) 0.607 1014 AGE*REGION(Scottish Lowlands) 0.996 0.459 AGE*REGION(Midlands) 1.054 0.009

Notes

1. Author-date citations are used instead of alphabetisms to refer to grammars in order to avoid confusion: both A Comprehensive Grammar of the English Language (Quirk et al. 1985) and The Cambridge Grammar of the English Language (Huddleston and Pullum 2002) would become CGEL, differing only slightly from CGE for Cambridge Grammar of English (Carter and McCarthy 2006). Only Biber et al.’s Longman Grammar of Spoken and Written English would be easily identifiable by LGSWE. 2. In section 5 is the analysis of (for) to extended beyond complement clauses. 3. This concerns the ELN (East Lothian) and MLN (Midlothian) files, which are parts of the ECOSSE corpus (see the relevant FRED files and Miller 2004: 47). 4. Wun’t was retained, because it is not definitely wouldn’t or won’t Benedikt Szmrecsanyi: personal communication. 5. Guess is frequent only in American English (Biber et al. 1999: 668). 6. Pre-predicate wh-clauses or elements are not mentioned in Quirk et al. (1985), Huddleston and Pullum (2002) or Biber et al. (1999). 7. An example of a pre-predicate that clause in sports news is That they [As- ton Villa] are already struggling clearly troubles Graham Taylor. (BNC-Baby, News, a1n). 8. This is compounded by the fact that in the FRED user guide the words indirect and direct are confused: “no quote marks are used in FRED. Indirect speech is indicated by a comma followed by a capital letter” (Hernández 2006: 10; personal communication). 9. All logistic regression analyses in the present study are binary logistic regres- sion analyses, in which the influence of independent variables on a binary de- pendent variable (e.g., FORTO) is calculated. This calculation creates a model and there are several measures which evaluate the quality of the model. First of all, the model has to be overall statistically significant, i.e., the probability Complement clauses 301

that the identified distribution of the variants results from chance alone has to be 5% or less (in technical terms: model chi2 p >0.05) (Pampel 2000: 30–31; Menard 2002: 20–21). As all logistic regression models presented in the pre- sent study are statistically significant, the model chi2 values are not reported in the analyses. Another statistic measure that assesses the “goodness of fit” of a logistic regression model is the Hosmer-Lemeshow statistic, which should not be significant (Ryan 1997: 278–281). All logistic regression models included in the analyses below have non-significant Hosmer-Lemeshow results. The in- dependent variables included in the logistic regression models of the present study are therefore always tested for collinearity, so that no collinear variables are included in the analyses. 10. She actually refers to the unpublished 1992 manuscript (see McCloskey 2006: 126), which is the “ancestor” of the 2006 paper (McCloskey 2006: 89). The association of embedded inversion and adverbial adjuncts is maintained in Mc- Closkey (2006: 92–97). 11. The third person singular present form without -s is a regular morphological feature in Suffolk, East Anglia (see Trudgill 2004), but she / he say occurs in all regions represented in FRED except Wales and the Isle of Man. 12. For an account of the interrelation between frequency and “bondedness” of forms see Krug (2000: 176–177). Jaeger (2006: 88–89) also stresses the in- fluence of the frequency of the matrix verb on the choice of complementizer. 13. Even at a confidence interval of 99.9%, the difference remains significant. 14. According to Quirk et al. (1985: 1061), however, for is “generally” absent if the clause is a direct object, as in (82). 15. Further subordinators of to-infinitive clauses are as if, as though, in order, so as, whether (. . . or), with and without (Quirk et al. 1985: 1004). 16. There is no clear example of this phenomenon in either the NITCS or FRED. The only instance it was all in order for me to do it (FRED, SAL_018) is rather to be taken to mean ‘it was okay for me to do it’. The sentence follows a descrip- tion of working conditions which were rather difficult since the speaker was too young at that time and continues with everyone else was doing it the same you see in those days. 17. In her study, however, adverbial clauses are treated as a subgroup of comple- ment clauses. 18. This is not an actual utterance, but one of the examples “checked against ac- tual speech data” (Henry 1992: 282). As generative grammar is more theory- oriented than functional approaches, using actual speech data as examples is not of central interest. Competence rather than actual performance is important for theory building. 19. Most probably this is Isolated “Southeastern” American English, but this is not a term used for a variety explicitly discussed in either volume of the handbook, 302 Daniela Kolbe

or explained on the CD-Rom, nor is “Se” listed as an abbreviation in Kortmann et al. (2004: xvi-xvii). 20. The restrictions tried to reduce the data were to certain types of clauses, to speakers older than 59 years of age and to exclude those data sets in which for to does not occur as a variant of to, i.e., the Isle of Man and Wales, or to infinitives of get. Only the inclusion of the variable MATRIX caused an increase of correctly predicted cases (to 99.1%), but only in a non-significant model. None of the matrices had a significant effect; all p-values are 1.0 or 0.999, meaning that with the same matrices there would be a very good chance to predict the use of to or for to, but it is unlikely that a second sample would contain the same matrices. Even with the next most frequent verbs occurring with for to – keep and go – the percentage of for to in infinitive markers could not be increased. 21. In contrast to other R2 measures, however, Nagelkerke R2 does not have a max- imum of 1 (100% explained variance). It is therefore sometimes referred to as a pseudo-R2 measure. Complement clauses 303

References

Anderwald, Lieselotte. 2004. The varieties of English spoken in the Southeast of England: Morphology and syntax. In: Bernd Kortmann and Edgar W. Schneider (eds.), A Handbook of Varieties of English, vol. 2: Morphology and Syntax, 175– 195. Berlin / New York: Mouton de Gruyter. Auer, Peter. 2004. Non-standard evidence in syntactic typology – Methodological remarks on the use of dialect data vs spoken language data. In: Bernd Kortmann (ed.), Dialectology meets typology: Dialect grammar from a cross-linguistic perspective, 69–92. Berlin / New York: Mouton de Gruyter. Baker, Carl Lee. 1970. Notes on the description of English questions: The role of an abstract question morpheme. Foundations of Language 6: 197–219. Bakovic,´ Eric. 1998. Optimality and inversion in Spanish. In: Pilar Barbosa, Danny Fox, Paul Hagstrom, Martha McGinnis, and David Pesetsky (eds.), Is the Best Good Enough? Optimality and Competition in Syntax, 35–58. Cambridge, MA: MIT Press. Barry, Michael V. 1981. The methodology of the tape-recorded survey of Hiberno- English speech. In: Michael V. Barry (ed.), Aspects of English Dialects in Ire- land, vol. 1: Papers arising from the tape-recorded survey of Hiberno-English speech, 18–46. Belfast: Institute of Irish studies, Queen’s University of Belfast. Barry, Michael V. 1984. Manx English. In: Peter Trudgill (ed.), Language in the British Isles, 167–177. Cambridge: Cambridge University Press. Bayley, Robert and Otto Santa Ana. 2004. Chicano English: Morphology and syn- tax. In: Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Upton (eds.), A Handbook of Varieties of English, vol. 2: Morphology and Syntax, 374–390. Berlin / New York: Mouton de Gruyter. Beal, Joan. 1993. The grammar of Tyneside and Northumbrian English. In: James Milroy and Lesley Milroy (eds.), Real English: The Grammar of English Dia- lects in the British Isles, 187–213. London / New York: Longman. Beal, Joan. 2004. English dialects in the North of England: Morphology and syntax. In: Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Upton (eds.), A Handbook of Varieties of English, vol. 2: Morphology and Syntax, 114–141. Berlin / New York: Mouton de Gruyter. Biber, Douglas, Susan Conrad, and Geoffrey Leech. 1999. Longman Grammar of Spoken and Written English. Harlow: Longman. Bickerton, Derek. 1981. Roots of Language. Ann Arbor: Karoma. Bonaparte, Prince Louis Lucien. 1875. On the dialects of Monmouthshire, Here- fordshire, , Gloucestershire, Berkshire, Oxfordshire, South War- wickshire, South Northamptonshire, Buckinghamshire, Hertfordshire, Middle- sex, and Surrey, with a new classification of the English dialects. Transactions of the Philological Society 570–581. Burnard, Lou (ed.). 2000. Reference guide to the British National Corpus (World- Edition). http://www.natcorp.ox.ac.uk/archive/worldURG/urg.pdf. 304 Daniela Kolbe

Byrman, Gunilla and Britta Holm. 1992. Svenska utifran: Schemagrammatik – Svenska Strukturer och Vardagsfraser. Uddevalla: Svenska institutet. Campion, G. Edward. 1976. Lincolnshire Dialects. Boston: Guardian Press. Carroll, Susanne. 1983. Remarks on FOR-TO infinitives. Linguistic Analysis 12: 415–451. Carter, Ronald and Michael McCarthy. 2006. Cambridge Grammar of English. Cambridge: Cambridge University Press. Catford, J. C. 1957. The linguistic survey of Scotland. Orbis 6: 105–121. Chambers, J. K. and Peter Trudgill. 1980. Dialectology. Cambridge: Cambridge University Press. Chambers, J. K. and Peter Trudgill. 1998. Dialectology. Cambridge: Cambridge University Press, 2nd edn.. Cheshire, Jenny. 1996. Syntactic variation and the concept of prominence. In: Juhani Klemola, Merja Kytö, and Matti Rissanen (eds.), Speech Past and Pre- sent: Studies in English Dialectology in Memory of Ossi Ihalainen, 1–17. Frank- furt am Main: Peter Lang. Chomsky, Noam. 1981. Lectures on Government and Binding. Dordrecht Cin- naminson: Foris. Chomsky, Noam. 1986. Barriers. Cambridge, MA: The MIT Press. Chomsky, Noam. 1992. A Minimalist Program for Linguistic Theory. Cambridge, MA: MIT Press. Chomsky, Noam and Howard Lasnik. 1977. Filters and control. Linguistic Inquiry 8: 425–504. Clark, Urszula. 2004. The English West Midlands: Phonology. In: Edgar W. Schneider, Kate Burridge, Bernd Kortmann, Rajend Mesthrie, and Clive Upton (eds.), A Handbook of Varieties of English, vol. 1: Phonolgy, 134–162. Berlin / New York: Mouton de Gruyter. Clarke, Sandra. 2004. Newfoundland English: Morphology and syntax. In: Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Up- ton (eds.), A Handbook of Varieties of English, vol. 2: Morphology and Syntax, 303–318. Berlin / New York: Mouton de Gruyter. Coates, Jennifer. 1986. Women, Men and Language: A Sociolinguistic Account of Sex Differences in Language. London / New York: Longman. Corrigan, Karen. 2003. For-to infinitives and beyond: Interdisciplinary approaches to non-finite complementation in a rural Celtic English. In: Hildegard L. C. Tristram (ed.), The Celtic Englishes III, 318–338. Heidelberg: C. Winter. Crystal, David. 1997. A Dictionary of Linguistics and Phonetics. Oxford: Black- well, 4th edn.. Curme, George O. 1931. A Grammar of the English Language, vol. 2: Syntax. Boston: Heath. Darlington, Thomas. 1887. The Folk-Speech of South Cheshire. London: Trübner. Complement clauses 305

Davydova, Julia, Michaela Hilbert, Lukas Pietsch, and Peter Siemund. 2011. Com- paring varieties of English: Problems and perspectives. In: Peter Siemund (ed.), Linguistic Universals and Language Variation, 291–316. Berlin: Mouton de Gruyter. Dixon, Robert M. W. 1995. Complement clauses. In: Frank Robert Palmer (ed.), Grammar and Meaning: Essays in Honour of Sir John Lyons, 175–220. Cam- bridge: Cambridge University Press. Duffley, Patrick J. 1992. The English Infinitive. London / New York: Longman. Edwards, Viv. 1993. The grammar of Southern British English. In: James Milroy and Lesley Milroy (eds.), Real English: The Grammar of English Dialects in the British Isles, 214–238. London / New York: Longman. Edwards, Viv and Bert Welten. 1985. Research on non-standard dialects of British English: Progress and prospect. In: Wolfgang Viereck (ed.), Focus on: England and Wales. Amsterdam and Philadelphia: John Benjamins. Emonds, Joseph E. 1976. A Transformational Approach to English Syntax: Root, Structure-Preserving, and Local Transformations. New York: Academic Press. Erdmann, Peter. 1979. Inversion im heutigen Englisch. Heidelberg: C. Winter. Erdmann, Peter. 1997. The for ... to Construction in English. Frankfurt am Main: Peter Lang. Filppula, Markku. 1994. From Anglo-Irish to Hiberno-English: Divergence and convergence in the Irish dialects of English. In: Wolfgang Viereck (ed.), Re- gional Variation, Colloquial and Standard languages: Proceedings of the Inter- national Congress of Dialectologists, 180–196. Stuttgart: Franz Steiner. Filppula, Markku. 1999. The Grammar of Irish English: Language in Hibernian style. London / New York: Routledge. Filppula, Markku. 2000. Inversion in embedded questions in some regional va- rieties of English. In: Ricardo Bermúdez-Otero, David Denison, Richard M. Hogg, and C.B. McCully (eds.), Generative Theory and Corpus Studies: A Di- alogue from 10 ICEHL, 439–453. Berlin: Mouton de Gruyter. Filppula, Markku. 2004. Irish English: Morphology and syntax. In: Bernd Kort- mann, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Upton (eds.), A Handbook of Varieties of English, vol. 2: Morphology and Syntax, 73– 101. Berlin / New York: Mouton de Gruyter. Fludernik, Monika. 1993. The Fictions of Language and the Languages of Fiction: The Linguistic Representation of Speech and Consciousness. London / New York: Routledge. Francis, W. Nelson. 1983. Dialectology: An Introduction. London / New York: Longman. Freeborn, Dennis. 2006. From Old English to Standard English. Houndmills: Palgrave Macmillan, 3rd edn.. van Gelderen, Elly. 1998. The future of for to. American Journal of Germanic Linguistics and Literatures 10: 45–71. 306 Daniela Kolbe

van Gelderen, Elly. 2006. A History of the English Language. Amsterdam Philadel- phia: John Benjamins. George, Ken. 1993. Cornish. In: Martin J. Ball and James Fife (eds.), The Celtic Languages, 410–468. London / New York: Routledge. Gillies, William. 1993. Scottish Gaelic. In: Martin J. Ball and James Fife (eds.), The Celtic Languages, 145–227. London / New York: Routledge. Gommer, Eva and Josef Huber. 1992. Prismas Tyska Ordbok. Stockholm: Rabén Prisma. Granath, Solveig. 1997. Verb Complementation in English: Omission of Prepo- sitions before THAT-clauses and TO-infinitives. Göteborg: Acta Universitatis Gothoburgensis. Gries, Stefan Th. 2003. Grammatical variation in English: A questions of ‘structure vs. function’? In: Günter Rohdenburg and Britta Mondorf (eds.), Determinants of Grammatical Variation in English, 155–173. Berlin / New York: Mouton de Gruyter. Harris, John. 1984. English in the North of Ireland. In: Peter Trudgill (ed.), Lan- guage in the British Isles, 115–134. Cambridge: Cambridge University Press. Harris, John. 1993. The grammar of Irish English. In: James Milroy and Lesley Milroy (eds.), Real English: The Grammar of English Dialects in the British Isles, 139–186. London / New York: Longman. Hayden, Mary and Marcus Hartog. 1909. The Irish dialect of English. The Fort- nightly Review 85: 774–785; 938–947. Henry, Alison. 1992. Infinitives in a for-to dialect. Natural Language and Linguistic Theory 10: 279–301. Henry, Alison. 1995. Belfast English and Standard English. New York / Oxford: Oxford University Press. Hernández, Nuria. 2006. User’s guide to FRED: Freiburg Corpus of English Dia- lects. Herrmann, Tanja. 2005. Relative clauses in English dialects of the British Isles. In: Bernd Kortmann, Tanja Herrmann, Lukas Pietsch, and Susanne Wagner (eds.), A Comparative Grammar of British English Dialects, vol. 1: Agreement, Gen- der, Relative clauses, 21–124. Mouton de Gruyter. Hilbert, Michaela. 2008. Interrogative inversion in non-standard varieties of Eng- lish. In: Peter Siemund and Noemi Kintana (eds.), Language Contact and Con- tact Languages, 261–289. Amsterdam: John Benjamins. Howell, David C. 2003. Fundamental Statistics for the Behavioural Sciences. Bel- mont, CA: Brooks / Cole, 5th edn.. Huddleston, Rodney and Geoffrey K. Pullum. 2002. The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press. Hughes, Arthur and Peter Trudgill. 1996. English Accents and Dialects: An Intro- duction to Social and Regional Varieties of English in the British Isles. London / New York: Arnold. Complement clauses 307

Jaccard, James. 2001. Interaction Effects in Logistic Regression. Thousand Oaks, CA: Sage. Jaeger, Tim Florian. 2006. Redundancy and syntactic reduction in spontaneous speech. Ph.D. thesis, Stanford University. Jespersen, Otto. 1927. Modern English Grammar on Historical Principles: Part Three: Syntax. London: George Allan & Unwin. Johnstone, Barbara. 1987. ‘He says ... so I said’: Verb tense alternation and narrative depictions of authority in American English. Linguistics 25: 33–52. Jones, Val. 1985. Tyneside syntax: A presentation of some data from the Tyneside Linguistic Survey. In: Wolfgang Viereck (ed.), Focus on: England and Wales, 163–177. Amsterdam / Philadelphia: John Benjamins. Kewley Draskau, Jennifer. to appear 2012. Manx English. In: Bernd Kortmann (ed.), World Atlas of Varieties of English. Berlin / New York: De Gruyter Mou- ton. Kirk, John M. 1992. The Northern Ireland Transcribed Corpus of Speech. In: Gerhard Leitner (ed.), New Directions in English Language Corpora. Berlin / New York: Mouton de Gruyter. Kolbe, Daniela. 2001. Embedded inversion in the North of the British Isles. Mas- ter’s thesis, Albert-Ludwigs-Universität. Kolbe, Daniela. 2008. Complement clauses in British Englishes. Ph.D. thesis, Trier University, Trier. Unpublished. Kolbe, Daniela. 2010. The semantic and grammatical overlap of as and that: Ev- idence from non-standard English. In: Ute Römer and Rainer Schulze (eds.), Exploring the Lexis-Grammar Interface. Amsterdam: John Benjamins. Kolbe, Daniela and Andrea Sand. 2010. Embedded inversion worldwide. Lingua- culture 2 (1): 25–42. Kortmann, Bernd. 2004. Synopsis: morphological and syntactic variation in the British Isles. In: Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Upton (eds.), A Handbook of Varieties of English, vol. 2: Morphology and Syntax, 1089–1103. Berlin / New York: Mouton de Gruyter. Kortmann, Bernd and Edgar W. Schneider (eds.). 2004. A Handbook of Varieties of English. Berlin / New York: Mouton de Gruyter. Kortmann, Bernd and Benedikt Szmrecsanyi. 2004. Global synopsis: Morpholog- ical and syntactic variation in English. In: Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Upton (eds.), A Handbook of Varieties of English, vol. 2: Morphology and Syntax, 1142–1202. Berlin / New York: Mouton de Gruyter. Kortmann, Bernd, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Upton (eds.). 2004. A Handbook of Varieties of English, vol. 2: Morphology and Syntax. Berlin / New York: Mouton de Gruyter. Krug, Manfred. 1998. String frequency: A cognitive motivating factor in coales- cence, language processing, and linguistic change. Journal of English Linguis- tics 26: 286–320. 308 Daniela Kolbe

Krug, Manfred. 2000. Emerging English Modals. Berlin / New York: Mouton de Gruyter. Labov, William. 1972a. Language in the Inner City. Philadelphia: University of Pennsylvania Press. Labov, William. 1972b. Sociolinguistic patterns. Philadelphia: University of Penn- sylvania Press. Labov, William. 2001. Principles of Linguistic Change, vol. 2: Social factors. Malden, MA / Oxford: Blackwell. Ludwig-Mayerhofer, Wolfgang. 2007. ILMES: Internet-Lexikon der Methoden der empirischen Sozialforschung. http://www.lrz-muenchen.de/wlm/ilmes.htm. Macafee, Caroline. 1992. Characteristics of non-standard grammar in Scotland. Macafee, Caroline I. and Colm Ó Baoill. 1997. Why Scots is not a Celtic English. In: Hildegard L. C. Tristram (ed.), The Celtic Englishes, 245–286. Heidelberg: C. Winter. MacAulay, Donald. 1992a. The Celtic Languages: An overview. In: Donald Macaulay (ed.), The Celtic languages, 1–8. Cambridge: Cambridge University Press. MacAulay, Donald. 1992b. The Scottish Gaelic language. In: Donald MacAulay (ed.), The Celtic Languages, 137–248. Cambridge: Cambridge University Press. Mair, Christian. 1990. Infinitival Complement Clauses in English: A Study of Syntax in Discourse. Cambridge: Cambridge University Press. Mbangwana, Paul. 2004. Cameroon English: Morphology and syntax. In: Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Up- ton (eds.), A Handbook of Varieties of English, vol. 2: Morphology and Syntax, 898–908. Berlin / New York: Mouton de Gruyter. McArthur, Tom. 1992. The Oxford Companion to the English Language. London: BCA. McCloskey, James. 1991. Clause structure, ellipsis and proper government in Irish. Lingua 85: 259–302. McCloskey, James. 2006. Questions and questioning in a local English. In: Raffaella Zanuttini, Héctor Campos, Elena Herburger, and Paul H. Portner (eds.), Crosslinguistic Research in Syntax and Semantics: Negation, Tense, and Clausal Architecture, 87–126. Washington, DC: Georgetown University Press. McDavid, Virginia Glenn and William Card. 1972. Problem areas in grammar. In: A. L. Davis (ed.), Culture, Class and Language Variety: A Resource Book for Teachers, 89–132. Chicago: Center for American English, Illinois Institue of Technology. Melchers, Gunnel. 2004. English spoken in Orkney and Shetland: Morphology, syntax and lexicon. In: Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Upton (eds.), A Handbook of Varieties of Eng- lish, vol. 2: Morphology and Syntax, 34–46. Berlin / New York: Mouton de Gruyter. Complement clauses 309

Menard, Scott. 2002. Applied Logistic Regression Analysis. Thousand Oaks, CA: Sage, 2nd edn.. Millar, Robert McColl. 2007. Northern and Insular Scots. Edinburgh: Edinburgh University Press. Miller, Jim. 1993. The grammar of Scottish English. In: James Milroy and Lesley Milroy (eds.), Real English: The Grammar of English Dialects in the British Isles, 99–138. London / New York: Longman. Miller, Jim. 2004. Scottish English: Morphology and syntax. In: Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Upton (eds.), A Handbook of Varieties of English, vol. 2: Morphology and Syntax, 47–72. Berlin / New York: Mouton de Gruyter. Miller, Jim and Regina Weinert. 1998. Spontaneous Spoken Language: Syntax and Discourse. Oxford: Clarendon. Milroy, James. 1981. Regional Accents of English: Belfast. Belfast: Blackstaff Press. Milroy, James and Lesley Milroy (eds.). 1993. Real English: The grammar of English dialects in the British Isles. London: Longman. Mondorf, Britta. 2002. Gender differences in English syntax. Journal of English Linguistics 30: 158–180. Montgomery, Michael. 2006. The morphology and syntax of Ulster Scots. English World-Wide 27: 295–329. Montgomery, Michael B. 2004. Appalachian English: Morphology and syntax. In: Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Upton (eds.), A Handbook of Varieties of English, vol. 2: Morphology and Syntax, 245–281. Berlin / New York: Mouton de Gruyter. Murray, Thomas E. and Beth Lee Simon. 2004. Colloquial American English: Grammatical features. In: Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Upton (eds.), A Handbook of Varieties of Eng- lish, vol. 2: Morphology and Syntax, 221–244. Berlin / New York: Mouton de Gruyter. Noonan, Michael. 1985. Complementation. In: Timothy Shopen (ed.), Language Typology and Syntactic Description, vol. 2: Complex constructions, 42–140. Cambridge: Cambridge University Press. Ó Siadhail, Micheal. 1989. Modern Irish: Grammatical Structure and Dialectal variation. Cambridge: Cambridge University Press. OED Online, CUP. 2006. http://www.oed.com/. Ohlander, Sölve. 1986. Question-orientation versus answer-orientation in English interrogative clauses. In: Dieter Kastovsky and Aleksander Szwedek (eds.), Linguistics Across Historical and Geographical Boundaries, 963–982. Berlin / New York: Mouton de Gruyter. Oral History Society, OHS. 2007. Practical Advice: Getting started. http://www. ohs.org.uk/advice/. Pampel, Fred C. 2000. Logistic Regression: A Primer. Thousand Oaks, CA: Sage. 310 Daniela Kolbe

Parry, David. 1999. A Grammar and Glossary of the Conservative Anglo-Welsh Dialects of Rural Wales. Sheffield: The National Centre for English Cultural Tradition. Peitsara, Kirsti. 1996. Studies on the structure of the Suffolk dialect. In: Juhani Kle- mola, Merja Kytö, and Matti Rissanen (eds.), Speech Past and Present: Studies in English Dialectology in Memory of Ossi Ihalainen, 284–307. Frankfurt am Main: Peter Lang. Penhallurick, Robert. 1991. The Anglo-Welsh Dialects of North Wales. Frankfurt am Main: Peter Lang. Penhallurick, Robert. 2004. Welsh English: Morphology and syntax. In: Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Up- ton (eds.), A Handbook of Varieties of English, vol. 2: Morphology and Syntax, 102–113. Berlin / New York: Mouton de Gruyter. Quirk, Randolph and C. L. Wrenn. 1957. An Old English Grammar. London: Routledge. Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik. 1985. A Comprehensive Grammar of the English Language. London: Longman. Rizzi, Luigi and Ian G. Roberts. 1989. Complex inversion in French. Probus 1: 1–30. Rohdenburg, Günter and Britta Mondorf (eds.). 2003. Determinants of Grammati- cal Variation in English. Berlin / New York: Mouton de Gruyter. Rosenbaum, Peter S. 1967. The Grammar of English Predicate Complement Con- structions. Cambridge, MA: The MIT Press. Rudanko, Juhani. 1998. Change and Continuity in the English Language: Studies on Complementation over the Past Three Hundred Years. Lanham / New York / Oxford: University Press of America. Rudanko, Juhani. 1999. Diachronic Studies of English Complementation Patterns: 18th Century Evidence in Tracing the Development of Verbs and Adjectives Se- lecting Prepositions and Complement Clauses. Lanham, NY: University Press of America. Rudanko, Juhani. 2005. Watching English Grammar Change: A Case Study on Complement Selection in British and American English. English Language and Linguistics 10: 31–48. Rudanko, Juhani and Lea Luodes. 2005. Corpus-Based Studies on Prepositions and Complement Clauses in British and American English. Lanham, Md: University Press of America. Ryan, Tomas. 1997. Modern Regression Methods. New York: Wiley. Sabban, Annette. 1982. Gälisch-Englischer Sprachkontakt: Zur Variabilität des Englischen im gälischsprachigen Gebiet Schottlands. Eine empirische Studie. Heidelberg: Julius Groos Verlag. Sand, Andrea. 2005. Angloversals? Shared morpho-syntactic features in contact varieties of English. Ph.D. thesis, Albert-Ludwigs-Universität Freiburg. Schiffrin, Deborah. 1981. Tense variation in narrative. Language 57: 45–62. Complement clauses 311

Schiffrin, Deborah. 1987. Discourse Markers. Cambridge: Cambridge University Press. Schneider, Edgar W. 2004. Synopsis: Morphological and syntactic variation in the Americas and the Caribbean. In: Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Upton (eds.), A Handbook of Varieties of English, vol. 2: Morphology and Syntax, 1104–1115. Berlin / New York: Mouton de Gruyter. Schourup, Lawrence. 1999. Discourse markers. Lingua 107: 227–265. Sebba, Mark. 2004. British Creole: Morphology and syntax. In: Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Upton (eds.), A handbook of varieties of English, vol. 2: Morphology and Syntax, 196–208. Berlin / New York: Mouton de Gruyter. Shorrocks, Graham. 1999. A Grammar of the Dialect of the Bolton Area. Frankfurt am Main: Peter Lang. Shuken, Cynthia. 1984. Highland and Island English. In: Peter Trudgill (ed.), Lan- guage in the British Isles, 152–166. Cambridge: Cambridge University Press. Stefanowitsch, Anatol. 2006. Negative evidence and the raw frequency fallacy. Corpus Lingusitics and Linguistic Theory 2: 61–77. Stockwell, Robert P. 1984. On the history of the verb-second rule in English. In: Jacek Fisiak (ed.), Historical Syntax, 575–592. Berlin / New York: Mouton. Szmrecsanyi, Benedikt. 2006. Morphosyntactic Persistence in Spoken English. Berlin / New York: Mouton de Gruyter. Szmrecsanyi, Benedikt. 2008. Corpus-based Dialectometry: Aggregate Morphosyntactic Variability in British English Dialects. Interna- tional Journal of Humanities and Arts Computing 2 (1-2): 279–296. http://www.euppublishing.com/doi/abs/10.3366/E1753854809000433. Szmrecsanyi, Benedikt. 2011. Corpus-based dialectometry: a methodological sketch. Corpora 6 (1): 45–76. http://www.euppublishing.com/doi/abs/10.3366 /cor.2011.0004. Szmrecsanyi, Benedikt and Bernd Kortmann. 2009. The morphosyntax of varieties of English worldwide: A quantitative perspective. Lingua 119 (11): 1643–1663. Special issue “The Forests behind the Trees”, ed. by John Nerbonne & Franz Manni. Tagliamonte, Sali and Jennifer Smith. 2005. No Momentary Fancy! The Zero ’Complementizer’ in English Dialects. English Language and Linguistics 9: 289–309. Tagliamonte, Sali, Jennifer Smith, and Helen Lawrence. 2005. English dialects in the British Isles in cross-variety perspective: A base-line for future research. In: Markku Filppula, Juhani Klemola, Marjatta Palander, and Esa Penttilä (eds.), Dialects across Borders: Selected Papers from the 11th International Confer- ence on Methods in Dialectology (Methods XI), Joensuu, August 2002, 87–117. Amsterdam / Philadelpia: John Benjamins. 312 Daniela Kolbe

Tannen, Deborah. 1989. Talking Voices: Repetition, Dialogue, and Imagery in Con- versational Discourse. Cambridge: Cambridge University Press. Thomas, Alan R. 1984. Welsh English. In: Peter Trudgill (ed.), Language in the British Isles, 178–194. Cambridge: Cambridge University Press. Thomas, Alan R. 1992a. The Cornish language. In: Donald MacAulay (ed.), The Celtic Languages, 346–370. Cambridge: Cambridge University Press. Thomas, Alan R. 1992b. The Welsh language. In: Donald MacAulay (ed.), The Celtic Languages, 251–345. Cambridge: Cambridge University Press. Thomas, Alan R. 1994. English in Wales. In: Robert W. Burchfield (ed.), The Cam- bridge History of the English Language, vol. 5, 94–147. Cambridge: Cambridge University Press. Thomas, Alan R. 1997. The Welshness of Welsh English: A survey paper. In: Hildegard L. C. Tristram (ed.), The Celtic Englishes, 55–85. Heidelberg: C. Winter. Thomason, Sarah Grey and Terence Kaufman. 1988. Language Contact, Creoliza- tion and Genetic Linguistics. Berkeley: University of California Press. Thomson, Robert L. 1992. The Manx language. In: Donald MacAulay (ed.), The Celtic Languages, 100–136. Cambridge: Cambridge University Press. Timm, Erika. 2005. Historische jiddische Semantik. Tübingen: Max Niemeyer. Trask, Robert Lawrence. 1993. A Dictionary of Grammatical Terms in Linguistics. London / New York: Routledge. Tristram, Hildegard L. C. 1997. Introduction. In: Hildegard L. C. Tristram (ed.), The Celtic Englishes, 1–26. Heidelberg: C. Winter. Trotta, Joe. 1998. Wh- clauses in English: Aspects of theory and description. Ph.D. thesis, Göteborg Universitet. Trudgill, Peter (ed.). 1984. Language in the British Isles. Cambridge: Cambridge University Press. Trudgill, Peter. 2000. The Dialects of England. Malden, MA / Oxford: Blackwell, 2nd edn.. Trudgill, Peter. 2004. The dialect of East Anglia: Morphology and syntax. In: Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Upton (eds.), A handbook of varieties of English, vol. 2: Morphology and Syntax, 142–153. Berlin/New York: Mouton de Gruyter. Upton, Clive, David Parry, and J.D.A. Widdowson. 1994. Survey of English Dia- lects: The Dictionary and Grammar. London: Routledge. Van Hamel, A. G. 1912. On Anglo-Irish syntax. Englische Studien 45: 272–292. Van Valin, Robert D. 2001. An Introduction to Syntax. Cambridge: Cambridge University Press. Visser, Fredericus Th. 1966. An Historical Syntax of the English Language, Part Two: Syntactical Units with One Verb (Continued). Leiden: Brill. Wagner, Susanne. 2000. Depends how long you want for it to take: For/to clauses in present-day spoken British English. Arbeiten aus Anglistik und Amerikanistik 25: 191–211. Complement clauses 313

Wagner, Susanne. 2004. English dialects in the Southwest: Morphology and syntax. In: Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Upton (eds.), A Handbook of Varieties of English, vol. 2: Morphology and Syntax, 154–174. Berlin / New York: Mouton de Gruyter. Wakelin, Martyn F. 1972. English Dialects: An Introduction. London: Athlone. Wakelin, Martyn F. 1984. Cornish English. In: Peter Trudgill (ed.), Language in the British Isles, 195–198. Cambridge: Cambridge University Press. Wakelin, Martyn F. 1986. The Southwest of England. Amsterdam Philadelphia: John Benjamins. Weissberg, Josef. 1988. Jiddisch: Eine Einführung. Bern: Peter Lang. Wolfram, Walt. 2004. Rural and ethnic varieties in the Southeast: Morphology and syntax. In: Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar W. Schneider, and Clive Upton (eds.), A Handbook of Varieties of English, vol. 2: Morphology and Syntax, 281–302. Berlin / New York: Mouton de Gruyter. Woods, Anthony, Paul Fletcher, and Arthur Hughes. 1986. Statistics in Language Studies. Cambridge: Cambridge University Press. Zoefel, Peter. 2002. Statistik verstehen. München: Addison-Wesley.

Index

thee, 103, 175 Avoid Ambiguity Principle, 138 us, singular, 53 avoidance, 93, 101, 129 youse, 151 of ambiguity, 168 yous, 151 thee, 175, 176 Belfast English, 105 thou, 175 binding principles, 57, 129, 132, y’all, 178 134, 174, 177 as complementizer, 11 Brazilian Portuguese, 175, 177 innit, 67–69, 72 British English, 12 thee, 102 British National Corpus (BNC), 4 us, singular, 66–67, 158 British National Corpus (BNC), 75, wunnit, 67, 68, 72 202, 234, 258, 300 y’all, 141, 149, 153, 156 Brown University Standard Corpus you all, 140, 141, 148, 149, 151– of Present-Day American 153, 156 English (Brown), 144 youse, 150, 155 Catalan, 178 yous, 150, 151, 155 Celtic language, 194, 224, 228, 230– Accessibility Hierarchy 231, 236, 238, 242–243, Noun Phrase, 179 290 Adjacent Head Condition, 178 Irish, 230 adjusted residuals, 295, 298 irish, 230 allo-repetition, 179 Manx, 203, 230 ambiguity, 71, 120, 137, 139, 162, Scottish, 203, 224, 230 166, 167 Welsh, 230, 231 American English, 114, 116, 150, Chomsky, Noam, 57, 174, 225, 260, 178, 267, 300, 301 265–268, 285 analogy, 101, 154 Clause Mate Condition, 133 animacy, 84, 93, 174 clusivity, 142 Animacy Hierarchy, 84–85, 94, 160 Cockney, 68 animation, 81, 93 collinearity, 301 of topical referent, 81 complement clauses, 11 antecedent, 61, 73, 134, 166, 167, complementarity failures, 159 174, 178 complementizer, 195, 197, 198, 209, Asian, 166 219, 222, 224–227, 244– 316 Index

255, 257, 260, 265–269, disambiguation, 71, 120, 137, 138 288–291, 299, see also sub- discourse grammar, 115, 134, 177 ordinator diversity, 139 conjunction, 245, 246, 253, see also functional diversity, 53, 63 subordinator functional diversity index, 64 conjunction vs. plurality, 118 linguistic diversity, 168 Constraint on NP Like X-self, 177 double modals, 4 Context Hierarchy of Non-standard duality, 142, 155 Object Forms, 110, 124, Dutch, 74, 132, 179, 226, 227, 255 160 contraction, 69, 72 Earlier English, 200, 255 contrast, 87, 107, 115, 123–125, 127– Early Modern English, 95, 126, 175 132, 139, 165 emotivity, 76, 87, 89 coordination, 97, 102, 105, 108, 115, empathy, 115, 134, 136, 139 116, 118–121, 124, 125, Empathy Constraint on Reflex- 127, 132, 143, 156, 161, ives, 178 162, 176 emphasis, 96, 109, 115, 118, 119, correlation coefficients, 296 123–125, 127, 129, 132, correlations 138, 163, 168 explanation of, 296 England component (FRED corpus), count noun, 74, 82 54 experiential domain count–mass distinction, 75, 93 high-codability, 24–25, 42 cross tabulations low-codability, 24–25, 42 explanation of, 295 explained variance, 297 cross-dialectal, 58 focus, 79 default case, 108, 131, 143, 162, change in focus, 85 163 Freiburg Corpus of English Dialects delexicalisation, 153–155 (FRED), 53, 54, 195 demarcation, 93 French, 68, 69, 179 geographical, 157 frequency (determinant), 69, 88, 94, dialect areas 163, 168 Midlands, 9, 20, 22 Frisian, 177 North, 9, 20 function matrix, 63, 97, 170 relic, 9, 20–22, 48 Functional Diversity Hierarchy, 63, traditional, 55, 56, 157 160 transitional, 9, 20–22, 48 directedness of the verb, 135–136 Gaelic, 116 Index 317 gender Koasati, 174 grammatical gender, 73, 174 natural gender, 73, 74 Labov, William, 3, 4, 59, 103, 159, gender symbolism, 83, 91, 93 164, 207, 208, 217, 220, gendered pronouns, 54, 61, 73–94 228, 243, 291 German, 68, 69, 132, 150, 178, 227, layering, 19, 24–25, 41 255, 287 lexicalisation, 68, 69, 153 Germanic languages, 150 logistic regression grammaticalisation, 156 explanation of, 296–299 grammaticalization logophoric, logophoricity, 115, 177 degree of, 10, 19, 22, 24–25, markedness, 62, 162 41, 48 mass nouns, 74, 75, 81, 82, 93 indicators of, 24–25, 48 modesty, 66, 72, 115 Greenberg, Joseph, 162, 165 negation Hierarchy of Morphosyntactic Cat- direct, 42–43 egories in Personal Pro- double, 33–34 noun Processing, 164 narrow scope, 19, 28, 43–45 historical, 54, 59, 63, 66, 73, 74, wide scope, 19, 28, 43–45 95, 96, 100, 122, 164, 174 Newcastle Electronic Corpus of Ty- hypercorrection, 101, 132, 138, 144 neside English (NECTE), independent self-form, 54, 107, 114– 4 124, 126, 129, 131, 158, Newcastle Electronic Corpus of Ty- 161, 163 neside English (NECTE), individuation, 76, 78, 81, 84, 93, 109 165 Newfoundland English, 75, 95 continuum of, 84, 85 nominative–accusative system, 162 inflection criterion, 62 non–mobile older rural male (NORM), International Corpus of English (ICE), 3 4 non-mobile older rural male (NORM), interviews, 4 56 Irish English, 114, 115, 118, 150, Non-standard Frequencies Hierar- 151, 178, 193, 200, 224, chy, 63 226, 227, 229, 230, 238, Northern Ireland Transcribed Cor- 242, 262–264, 267, 290 pus of Speech (NITCS), 12 Jespersen, Otto, 119, 128, 129, 132, Northern Ireland Transcribed Cor- 135, 166, 227, 229 pus of Speech (NITCS), 318 Index

193, 195, 202–204, 206, possession 216, 220, 234, 241, 246, HAD GOT 247, 252, 254, 274, 287, eventive, 32–33 291, 301 possessive, 9, 19, 22–23, 30– Northern Subject Rule, 178 43, 48 Novial, 166 HAD null hypothesis, 236, 250, 275, 279, eventive, 26, 29 280 possessive, 9, 19, 26–29, 38– explanation of, 294 43, 48 HAVE GOT ,37 obligation HAVE GOT, 22–23 HAD GOT TO, 9–10, 19, 22– HAVE, 22–23 23, 34–45, 48 pragmatic, 71, 72, 75, 88, 93, 96, HAD TO, 9–10, 19, 23, 28–29, 115, 124 38–45 predictor, 298 HAVE GOT TO, 22, 37, 45–47 predictors HAVE TO, 22–23, 25, 45–47 of for to, 281–282 MUST, 45–47 of embedded inversion, 238– deontic, 45–47 240 epistemic, 45–47 of the complementizer as, 251– observer’s paradox, 4 252 odds ratio, 298 prescriptive, 71, 103, 128, 159 Old English, 95, 100, 115, 126, 174, written standard, 53, 156 177, 224, 227, 231, 266 prescriptivist, 162 outliers, 91, 112 priming, 87, 88 Oxford English Dictionary (OED), comprehension-to-production, 66, 116 179 principle of validity, 103 pejorative, 152 pronoun exchange, 54, 94–125 percentage of correctly predicted pronouns vs. anaphors, 57, 58, 174 cases, 297 prosody, 71, 72, 140, 167 Performative Hypothesis, 119 persistence, 73, 142, 158, 163, 179 qualified pronoun, 54, 139–156 picture pronoun, 107, 121, 177 quantification of plural referents, 155 Pittsburgh English, 151 question tags, 106, 112, 159 plural of authorship, 174 generic, 54, 67–72, 156 plural of majesty, 174 politeness, 115, 150, 178 referent category (determinant), 90 Index 319 reflexive, 54, 59, 60, 63, 113–116, 84, 89, 90, 93, 115, 125, 120, 122, 125, 132, 135– 134, 135, 138, 139 138, 158, 161, 166, 167, Standard English, 53, 59, 68, 70, 174, 178 72, 73, 114, 200, 211, 221, absolute, 114, 115 224, 225, 236, 244, 246, independent, 96 255, 256, 259, 260, 265, locally-bound, 162 268, 276, 284 long-distance, 166 standard pronominal paradigm, 59– override, 177 60 S-form, 125 standard vs. non-standard, 60, 193– unbound, 96, 114 194, 199–201, 219, 221– untriggered, 96, 114 224, 227, 244–247, 255– reflexive marking, 137, 139, 166, 256, 259–261, 264, 289 177 statistical significance reflexive paradigm, 120 explanation of, 294–295 reflexivity, 59, 60, 113, 166 stress, 66, 96, 97, 115, 119, 128, refunctionalization, 59 129, 168, 176 regional distribution, 218 Strict Clause Condition, 132 of for to, 275–276, 278–279 Subject of Consciousness, 134, 177 of embedded inversion, 236– subordinator, 211, 222, 223, 255, 238 256, 258–260, 269, 285, of the complementizer as, 250 301, see also complemen- Role Conflict Hypothesis, 177 tizer, conjunction Survey of English Dialects (SED), salience, 160 3 Sapir, Edward, 95 Survey of English Dialects (SED), Scots, 178 98, 100, 175 Scottish English, 114, 151, 174, 193, Swedish, 100, 255 230, 243, 262, 263 switches, 65, 85–87, 175 self-relatedness, 60 demarcation, 86, 88 semanticization, 30 functional, 86, 87, 93 snake sentences, 126, 132–138 sentence-internal, 146 social deixis, loss of, 150 syntactic criterion, 62 South–North continuum, 76, 91, 93 syntheticity vs. analyticity, 143, 147– Southern American English, 153 149 Spanish, 68, 69, 178, 226, 227 speaker perspective, 87 theory of incorporated identifica- speaker viewpoint, 73, 75, 76, 78, tion, 115 320 Index topic reading, 115 variation, 195, 196, 200, 217–219, topicalisation, 80, 93 222, 245, 247, 272 topicality, 74, 75, 88, 125 determinants of, 201, 217, 218, transcategorization, 59 291 Trudgill, Peter, 56, 81 in case, 53, 94–156 typological, 54, 72–74, 94, 113, 132, in gender, 53, 73–94 150, 162, 164 in number, 53, 65–72 in person, 53, 67–72 underspecification, 150 in verbalization of experience, universal, 97, 165, 166 19, 24 universal vs. quantificational use, morphosyntactic, 24, 42 140, 152 stable, 9–10, 23, 38, 48 syntactic, 194, 198, 201, 219, variability 291, 299 inherent, 159 internal, 159 variable(s), 218–221 verb absence/elision, 61, 71, 106– dependent, 219, 234, 250, 270– 108, 121, 124, 125, 131, 271 132, 162 dependent vs. independent, 293 Verbal Deprivation Hypothesis, 164 explanation of, 293–294 independent, 218–220, 234–236, 250, 272–273 Welsh, 116 linguistic, 201, 210, 217, 219 West Country, 74–76, 93