The Empirical Base of Linguistics: Grammaticality Judgments and Linguistic Methodology

UCLA

UCLA Previously Published Works

Title

The empirical base of linguistics: Grammaticality judgments and linguistic methodology

Permalink

https://escholarship.org/uc/item/05b2s4wg

ISBN

978-3946234043

Author

Schütze, Carson T

Publication Date

2016-02-01

DOI

10.17169/langsci.b89.101

Data Availability

The data associated with this publication are managed by: Language Science Press, Berlin

Peer reviewed

eScholarship.org

Powered by the California Digital Library

University of California

The empirical base of linguistics

Grammaticality judgments and linguistic methodology

Carson T. Schütze

language

science press

Classics in Linguistics 2

Classics in Linguistics

Chief Editors: Martin Haspelmath, Stefan Müller In this series:
1. Lehmann, Christian. Thoughts on grammaticalization 2. Schütze, Carson T. The empirical base of linguistics: Grammaticality judgments and linguistic methodology

3. Bickerton, Derek. Roots of language

ISSN: 2366-374X

The empirical base of linguistics

Grammaticality judgments and linguistic methodology

Carson T. Schütze

language

science press

Carson T. Schütze. 2019. e empirical base of linguistics: Grammaticality judgments and linguistic methodology (Classics in Linguistics 2). Berlin:

Language Science Press. is title can be downloaded at: http://langsci-press.org/catalog/book/89 © 2019, Carson T. Schütze Published under the Creative Commons Aꢀribution 4.0 Licence (CC BY 4.0): hꢀp://creativecommons.org/licenses/by/4.0/ ISBN: 978-3-946234-02-9 (Digital)
978-3-946234-03-6 (Hardcover) 978-3-946234-04-3 (Soﬅcover) 978-1-523743-32-2 (Soﬅcover US)
ISSN: 2366-374X DOI:10.5281/zenodo.3362734

Cover and concept of design: Ulrike Harbort Typeseꢀing: Felix Kopecky, Sebastian Nordhoﬀ, Carson T. Schütze Fonts: Linux Libertine, Arimo, DejaVu Sans Mono

A

Typeseꢀing soﬅware: X LT X
Ǝ E

Language Science Press Habelschwerdter Allee 45 14195 Berlin, Germany langsci-press.org

Storage and cataloguing done by FU Berlin

Language Science Press has no responsibility for the persistence or accuracy of URLs for external or third-party Internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

For my mother, Dorly Schütze, and the memory of my father, Ted Schütze

It is simultaneously the greatest virtue and failing of linguistic theory that sequence acceptability judgments are used as the basic data.

(Bever 1970b)

Contents

Preface (2016)

xi

xvii xix
Preface (1996) Aꢀnowledgments (1996)

1

Introduction

1

17
10 13 15
1.1 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Motivation: Whither Linguistics? . . . . . . . . . . . . . . . . . 1.4 A Working Hypothesis . . . . . . . . . . . . . . . . . . . . . . . 1.5 Scope and Organization . . . . . . . . . . . . . . . . . . . . . . .

2

Deﬁnitions and Historical Baꢀground

19

19 20 36 36 37 41
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 A Short History of Grammaticality . . . . . . . . . . . . . . . . 2.3 e Use of Judgment Data in Linguistic eory . . . . . . . . . .
2.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 e Dangers of Unsystematic Data Collection . . . . . . 2.3.3 A Case Study in the Use of Subtle Judgments . . . . . . 2.3.4 e Interpretation of the Annotations and Degrees of Bad-

ness . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

44

48 52
2.4 Introspection, Intuition, and Judgment . . . . . . . . . . . . . . 2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

Judging Grammaticality

55

55 56 62 62 70 74
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Tasks that Access Grammaticality . . . . . . . . . . . . . . . . . 3.3 e Nature of Graded Judgments . . . . . . . . . . . . . . . . . .
3.3.1 Is Grammaticality Dichotomous? . . . . . . . . . . . . . 3.3.2 Experiments on Chomsky’s ree Levels of Deviance . . 3.3.3 Other Experiments . . . . . . . . . . . . . . . . . . . . .
Contents

3.3.4 Ratings, Rankings, and Consistency . . . . . . . . . . . .

77

81 88 96
3.4 e Judgment Process . . . . . . . . . . . . . . . . . . . . . . . . 3.5 e Interpretation of Judgments with Respect to Competence . 3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

Subject-Related Factors in Grammaticality Judgments

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Individual Diﬀerences: ree Representative Studies . . . . . . .

97

97 98
4.3 Organismic Factors . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.3.1 Field Dependence . . . . . . . . . . . . . . . . . . . . . . 106 4.3.2 Handedness . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.3.3 Other Organismic Factors . . . . . . . . . . . . . . . . . 109
4.4 Experiential Factors . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.4.1 Linguistic Training . . . . . . . . . . . . . . . . . . . . . 112 4.4.2 Literacy and Education . . . . . . . . . . . . . . . . . . . 120 4.4.3 Other Experiential Factors . . . . . . . . . . . . . . . . . 124
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

5

Task-Related Factors in Grammaticality Judgments

127

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 5.2 Procedural Factors . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.2.1 Instructions . . . . . . . . . . . . . . . . . . . . . . . . . 129 5.2.2 Order of Presentation . . . . . . . . . . . . . . . . . . . 132 5.2.3 Repetition . . . . . . . . . . . . . . . . . . . . . . . . . . 133 5.2.4 Mental State . . . . . . . . . . . . . . . . . . . . . . . . . 138 5.2.5 Judgment Strategy . . . . . . . . . . . . . . . . . . . . . 142 5.2.6 Modality and Register . . . . . . . . . . . . . . . . . . . 144 5.2.7 Speed of Judgment . . . . . . . . . . . . . . . . . . . . . 146
5.3 Stimulus Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.3.1 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 5.3.2 Meaning . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 5.3.3 Parsability . . . . . . . . . . . . . . . . . . . . . . . . . . 160 5.3.4 Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . 161 5.3.5 Lexical Content . . . . . . . . . . . . . . . . . . . . . . . 162 5.3.6 Morphology and Spelling . . . . . . . . . . . . . . . . . 164 5.3.7 Rhetorical Structure . . . . . . . . . . . . . . . . . . . . 164
5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

viii
Contents

6

eoretical and Methodological Implications

167

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 6.2 Modeling Grammaticality Judgments . . . . . . . . . . . . . . . 168
6.2.1 Previous Work . . . . . . . . . . . . . . . . . . . . . . . 168 6.2.2 e Outlines of a Preliminary Model . . . . . . . . . . . 169 6.2.3 Applications of the Model . . . . . . . . . . . . . . . . . 177
6.3 Methodological Proposals . . . . . . . . . . . . . . . . . . . . . . 180
6.3.1 Materials . . . . . . . . . . . . . . . . . . . . . . . . . . 180 6.3.2 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 183 6.3.3 Analysis and Interpretation of Results . . . . . . . . . . 191
6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

7

Looking Baꢀ and Looking Ahead

199

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 7.2 Directions for Further Research . . . . . . . . . . . . . . . . . . 201 7.3 e Future in Linguistics . . . . . . . . . . . . . . . . . . . . . . 206

References Indexes
209 229

Name Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

ix

Preface (2016)

Since the original version of this book (University of Chicago Press, 1996) went out of print in the 2000s, I have continued to receive inquiries from people asking how they can obtain a copy. I am therefore thrilled that Language Science Press has offered to make the title available again, as part of their Classics in Linguistics series. I would like to thank series editors Stefan Müller and Martin Haspelmath, as well as Sebastian Nordhoff and Felix Kopecky, for their help in making this happen.
e content of this new printing is identical to the first printing, with the following exceptions:

• I have altered the wording in a few places where I found it insuﬃciently clear or terminologically outdated;

• my uses of the term informant(s) have been replaced with consultant(s) or speaker(s), in keeping with current practice (of course, the former term still appears in some quoted passages);

• I have updated the reference information for a couple of works that had not been published at the time of the original printing, particularly Cowart (1997);

• the original index has been split into name and subject indexes, and both are now more comprehensive.

In terms of presentation, the following things have changed:
• the format of citations and references has been adapted to LangSci house style, as have other minor typographical choices;

• full given names have been added to references whenever available; • since the text has been freshly typeset, the page numbers do not match those of the original printing; however, the (sub)section numbers are unchanged: I suggest using those if it is necessary to specify a location within a chapter. Example numbers are also unchanged.
Preface (2016)
Importantly, I have not aꢀempted to update the content in light of subsequent relevant research, since this would undoubtedly have compelled me to try to write a whole new book. Of course, linguistics and psycholinguistics have changed a great deal in the 20 years since I completed the original manuscript; e.g., “theoretical” linguistics has notably become more “experimental.” Also, some of my own views on the issues have evolved over those two decades. ere are passages in the book that I would have omiꢀed or altered, if I had allowed myself to make any substantive revisions. Instead, I have chosen to restrict all followup discussion to this preface. In what follows I try to point readers to works that should allow them to “get up to speed” on intervening developments.
For collections that are comprised mainly of papers on topics that are important in the book, see McNair et al. (1996), Penke & Rosenbach (2004), Kepser & Reis (2005), Borsley (2005), Featherston (2007) and replies in the same journal issue, Featherston & Sternefeld (2007), Featherston & Winkler (2009), and Winkler & Featherston (2009). My more recent views can be found in the following surveys: Schütze (2006; 2011) and Schütze & Sprouse (2013).
ere have been (at least) four major developments involving the empirical base of linguistics that anyone interested in the topic should be aware of.

1. e adaptation of the magnitude estimation task from psychophysics to judgment collection (Bard, Robertson & Sorace 1996). is was touted as having numerous potential advantages over the traditional Likert scale task, most or all of which have been subsequently refuted (see Weskoꢀ & Fanselow 2011 and Sprouse, Schütze & Almeida 2013).

2. e use of World Wide Web searches to establish aꢀestation, and infer acceptability, of certain sentence/construction types. I discuss the limitations of this approach in Schütze (2009).

3. e use of Amazon Mechanical Turk (AMT) and potentially other crowdsourcing platforms as sources of subjects for acceptability judgment and many other psycholinguistic experiments (so far, in only a handful of languages). For an empirical investigation of how AMT results compare with judgments collected in the lab (on a small range of constructions in English), see Sprouse (2011).

4. Detailed empirical challenges to – and defenses of – the proposal, advocated in Section 7.2 in the book, that Subjacency eﬀects could be reduced to processing factors. See Yoshida et al. (2014) as well as the Stanford/ Maryland debate (Hofmeister & Sag (2010); Hofmeister, Staum Casasanto

xii
Preface (2016)
& Sag (2012a,b); Sprouse, Wagers & Phillips (2012a,b), and many of the contributions in Sprouse & Hornstein 2014).

Finally, there is a statement by Chomsky, which I aꢀribute in the book (p. 195) to a popular press source, about which I have oﬅen been questioned, wherein Chomsky calls it a truism that genetically based Universal Grammar (UG) is subject to some individual variation. For those who have asked whether Chomsky’s position can be conﬁrmed in any academic publications, I oﬀer the following quotes:

Puꢀing aside genetic variation (an interesting but marginal phenomenon in the case of language) and conceivable but unknown epigenetic eﬀects, the principles of UG, whatever they are, are invariant. (Chomsky 2013: 35)

It is hardly controversial that [the faculty of language] is a common human possession apart from pathology, to an approximation so close that we can ignore variation. (Chomsky 2008: 138)

I am aware of no empirical evidence that would indicate how much UG can vary across individuals.

Carson T. Schütze

December 2015

References

Bard, Ellen Gurman, Dan Robertson & Antonella Sorace. 1996. Magnitude estimation of linguistic acceptability. Language 72. 32–68.

Borsley, Robert D. (ed.). 2005. Data in theoretical linguistics. Special issue 115(11).

Chomsky, Noam. 2008. On phases. In Robert Freidin, Carlos P. Otero & Maria

Luisa Zubizarreta (eds.), Foundational issues in linguistic theory: Essays in honor

of Jean-Roger Vergnaud, 133–166. Cambridge, MA: MIT Press.
Chomsky, Noam. 2013. Problems of projection. Lingua 130. 33–49.

Cowart, Wayne. 1997. Experimental syntax: Applying objective methods to sentence

judgments. ousand Oaks, CA: SAGE Publications.
Featherston, Sam. 2007. Data in generative grammar: e stick and the carrot.

eoretical Linguistics 33(3). 269–318.

Featherston, Sam & Wolfgang Sternefeld (eds.). 2007. Roots: Linguistics in search of its evidential base. Berlin: Mouton de Gruyter.

xiii
Preface (2016) Featherston, Sam & Susanne Winkler (eds.). 2009. e fruits of empirical linguistics. Volume 1: Process. Berlin: Mouton de Gruyter.
Hofmeister, Philip & Ivan A. Sag. 2010. Cognitive constraints and island eﬀects.
Language 86. 366–415.
Hofmeister, Philip, Laura Staum Casasanto & Ivan A. Sag. 2012a. How do individual cognitive diﬀerences relate to acceptability judgments? A reply to Sprouse, Wagers, and Phillips. Language 88. 390–400.
Hofmeister, Philip, Laura Staum Casasanto & Ivan A. Sag. 2012b. Misapplying working-memory tests: A reductio ad absurdum. Language 88. 408–409.
Kepser, Stephan & Marga Reis (eds.). 2005. Linguistic evidence: Empirical, theoret-

ical, and computational perspectives. Berlin: Mouton de Gruyter.

McNair, Lisa, Kora Singer, Lise M. Dobrin & Michelle M. AuCoin (eds.). 1996. Pa-

pers from the parasession on theory and data in linguistics (CLS 32/2). Chicago:

Chicago Linguistic Society.
Penke, Martina & Aneꢀe Rosenbach (eds.). 2004. What counts as evidence in lin-

guistics? Special issue 28(3).

Phillips, Colin. 2006. e real-time status of island phenomena. Language 82. 795–
823.
Schütze, Carson T. 2006. Data and evidence. In Keith Brown (ed.), Encyclopedia of language and linguistics, 2nd edn., vol. 3, 356–363. Oxford: Elsevier.
Schütze, Carson T. 2009. Web searches should supplement judgements, not sup-

plant them. Zeitschriﬅ für Sprachwissenschaﬅ 28. 151–156.

Schütze, Carson T. 2011. Linguistic evidence and grammatical theory. Wiley In-

terdisciplinary Reviews: Cognitive Science 2. 206–221.

Schütze, Carson T. & Jon Sprouse. 2013. Judgment data. In Robert J. Podesva &
Devyani Sharma (eds.), Research methods in linguistics, 27–50. New York: Cambridge University Press.
Sprouse, Jon. 2011. A validation of Amazon Mechanical Turk for the collection of acceptability judgments in linguistic theory. Behavior Research Methods 43(1). 155–167.
Sprouse, Jon & Norbert Hornstein (eds.). 2014. Experimental syntax and island eﬀects. Cambridge: Cambridge University Press.
Sprouse, Jon, Carson T. Schütze & Diogo Almeida. 2013. A comparison of informal and formal acceptability judgments using a random sample from Linguistic Inquiry 2001–2010. Lingua 134. 219–248.
Sprouse, Jon, Maꢀ Wagers & Colin Phillips. 2012a. A test of the relation between working memory and syntactic island eﬀects. Language 88. 82–123.

xiv
Preface (2016)
Sprouse, Jon, Maꢀ Wagers & Colin Phillips. 2012b. Working-memory capacity and island eﬀects: A reminder of the issues and the facts. Language 88. 401– 407.
Weskoꢀ, omas & Gisbert Fanselow. 2011. On the informativity of diﬀerent measures of linguistic acceptability. Language 87. 249–273.
Winkler, Susanne & Sam Featherston (eds.). 2009. e fruits of empirical linguistics. Volume 2: Product. Berlin: Mouton de Gruyter.
Yoshida, Masaya, Nina Kazanina, Leticia Pablos & Patrick Sturt. 2014. On the origin of islands. Language 29(7). 761–770.

xv

Preface (1996)

e goal of this book is to demonstrate that the absence of methodology of grammaticality judgments in linguistics constitutes a serious obstacle to meaningful research, and to begin to propose suitable remedies for this problem. roughout much of the history of linguistics, judgments of the grammaticality/acceptability of sentences (and other linguistic intuitions) have been the major source of evidence in constructing grammars. While this seems to have been an exceedingly fruitful approach, some skeptics have worried that theoretical linguists are in fact constructing grammars of intuition, which might not have much to do with the competence that underlies everyday production or comprehension of language. Also, in the pseudoexperimental procedure of judgment elicitation there is typically no aꢀempt to impose any of the standard experimental controls, and oﬅen the only subject is the theorist himself or herself. Should we linguists be worried? I think so. I survey the way grammaticality judgments are currently used in theoretical syntax, and argue that such uses, together with the problems of intuition and experimental design, demand a careful examination of judgments, not as pure sources of data, but as instances of metalinguistic performance.
Several important issues arise when this view of grammaticality judgments is pursued, including which tasks one should use to elicit them, what people are doing when they give them, and what they can really tell us about linguistic competence. On the assumption that grammaticality judgments result from interactions among primary language faculties of the mind and general cognitive processes, I try to understand the process by identifying and analyzing its component parts. I review the psycholinguistic research that has examined ways in which the judgment process can vary with differences among subjects, experimental manipulations, and spurious features of the stimulus. Parallels with other cognitive behaviors are pointed out. Aﬅer drawing together the substantive and methodological findings into a schematic picture of what the overall process of giving linguistic intuitions might look like, I propose strategies for collecting these intuitions that avoid the pitfalls of previous work and take account of the conditions that have been shown to influence such judgments. I suggest that we can actually strengthen the case for linguistic universals by giving empirical ar-