<<

Formal Grammar

This volume draws together fourteen previously published papers which explore the nature of mental grammar through a formal, generative approach. The book begins by outlining the development of formal gram- mar in the last fifty years, with a particular focus on the work of , and moves into an examination of a diverse set of phenomena in various languages that shed light on theory and model construction. Many of the papers focus on comparisons between English and Norwegian, high- lighting the importance of comparative approaches to the study of language. With a comprehensive collection of papers that demonstrate the richness of formal approaches, this volume is key reading for students and scholars interested in the study of grammar.

Terje Lohndal is Full Professor of English linguistics in the Department of Language and Literature at the Norwegian University of Science and Tech- nology, where he also serves as Deputy Head of Research and Director of the PhD program in Language and Linguistics. In addition he also holds an Adjunct Professor position at UiT The Arctic University of Norway. Routledge Leading Linguists Edited by Carlos P. Otero University of California, Los Angeles, USA

For a full list of titles in this series, please visit www.routledge.com

15 Regimes of Derivation in Syntax and Morphology Edwin Williams

16 Typological Studies Word Order and Relative Clauses Guglielmo Cinque

17 Case, Argument Structure, and Word Order Shigeru Miyagawa

18 The Equilibrium of Human Syntax Symmetries in the Brain Andrea Moro

19 On Shell Structure Richard K. Larson

20 Primitive Elements of Grammatical Theory Papers by Jean-Roger Vergnaud and His Collaborators Edited by Katherine McKinney-Bock and Maria Luisa Zubizarreta

21 Pronouns, Presuppositions, and Hierarchies The Work of Eloise Jelinek in Context Edited by Andrew Carnie and Heidi Harley

22 Explorations in Maximizing Syntactic Minimization Samuel D. Epstein, Hisatsugu Kitahara, and T. Daniel Seely

23 Merge in the Mind-Brain Essays on Theoretical Linguistics and the Neuroscience of Language Naoki Fukui

24 Formal Grammar Theory and Variation across English and Norwegian Terje Lohndal Formal Grammar Theory and Variation across English and Norwegian

Terje Lohndal First published 2018 by Routledge 711 Third Avenue, New York, NY 10017 and by Routledge 2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN Routledge is an imprint of the Taylor & Francis Group, an informa business © 2018 Taylor & Francis The right of Terje Lohndal to be identified as author of this work has been asserted by him in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data A catalog record for this book has been requested ISBN: 978-1-138-28969-7 (hbk) ISBN: 978-1-315-26705-0 (ebk) Typeset in Sabon by Apex CoVantage, LLC Contents

Acknowledgments vii Original Publication Details ix

Introduction 1

PART A Transformational Constraints 17 1 Brief Overview of the History of Generative Syntax 19 2 Noam Chomsky: A Selected Annotated Bibliography 61 3 Comp-t Effects: Variation in the Position and Features of C 85 4 Freezing Effects and Objects 113 5 Medial-wh Phenomena, Parallel Movement, and Parameters 149 6 Sentential Subjects in English and Norwegian 175 7 Be Careful How You Use the Left Periphery 203

PART B The Syntax–Semantics Interface 229 8 Negative Concord and (Multiple) Agree: A Case Study Of West Flemish 231 9 Medial Adjunct PPs in English: Implications for the Syntax of Sentential Negation 265 10 Neo-Davidsonianism in Semantics and Syntax 287 11 Interrogatives, Instructions, and I-Languages: An I-Semantics for Questions 319 vi Contents PART C Multilingualism and Formal Grammar 369 12 and Language Mixing 371 13 Language Mixing and Exoskeletal Theory: A Case Study of Word-Internal Mixing in American Norwegian 381 14 Grammatical Gender in American Norwegian Heritage Language: Stability or Attrition? 413

Index 443 Acknowledgments

I am grateful to many scholars for commenting on the ideas and drafts behind the papers included in the present volume. In particular, I am grateful to Tor A. Åfarli, Artemis Alexiadou, Elly van Gelderen, Liliane Haegeman, Norbert Hornstein, Howard Lasnik, Paul Pietroski, and Marit Westergaard for all their comments and support over the years. I would also like to thank Carlos Otero for publishing this book in his esteemed series, and my co- authors for allowing me to include our joint work in this volume.

Original Publication Details

1 Lasnik, Howard and Terje Lohndal. 2013. Brief Overview of the History of Generative Grammar. In The Cambridge Handbook of Generative Syntax, Marcel den Dikken (ed.), 26–60. Cambridge: Cambridge Uni- versity Press. 2 Lohndal, Terje and Howard Lasnik. 2013. Noam Chomsky. Oxford Bibliographies 3 Lohndal, Terje. 2009. Comp-t Effects: Variation in the Position and Fea- tures of C. Studia Linguistica 63: 204–232. 4 Lohndal, Terje. 2011. Freezing Effects and Objects. Journal of Linguis- tics 47: 163–199. 5 Lohndal, Terje. 2010. Medial-wh Phenomena, Parallel Movement, and Parameters. Linguistic Analysis 34: 215–244. 6 Lohndal, Terje. 2014. Sentential subjects in English and Norwegian. Syntaxe et Sémantique 15: 81–113. 7 Haegeman, Liliane and Terje Lohndal. 2015. Be careful how you use the left periphery. In Structures, Strategies and Beyond: Studies in Honour of Adriana Belletti, Elisa Di Domenico, Cornelia Hamann, & Simona Matteini (eds.), 135–162. Amsterdam: John Benjamins. 8 Haegeman, Liliane and Terje Lohndal. 2010. Negative Concord and (Multiple) Agree: A Case Study of West Flemish. Linguistic Inquiry 41: 181–211. 9 De Clercq, Karen, Liliane Haegeman and Terje Lohndal. 2012. Medial adjunct PPs in English: Implications for the syntax of sentential nega- tion. Nordic Journal of Linguistics 35: 5–26. 10 Lohndal, Terje and Paul Pietroski. 2011. Interrogatives, Instructions, and I-languages: An I-Semantics for Questions. Linguistic Analysis 37: 458–515. 11 Lohndal, Terje. In press. Neodavidsonianism in semantics and syntax. In The Oxford Handbook of Event Structure, Robert Truswell (ed.). Oxford: Oxford University Press. 12 Lohndal, Terje. 2013. Generative grammar and language mixing. Theo- retical Linguistics 39: 215–224. x Original Publication Details 13 Berg Grimstad, Maren, Terje Lohndal and Tor A. Åfarli. 2014. Lan- guage mixing and exoskeletal theory: A case study of word-internal mix- ing in American Norwegian. Nordlyd 41: 213–237. 14 Lohndal, Terje and Marit Westergaard. 2016. Grammatical Gender in American Norwegian Heritage Language: Stability or attrition? Fron- tiers in Psychology. doi:10.3389/fpsyg.2016.00344 Introduction1

Human languages are inextricably a part of our mind/brain. No other ani- mal has a comparable ability with the same complexity and richness that humans do. An important research goal is to better understand this ability for language: What is it that enables human to acquire and use language the way we do? One way of answering this is to argue that there are aspects of our biology that enable us to acquire and use language. This has been the answer that in modern times has been advocated by generative grammar, in particular in approaches developed based on work by Noam Chomsky (1965, 1986, 2009), although its origins are much older. This approach holds that there are universal aspects of language that all humans share. However, it is at the same time evident that languages also differ: A child growing up in Japan will acquire Japanese whereas a child growing up in Norway will acquire Norwegian. An adequate theory of human language needs to be able to account for both possible universals and language variation. However, a core question is what such an adequate theory may look like. This volume consists of essays that adopt a formal approach to linguistic variation and apply it in different areas: syntactic variation in synchronic grammars, the interface between syntax and semantics, and aspects of the grammar of multilingual individuals. In this introduction, these general themes are discussed, albeit briefly, before a summary of the individual chapters is provided.

A Formal Approach to Grammar Formal and generative linguists are concerned with developing formal descriptions of the structures of human language. In some of his earli- est work, Chomsky (1955, 1957), drawing on Harris (1951), developed phrase-structural analyses for a fragment of English. For example, the grammar in (2) can be utilized to generate the derivation in (3), yielding the sentence in (1).

(1) Linda sings. (2) a. Designated initial symbol (Σ): S 2 Introduction b. Rewrite rules (F): S → NP VP NP → N VP → V N → Linda V → sings (3) a. Line 1: S b. Line 2: NP VP c. Line 3: N VP d. Line 4: N V e. Line 5: N sings f. Line 6: Linda sings Importantly, Chomsky introduced a level of abstract structure that was not present in earlier work. We see that in rewrite rules that utilize structure independently of the words (e.g., NP → N). In all modern work on formal grammars, an important research question has been the number of levels of representation and their nature. The formal details of this abstractness have changed as new approaches have emerged, but the existence of abstraction has always been a staple of formal approaches to grammar. Questions soon emerged regarding what phrase-structural grammars are describing. Chomsky (1959, 1965) argues that a formal grammar should describe the competence of the native speaker; that is, it should characterize the mental grammars that each of us have internalized. In order to develop this line of reasoning, Chomsky (1965) distinguishes between descriptive and explanatory adequacy. A descriptively adequate grammar is a grammar that correctly describes the set of sentences that are grammatical, while also ruling out those sentences that are ungrammatical. Explanatory adequacy is characterized as follows: To the extent that a linguistic theory succeeds in selecting a descriptively adequate grammar on the basis of primary linguistic data, we can say that it meets the condition of explanatory adequacy. That is, to this extent, it offers an explanation for the intuition of the native speaker on the basis of an empirical hypothesis concerning the innate predispo- sition of the child to develop a certain kind of theory to deal with the evidence presented to him. (Chomsky 1965: 25–26) This displays a clear mental perspective on language whereby language is an aspect of the mind/brain. Formal linguistics, then, need to develop models of grammars that both respond to the descriptive generalizations and how this grammar can be selected based on available data and prior structure in a human being. A crucial concept in the present tradition is the notion of an I-language. The notion is due to Chomsky (1986) but also goes back to Church’s (1941) for- malization of the lambda calculus. Chomsky (1986) distinguishes I-language Introduction 3 from E-language, whereby the latter describes language use and aspects of language that are external to the mind and to the speaker. Notions such as “English” and “Norwegian” are typical examples of E-language, and so are corpora and other data collections. E-language constitutes the data from which I-language can be distilled. The “I” connotes “individual”, “internal” and “intensional”. The first two notions make it clear that language has a psychological or mental existence in each and one of us. The last notion, that it is intensional, is more complex. This is the notion that relates to Church (1941) and his formalizations of functions. Church distinguishes between a function in extension and a function in intension. Roughly put, these can be thought of as “output” and “input,” respectively. A simplified example may serve as an illustration. Consider the following two functions in (4):

(4) a. f(x) = x + 2 b. f(y) = 10 − y

For a given value of x and y, say 4, both functions yield the same result:

(5) a. f(4) = 4 + 2 = 6 b. f(4) = 10−4 = 6

With Church, we can say that the extension of the functions is the same in (5). However, the intensions of the functions are not the same: In one case there is addition, in another case subtraction. The goal of an I-language approach to the study of language is to determine what the intensional func- tion is, not just the extensional output. Consider the sentence in (6):

(6) Daniel likes cookies.

This particular sentence can be considered the extension of a function. There is a range of different analyses that could be given for this sentence. (7) provides three different analyses: 4 Introduction Formal generative work has uncovered that among the alternatives in (6), the structural relations depicted in (6c) are the most accurate ones (though see Borer 2005a, b and Lohndal 2014, among many others, for yet another alternative). Determining what the intensional function is, that is, what the accurate formal structural analysis is for natural language, is a crucial part of the generative enterprise as developed by Chomsky and many others.

Language Variation In Chomsky and Lasnik (1977) and Chomsky (1981), a specific theory of I-language was developed which became known as the Principles and Param- eters theory. On this view, there are universal principles that hold across all languages and limited variation that is encoded by way of parameters. In the words of Chomsky (1981: 6), “[i]deally we hope to find that complexes of properties [. . .] are reducible to a single parameter, fixed in one or another way.” Such a model is substantially different from earlier approaches, where universal grammar was a specification of an infinite array of possible gram- mars. On that view, explanatory adequacy required a presumably unfeasible search procedure to find the highest-valued grammar based on the relevant input (primary linguistic data). The Principles and Parameters approach eliminated the necessity of such a search procedure. Since then there has been a lot of discussion concerning the nature and structure of parameters (see, among many, Borer 1984, Chomsky 1995, Baker 1996, 2001, 2008, Kayne (2005, Newmeyer 2005, Fukui 2006, Bib- erauer 2008, Biberauer and Roberts 2012, 2016, Westergaard 2013, Alex- iadou 2014 for expositions based on different empirical domains). Two main proposals have emerged: the macroparametric and the microparamet- ric view. The macroparametric view holds that there are major parameters that distinguish languages from each other and that parametric properties are linked by way of implications (see, e.g., Baker 1996 on polysynthesis and Hale 1982 on the non-configurationality parameter). Consider e.g., the pro- posal in Rizzi (1982) for the null subject parameter. Rizzi argues that the following properties correlate: thematic null subjects, null expletives, free inversion, and that-trace effects. Since then it has become clear that cluster- ing effects are not as strong as originally thought (see, e.g., Newmeyer 2005 and Biberauer 2008). However, a different way of developing the intuitions behind a macroparametric approach is provided in Biberauer and Roberts (2012, 2016). They suggest that parameters come in different sizes and that there is a cross-linguistic taxonomy for parameters. (8) shows this based on Biberauer and Roberts (2016: 260).

(8) For a given value vi of a parametrically variant feature F:

a. Macroparameters: all functional heads of the relevant type share vi; b. Mesoparameters: all functional heads of a given naturally definable

class, e.g. [+V], share vi; Introduction 5 c. Microparameters: a small subclass of functional heads (e.g. modal

auxiliaries) shows vi; d. Nanoparameters: one or more individual lexical items is/are speci-

fied for vi. This view fits better with the cross-linguistic generalizations, and it also makes more accurate predictions concerning the structure of linguistic variation. A different take on linguistic variation is the Lexical Parametrization Hypoth- esis (Borer 1984, and since adopted and developed by many others), locating variation in the features of particular items. Since then, functional heads have become a very important locus for parametric variation (e.g., Kayne 2005). The major appeal of this view is that it puts the acquisition of variation on a similar footing as the acquisition of lexical items and, furthermore, that this view would be sensitive to fine-masked differences between languages. Yet another view is the view of Westergaard (2013, 2014) where param- eters are replaced by micro-cues (originally a development of the cue-based approach in Lightfoot 1999, see also Fodor 1998 and Dresher 1999). Micro- cues are small pieces of abstract structure that emerge from children parsing the input. Universal grammar is the ability to parse the input, whereas the specific micro-cues emerge through parsing and input together. The size of a micro-cue is a relevant question, but work done by Westergaard and others already suggests that these cues come in different sizes. Despite the various approaches and proposals, there is a consensus that the basic idea still holds: Certain aspects of language are universal, and a range of other properties vary in limited ways. This idea is also what distinguishes Chomskyan generative grammar from all other approaches to language and grammar, as most other approaches hold that there is no universality related to language per se. In order to further our understanding of the space of variation, it is nec- essary to both compare languages that are typologically very different, and languages that are not very different. The present volume draws together papers that primarily scrutinize differences between two closely relates languages, namely, English and Norwegian. By comparing languages that are closely related both structurally and in terms of heritage, it is possible to more easily isolate fine-grained properties that differ and, thereby, also to understand exactly where grammars vary and where they do not vary. Several chapters scrutinize various empirical puzzles and demonstrate how these illuminate theory and model construction.

The Syntax–Semantics Interface Ever since the first work within generative grammar, a major concern has been the relationship between syntactic representations and the meaning of these rep- resentations. Chomsky outlines the importance already in :

In proposing that syntactic structure can provide a certain insight into problems of meaning and understanding we have entered onto 6 Introduction dangerous ground. There is no aspect of linguistic study more subject to confusion and more in need of clear and careful formulation than that which deals with the points of connection between syntax and semantics. (Chomsky 1957: 93)

A leading intuition in much work has been that the semantic component “reads off” the syntactic representations. Put differently, semantic interpre- tation takes the syntactic structure as its input and respects its relations. This view is often called “interpretive semantics” as opposed to a semantics with its own principles and rules, often called “generative semantics” (see, e.g., Chomsky 1965 vs. Lakoff 1971). The details of how the semantic interpretation takes place have been the subject of much debate. A major view is the approach that first was implemented by Montague (1974) for fragments of English (see also Partee 1975). This was a model-theoretic approach to semantics building on the foundational work by Frege and Tarski. The main textbook version of this approach is the one outlined in Heim and Kratzer (1998), where syntactic structures are interpreted by way of a formal semantics model with its own and independent principles. In modern developments of this approach, the syntax–semantics interface is not entirely transparent, as both the syntax and the semantics allow you to adjust the relevant representations. An alternative approach is the one outlined in Davidson (1967), which is motivated based on sentences such as (9), based on Davidson (1967: 82):

(9) Jones buttered the toast slowly in the bathroom with a knife at midnight.

Davidson argues that all the adverbial modifiers have an event variable, which derives the entailment that (9) entails (10):

(10) a. Jones buttered the toast slowly in the bathroom with a knife. b. Jones buttered the toast slowly in the bathroom. c. Jones buttered the toast slowly. d. Jones buttered the toast.

Davidson argues that the logical form consists of conjuncts of event predi- cates. Later on, scholars have developed this further to be a general principle of semantic composition, by arguing that concatenation signifies conjunc- tion (see Pietroski 2005 for much discussion). Most of this work has not addressed the question of what the syntax underlying conjunctive logical forms look like, although recent work has started to address this issue (see, e.g., Borer 2005a, b, Lohndal 2014). In addition to much discussion of the correct characterization of meaning, there has also been a lot of work on argument structure which has argued that certain semantic relations are encoded in the syntax. Since Chomsky (1995), Harley (1995) and Kratzer (1996), researchers have argued that Introduction 7 the Agent is introduced by a dedicated functional projection, VoiceP or vP (Alexiadou, Anagnostopoulou, and Schäfer 2006, 2015, Folli and Har- ley 2007, Merchant 2013), distinguishing between the external and all the internal arguments (Williams 1981, Marantz 1984). Much work has since extended this to hold of all arguments, meaning that every argument is introduced by a dedicated projection (Borer 2005ab, 2013, Ramchand 2008, Bowers 2010, Lohndal 2012, 2014). Put differently, syntactic struc- ture is essential for determining argument structure. Marantz (2013: 153) summarizes recent developments as follows:

[current developments in linguistic theory] have shifted discussion away from verb classes and verb-centered argument structure to the detailed analysis of the way that structure is used to convey meaning in lan- guage, with verbs being integrated into the structure/meaning relations by contributing semantic content, mainly associated with their roots, to subparts of a structured meaning representation.

In this view, the syntax transparently provides the correct relations for semantic interpretation to take place. In the current volume, several chapters discuss issues relevant for model- ing the syntax-semantics interface. They are especially concerned with nega- tion, interrogatives, and argument structure.

Formal Grammar and Multilingualism Chomsky (1965) makes a much-cited idealization concerning the object of study:

Linguistic theory is concerned primarily with an ideal speaker-listener, in a completely homogeneous speech-community, who knows its language perfectly and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of the lan- guage in actual performance. (Chomsky 1965: 3)

This idealization has been very helpful for uncovering a range of important generalizations which in turn have contributed to a better theoretical under- standing of human language in general. However, we know that probably the majority of speakers in the world are multilingual in some way or other. There has been a lot of generative work on second-language acquisition (see, e.g., Hawkins 2001, White 2003, Slabakova 2016), although less general work on the grammatical representations in multilingual individuals. For- mal models and theories should also be able to account for multiple mental grammars within the same individual, as the way in which these emerge, are put to use, and possibly interact also constitute possible human grammatical 8 Introduction patterns and rules. That is, multiple I-languages within the same individual are just as important as a single I-language within an individual is. In addition to formal work on second-language acquisition, there has also been continuous work on code-switching or language mixing from a Chomskyan generative point of view (see, e.g., Sankoff and Poplack 1981, Woolford 1983, Di Sciullo, Muysken and Singh 1986; Belazi, Rubin and Toribio 1994, MacSwan 1999, 2000, 2005, Muysken 2000, van Gelderen and MacSwan 2008, González-Vilbazo and López 2011, Alexiadou et al. 2015). This work has proved very interesting because mixing data address the nature of interacting grammars and what their possible restrictions are. However, one question which has so far not been scrutinized is the ques- tion of what the smallest units that can be mixed are. Data such as (11) show that mixing clearly go beyond the word level:

(11) det andre crew-et (Haugen 1953: 571) the.n other crew-df.sg.n “ the other crew”

In (11), the noun crew has received the Norwegian inflection for neuter nouns, meaning that (1) the noun has been assigned grammatical gender, and (2) there is language mixing within the word level (pace MacSwan 1999, 2000, 2005). However, is it a category-less root that is mixed, or is it a stem that is already categorized as a noun? That is an important and yet unresolved question, though one chapter in the present volume addresses the question. See also Alexiadou et al. (2015) for a different perspective. Chapters in this book address three different issues: why multilingual data are important for formal models, what multilingual data can say about modeling that monolingual data do not address, how a formal model can account for various patterns of word-internal language mixing, and how grammatical gender systems may change in heritage populations.

Three Parts: Transformations, Interface, and Multilingualism The present volume is organized in three different parts, which are reviewed and contextualized.

Part A: Transformational Constraints Since Harris (1951) and Chomsky (1955, 1957), transformations have been a crucial part of formal grammar. Often scholars think of transformations as denoting syntactic movement processes, but the headline here has been chosen because originally transformations were richer and denoted what we today would label “syntactic operations”. This part of the volume is Introduction 9 therefore concerned with constrains on syntactic operations, although the majority of the case studies are focusing on restrictions on what and when you can move a constituent in a syntactic structure. The first chapter (coauthored with Howard Lasnik), despite its title, provides a rather lengthy and comprehensive discussion of the history of generative grammar, mainly in the tradition emerging from Chomsky’s work. It expands on the issues discussed all too briefly in this discussion and attempts to show the major lines of development from the 1950s until today. The major topics are covered: how generative grammar emerged and developed in the beginning, how theories of phrase structure have devel- oped, in addition to other core notions such as the syntax–semantics inter- face, filters, derivations versus representations, and economy. An important point argued for in this chapter is that there is a lot of continuity within Chomskyan generative grammar despite its changing formal apparatus. For that reason, the chapter focuses on the general and overall theoretical ideas and principles rather than in depth discussion of specific empirical analyses. There is no denying that Noam Chomsky (1928–) has been a pivotal fig- ure in modern linguistics and arguably the most influential scholar within generative grammar. In part because of this, his ideas, position and influ- ence are subject to often intense debate. Chapter 2 (again coauthored with Howard Lasnik) provides an annotated bibliography of Chomsky’s work, both within linguistics and . It also includes work that is critical of Chomsky’s ideas. Although by no means complete, the bibliography is hopefully a useful entry into the massive literature produce by and about Chomsky. After these two more general chapters, a series of chapters follows which study restrictions on movement and syntactic dependencies more generally. A crucial question that is addressed is when movement is possible, when it is not possible, and, in turn, why certain movements are not possible in certain languages. Chapter 3 considers the famous that-trace effect, which at least goes back to Perlmutter (1971). It begins by reviewing a specific proposal for how to analyze English and then extends this analysis to the Scandinavian languages. However, the chapter seeks to go beyond that-trace effects and deal with complementizer-trace effects more generally. This also includes an analysis of why relative clauses show a reversed that-trace effect. In Chap- ter 4, aspects of the ideas developed in Chapter 3 are developed much fur- ther and into a new empirical domain. The topic of Chapter 4 is freezing effects as they relate to objects, both direct and indirect objects. A freezing effect is an instance where further movement is prohibited; the constituent is frozen in a specific position. Importantly, the analysis argued for crucially relies on a specific analysis on what we could call freezing effects for sub- jects, of which complementizer-trace effects would be one example. In Chapter 5, a very different set of phenomena are studied, namely, instances where it seems like multiple members of a chain are pronounced. 10 Introduction The focus of the chapter is instances of wh-movement where both vari- ous dialects/languages and developmental child language exhibits instances where the intermediate wh-constituent is pronounced. The chapter looks at restrictions governing these data and provides an analysis which argues that the grammatical structures of developing children and adults should be analyzed differently. Chapter 6 returns to the topic of subjects, although this time focuses exclu- sively on subjects that are sentential. The two main questions addressed are where sentential subjects are located in the sentential structure, and whether or not sentential subjects have the same structural position across languages. A detailed comparison of English and Norwegian illustrates that the answer to the latter question is negative: Sentential subjects the canonical subject position in some languages whereas they occupy a special topic position in others. The last chapter in Part A, Chapter 7 (coauthored with Liliane Haege- man), considers gapping in English and previously published analyses of this phenomenon. The chapter critically discusses the role of the left periph- ery of the clause in analyzing gapping, the main general concern being what restrictions there are on movement and how these should be formally implemented.

Part B: The Syntax–Semantics Interface Some of the chapters in this part are mostly concerned with the structures underlying semantic interpretation, whereas two of the chapters develop specific mapping hypotheses for the syntax–semantics interface. The first two chapters in Part B are concerned with negation and its struc- tural representation. Chapter 8 (coauthored with Liliane Haegeman) scruti- nizes what the correct representation for negative concord in West Flemish is. West Flemish is important because it shows that there are restrictions on which negative words can go together, with implications for the gen- eral theoretical mechanism of Multiple Agree. The chapter argues against Multiple Agree on the grounds that it does not predict or derive the correct empirical generalizations. Negation is also the topic of Chapter 9 (coauthored with Karen De Clercq and Liliane Haegeman), this time based on corpus data from English. The chapter demonstrates that medial nonnegative adjunct PPs are attested in both American and British English, contrary to claims often made in the liter- ature. Furthermore, the data show that medial negative adjunct PPs strongly outnumber postverbal negative adjunct PPs. In addition, the chapter devel- ops a syntactic analysis that relies on a polarity head in the left periphery. Chapter 10 discusses the impact and development of Donald David- son’s original proposal that there is an event variable in the logical forms that encode meaning in natural languages (Davidson 1967). Originally, Davidson was concerned with adjuncts and their entailments, but this Introduction 11 chapter demonstrates how these insights were extended to apply to thematic arguments. An important point is that there is a family of neo-Davidsonian proposals that all have in common that they argue for logical forms that are neo-Davidsonian in nature. Aspects of the formal semantics used in Chapter 10 is also the topic of Chapter 11. Chapter 11 (coauthored with Paul Pietroski) is a very long dis- cussion of what an I-language semantics would look like for questions. The chapter relies on different semantic formalism than in the standard formal semantic literature (e.g., Heim and Kratzer 1998) and combines this with the syntax for questions provided in Cable (2010). In many ways, the chap- ter can be conceived of as an initial case study of some phenomena and how these can be captured.

Part C: Multilingualism and Formal Grammar This part of the book contains three chapters that all explore formal accounts of aspects of multilingualism. The chapters all focus on heritage languages, which can be defined as follows:

A language qualifies as aheritage language if it is a language spoken at home or otherwise readily available for young children, and crucially this language is not a dominant language of the larger (national) society. [. . .] From a purely linguistic point of view, we assume that an indi- vidual qualifies as a heritage speaker, if and only if he or she has some command of the heritage language acquired naturalistically. (Rothman 2009: 156)

The heritage language in question is American Norwegian, which is a heri- tage variety of Norwegian spoken in the US since the 1850s. Chapter 12 is an epistemological paper that seeks to justify why for- mal models should be able to account for individuals with multiple men- tal grammars. The chapter was originally a commentary on Benmamoun, Montrul and Polinsky (2013), and the goal was to show how multilingual data, notably data from heritage languages, can shed light on theoretical issues in syntax and morphology. In Chapter 13 (coauthored with Maren Berg Grimstad and Tor A. Åfarli), we argue that aspects of language mixing can be analyzed in a formal model that combines two different theories: An exoskeletal approach to grammar (e.g., Borer 2005ab, 2013, Lohndal 2014) and Distributed Morphology’s notion of late insertion (Halle and Marantz 1993, Embick and Noyer 2007) can be straightforwardly extended to cover multilingual situations. The main empirical focus is on language mixing within verbs and nouns in the heritage language American Norwegian, where we show how the model captures the main empirical mixing pattern: “Norwegian” functional mor- phology combined with “English” roots/stems. 12 Introduction The last chapter in the book, Chapter 14 (coauthored with Marit Wester- gaard), investigates grammatical gender in the heritage language American Norwegian. Norwegian has the three genders—masculine, feminine, and neuter—and this chapter shows that for many of the speakers of Ameri- can Norwegian, this gender system has changed quite significantly: There is overgeneralization of masculine forms to both the feminine and the neuter. The chapter also proposes a way to distinguish between incomplete acquisi- tion and attrition: The former should lead to systematic differences between the heritage variety and the nonheritage variety, whereas attrition will lead to general erosion and eventually complete loss.

In Conclusion Hopefully, this volume demonstrates the utility of comparative work on closely related varieties such as English and Norwegian. Taken together, it furthermore presents examples of the richness of formal approaches and what they cover empirically. The chapters illuminate both theoretical and formal models of grammar, how language variation speaks to such models, and, last, how these models have developed across time.

Note 1 I am grateful to Artemis Alexiadou for helpful comments on this introduction.

References Alexiadou, A. 2014. Multiple Determiners and the Structure of DPs. Amsterdam: John Benjamins. Alexiadou, A., Anagnostopoulou, E. and Schäfer, F. 2006. The properties of anti- causatives crosslingustically. In Phases of Interpretation, M. Frascarelli (ed.), 187–212. Berlin: Mouton de Gruyter. Alexiadou, A., Anagnostopoulou, E. and Schäfer, F. 2015. External Arguments in Transitivity Alternations: A Layering Approach. Oxford: Oxford University Press. Alexiadou, A., Lohndal, T., Åfarli, T. A. and Grimstad, M. B. 2015. Language mix- ing: A distributed morphology approach. In Proceedings of the Fourty-Fifth Annual Meeting of the North East Linguistic Society, T. Bui and D. Özyildiz (eds.), 25–38. Create Space. Baker, M. C. 1996. The Polysynthesis Parameter. Oxford: Oxford University Press. Baker, M. C. 2001. The Atoms of Language. New York: Basic Books. Baker, M. C. 2008. The macroparameter in a microparametric world. In The Lim- its of Syntactic Variation, M. T. Biberauer (ed.), 351–374. Amsterdam: John Benjamins. Belazi, H. M., Rubin, E. J. and Toribio, A. J. 1994. Code switching and X-Bar theory. Linguistic Inquiry 25: 221–237. Benmamoun, E., Montrul, S. and Polinsky, M. 2013. Heritage languages and their speakers: Opportunities and challenges for linguistics. Theoretical Linguistics 39: 129–181. Introduction 13 Biberauer, M. T. 2008. Introduction. In The Limits of Syntactic Variation, M. T. Biberauer (ed.), 1–74. Amsterdam: John Benjamins. Biberauer, M. T. and Roberts, I. 2012. Towards a parameter hierarchy for auxilia- ries: Diachronic considerations. Cambridge Occasional Papers in Linguistics 6: 267–294. Biberauer, M. T. and Roberts, I. 2016. Parameter typology from a diachronic per- spective. In Theoretical Approaches to Linguistic Variation, E. Bidese, F. Cognola and M. C. Moroni (eds.), 259–291. Amsterdam: John Benjamins. Borer, H. 1984. Parametric Syntax. Dordrecht: Foris. Borer, H. 2005a. Structuring Sense I: In Name Only. Oxford: Oxford University Press. Borer, H. 2005b. Structuring Sense II: The Normal Course of Events. Oxford: Oxford University Press. Borer, H. 2013. Structuring Sense III: Taking Form. Oxford: Oxford University Press. Bowers, J. 2010. Arguments as Relations. Cambridge, MA: MIT Press. Cable, S. 2010. The Grammar of Q. Oxford: Oxford University Press. Chomsky, N. 1955. The Logical Structure of Linguistic Theory. Ms., Harvard Uni- versity. [Revised version published in part by Plenum, New York, 1975]. Chomsky, N. 1957. Syntactic Structures. The Hague: Mouton. Chomsky, N. 1959. Review of Verbal Behavior by B.F. Skinner. Language 35: 26–58. Chomsky, N. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press. Chomsky, N. 1981. Lectures on Government and Binding. Dordrecht: Foris. Chomsky, N. 1986. Knowledge of Language. New York: Praeger. Chomsky, N. 1995. The . Cambridge, MA: MIT Press. Chomsky, N. 2009. : A Chapter in the History of Rationalist Thought. 3rd edition. Cambridge: Cambridge University Press. Chomsky, N. and Lasnik, H. 1977. Filters and control. Linguistic Inquiry 11: 425–504. Church, A. 1941. The Calculi of Lambda-Conversion. Princeton, NJ: Princeton Uni- versity Press. Davidson, D. 1967. The logical form of action sentences. In The Logic of Decision and Action, N. Resher (ed.), 81–95. Pittsburgh: Pittsburgh University Press. Di Sciullo, A-M., Muysken, P. and Singh, R. 1986. Government and code-mixing. Journal of Linguistics 22: 1–24. Dresher, E. 1999. Charting the learning path: Cues to parameter setting. Linguistic Inquiry 30: 27–67. Embick, D. and Noyer, R. 2007. Distributed morphology and the syntax morphol- ogy interface. In The Oxford Handbook of Linguistic Interfaces, G. Ramchand and C. Reiss (eds.), 289–324. Oxford: Oxford University Press. Fodor, J. D. 1998. Unambiguous triggers. Linguistic Inquiry 29: 1–36. Folli, R. and Harley, H. 2007. Causation, obligation, and argument structure: On the nature of little v. Linguistic Inquiry 38: 197–238. Fukui, N. 2006. Theoretical Comparative Syntax. London: Routledge. González-Vilbazo, K. and López, L. 2011. Some properties of light verbs in code- switching. Lingua 121: 832–850. Hale, K. 1982. Warlpiri and the grammar of non-configurational languages. Natural Language and Linguistic Theory 1: 5–47. Halle, M. and Marantz, A. 1993. Distributed morphology and the pieces of inflec- tion. In The View from Building 20: Essays in Linguistics in Honor of Sylvain 14 Introduction Bromberger, K. Hale and S. J. Keyser (eds.), 111–176. Cambridge, MA: MIT Press. Harley, H. 1995. Subjects, Events, and Licensing. Doctoral dissertation, MIT. Harris, Z. 1951. Methods in Structural Linguistics. Chicago, IL: The University of Chicago Press. Haugen, E. 1953. The Norwegian Language in America. Philadelphia: University of Philadelphia Press. Hawkins, R. 2001. Second Language Syntax. Malden: Blackwell. Heim, I. and Kratzer, A. 1998. Semantics in Generative Grammar. Malden: Blackwell. Kayne, R. 2005. Some notes of comparative syntax with special reference to ­English and French. In The Oxford Handbook of Comparative Syntax, G. Cinque and R. Kayne (eds.), 3–69. Oxford: Oxford University Press. Kratzer, A. 1996. Severing the external argument from the verb. In Phrase Structure and the Lexicon, J. Rooryck and L. Zaring (eds.), 109–137. Dordrecht: Kluwer. Lakoff, G. 1971. In semantics: An interdisciplinary reader. In Philosophy, Linguis- tics and Psychology, D. D. Steinberg and L. A. Jakobovits (eds.), 232–296. Cam- bridge: Cambridge University Press. Lightfoot, D. 1999. The Development of Language: Acquisition, Change and Evolu- tion. Malden: Blackwell. Lohndal, T. 2012. Without Specifiers: Phrase Structure and Events. Doctoral disser- tation, University of Maryland. Lohndal, T. 2014. Phrase Structure and Argument Structure: A Case-Study of the Syntax Semantics Interface. Oxford: Oxford University Press. MacSwan, J. 1999. A Minimalist Approach to Intra-Sentential Code-Switching. New York: Garland. MacSwan, J. 2000. The architecture of the bilingual faculty: Evidence from intrasen- tential code switching. Bilingualism 3: 37–54. MacSwan, J. 2005. Codeswitching and generative grammar: A critique of the MLF model and some remarks on “modified minimalism”. Bilingualism: Language and Cognition 8: 1–22. Marantz, A. 1984. On the Nature of Grammatical Relations. Cambridge, MA: MIT Press. Marantz, A. 2013. Verbal argument structure: Events and participants. Lingua 130: 152–168. Merchant, J. 2013. Voice and ellipsis. Linguistic Inquiry 44: 77–108. Montague, R. 1974. Formal Philosophy. New Haven: Yale University Press. Muysken, P. 2000. Bilingual Speech: A Typology of Code-Mixing. Cambridge: Cam- bridge University Press. Newmeyer, F. J. 2005. Possible and Probable Languages. Oxford: Oxford University Press. Partee, B. H. 1975. Montague grammar and transformational grammar. Linguistic Inquiry 6: 203–300. Perlmutter, D. 1971. Deep and Surface Structure Constraints in Syntax. New York: Holt. Pietroski, P. 2005. Events and Semantic Architecture. Oxford: Oxford University Press. Ramchand, G. 2008. Verb Meaning and the Lexicon: A First Phase Syntax. Cam- bridge: Cambridge University Press. Rizzi, L. 1982. Issues in Italian Syntax. Dordrecht: Foris. Introduction 15 Rothman, J. 2009. Understanding the nature and outcomes of early bilingualism: Romance languages as heritage languages. International Journal of Bilingualism 13: 55–163. Sankoff, D. and Poplack, S. 1981. A formal grammar for code-switching. Research on Language and Social Interaction 14: 3–45. Slabakova, R. 2016. Second Language Acquisition. Oxford: Oxford University Press. van Gelderen, E. and MacSwan, J. 2008. Interface conditions and code-switching: Pronouns, lexical DPs, and checking theory. Lingua 118: 765–776. Westergaard, M. 2013. The acquisition of linguistic variation: Parameters vs. micro- cues. In In Search of Universal Grammar: From Old Norse to Zoque, T. Lohndal (ed.), 275–298. Amsterdam: John Benjamins. Westergaard, M. 2014. Linguistic variation and micro-cues in first language acquisi- tion. Linguistic Variation 14: 26–45. White, L. 2003. Second Language Acquisition and Universal Grammar. Cambridge: Cambridge University Press. Williams, E. 1981. Argument structure and morphology. The Linguistic Review 1: 81–114. Woolford, E. 1983. Bilingual code-switching and syntactic theory. Linguistic Inquiry 14: 520–536.

Part A Transformational Constraints

1 Brief Overview of the History of Generative Syntax*

with Howard Lasnik

1.1 Background and Outline Scientific grammar goes back to the Middle Ages, and specifically the study, by Modistic philosophers, of language as a phenomenon independent of thought. In a sense, the tradition is even older, dating back to Classical Antiquity, and spanning several cultures—after all, every traditional writ- ing system presupposes some serious linguistic theorizing. In the human- istic Renaissance, philosophers started worrying, also, about the relation between , and as the Ages of Exploration and Reason came to be, about the problem of creativity and what it reveals about the natural world—where according to Descartes it effectively constituted a “second substance”. By the time Darwin began to revolutionize our think- ing about human nature, philology was a profession in its own right, so much so that the discovery of the Indo-European ancestry and how it gave rise to hundreds of different languages served as a central inspiration in Darwin’s evolutionary theory. Many of the theoretical insights of linguistics in the twentieth century date back to this modern tradition, particularly as coupled together with late nineteenth and early twentieth century develop- ments in mathematical logic and philosophy more generally. Saussure (1916) initiated contemporary structural linguistics by empha- sizing how language should be conceived as separate from what it is used for and by concentrating on how language is, not how it changes. Bloomfield (1933), Wells (1947) and Harris (1951) developed structuralism further and Noam Chomsky’s work developed, in particular, in immediate reaction to Harris’s program. A fundamental difference between structuralism and generative grammar stems from the fact that Chomsky focused on those aspects of structure that make the system recursive, whereas structuralism left those for the realm of what we nowadays call performance. Structural- ism in fact focused on finite levels of language, such as morphophonemics, where notions like “linguistic feature” or the paradigmatic inventory under- lying phonemics came to be understood (see again especially Harris 1951). But it was the syntax put to the side at the time that especially interested Chomsky, particularly since it was taken to address a key element in the 20 Transformational Constraints problem of linguistic creativity. For this purpose, Chomsky borrowed from the axiomatic-deductive method in mathematical logic, developed a genera- tion earlier in its computational formulation—more concretely via Davis (1958; which had circulated as a draft much prior to its publication date). Chomsky systematized and generalized Emil Post’s version of “recursive function theory” (see Post 1944), and eventually came to propose formal devices of his own (“transformations”; see the following). Aside from these theoretical considerations pertaining to the precise struc- ture of language and its implications, generative grammar from Chomsky’s perspective always had a conceptual angle that informs the enterprise to this day: Syntax is seen as a natural system, somehow rooted in human psychol- ogy and biology. This point of view constituted the bulk of Chomsky’s reac- tion to behaviorism, his later exploration of complex forms of biology, and, more generally, his insistence over six decades on approaching linguistic structure with the same sorts of tools and attitudes that one should assume for an intricate biological phenomenon, like adaptive immunity. All of Chomsky’s work has centered on two fundamental questions:

(1) What is the correct characterization of someone who speaks a language? What kind of capacity is “knowledge of language”? (2) How does this capacity arise in the individual? What aspects of it are acquired by exposure to relevant information (“learned”), and what aspects are present in advance of any experience (“wired in”)?

Chomsky’s earliest work, in the 1950s, raised and focused on question (1), since explicit and comprehensive answers to that question had never been provided before. Chomsky’s answer posited a computational system in the human mind that provides statements of the basic phrase structure patterns of languages (phrase structure rules) and more complex operations for manipu- lating these basic phrase structures (transformations). This framework, and its direct descendants, fall under the general title Transformational Genera- tive Grammar (generative meaning explicit, in the sense of mathematics). In the 1960s, the research began to shift more toward question (2), and Chomsky called the theory that was developed the Standard Theory. Chomsky coined the term explanatory adequacy for putative answers to that question. A theory of language, regarded as one component of a theory of the human mind, must make available grammars for all possible human languages. To attain explanatory adequacy, the theory must in addition show how the learner selects the correct grammar from among all the available ones, based on restricted data. The theories of the 1950s and early 1960s made an infinite number of grammars available, so the explanatory problem was severe. Through the late 1960s and 1970s, to enhance explanatory adequacy, theorists proposed more and more constraints on the notion “possible human grammar”. Ross (1967) was a particularly influential and pioneer- ing study looking at locality restrictions (den Dikken and Lahne 2013). These moves were explicitly motivated by considerations of explanatory A Brief History of Generative Syntax 21 adequacy, though general considerations of simplicity also played a role. This culminated in the Principles and Parameters framework (Bošković 2013) and, more specifically, in the Government and Binding approach that Chomsky (1981) proposes. The latter led to a wide range of cross-linguistic research since a core part of the program involved comparative syntax and used comparative data to help refine theoretical definitions of terms like government and binding. At the same time as these developments took place, a number of research- ers departed from Chomsky’s specific approach. Generative Semantics, in particular, was a very prominent theory in the late 1960s; today some Gen- erative Semantics ideas have returned, as we discuss in the following. In the early 1980s, nontransformational theories such as Lexical-Functional Grammar (Kaplan and Bresnan 1982; Sells 2013), Generalized Phrase Structure Grammar (Gazdar et al. 1985; Blevins and Sag 2013) and Tree Adjoining Grammar (Joshi, Levy and Takahashi 1975, Joshi 1985; see Frank 2013) were also developed. We say a bit more about these in the fol- lowing and contextualize them to make it clear why they emerged and what the main differences are between these theories and the more mainstream Chomskyan theories. In the late 1980s, Chomsky started to explore what has become known as the Minimalist Program, with its emphasis on simplicity in theorizing and on moving beyond explanatory adequacy in the sense of asking why the language faculty has the properties it does. This approach is most explic- itly outlined in Chomsky (1995b). Recent and ongoing work by Chomsky (2000, 2001, 2004, 2007, 2008) and many others continues to develop this framework. This chapter is organized as follows. Section 1.2 discusses the earliest gen- erative approaches, namely, those explicated in Syntactic Structures (1957) and Aspects of the Theory of Syntax (1965). We examine some relevant dif- ferences between these two theories, and we discuss some general properties of transformations. Section 1.3 discusses the syntax/semantics interface in early generative grammar and beyond, whereas Section 1.4 is an overview of how phrase structure has developed from the early days of generative grammar until today. In section 1.5, we discuss the role in the evolving theories of rules and filters versus principles. Section 1.6 is concerned with derivations and the derivation versus representation issue. In Principles and Parameters theory, Chomsky explicitly introduced economy principles for the first time, and we give a summary of some of these in section 1.7. A few concluding remarks are provided in section 1.8.

1.2 The Earliest Generative Approaches: Syntactic Structures and Aspects Chomsky’s earliest work developed in reaction to the structuralist work mentioned in section 1.1. As a student of Zellig Harris, Chomsky was very familiar with Harris’s program and he developed his own work in reaction 22 Transformational Constraints to Harris (1951). Harris had one sentence transform into another. This approach was therefore not able to give any systematic explanation for the more abstract kind of phenomena Chomsky started to deal with in The Log- ical Structure of Linguistic Theory (LSLT 1955) and Syntactic Structures. In order to deal with these phenomena, it is necessary to relate abstract struc- tures to abstract structures. Let us now look at some of the characteristics of Chomsky’s earliest work. Infinity and structure are the fundamental characteristics of human lan- guage, and they can both be captured, in part, by way of a context-free phrase structure (PS) grammar. One such device (a Σ, F grammar in Chom- sky’s terminology) consists of

(3) a. A designated initial symbol (or a set thereof) (Σ); b. Rewrite rules (F), which consist of a single symbol on the left, fol- lowed by an arrow, followed by at least one symbol.

A derivation consists of a series of lines such that the first line is one of the designated initial symbols, and to proceed from one line to the next we replace one symbol by the sequence of symbols it can be rewritten as, until there are no more symbols that can be rewritten. For instance, given

(4) a. Designated initial symbol (Σ): S b. Rewrite Rules (F): S  NP VP NP  N VP  V N  John V  laughs we can obtain a derivation as in (5):

(5) Line 1: S Line 2: NP VP Line 3: N VP Line 4: N V Line 5: John V Line 6: John laughs

Chomsky (1965) called rules like the last two in (4), which rewrite a particu- lar nonterminal symbol as a single terminal symbol, lexical insertion rules— a distinction not made in the theories of Chomsky (1955, 1957). PS grammars capture constituent structure by introducing nonterminal (unpronounced) symbols. Given (5), we can connect each symbol with the symbol(s) it rewrites as. In this way we can trace back units of structure. After joining the symbols we can represent the derivation in the standard A Brief History of Generative Syntax 23 form of a tree as in (6a). Getting rid of symbols that are mere repetitions, we end up with the collapsed tree in (6b):

More technically, a phrase marker for a terminal string is the set of all strings occurring in any of the equivalent derivations of that string, where two PS derivations are equivalent if and only if they involve the same rules the same number of times (not necessarily in the same order). This is a result that Chomsky (1955) proved by showing that for two PS derivations to be equivalent, they have to collapse down to the same PS tree. See section 1.4.1 for further discussion.

1.2.1 Transformations and Generalized Transformations Finite-state machines can easily capture infinity, one of the two fundamental characteristics of human language (see Lasnik 2000 for much discussion), and if we move one level up on the (Chomsky 1956), we can avail ourselves of PS grammars. These grammars are more powerful devices that capture both infinity and structure. Interestingly, the theory in both Syntactic Structures and The Logical Struc- ture of Linguistic Theory (Chomsky 1955, henceforth LSLT) did not have recursion in the base, that is, PS rules, or sequences of them, that allow self embedding. Instead, complicated structures, hence infinity, were created by special operations, called generalized transformations, which put together the simple structures generated by the PS rules. For example, to derive John knew that Mary understood the theory, first the separate structures underlying John knew it and Mary understood the theory were generated by the method described above; then a generalized transformation inserted the second of these structures into the first. Metaphorically, a generalized transformation grafts one tree onto another. Put differently, in this theory recursion was in the “transformational component”.1 In more recent times, Tree Adjoining Grammar (TAG) developed this approach further (Joshi, Levy and Takahashi 24 Transformational Constraints 1975, Joshi 1985; see Frank 2013) by arguing for a system of tree rewriting. In this theory, a derivation works on a set of predefined pieces of tree struc- ture. These pieces are called elementary trees, and they are expanded and combined with one another so that structures are built through generalized transformations. Still more recently, Frank (2002) suggested a way to inte- grate the Minimalist approach to grammar suggested by Chomsky with TAG. The structures created by phrase structure rules and generalized transfor- mations could be altered by singulary transformations.2 Singulary transfor- mations apply to single P-markers and derived P-markers, which is to say that they apply to one tree. Chomsky showed how singulary transforma- tions can explain the relatedness between, for example, statements and cor- responding questions:

(7) a. Susan will solve the problem.  Will Susan solve the problem? b. John is visiting Rome.  Is John visiting Rome?

The members of each pair come from the same initial P-marker, with sin- gulary transformations producing the divergent surface shapes. One of the great triumphs of the analysis of such pairs in LSLT is that it was able to use the same singulary transformation for the interrogative sentences in (7) and the superficially very different one in (8).

(8) Susan solved the problem.  Did Susan solve the problem?

This was a significant achievement since the relations are felt by native speakers to be parallel, an otherwise mysterious fact. Chomsky also showed how, in numerous situations, even properties of individual sentences cannot be adequately characterized without recourse to the descriptive power of singulary transformations. One major example involved the sequences of English auxiliary verbs and the inflectional suffixes associated with them. The revolutionary insight here (and also in the analysis of (7)-(8)) was that these bound morphemes, especially the one carrying tense and agreement, are autonomous items as far as the syntax is concerned, capable of undergo- ing syntactic operations independently until eventually uniting with a verbal element (a process that came to be called “Affix Hopping”). The Affix Hop- ping transformation rises above the limitations of phrase structure (which at best can simply list the possible sequences) and simultaneously captures the generalizations about linear ordering of the elements, their morphological dependencies, the location of finite tense, the form of inversion and sentence negation, and the distribution of auxiliary do.3 There was, thus, consider- able motivation for this new device relating more abstract underlying struc- tures to more superficial surface representations. In fact, one of the major conceptual innovations in the entire theory is the proposal that a sentence has not just one structure, closely related to the way it is pronounced, but an additional abstract structure (potentially very different from the superficial A Brief History of Generative Syntax 25 one), and intermediate structures between these two. This is fundamental to all the analyses in the Chomskyan system. The organization of the syntactic portion of the grammar is as follows: Application of the phrase structure rules creates a P-marker, or, in the case of a complex sentence, a set of P-markers. Then successive application of transformations (singulary and generalized) creates successive phrase struc- ture representations (derived P-markers), culminating in a final surface representation. The syntactic levels in this theory are that of phrase struc- ture and that of transformations, the latter giving a history of the trans- formational derivation (the successive transformational steps creating and affecting the structure). The representations at these levels are the P-marker and the T-marker respectively. The final derived P-marker is the input to phonological interpretation, and the T-marker is the input to semantic interpretation.4 Let us consider some of the formal properties of transformations as they are stated in Syntactic Structures. Each transformation has a structural anal- ysis (SA) and a structural change (SC). The SA characterizes the class of structures to which the transformation applies. The SC specifies the altera- tions that the process carries out. An SA is a sequence of terms or a set of sequences of terms. Elements that can constitute a term are listed in a general fashion in (9):

(9) a. any sequence of symbols (terminals, nonterminals, and variables) or b. a set of sequences of symbols or c. a Boolean combination of these

SCs are able to carry out the following elementary operations:

(10) a. adjunction of one term to another (to the right or the left) b. deletion of a term or sequence of terms c. adjunction of new material that was not in the structure before to a term d. permutation

A SC for Chomsky was a set of elementary operations. Other properties of transformations are that they are ordered and that they are specified as being optional or obligatory. For some transformations it is crucial that we be allowed but not required to apply them; for others it is necessary that we be required to apply them. Last, the transformations in Syntactic Structures also occasionally had a global dependency: They can refer back to any other stage of a derivation. We do not go through an example of an early generative syntactic analy- sis here but instead refer the reader to Lasnik (2000: 53ff.) for a thorough illustration of several early transformations. 26 Transformational Constraints 1.2.2  Chomsky (1965) Chomsky (1965), henceforth Aspects, presented a revised conception of the grammar, based on an alternative way of constructing complex sentences, one that Chomsky argued was an advance in terms of simplicity and explan- atory adequacy over the one in LSLT. In the LSLT framework, as discussed earlier, the phrase structure rules produce simple monoclausal structures, which can then be merged together by generalized transformations. Gener- alized transformations were thus the recursive component of the grammar, the one responsible for the infinitude of language. In the alternative view, the phrase structure rule component itself has a recursive character. Con- sider the complex sentences in (11):

(11) a. Mary reads books. b. John thinks that Mary reads books. c. Susan said John thinks Mary reads books.

By adding a recursive “loop” to a standard set of phrase structure rules, we can directly create the possibility of ever longer sentences. Such a rule is given in (12).

(12) VP  V S

Under this approach to sentence embedding, unlike that in LSLT, there is one unified structure underlying a sentence prior to the operation of any syntactic transformations. This structure is the result of application of the phrase structure rules and lexical insertion transformations which insert items from the lexicon into the skeletal structure.5 Chomsky argued in Aspects that this underlying structure, which he there named “Deep Struc- ture”, is the locus of important generalizations and constitutes a coherent level of representation. Let us say a bit more about the latter concept before we move on. Levels of representation were introduced into the theory in the following way in LSLT:

We define, in general linguistic theory, a system of levels of represen- tation. A level of representation consists of elementary units (primes), an operation of concatenation by which strings of primes can be con- structed, and various relations defined on primes, strings of primes, and sets and sequences of these strings. Among the abstract objects con- structed on the level L are L-markers that are associated with sentences. The L-marker of a sentence S is the representation of S on the level L. A grammar of a language, then, will characterize the set of L-markers for each level L and will determine the assignment of L-markers to sentences. (Chomsky 1955/1975: 6) A Brief History of Generative Syntax 27 The child learning a language is assumed to bring knowledge of the lev- els to bear on the task of learning. That is, the child must learn properties of the language at each level but knows the levels in advance; hence, he or she knows what to look for. The levels are part of Universal Grammar. Of course, the linguist does not know in advance of research what the levels are. Determining them is a scientific question, one of biological psychol- ogy. Throughout the years, Chomsky and others have devoted considerable attention to determining just what the levels of representation are in the human language faculty. In LSLT, the levels were considered to be phonet- ics, phonemics, word, syntactic category, morphemics, morphophonemics, phrase structure, and transformations. Throughout the years, the levels have changed in important and interesting ways. Chomsky’s major arguments for the new level, Deep Structure, in Aspects were that it resulted in a simpler overall theory, and at the same time it explained the absence of certain kinds of derivations that seemed not to occur (or at least seemed not to be needed in the description of sentences of human languages). Taking the second of these points first, Chomsky argued that while there is extensive ordering among singulary transformations (situations where a derivation produces an unacceptable sentence if two transformations are applied in reverse order), “there are no known cases of ordering among generalized embedding transformations although such ordering is permitted by the theory of Transformation-markers” (Chomsky 1965: 133; see also Fillmore 1963, Lees 1963). Furthermore, while there are many cases of singulary transformations that must apply to a constituent sentence before it is embedded or that must apply to a matrix sentence after another sentence is embedded in it, “there are no really convincing cases of singulary transformations that must apply to a matrix sentence before a sentence transform is embedded in it” (Chomsky 1965: 133). As for the first argument, Chomsky claimed that the theory of transfor- mational grammar is simplified by this change, since the notions “general- ized transformation” and “Transformation-marker” are eliminated entirely. The P-markers in the revised theory contain all of the information of those in the LSLT version, but they also indicate explicitly how the clauses are embedded in one another, that is, information that had been provided by the embedding transformations and T-markers. This change in the theory of phrase structure, which has the effect of elim- inating generalized transformations, also has consequences for the theory of singulary transformations. As indicated previously, in the Aspects theory, as in LSLT, there is extensive ordering among singulary transformations. In both frameworks, the set of singulary transformations was seen as a linear sequence: an ordered list. Given the Aspects modification, this list of rules applies cyclically, first operating on the most deeply embedded clause, then the next most deeply embedded, and so on, working up the tree until they apply on the highest clause, the entire generalized P-marker. Thus, singulary transformations apply to constituent sentences “before” they are embedded 28 Transformational Constraints and to matrix sentences “after” embedding has taken place. “The ordering possibilities that are permitted by the theory of Transformational-markers but apparently never put to use are now excluded in principle” (Chomsky 1965: 135).

1.3 The Syntax/Semantics Interface in Early Generative Grammar and Beyond An important question for any syntactic theory is how syntax relates to semantics: what the precise connection is between form and meaning. In LSLT, the T-marker contains all of the structural information relevant to semantic interpretation. Katz and Postal (1964) proposed a severe restric- tion on just how this structural information could be accessed. In particular, they postulated that the only contribution of transformations to semantic interpretation is that they interrelate P-markers. The slogan at the time was that “transformations do not change meaning”. As Chomsky put it, (generalized) transformations combine semantic interpretation of already interpreted P-markers in a fixed way. In the revised theory, which Chomsky called the Standard Theory, the initial P-marker, now a Deep Structure, then contains just the information relevant to semantic interpretation. To sum- marize the model,

the syntactic component consists of a base that generates Deep Struc- tures and a transformational part that maps them into Surface Struc- tures. The Deep Structure of a sentence is submitted to the semantic component for semantic interpretation, and its Surface Structure enters the phonological component and undergoes phonetic interpretation. The final effect of a grammar, then, is to relate a semantic interpreta- tion to a phonetic representation—that is, to state how a sentence is interpreted. (Chomsky 1965: 135–136)

To carry out this program, Chomsky (1965) adopted the proposal of Katz and Postal (1964) that many seemingly “meaning-changing” optional transformations of LSLT be replaced by obligatory transformations trig- gered by a marker in the Deep Structure. To take one example, earlier we noted that in LSLT, simple questions and the corresponding statements are derived from the same initial P-marker. In the revision, those initial ­P-markers would be very similar but not identical. The former would con- tain a marker of interrogation that would both signal the difference in meaning and trigger the inversion that results in the auxiliary verb appear- ing at the front of the sentence. Katz and Postal also noted that there are languages such as Japanese in which the Q-marker is spelled out as a sepa- rate morpheme. A Brief History of Generative Syntax 29 At this point in the development of the theory, the model can be graphi- cally represented as follows, with Deep Structure doing the semantic work formerly done by the T-marker:

(13) Deep Structure ⇒ Semantic Interpretation

Transformations (operating cyclically)

Surface Structure ⇒ Phonetic Interpretation (via the “sound-related” levels of morphophonemics, phonemics, and phonetics)

Some researchers soon challenged this framework. Generative Semantics built on the work by Katz and Postal (1964), and especially the claim that Deep Structure determines meaning (Lakoff 1971). For Generative Seman- tics, syntax is not the primary generative component. Rather, each meaning is represented by a different deepest representation (much more abstract than Chomsky’s Deep Structure). On this view, transformations can, and often must, be far more complex and powerful than those in the Aspects model. There was intense debate about these issues in the late 1960s and into the 1970s before Generative Semantics largely disappeared from the scene, partly because the main practitioners came to develop different interests. However, central aspects of Generative Semantics have survived in different contem- porary frameworks such as Cognitive Linguistics, Construction Grammar, and generative grammar including Chomskyan approaches. For example, Generative Semantics assumed that causative structures have a cause mor- pheme in the syntax, which is an approach that is found in recent work (see, e.g., Harley 1995). Baker’s (1988) Uniformity of Theta Assignment Hypothe- sis (UTAH), which states that identical thematic relationships are represented by identical structural relationships, is, in essence, another example of a pro- posal from Generative Semantics that has returned. Yet another, which we discuss later, is the elimination of Deep Structure as a level of representation. Let us now return to the chronological history. By the time Aspects was published, there were already questions about initial structure as the sole locus of semantic interpretation. To take just one example, Chomsky (1957) observed that in sentences with quantifiers (see Dayal 2013), the derived structure has truth conditional consequences. (14a) may be true while (14b) is false, for instance if one person in the room knows only French and German, and another only Spanish and Italian (see also Newmeyer 2013, ex. (13)):

(14) a. Everyone in the room knows at least two languages. b. At least two languages are known by everyone in the room. 30 Transformational Constraints In the theory of Chomsky (1957), this is not problematic since semantic inter- pretation is based on the T-marker. However, in the Aspects framework, there is a problem, as Chomsky acknowledges. He speculates that the interpre- tive difference between (14a) and (14b) might follow from discourse proper- ties rather than grammatical ones. The general problem, though, came to loom larger and larger, leading to a theory in which both Deep Structure and Surface Structure contribute to semantic interpretation. The core idea was introduced by Jackendoff (1969) and then elaborated by Chomsky (1970a; see also, e.g., Bach 1964, McCawley 1968), and it is clearly different from the view held by Generative Semantics. In this so-called Extended Standard Theory the contribution of Deep Structure concerns “grammatical rela- tions” such as understood subject and object of (cf. fn. 5). The contribution of Surface Structure concerns virtually all other aspects of meaning, includ- ing scope, as in the examples mentioned just above, anaphora, focus and presupposition. Alongside these questions about Deep Structure as the sole locus of semantic interpretation, there were also challenges to its very existence. Postal (1972) argued that the best theory is the simplest, which, by his rea- soning, included a uniform set of rules from semantic structure all the way to surface form, with no significant level (i.e., Deep Structure) between. And McCawley (1968) explicitly formulated an argument against Deep Struc- ture on the model of Morris Halle’s (1959) famous argument against a level of taxonomic phonemics. McCawley’s argument is based on the interpreta- tion of sentences with respectively, such as (15):

(15) Those men love Mary and Alice respectively.

McCawley argues that a respectively transformation relates (16) to (15).

(16) That man (x) loves Mary and that man (y) loves Alice.

For McCawley, this is a syntactic operation since it involves conjunction reduction. McCawley then notes that there is a corresponding semantic rela- tion between (17) and (18).

(17) ∀x:x∈M [x loves x’s wife] (18) These men love their respective wives.

For generative semanticists, such as McCawley, since there is no syntactic level of Deep Structure, there is no a priori need to separate the two opera- tions involved in (15) and (16) and in (17) and (18). The deepest level of representation is a semantic representation. But in a theory with Deep Struc- ture, the syntactic operation involved in (15) and (16) would necessarily be post–Deep Structure, while the operation implicated in (17) and (18) would necessarily be in a different module, one linking a syntactic representation A Brief History of Generative Syntax 31 with a semantic representation. Purportedly, then, a generalization is missed, as in Halle’s classic argument. Chomsky (1970b) considers this argument but rejects it, claiming that it rests on an equivocation about exactly what the relevant rule(s) would be in the the- ories in question. Chomsky points out that it is possible to give a more abstract characterization of the transformations such that one is not syntactic and the other is not semantic. Therefore, there is no argument against Deep Structure here. Chomsky does, however, accept McCawley’s contention that it is nec- essary to provide justification for the postulation of Deep Structure. But he observes that the same is true of Surface Structure or phonetic representation, or, in fact, any theoretical construct. How can such justification be provided?

There is only one way to provide some justification for a concept that is defined in terms of some general theory, namely, to show that the theory provides revealing explanations for an interesting range of phenomena and that the concept in question plays a role in these explanations. (Chomsky 1970b: 64)

As far as Chomsky was concerned, this burden had been met, especially by the Aspects analysis of the transformational ordering constraints discussed earlier.6 One small simplification in the Extended Standard Theory model was the result of a technical revision concerning how movement transformations operate (Wasow 1972, Chomsky 1973, Fiengo 1974, 1977). Trace theory proposed that when an item moves, it leaves behind a “trace”, a silent place- holder marking the position from which movement took place. The moti- vation for this was that in important respects, movement gaps behave like positions that are lexically filled, an argument first made in Wasow (1972) and Chomsky (1973). Under trace theory, the importance of Deep Structure (D-structure) for semantic interpretation is reduced, and ultimately elimi- nated. Once Surface Structure (S-structure) is enriched with traces, even grammatical relations can be determined at that derived level of represen- tation. Using the terms LF (Logical Form) for the syntactic representation that relates most directly to the interpretation of meaning and PF (Phonetic Form) for the one relating most directly to how sentences sound, we have the so-called T-model in (19; also called the [inverted] Y-model), which was at the core of Government and Binding theory.

(19) D-structure | Transformations | S-structure / \ PF LF 32 Transformational Constraints The precise nature of the connection between the syntactic derivation and semantic and phonological interfaces has been a central research ques- tion throughout the history of generative grammar. In the earliest genera- tive model, the interface is the T-marker, which includes all the syntactic structures created in the course of the derivation. Subsequent models had the following interfaces with semantics: The Standard Theory had D-struc- ture and the Extended Standard Theory had D-structure and S-structure, whereas Government and Binding and early Minimalism had LF. Chom- sky’s most recent model even dispenses with LF as a level in the technical sense (Chomsky 2004). The Minimalist approach to structure building, where Merge is the basic operation, is much more similar to that of the 1950s than to any of the intervening models, which is to say that inter- pretation in the Minimalist model also could be more like that in the early model, distributed over many structures. In the late 1960s and early 1970s, there were already occasional arguments for such a model from phono- logical interpretation, as well as semantic interpretation. For example, Bresnan (1971) argued that the phonological rule responsible for assign- ing English sentences their intonation contour (see Büring 2013) applies cyclically, following each cycle of transformations, rather than applying at the end of the entire syntactic derivation. There were similar propos- als for semantic phenomena involving scope and anaphora put forward by Jackendoff (1972). Chomsky (2000, 2001, 2004) argued for a general instantiation of this distributed approach to phonological and semantic interpretation, based on ideas of Epstein (1999) and Uriagereka (1999), who called the approach “multiple Spell-Out”. Simplifying somewhat, at the end of each cycle (or “phase” as it has been called for the past ten years) the syntactic structure created thus far is encapsulated and sent off to the interface components for phonological and semantic interpretation. Thus, although there are still what might be called PF and LF components, there are no syntactic levels of PF and LF. Epstein argued that such a move represents a conceptual simplification, and both Uriagereka and Chom- sky provided some empirical justification. We can view this conceptual simplification similarly to the elimination of D-structure and S-structure. Chomsky (1993) argued that both D-structure and S-structure should be dispensed with. Both levels are theory-internal, highly abstract, and they are not motivated by conceptual necessity, as the semantic and phono- logical interfaces to a much greater extent are. Another way to put this is to say that the motivation for D-structure and S-structure is empirical. Chomsky argued that, contrary to appearances, it is possible to cover the same or even more empirical ground without postulating either S-structure or D-structure.7 The role of syntactic derivation becomes even more central on this view because there are no levels of representation at all. The syntax interfaces directly with sound and meaning. A Brief History of Generative Syntax 33 1.4 The Development of Phrase Structure In this section, we provide a history of the development of phrase struc- ture (see also Fukui 2001, and Bošković 2013, Sells 2013, Blevins and Sag 2013, and Frank 2013). We start with a brief recap of PS grammars and then move on to different versions of X-bar theory. Last we discuss the approach to phrase structure within the Minimalist Program: Bare Phrase Structure. Our focus throughout is mainly be on the Chomskyan versions of phrase structure, but we also mention where other theories developed and why they developed.

1.4.1 Phrase Structure Grammars Chomsky (1955, 1957) developed a theory of phrase structure which made use of context-free PS grammars ([Σ, F] grammars). In addition, the theory was based on derivations and equivalence classes of such derivations. Chomsky (1957: 27–29, 87) defines phrase structure set theoretically as in (20):

(20) Given a particular [Σ, F] grammar and a particular terminal string (i.e., string of terminal symbols): a. Construct all of the equivalent PS derivations of the terminal string. b. Collect all of the lines occurring in any of those equivalent deriva- tions into a set. This set is the phrase marker (PM), a representation of the phrase structure of the terminal string.

The purpose of a PM is to tell us for each portion of the terminal string whether that portion comprises a constituent or not, and, when it comprises a constituent, what the “name” of that constituent is. Chomsky makes the following empirical claim: All and only what we need a PM to do is to tell us the “is a” relations between portions of the terminal strings and non- terminal symbols. Anything that tells us those and only those is a perfectly adequate PM; anything that does not is inadequate as a PM. The PS rules can generate a graph-theoretic representation like the one in (21; see Lasnik 2000: 29ff. for an illustration of how this works): 34 Transformational Constraints The tree tells us everything we have established concerning the “is a” rela- tions. Note, however, that the tree encodes information that goes beyond the “is a” relations. The tree tells us that a VP is rewritten as V and that the V is rewritten as left. It is an empirical question whether we need this additional information or not, say, for phonological, semantic, or further syntactic operations. If we do, then this particular set-theoretic model has to be rejected. If we do not, then the model is accepted since we would like the minimal theory that does what has to be done. We will see later that the field has typically assumed that the set-theoretic model needs to be enriched in various ways. Lasnik and Kupin (1977) showed that the algorithm for computing “is a” relations needs recourse only to the terminal string and the other members of the PM that consist of exactly one nonterminal symbol surrounded by any number of terminal symbols (what Lasnik and Kupin called monostrings). Hence, Lasnik and Kupin proposed a construct called a reduced phrase marker, which includes only the terminal strings and the monostrings. See Lasnik (2000: section 1.2.6.1) for more discussion.

1.4.2 X-bar theory One problem in LSLT and Syntactic Structures is that the theory developed there allows PS rules like (23) alongside ones like (22) (Lyons 1968):

(22) NP  . . . N . . . (23) VP  . . . N . . .

But there do not seem to be rules like (23). Why is this? The formalism allows both rules, and the evaluation metric (Chomsky 1965) judges them equally costly. Chomsky (1970a) was an attempt to come to grips with this problem. There it is proposed that there are no individual PS rules of the sort that did so much work in Syntactic Structures and even in Aspects. Rather, there is what is now known as the X-bar schema (see also Corver 2013). X is a variable, ranging over category names such as V, N, and so on. Here is the version of X-bar theory that Chomsky (1970a) presented (see also Emonds 1976 and Jackendoff 1977 for much relevant discussion).

(24) X′  . . . X . . . X′′  . . . X′ . . . A Brief History of Generative Syntax 35 X′ and X′′ are true complex symbols. Keep in mind that in Syntactic Struc- tures NP looked like it had something to do with N, but in that system it really did not. NP was just one symbol that was written for mnemonic purposes with two letters. In X-bar theory, a category label is a letter plus a number of bars (originally written as overbars—e.g., X̅ —but later writ- ten as primes—e.g., X′—for typographical convenience). It can be thought of as an ordered pair. X is , X′ is , and X′′ is . X-bar theory immediately explains why there are no rules like (23). This is because phrases have heads; that is, they are endocentric, which is to say that phrases are projections of heads. Chomsky also introduced the relational notions complement and speci- fier. A complement is a sister to a head. He argued that the notion comple- ment does not play any role in transformations (Chomsky 1970a: 210), that is, complements cannot be the target qua complements of any trans- formational operations. At this point, there were general rules like (29) that subsumed rules like the ones in (26) through (28):

(26) NP  N Comp (27) VP  V Comp (28) AP  A Comp (29) Comp  NP, S, NP S, NP Prep-P, Prep-P Prep-P, etc.

The rules in (29) should instead be replaced with the rule in (30):

(30) X′  . . . X . . .

The dots in (30) indicate that there are no restrictions on what can be a complement and where the complement is placed vis-à-vis the head. Chomsky then proposes that in order to “introduce further terminologi- cal uniformity, let us refer to the phrase associated with N′, A′, V′ in the base structure as the ‘specifier’ of these elements” (Chomsky 1970a: 210):

(31) X′′  [Spec, X′] X′

On this view, a specifier encompasses a heterogeneous set as it contains a variety of pre-head elements like auxiliaries in SpecV′, determiners in SpecN′, adverbials in SpecV′ and degree modifiers in SpecA′. As Jackendoff (1977: 14) points out, it is not clear whether Chomsky considers the speci- fier to be a constituent or an abbreviation for a sequence of constituents, like Comp. The diagrams in Chomsky (1970a) show specifiers as constitu- ents. Jackendoff (1977) argues against specifiers being constituents whereas Hornstein (1977) defends the claim that they are. However, beyond being a constituent and bearing a geometrical relation to a head, it is not clear what the defining characteristics of a specifier are (see also George 1980: 17). 36 Transformational Constraints Later a biconditional version of X-bar theory was developed, namely, that phrases have heads, and heads project. Whenever a structure has an XP, it has an X (this is what Chomsky 1970a proposed), and whenever a structure has an X, it has an XP. In Chomsky (1970a), the initial rule of the base grammar is as in (32):

(32) S  N′′ V′′

This means that X-bar theory is not fully general: S and S′ (the latter the larger clause including a sentence-introducing complementizer like that) do not fit into the theory in any neat way.8 These labels are not projections of heads, unlike the other labels in the system. However, it is worth bearing in mind that Bresnan (1970) suggests that complementizers are essentially specifiers of sentences through the rule in (33):

(33) S ′  Comp S

This is in line with the general approach to specifiers during the 1970s, as comple- mentizers here are analyzed on a par with auxiliaries, which were also specifiers. It may be worth pausing to reflect on what pushed Chomsky to create X-bar theory.

The development of X′ theory in the late 1960s was an early stage in the effort to resolve the tension between explanatory and descriptive ade- quacy. A first step was to separate the lexicon from the computations, thus eliminating a serious redundancy between lexical properties and phrase structure rules and allowing the latter to be reduced to the sim- plest (context-)free form. X′ theory sought to eliminate such rules alto- gether, leaving only the general X′ theoretic format of UG. The problem addressed in subsequent work was to determine that format, but it was assumed that phrase structure rules themselves should be eliminable. (Chomsky 1995a: 61)

The attempt was to do away with redundancies in favor of larger general- izations. Another way to say this is that when we impose strict constraints, the PS rules themselves vanish. It is possible to view the change from phrase structure rules to X-bar theory in the same way as Chomsky’s (1973) gen- eralization of some of Ross’s (1967) locality “island” constraints on move- ment (see den Dikken and Lahne 2013). In both cases, instead of more or less idiosyncratic properties, we get general properties that hold across cat- egories. Baltin (1982: 2) puts the general development this way:

The history of transformational generative grammar can be divided into two periods, which can be called expansion and retrenchment. During A Brief History of Generative Syntax 37 the early “expansion” period, a primary concern was the description of grammatical phenomena. [. . .] The theory was correspondingly loose, and consequently failed to provide an adequate solution to the projec- tion problem.9 [. . .] During the retrenchment period [. . .] the focus of attention shifted from the construction of relatively complex [. . .] state- ments to the construction of a general theory of grammar, restricted as to the devices it employed, which could be ascribed to universal grammar.

Chomsky (1970a) only discusses NPs, VPs and APs, not PPs. One goal of Jackendoff (1977) is to bring PPs under the X-bar theoretic fold. So at the end of the 1970s, a quite general picture of phrase structure had started to emerge. Before we move on to the early Principles and Parameters view of phrase structure, it is worth considering a general problem that both Chomsky (1970a) and Jackendoff (1977) face. The problem has been brought up most clearly by Stuurman (1985). Stuurman’s goal is to defend what he calls “the single-projection-type hypothesis”. Multiple projection types (X, X′, X′′, Xn), as assumed in Chomsky’s and Jackendoff’s works, are banned. Stuurman’s thesis is that only one distinction is made internal to projec- tions: the distinction between X0 and X1, or put differently, between a head and everything else. Stuurman argues that this provides a more restrictive phrase structure theory and a theory that is more easily learnable. Here is an example that he uses to make his claim. In English, only the first hierarchical level projected from X0 can domi- nate an NP.

(34) a. he [[met his wife] in Italy] b. *he [[met in Italy] his wife]

Stuurman (1985: 8) points out that if we assume multiple projection-types, the facts in (34) can easily be captured directly at the level of PS as follows:

(35) a. Vi  . . . Vj . . ., where. . . ≠ NP,i > j ≥ 1 b. V1  . . . V0 . . ., where. . . = NP,. . .

These restrictions are descriptively adequate, but as Stuurman stresses, they do not explain how a child can learn the distribution of NPs. Put differently, Universal Grammar (UG) does not provide a rationale for why the constraints are the way they are: Why should UG not allow NP under Vi and exclude NP under V1? Unless the rules in (35) are universal, children need access to negative data (i.e., that (34b) is bad), which they by assumption do not have access to.10 Stuurman presents a different analysis where there is only one projection type. His theory, which we do not flesh out here, allows for both the struc- ture in (36a) and (36b): 38 Transformational Constraints

Here one needs an independent principle that filters out the structure in (36b). This structure has an NP that is not dominated by the first X1 up from X0. Stuurman argues that this filtering condition can be associated with an adjacency condition on Case Theory, following Stowell (1981) (see Polinsky 2013 for more discussion). That is, being a Case assigner is a lexi- cal property, thus a property of X0, not of X1. (36b) is therefore ruled out independently of PS rules, as in Stowell’s work.11 Stuurman presents addi- tional arguments for the single projection hypothesis. The point is that the view emerging in the late 1970s had important flaws, as it was too flexible and not principled enough. In the early 1980s, these flaws were addressed. As research developed during the 1970s and 1980s, more and more of the elements that Chomsky and Jackendoff had analyzed as specifiers came to be analyzed as heads of particular functional projections (see also Abney 1987). As Chametzky (2000) points out, a notion of specifier emerged with the following characteristics: (1) typically an NP, (2) it bears a certain rela- tionship with the head. Stowell (1981: 70) summarizes the general charac- teristics of X-bar theory as follows:

(37) a. Every phrase is endocentric. b. Specifiers appearat the XP-level; subcategorized complements appear within X′. c. The head always appears adjacent to one boundary of X′. d. The head term is one bar-level lower than the immediately dominat- ing phrasal node. e. Only maximal projections may appear as non-head terms within a phrase.

These were further developed during the Government and Binding era in the 1980s. Here we focus on Chomsky (1986) since that work presents X-bar theory as it is best known. Chomsky (1986, henceforth Barriers) provides a generalization of X-bar structure, though attempts had already been made in Chomsky (1981), Stowell (1981) and den Besten (1983), to mention the most important works. As we have seen, prior to Barriers, the maximal projections were VP, NP, AP and PP. In addition, there was S, which gets rewritten as NP Infl VP, and S′, which gets rewritten as Comp S. Comp includes at least C and A Brief History of Generative Syntax 39 wh-expressions. The problem is that S does not conform to X-bar theory. It is not endocentric since it has no head, which means that there is no pro- jection line from a head to a maximal projection. S′ is also not uniformly endocentric since when Comp is filled by phrasal material, it is not the head of S′. Because of these problems, Stowell (1981: chapter 6) suggests that the head of S is Infl, as illustrated in (38). This is very similar to Williams (1981: 251), who suggests that S is headed by Tense:

Once IP replaces S, a natural step is to reconsider S′. Stowell (1981: chap- ter 6) proposes that C is the head of S′. The optional specifier then becomes the target of wh-movement. We then have the structure in (39) (see also Chomsky 1986, and Corver 2013, sect. 5).

With this in place, it is possible to formulate restrictions on movement based on what can appear in a head position and what can appear in a specifier position; compare with Travis (1984) and Rizzi (1990). The reanalysis of S and S′ paves the way for a generalization of X-bar the- ory. Chomsky (1986: 3) proposes that X-bar theory has the general struc- ture in (40), where X* stands for zero or more occurrences of some maximal projection and X = X0.12

(40) a. X′ = X X′′* b. X′′ = X′′* X′

Koizumi (1995: 137) argues that the traditional X-bar schema can be seen as expressing three claims, as given in (41):

(41) a. Asymmetry: A node is projected from only one of its daughters. b. Binarity: A node may have at most two daughters. c. Maximality: A head may project (at most) two non-minimal projections. 40 Transformational Constraints It should be mentioned that (40) does not force binarity, since a node may have more than two daughters. One can either restrict X-bar theory so that it does observe binarity by hardwiring it into the X-bar theory, or, for exam- ple, follow the proposal of Kayne (1984, 1994) that independent grammati- cal constraints require all branches in a tree to be binary (see the following discussion). Chomsky (1986: 4) points out that specifiers are optional whereas the choice of complements is determined by the Projection Principle. The latter is a principle that says that representations at each syntactic level are pro- jected from the lexicon. Following up on the theory in Barriers, many researchers developed somewhat different versions of X-bar theory. Fukui and Speas (1986) claim that there are significant differences between lexical and functional projec- tions, for example, VP and IP. They argue that lexical categories may iterate specifiers as long as all these positions are fully licensed and can be inter- preted at LF. Functional categories, on the other hand, only have one unique specifier position.13 Hoekstra (1991; see also Hoekstra 1994) argues that specifiers are stipu- lated in X-bar theory. Rather, Hoekstra argues, specifiers should be defined through agreement: A specifier always agrees with its head (see also Baker 2013). Hoekstra also eliminates the phrase structural distinction between adjuncts and specifiers and argues that an adjunct can be defined as an ele- ment that does not agree with the head of the projection it is adjoined to. Recently, several researchers have argued that specifiers are problematic and should not be part of phrase structure (Hoekstra 1991, Kayne 1994, Cor- mack 1999, Starke 2004, Jayaseelan 2008). Kayne (1994) puts forward a novel theory of phrase structure. He sug- gests there is one universal order and that this order is as in (42):

(42) specifier > head > complement

Throughout the history of generative grammar, it had generally been an assumption that languages vary in their base structure. PS rules encode this directly as in (43) for an English VP and (44) for a Japanese VP:

(43) VP  V NP (44) VP  NP V

In the Government and Binding era, a common analysis of this variation was given in terms of the head parameter. Contrary to these analyses, Kayne claims that linear and hierarchical order are much more tightly con- nected. He argues that the property of antisymmetry that the linear prece- dence ordering has is inherited by the hierarchical structure.14 The Linear A Brief History of Generative Syntax 41 Correspondence Axiom is the basic property of phrase structure, and famil- iar X-bar theoretic properties follow from it.

(45) Linear Correspondence Axiom d(A) is a linear ordering of T. (Kayne 1994: 6)

The nonterminal-to-terminal dominance relation is represented by d. This relation d is a many-to-many mapping from nonterminals to terminals. For a given nonterminal X, d(X) is the set of terminals that X dominates. A is a set of ordered pairs such that for each j, Xj asymmetrically c-commands

Yj. A contains all pairs of nonterminals such that the first asymmetrically c-commands the second; thus, it is a maximal set. T stands for the set of terminals. At this point, we will turn to a brief description of Bare Phrase Structure, which partly incorporates Kayne’s ideas, since this is the current approach to phrase structure in Chomskyan generative grammar.

1.4.3 Bare Phrase Structure and Cartography Kayne’s theory forces the elimination of the distinction between X′ and XP since his linearization algorithm does not make this distinction. Chomsky (1995a, b) went further and argued that X-bar levels should be eliminated altogether. This is the theory of Bare Phrase Structure (BPS; see also sec- tion 2.4.2 of Bošković 2013). The gist of BPS is summarized in the following quote: “Minimal and maximal projections must be determined from the structure in which they appear without any specific marking; as proposed by Muysken (1982) they are relational properties of categories, not inherent to them” (Chom- sky 1995a: 61): “What I will propose is that bar level is not a primitive of the grammar at all, rather ‘maximal projection’ and ‘minimal projection’ are defined terms, and intermediate projections are simply the elsewhere case” (Muysken 1982).15 Chomsky (1995b: 242) tied this to the Inclu- siveness Condition, which bans any marking of maximal and minimal projections.16

(46) Inclusiveness Condition Any structure formed by the computation is constituted of ele- ments already present in the lexical items. No new objects are added in the course of computation apart from rearrangements of lexical properties. (Chomsky 1995b: 228) 42 Transformational Constraints Another way to look at BPS is to say that phrase structure consists solely of lexical items. No extrinsic marking is necessary. This means that instead of a phrase like (47), phrases look like (48). Here we are setting aside how verbs get their inflection and where the arguments really belong in the structure—the important point at hand is the difference between the two structures.

These lexical items are accessed at the LF interface. No units apart from the lexical items can be part of the computation. Thus, bar levels have no existence within BPS. For a critical discussion of some problems with BPS, see Starke (2004) and Jayaseelan (2008). Shortly after BPS had been developed in Chomsky (1995a, b), Rizzi (1997) initiated what has become known as the cartographic approach. This approach assumes an expansion of functional structure, an expansion that is claimed to be necessary on empirical grounds. See Rizzi (2013) for discussion. This concludes our rather brief overview of the history of phrase structure. A common thread has been the reduction and generalization that started with Chomsky (1955). X-bar theory was a generalization of PS grammars but at the same time a reduction in that the core primitives of the theory were fewer. Chomsky (1986) also made significant generalizations of the X-bar theory in Chomsky (1970a). Last, BPS has provided the last reduction that we have seen so far, where even the existence of bar levels is denied.

1.5 Rules and Filters versus Principles Most of the early work in generative syntax was done on English. A few important exceptions were Kuroda (1965), Matthews (1965), Ross (1967), A Brief History of Generative Syntax 43 Perlmutter (1968) and Kayne (1969). However, especially with the publi- cation of Kayne (1975), it became more and more common to investigate different languages.17 Kayne gave a range of different language-particular rules for French and in many cases compared them to the syntax of English. Slightly later, Jaeggli (1980) and Rizzi (1982) conducted in-depth studies of other romance languages. Crucially, though, this enterprise centered on formulating language-specific and construction-specific rules, and what may be universal across languages was not given as much attention. Chomsky and Lasnik (1977) pointed out that early work in pursuit of descriptive adequacy led to an extremely rich theory of transformational grammar. For a formalization that encompasses much descriptive practice, see Peters and Ritchie (1973). Even this extremely rich theory does not encompass such devices as structure-building rules, global rules, transderi- vational constraints, and others that had often been proposed. Let us take a quick look at global rules and transderivational constraints. A global rule is a rule that state conditions on “configurations of corre- sponding nodes in non-adjacent trees in a derivation” (Lakoff 1970: 628). Thus, global rules go far beyond the usual Markovian property of trans- formational derivations. An example of a global rule is provided by Ross (1969). Ross observed that the island constraints on movement he proposed in Ross (1967) only hold if the island-forming node is present in Surface Structure. The constraints do not hold, however, if a transformation (“sluic- ing” in this case; see van Craenenbroeck and Merchant 2013) subsequently deletes that node. An example illustrating this is given in (49) and (50):

(49) *Irv and someone were dancing, but I don’t know who Irv and were dancing. (50) Irv and someone were dancing, but I don’t know who.

The conclusion drawn from this is that island constraints cannot just men- tion the point in the derivation at which the movement rule applies, nor just the Surface Structure. The constraints must mention both. As for transderivational constraints, these are constraints that depend on properties of derivations other than the one currently being constructed. Hankamer (1973) argues for transderivational constraints based on a detailed analysis of gapping (see van Craenenbroeck and Merchant 2013). Among others, he considers the data in (51) through (54) (Hankamer 1973: 26–27):

(51) Max wanted Ted to persuade Alex to get lost, and Walt, Ira. (52)  . . . and Walt *[wanted] Ira [to persuade Alex to get lost] (53)  . . . and Walt *[wanted Ted to persuade] Ira [to get lost] (54)  . . . and [Max wanted] Walt [to persuade] Ira [to get lost]

In order to block gapping in (52) and (53), Hankamer argues that a con- straint is needed that makes reference to other structures that might have 44 Transformational Constraints been created, even from different Deep Structures. In particular, the reason (51) cannot be derived from (52) or (53) is that it can be derived from (54). Space considerations prevent us from elaborating further, though we should acknowledge that Hankamer suggests that the constraint at issue here is universal, thus raising no learnability concerns. Returning to our main discussion, any enrichment of linguistic theory that extends the class of possible grammars requires strong empirical moti- vation. This, Chomsky and Lasnik (1977) argued, is generally missing in the case of devices that exceed the framework of Chomsky (1955), Peters and Ritchie (1973), and comparable work; compare with Dougherty (1973), Chomsky (1973), and Brame (1976). Note that the work of Chomsky and many others has consistently tried to reduce the descriptive power of the transformational component. The framework in Aspects is more restricted than the one in LSLT, and Chomsky (1973) is much more restricted than Aspects. In the 1980s, many researchers argued that we should make transformations as general as Move α, or even Affect α, as in Lasnik and Saito (1984, 1992). Chomsky and Lasnik (1977) contributed to these developments by pro- posing a framework that attempted to restrict the options that are available in this narrower, but still overly permissive framework, so that it is possible to approach one of the basic goals of linguistic theory: to provide, in the sense of Aspects, explanations rather than descriptions and thus to account for the attainment of grammatical competence. They assumed that Univer- sal Grammar is not an “undifferentiated” system but, rather, a system that incorporates something analogous to a theory of markedness. Specifically, there is a theory of core grammar with highly restricted options, limited expressive power, and a few parameters. Systems that fall within core gram- mar constitute the unmarked case; one can think of them as optimal in terms of the evaluation metric. An actual language is determined by fixing the parameters of core grammar and then adding rules or rule conditions, using much richer resources, perhaps resources as rich as those contem- plated in the earlier theories of transformational grammar noted earlier. Filters were supposed to bear the burden of accounting for constraints, which, in the earlier and far richer theory, were expressed in statements of ordering and obligatoriness, as well as contextual dependencies that cannot be formulated in the narrower framework of core grammar. The hypoth- esis in Chomsky and Lasnik (1977) was that the consequences of order- ing, obligatoriness, and contextual dependency could be captured in terms of surface filters. Furthermore, they argued that the previously mentioned properties could be expressed in a natural way as surface filters that are universal or else the unmarked case. We see that the idea of a distinction between parameters and principles is already present in Chomsky and Lasnik (1977). However, in this frame- work, there are only a few parameters that affect the core grammar. Besides A Brief History of Generative Syntax 45 these parameters, there are a number of language-specific rules. An example is the filter in (55) that blocks for-to constructions in Standard English:

(55) *[for-to] (56) *We want for to win.

As Chomsky and Lasnik (1977: 442) point out, this filter is a “dialect” filter, meaning that it is not a principle of Universal Grammar. They discuss a range of filters, and some of them are like (55) in being outside of core grammar, whereas others, like the Stranded Affix filter of Lasnik (1981), are argued to be part of Universal Grammar. With Chomsky (1981), the conception of rules and filters changed some- what. The part related to rules stayed intact, since there is no distinction between rules and principles. Both are assumed to be universal and part of Universal Grammar. But instead of filters that can be both language- and construction-specific, Chomsky suggested that we should conceive of variation in terms of parameters (hence the name Principles and Param- eters Theory; see Bošković 2013). The following quote brings out the main difference:

If these parameters are embedded in a theory of UG that is sufficiently rich in structure, then the languages that are determined by fixing their values one way or another will appear to be quite diverse. (Chomsky 1981: 4)

The parameters are assumed to be part of UG and together they should both yield the variation we observe and an answer to Plato’s problem: How do we know so much given the limited evidence available to us? In the realm of language, the question is how the child can arrive so rapidly at its target grammar given the input it gets. An important part of the the- ory was that parameters were supposed to represent clusters of properties: “[I]deally we hope to find that complexes of properties [. . .] are reducible to a single parameter, fixed in one or another way” (Chomsky 1981: 6). Rizzi (1982) gave a nice example of this when he argued that there are cor- relations between thematic null subjects, null expletives, free inversion, and that-trace effects (*Who do you think that __ won the race). This model was therefore a sharp break from earlier approaches, under which universal grammar specified an infinite array of possible grammars, and explanatory adequacy required a presumably unfeasible search proce- dure to find the highest-valued one, given primary linguistic data. The Prin- ciples and Parameters approach eliminated all this. There is no enumeration of the array of possible grammars. There are only finitely many targets for acquisition, and no search procedure apart from valuing parameters. This cut through an impasse: Descriptive adequacy requires rich and varied 46 Transformational Constraints grammars, hence unfeasible search; explanatory adequacy requires feasible search. See Bošković 2013, Barbiers (2013), and Thornton and Crain (2013) for further discussion of parameters.

1.6 Derivations The general issue of derivational versus representational approaches to syntax has received considerable attention throughout the history of generative grammar. A derivational approach argues that there are con- straints on the processes by which well-formed expressions are generated, whereas a representational approach argues that there is a system of well- formedness constraints that apply to structured expressions (see Frank 2002 for more discussion of this general issue). Internally to the major derivational approach, transformational grammar, a related issue arises: Are well-formedness conditions imposed specifically at the particular lev- els of representations made available in the theory, or are they imposed “internal” to the derivation leading to those levels?18 Like the first ques- tion concerning whether derivations exist, it is a subtle one, perhaps even subtler than the first, but since Chomsky (1973), there has been increas- ing investigation of it, and important arguments and evidence have been brought to bear (see Freidin 1978 and Koster 1978 for illuminating early discussion). However, generative theories disagree on whether derivations actually exist or not. Typically this disagreement emerges when the question of whether there are transformations is considered since this is the main case where one can impose derivational constraints. Any phrase structure repre- sentation has to be generated somehow, and one can arguably claim that the generation of such a tree is derivational. This is not where the disagreement lies; rather, it concerns whether one can impose constraints on derivations or not. Chomskyan generative grammar, especially since the very important work of Ross (1967), has always assumed that this is possible and that it is a virtue of the theory. However, let us consider some nontransformational theories (see also Frank 2002 for useful discussion, and Harman 1963 for a very early formulation of a nontransformational generative theory). Most of these developed in the wake of Chomsky’s (1973, 1977) theorizing based on the important discoveries in Ross (1967). Lexical-Functional Grammar (LFG; (Kaplan and Bresnan 1982, Bresnan 2001) eliminates transformations and increases the role of structural com- position. This is a theory where the lexical expressions are of crucial impor- tance. LFG argues that lexical representations have a richer hierarchical structure than in the Chomskyan theory. The theory also assumes paral- lel levels of representation: constituent structure, functional structure, and argument structure all constitute independent levels of representation. Since the theory does not have transformations, dependencies are established A Brief History of Generative Syntax 47 by interaction between the different levels and by lexical entries that have been transformed by lexical rules. For example, an analysis of the passive assumes that there are two lexical entries of the verb in the lexicon and that there are linkages that determine the appropriate thematic dependencies. See Sells (2013) for more discussion of LFG. Generalized Phrase Structure Grammar (GPSG; Gazdar et al. 1985) elimi- nates transformations in a different way. In this theory, a derivation consists of context-free phrase structure rules. Metarules that modify the phrase structure rules are used to establish dependencies in a way reminiscent of Harman (1963). This is to say that wh-movement, for example, is captured through additional phrase structure rules. Blevins and Sag (2013) discuss GPSG in detail. As Frank (2002: 8) points out, all these nontransformational theories share with transformational theories the property that there are no privi- leged intermediate levels of syntactic structure. This has been the case since Chomsky (1965), but it was not true of Chomsky (1955, 1957), where ker- nel structures constituted such intermediate structures. Put differently, some- thing needs to prevent nonlocal dependencies from being created. However, a nontransformational theory that returns to a theory that is closer to that of Chomsky (1955) is Tree Adjoining Grammar (Joshi, Levy and Takahashi 1975, Joshi 1985). We briefly described this theory in section 1.2.1; see also Frank (2013). In theories of the Chomskyan sort, based on transformational movement operations, a question arises: What determines whether movement occurs? In the Move α framework, all such processes were completely free (see, e.g., Lasnik and Saito 1992 for a detailed version of this theory). There were no triggers; rather, there were representational constraints that had to be satisfied for a structure to be convergent. Even though representa- tionalist approaches have been developed in recent years (see, in particular, Brody 1995, 2002, 2003), Chomsky and most researchers within Chom- skyan generative grammar have defended a derivationalist approach where movement is triggered.19 Chomsky (1995b) argues on conceptual and, to some extent, empirical grounds that movement is always morphologically driven. That is, there is some formal feature that needs to be checked, and movement provides the configuration in which the checking can take place. Chomsky also provides reasons that, all else being equal, covert movement (movement in the LF component) is preferred to overt movement, a prefer- ence that Chomsky calls “Procrastinate”. When movement is overt, rather than covert, then, it must have been forced to operate early by some spe- cial requirement. The major phenomenon that Chomsky considers in these terms is verb raising, following the influential work of Pollock (1989). He also hints at a contrast in object shift, overt in some languages and covert in others. Chomsky (1993, 1995a, 1995b) codes the driving force for overt movement into strong features, and presents three successive distinct the- ories of precisely how strong features drive overt movement. These three 48 Transformational Constraints theories, which we summarize immediately, are of interest to our question, since the first two of them are explicitly representational in the relevant sense, while the third is derivational:

(57) a. A strong feature that is not checked in overt syntax causes a deriva- tion to crash at PF. (Chomsky 1993) b. A strong feature that is not checked (and eliminated) in overt syn- tax causes a derivation to crash at LF. (Chomsky 1995a) c. A strong feature must be eliminated (almost) immediately upon its introduction into the phrase marker. (Chomsky 1995b)

All three of these proposals are designed to force overt movement in the relevant instances (e.g., verb raising in French, where a strong V feature of Infl will cause a violation in one of the three ways listed in (57) if overt movement does not take place) and all are framed within a Minimalist con- ception of grammar. The work of building structure is done by generalized transformations, as it was before recursion in the base was introduced in Chomsky (1965). This return to an earlier approach replaces a partly repre- sentational view with a strongly derivational one. Chomsky (1993) argues that the treatment in (57a) follows from the fact that parametric differences in movement, like other parametric differences, must be based on morphological properties reflected at PF. (57a) makes this explicit. Chomsky suggests two possible implementations of the approach:

“[S]trong” features are visible at PF and “weak” features invisible at PF. These features are not legitimate objects at PF; they are not proper components of phonetic matrixes. Therefore, if a strong feature remains after Spell-out, the derivation crashes [. . .] Alternatively, weak features are deleted in the PF component so that PF rules can apply to the pho- nological matrix that remains; strong features are not deleted so that PF rules do not apply, causing the derivation to crash at PF. (Chomsky 1993: 198)

There is presumably only one other possible type of representational approach, given Minimalist assumptions: one that involves LF, rather than PF. Chomsky (1995a) proposes such an analysis, (57b), based on an empirical shortcoming of (57a). What is at issue is the unacceptability of sentences like (58):

(58) *John read what?

Assuming that the strong feature forcing overt wh-movement in English resides in interrogative C,20 the potential concern is that C, since it has no phonetic features, might be introduced in the LF-component, where, checked or not, it could not possibly cause a PF crash, since it has no phonetic fea- tures, and therefore, as far as PF knows, the item does not exist at all. Yet A Brief History of Generative Syntax 49 (58) is bad as a non-echo question, so such a derivation must be blocked. This problem arises in the general context of fitting lexical insertion into the grammar. In most circumstances, there is no need for a specific prohibition against accessing the lexicon in the PF or the LF component. (58) represents a rare problem for the assumption that lexical insertion is free to apply anywhere. Chomsky (1995a: 60–61) suggests that the root C head has a fea- ture that requires overt wh-movement. Unless this feature is checked prior to Spell-Out, the derivation will crash at LF. Chomsky proposes to imple- ment this basic idea in the following way: “Slightly adjusting the account in Chomsky (1993), we now say that a checked strong feature will be stripped away by Spell-Out, but is otherwise ineliminable” (Chomsky 1995a: 61). Chomsky (1995b) rejects the representational approach in (57a), and the conceptual argument he gives evidently applies equally to the alternative representational approach in (57b). He discounts such an account as an evasion and proposes what he claims is a more straightforward statement of the phenomenon:

[F]ormulation of strength in terms of PF convergence is a restatement of the basic property, not a true explanation. In fact, there seems to be no way to improve upon the bare statement of the properties of strength. Suppose, then, that we put an end to evasion and simply define a strong feature as one that a derivation “cannot tolerate”: a derivation D  Σ is canceled if Σ contains a strong feature. (Chomsky 1995b: 233)

In summary, strong features trigger a rule that eliminates them. This approach is strongly derivational. There are problems with this account (see Lasnik 2001 for detailed discussion), but the goal here has merely been to outline the ways one can think of the trigger question in either derivational or representational terms. Since Chomsky (1995b), the assumption is that movement is triggered by feature checking. But while feature checking was originally thought to be possible only in specific derived configurations (the Spec-head relation and head-adjunction configurations, in particular), in more recent work it is contingent merely on the establishment of an Agree relationship between a c-commanding Probe and a Goal. The introduction of the Agree mechanism divorces the movement trigger from agreement, contrary to the framework in Chomsky (1993), where elements moved to specifiers to undergo agree- ment with a head (see Baker 2013 for discussion). However, even if features have to be checked, it is not clear that the approach is fully derivational. The typical assumption is that a derivation crashes unless all features are checked prior to the interfaces, which in effect is a representational condition based on features. However, the operations defined on features are derivational as they unfold as the structure is being built and they are limited by gram- matical principles (e.g., intervention effects or the Phase Impenetrability 50 Transformational Constraints Condition; see Chomsky 2001, Bošković 2013, and den Dikken and Lahne 2013 for discussion). Therefore, it seems valid to say that there are both derivational and representational aspects and that both play important roles in grammar in this model.

1.7 The Advent of Economy Principles in Principles and Parameters Theory As we have seen, a major Minimalist concern involves the driving force for syntactic movement. From its inception in the early 1990s, Minimalism has insisted on the last-resort nature of movement: In line with the leading idea of economy, movement must happen for a reason and, in particular, a formal reason. The Case Filter, which was a central component of the Government and Binding Theory, was thought to provide one such driving force. Baker (2013) illustrates this at length, so we will not discuss it here. Instead we will offer two other examples of economy principles: Relativized Minimality and the Extension Condition. An important instance of economy is what Luigi Rizzi (1990) called Rela- tivized Minimality (see den Dikken and Lahne 2013 for more discussion). Chomsky and Lasnik (1993) reinterpreted Rizzi’s groundbreaking work in terms of least effort. Let us illustrate that here by way of a phenomenon called Superiority, which has often been analyzed as a Relativized Minimal- ity effect. Consider the following examples:

(59) Guess who bought what? (60) *Guess what who bought?

In this situation, there might seem to be an option. One could either front who or what. As (59) and (60) show, only the former is licit. In such a situ- ation, you always have to pick the closest to the position where the element ends up after moving, as first observed in something like these terms by Chomsky 1973. Put differently, one should minimize the distance traveled by the moving element, an instance of “economy” of derivation. Another potential example of an economy condition relates to the Extension Condition. This condition requires that a transformational operation extends the tree upwards. In Chomsky (1965), the requirement that derivations work their way up the tree monotonically was introduced, alongside D-structure. Generally this is known as the requirement of cyclicity. Chomsky used this to explain the absence of certain kinds of derivations, but also as an argument against generalized transformations and for D-structure. But it was cyclicity, rather than D-structure, that was crucial in the account. As we have discussed earlier, Minimalism rejects D-structure and reinstates generalized transforma- tions, but it still preserves cyclicity, thus ruling out the anticyclic derivations that were the original concern. The Minimalist Extension Condition demands A Brief History of Generative Syntax 51 that both the movement of material already in the structure (internal merge = singulary transformation) and the merger of a lexical item not yet in the struc- ture (external merge = generalized transformation) target the top of the exist- ing tree. Consider in this context the structures in (61) through (63).

(61) X (62) X (63)X

/ \/ \/ \

Z A XZ A

/ \/ \/ \

B CZ AB C

/ \/ \

B CC

(61) is the original tree. (62) shows a derivation that obeys the Extension Condition. Here the new element β is merged at the top of the tree. The last derivation, (63), does not obey the Extension Condition because β is merged at the bottom of the tree. Importantly, there is a deep idea behind cyclicity, which again was present in Chomsky’s earliest work in the late 1950s. The idea, called the No Tampering Condition in current parlance, seems like a rather natural economy condition. (62) involves no tampering since the old tree in (61) still exists as a subtree of (62), whereas (63) involves tampering with the original structure. That is, it is more economical to expand a struc- ture than to go back and change a structure that has already been built. This becomes particularly clear if parts of the structure are shipped off to the interfaces (e.g., phase by phase as in much recent Minimalist work), where the earlier structure effectively is not available. Were one to tamper with that structure, it would require bringing the structure back into the main structure again, which seems hugely uneconomical.

1.8 Concluding Remarks The history of generative grammar is not very long. Despite this, consider- able progress has been made in our understanding of the human language faculty. Numerous problems and questions remain, but it is interesting to observe that there are certain questions that have remained at the center of the theoretical development since the early beginning. For example, whereas generalized transformations were eliminated in the mid-1960s, they returned again in the Minimalist Program where D-structure was eliminated (though see Uriagereka 2008 for critical discussion). Questions of how structure is generated are still at the forefront of current research. Another major issue, since Ross (1967), is locality (see den Dikken and 52 Transformational Constraints Lahne 2013). Since Chomsky (1973), locality issues have occupied a cen- tral role in linguistic theorizing. We are still lacking a complete theory of islands, so this is certainly another issue that will be on the front burner for quite some time. Phrase structure has been central since LSLT, though the theory of phrase structure has undergone substantial changes over the years. These are just a few examples of recurrent themes during the brief sixty- year history of our field. In this chapter we have in particular emphasized the early period since that is often the period that is not as well known. We believe it is important to know the history of the field in order to fully understand current developments. For example, understanding the change from Government and Binding to the Minimalist Program necessitates a good understanding of the former framework. But in order to understand Government and Binding, it is also necessary to understand the Extended Standard Theory and, in turn, also the framework in LSLT and Syntactic Structures and the one is Aspects. We hope that this chapter serves as a use- ful entry point into this history.

Notes * We are grateful to Juan Uriagereka for extensive help on an earlier draft and to Marcel den Dikken for his patience and encouragement, and to Marcel and an anonymous reviewer for very helpful comments that led to substantial improve- ments in the presentation. 1 In languages like Hungarian, sentential complementation is typically “medi- ated” by a pronoun, as shown in (i). (i) Janos azt tudja, hogy S Janos it.ACC knows that S This property may provide retroactive support for the LSLT way of generating sentential complementation. Thanks to Marcel den Dikken (p.c.) for pointing out this fact to us. 2 It should be noted that singulary transformations and generalized transforma- tions could be interspersed. There were no constraints on when either could apply. 3 For extensive discussion of this analysis, see Lasnik (2000, chapter 2). 4 We need the T-marker as interface with semantics because the final derived P-marker typically lacks information relevant to meaning, for example gram- matical relations if Passive has applied. 5 In the theory in Aspects, grammatical relations like subject and object are read off the syntactic structure. However, the relations themselves are semantic, so subject means understood subject of. A decade later, the theory of Relational Grammar (Perlmutter 1980) turned this view on its head, maintaining that grammatical relations are the primitives of the grammar. In Relational Grammar, grammatical relations are purely structural relations. This means that grammati- cal relations can be altered by transformations, and the major Relational Gram- mar syntactic processes had just this purpose. 6 It is worth noting that in Aspects, cyclicity and Deep Structure were intertwined. Later on, they were distinguished, which means that one has to reconsider the previous evidence for Deep Structure. A Brief History of Generative Syntax 53 7 See section 1.2.7 for discussion of an approach to cyclicity that, unlike that in Aspects, does not require recursion in the base (hence D-structure). In effect, this addresses one of the major Aspects arguments for D-structure. 8 Though see Jackendoff (1977) for a way to in part solve this problem by iden- tifying S with V′′ in his system. See also Hornstein (1977), who argues that S should be excluded from the X-bar convention. 9 That is, the problem of “projecting” the correct grammar from limited input data. 10 See also Stowell (1981: 71–75) for criticism based on arguments from acquisition. 11 In fact, Stowell (1981) argued for the general elimination of phrase structure rules, thus providing empirical motivation for the formalization of Lasnik and Kupin (1977). 12 This is what Chomsky said, but it is obviously not exactly what he meant. (40a) should read X′ = X Y′′* because otherwise a verb, for example, can only take a VP complement, and similarly for (40b) and specifiers. 13 See also Stuurman (1985: 182) for a similar claim, though Stuurman claims that this also holds for lexical categories. 14 We speculate that Kayne intended «asymmetry» rather than «antisymmetry». An antisymmetric relation R is one where if (a, b) ∈ R and (b, a) ∈ R, then a = b. Asymmetry is a stronger property: (a, b) ∈ R  (b, a) ∉ R. Since items evidently do not precede themselves, the weakening loophole of antisymmetry is not needed. 15 This way of looking at phrase structure is closely related to Speas (1990: 35). 16 This condition is an obvious extension of an idea in Katz and Postal (1964: 44–45), further developed in Chomsky (1965: 132) when he suggests that trans- formations cannot introduce meaning-bearing elements. 17 This also happened as textbooks on the Syntactic Structures and Aspects frame- works were written. 18 There are of course hybrid theories as well. Chomsky (1981), for example, pro- poses well-formedness conditions on Deep Structure, on Surface Structure, and on the application of transformations between grammatical levels. 19 Almost from the earliest days of generative grammar, there were qualms about optional transformations: “An obvious decision is to consider minimization of the optional part of the grammar to be the major factor in reducing complexity” (Chomsky 1958/1962: 154). 20 Notice that in English, the relevant strong feature could not reside in the wh- phrase, since in multiple interrogation, all but one of the whs remain in situ, hence unchecked, in overt syntax: (i) Who gave what to who?

References Abney, S. 1987. The English Noun Phrase in its Sentential Aspect. Doctoral disserta- tion, MIT. Bach, E. 1964. An Introduction to Transformational Grammars. New York: Holt, Rinehart and Winston. Baker, M. C. 1988. Incorporation: A Theory of Grammatical Function Changing. Chicago, IL: University of Chicago Press. Baker, M. C. 2013. Agreement and case. In The Cambridge Handbook of Genera- tive Syntax, M. den Dikken (ed.), 607–654. Cambridge: Cambridge University Press. 54 Transformational Constraints Baltin, M. 1982. A Landing-site theory of movement rules. Linguistic Inquiry 13: 1–38. Barbiers, S. 2013. Microsyntactic variation. In The Cambridge Handbook of Gen- erative Syntax, M. den Dikken (ed.), 899–926. Cambridge: Cambridge Univer- sity Press. Blevins, J. P. and Sag, I. A. 2013. Phrase structure grammar. In The Cambridge Handbook of Generative Syntax, M. den Dikken (ed.), 202–225. Cambridge: Cambridge University Press. Bloomfield, L. 1933. Language. New York: Henry Holt. Bošković, Ž. 2013. Principles and parameters theory and minimalism. In The Cam- bridge Handbook of Generative Syntax, M. den Dikken (ed.), 95–121. Cam- bridge: Cambridge University Press. Brame, M. K. 1976. Conjectures and Refutations in Syntax and Semantics. New York: Elsevier. Bresnan, J. 1970. On complementizers: Toward a syntactic theory of complement types. Foundations of Language 6: 297–321. Bresnan, J. 1971. Sentence stress and syntactic transformations. Language 47: 257–281. Bresnan, J. 2001. Lexical-Functional Syntax. Malden: Blackwell. Brody, M. 1995. Lexico-Logical Form: A Radically Minimalist Theory. Cambridge, MA: MIT Press. Brody, M. 2002. On the status of representations and derivations. In Derivation and Explanation in the Minimalist Program, S. D. Epstein and T. D. Seely (eds.), 19–41. Malden: Blackwell. Brody, M. 2003. Towards an Elegant Syntax. London: Routledge. Büring, D. 2013. Syntax, information structure, and prosody. In The Cambridge Handbook of Generative Syntax, M. den Dikken (ed.), 95–121. Cambridge: Cambridge University Press. Chametzky, R. A. 2000. Phrase Structure: From GB to Minimalism. Malden: Blackwell. Chomsky, N. 1955. The Logical Structure of Linguistic Theory. Ms., Harvard Uni- versity and MIT Press. [Revised version published in part by Plenum, New York, 1975]. Chomsky, N. 1956. Three models for the description of language. IRE Transactions On Information Theory 2: 113–124. Chomsky, N. 1957. Syntactic Structures. The Hague: Mouton. Chomsky, N. 1958/1962. A transformational approach to syntax. In Proceedings of the Third Texas Conference on Problems of Linguistic Analysis in English, A. A. Hill (ed.), 124–158. Austin: University of Texas Press. Chomsky, N. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press. Chomsky, N. 1970a. Remarks on nominalization. In Readings in English Trans- formational Grammar, R. A. Jacobs and P. S. Rosenbaum (eds.), 184–221. Waltham, Mass.: Ginn. Chomsky, N. 1970b. Deep structure, surface structure, and semantic interpretation. In Studies in General and Oriental Linguistics Presented to Shirô Hattori on the Occasion of His Sixtieth Birthday, R. Jakobson and S. Kawamoto (eds.), 52–91. Tokyo: TEX Company, Ltd. A Brief History of Generative Syntax 55 Chomsky, N. 1973. Conditions on transformations. In A Festschrift for Morris Halle, S. Anderson and P. Kiparsky (eds.), 232–286. New York: Holt, Rinehart and Winston. Chomsky, N. 1977. On wh-movement. In Formal Syntax, P. Culicover, T. Wasow and A. Akmajian (eds.), 71–132. New York: Academic Press. Chomsky, N. 1981. Lectures on Government and Binding. Dordrecht: Foris. Chomsky, N. 1986. Barriers. Cambridge, MA: MIT Press. Chomsky, N. 1993. A minimalist program for linguistic theory. In The View from Building 20: Essays in Linguistics in Honor of Sylvain Bromberger, K. Hale and S. J. Keyser (eds.), 1–52. Cambridge, MA: MIT Press. Chomsky, N. 1995a. Bare phrase structure. In Evolution and Revolution in Linguis- tic Theory: A Festschrift in Honor of Carlos Otero, H. Campos and P. Kempchin- sky (eds.), 51–109. Washington DC: Georgetown University Press. Chomsky, N. 1995b. The Minimalist Program. Cambridge, MA: MIT Press. Chomsky, N. 2000. Minimalist inquiries: The framework. In Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik, R. Martin, D. Michaels and J. Uriagereka (eds.), 89–155. Cambridge, MA: MIT Press. Chomsky, N. 2001. Derivation by phase. In Ken Hale: A Life in Language, M. Ken- stowicz (ed.), 1–52. Cambridge, MA: MIT Press. Chomsky, N. 2004. Beyond explanatory adequacy. In Structures and Beyond: The Cartography of Syntactic Structures, A. Belletti (ed.), 104–131. Oxford: Oxford University Press. Chomsky, N. 2007. Approaching UG from below. In Interfaces + Recursion = Lan- guage? Chomsky’s Minimalism and the View from Syntax-Semantics, H-M. Gärt- ner and U. Sauerland (eds.), 1–30. Berlin: Mouton de Gruyter. Chomsky, N. 2008. On phases. In Foundational Issues in Linguistic Theory: Essays in Honor of Jean-Roger Vergnaud, C. Otero, R. Freidin and M.-L. Zubizarreta (eds.), 133–166. Cambridge, MA: MIT Press. Chomsky, N. and Lasnik, H. 1977. Filters and control. Linguistic Inquiry 11: 425–504. Chomsky, N. and Lasnik, H. 1993. The theory of principles and parameters. In Syntax: An International Handbook of Contemporary Research, J. Jacobs, A. von Stechow, W. Sternefeld and T. Venneman (eds.), 506–569. New York: Walter de Gruyter. Cormack, A. 1999. Without specifiers. In Specifiers: Minimalist Approaches, D. Adger, S. Pintzuk, B. Plunkett and G. Tsoulas (eds.), 46–68. Oxford: Oxford University Press. Corver, N. 2013. Lexical categories and (extended) projection. In The Cambridge Handbook of Generative Syntax, M. den Dikken (ed.), 353–424. Cambridge: Cambridge University Press. Davis, M. 1958. Computability and Unsolvability. New York: McGraven-Hill Book Company. Dayal, V. 2013. The syntax of scope and quantification. In The Cambridge Hand- book of Generative Syntax, M. den Dikken (ed.), 827–859. Cambridge: Cam- bridge University Press. den Besten, H. 1983. On the interaction of root transformations and lexical deletive rules. In On the Formal Syntax of the Westgermania, W. Abraham (ed.), 47–131. Amsterdam: John Benjamins. 56 Transformational Constraints den Dikken, M. and Lahne, A. 2013. The locality of syntactic dependencies. In The Cambridge Handbook of Generative Syntax, M. den Dikken (ed.), 655–698. Cambridge: Cambridge University Press. Dougherty, R. C. 1973. A survey of linguistic methods and arguments. Foundations of Language 10: 432–490. Emonds, J. 1976. A Transformational Approach to English Syntax. New York: Aca- demic Press. Epstein, S. D. 1999. Un-principled syntax: The derivation of syntactic relations. In Working Minimalism, S. D. Epstein and N. Hornstein (eds.), 317–345. Cam- bridge, MA: MIT Press. Fiengo, R. 1974. Semantic Conditions on Surface Structure. Doctoral dissertation, MIT. Fiengo, R. 1977. On trace theory. Linguistic Inquiry 8: 35–62. Fillmore, C. J. 1963. The position of embedding transformations in a grammar. Word 19: 208–231. Frank, R. 2002. Phrase Structure Composition and Syntactic Dependencies. Cam- bridge, MA: MIT Press. Frank, R. 2013. Tree adjoining grammar. In The Cambridge Handbook of Genera- tive Syntax, M. den Dikken (ed.), 225–262. Cambridge: Cambridge University Press. Freidin, R. 1978. Cyclicity and the theory of grammar. Linguistic Inquiry 9: 519–549. Fukui, N. 2001. Phrase structure. In The Handbook of Contemporary Syntactic Theory, M. Baltin and C. Collins (eds.), 374–406. Malden: Blackwell. Fukui, N. and Speas, M. 1986. Specifiers and projection. MIT Working Papers in Linguistics: Papers in Theoretical Linguistics 8: 128–172. Gazdar, G., Klein, E., Pullum, G. K. and Sag, I. 1985. Generalized Phrase Structure Grammar. Oxford: Basil Blackwell. George, L. M. 1980. Analogical Generalization in Natural Language Syntax. Doc- toral dissertation, MIT. Halle, M. 1959. The Sound Pattern of Russian: A Linguistic and Acoustical Investi- gation. The Hague: Mouton. Hankamer, J. 1973. Unacceptable ambiguity. Linguistic Inquiry 4: 17–68. Harley, H. 1995. Subjects, Events and Licensing. Doctoral dissertation, MIT. Harman, G. H. 1963. Generative grammars without transformation rules: A defense of phrase structure. Language 39: 597–616. Harris, Z. 1951. Methods in Structural Linguistics. Chicago, IL: University of Chi- cago Press. Hoekstra, E. 1991. Licensing Conditions on Phrase Structure. Doctoral dissertation, University of Groningen. Hoekstra, E. 1994. Agreement and the nature of specifiers. In Minimalism and Kayne’s Antisymmetry Hypothesis, C. J-W. Zwart (ed.), 159–168. Groningen: Groninger Arbeiten zur Germanistischen Linguistic, Volume 37. Hornstein, N. 1977. S and X-bar convention. Linguistic Analysis 3: 137–176. Jackendoff, R. 1969. Some Rules of Semantic Interpretation for English. Doctoral dissertation, MIT. Jackendoff, R. 1972. Semantic Interpretation in Generative Grammar. Cambridge, MA: MIT Press. Jackendoff, R. 1977. X-bar Theory. Cambridge, MA: MIT Press. A Brief History of Generative Syntax 57 Jaeggli, O. 1980. On Some Phonologically Null Elements in Syntax. Doctoral dis- sertation, MIT. Jayaseelan, K. A. 2008. Bare phrase structure and specifier-less syntax. Biolinguistics 2: 87–106. Joshi, A. 1985. How much context-sensitivity is necessary for characterizing struc- tural descriptions? In Natural Language Processing: Theoretical, Computational, and Psychological Perspectives, D. Dowty, L. Karttunen and A. Zwicky (eds.), 206–250. Cambridge: Cambridge University Press. Joshi, A., Levy, L. S. and Takahashi, M. 1975. Tree adjunct grammar. Journal of Computer and System Science 10: 136–163. Kaplan, R. M. and Bresnan, J. 1982. Lexical-functional grammar: A formal system for grammatical representation. In The Mental Representation of Grammatical Relations, J. Bresnan (ed.), 173–281. Cambridge, MA: MIT Press. Katz, J. J. and Postal, P. 1964. An Integrated Theory of Linguistic Descriptions. Cambridge, MA: MIT Press. Kayne, R. S. 1969. The Transformational Cycle in French Syntax. Doctoral disserta- tion, MIT. Kayne, R. S. 1975. French Syntax: The Transformational Cycle. Cambridge, MA: MIT Press. Kayne, R. S. 1984. Connectedness and Binary Branching. Dordrecht: Foris. Kayne, R. S. 1994. The Antisymmetry of Syntax. Cambridge, MA: MIT Press. Koizumi, M. 1995. Phrase Structure in Minimalist Syntax. Doctoral dissertation, MIT. Koster, J. 1978. Locality Principles in Syntax. Dordrecht: Foris. Kuroda, S-Y. 1965. Generative Grammatical Studies in the Japanese Language. Doc- toral dissertation, MIT. Lakoff, G. 1970. Global rules. Language 46: 627–639. Lakoff, G. 1971. On generative semantics. In Semantics: An Interdisciplinary Reader in Philosophy, Linguistics and Psychology, D. D. Steinberg and L. A. Jakobovits (eds.), 232–296. Cambridge: Cambridge University Press. Lasnik, H. 1981. Restricting the theory of transformations. In Explanation in Lin- guistics, N. Hornstein and D. Lightfoot (eds.), 152–173. London: Longmans. Lasnik, H. 2000. Syntactic Structure Revisited: Contemporary Lectures on Classic Transformational Theory. Cambridge, MA: MIT Press. Lasnik, H. 2001. Derivation and representation in modern transformational syntax. In The Handbook of Contemporary Syntactic Theory, M. Baltin and C. Collins (eds.), 62–88. Malden: Blackwell. Lasnik, H. and Kupin, J. J. 1977. A restrictive theory of transformational grammar. Theoretical Linguistics 4: 173–196. Lasnik, H. and Saito, M. 1984. On the nature of proper government. Linguistic Inquiry 15: 235–289. Lasnik, H. and Saito, M. 1992. Move Alpha. Cambridge, MA: MIT Press. Lees, R. B. 1963. The Grammar of English Nominalizations. The Hague: Mouton. Lyons, J. 1968. Introduction to Theoretical Linguistics. Cambridge: Cambridge Uni- versity Press. Matthews, G. H. 1965. Hidatsa Syntax. The Hague: Mouton. McCawley, J. D. 1968. The role of semantics in a grammar. In Universals in Linguis- tic Theory, E. Bach and R. T. Harms (eds.), 124–169. New York: Holt, Rinehart and Winston. 58 Transformational Constraints Muysken, P. 1982. Parametrizing the notion “Head”. Journal of Linguistic Research 2: 57–75. Newmeyer, F. J. 2013. Goals and methods of generative syntax. In The Cambridge Handbook of Generative Syntax, M. den Dikken (ed.), 61–92. Cambridge: Cam- bridge University Press. Perlmutter, D. 1968. Deep and Surface Constraints in Syntax. Doctoral dissertation, MIT. Perlmutter, D. 1980. Relational grammar. In Syntax and Semantics: Current Approaches to Syntax, E. A. Moravcsik and J. R. Wirth (eds.), 195–229. New York: Academic Press. Peters, S. and Ritchie, R. W. 1973. On the generative power of transformational grammars. Information Sciences 6: 49–83. Polinsky, M. 2013. Raising and control. In The Cambridge Handbook of Generative Syntax, M. den Dikken (ed.), 577–606. Cambridge: Cambridge University Press. Pollock, J.-Y. 1989. Verb movement, universal grammar, and the structure of IP. Linguistic Inquiry 20: 365–424. Post, E. 1944. Recursively enumerable sets of positive integers and their decision problems. Bulletin AMS 50: 284–316. Postal, P. 1972. The best theory. In Goals of Linguistic Theory, S. Peters (ed.), 131– 170. Englewood Cliffs, NJ: Prentice-Hall. Rizzi, L. 1982. Issues in Italian Syntax. Dordrecht: Foris. Rizzi, L. 1990. Relativized Minimality. Cambridge, MA: MIT Press. Rizzi, L. 1997. The fine structure of the left periphery. In Elements of Grammar, L. Haegeman (ed.), 281–337. Dordrecht: Kluwer. Rizzi, L. 2013. The functional structure of the sentence, and cartography. In The Cambridge Handbook of Generative Syntax, M. den Dikken (ed.), 425–457. Cambridge: Cambridge University Press. Ross, J. R. 1967. Constraints on Variables in Syntax. Doctoral dissertation, MIT. [Published as Infinite Syntax! Norwood, NJ: Ablex, 1986]. Ross, J. R. 1969. Guess who? In Papers from the Fifth Regional Meeting of the Chicago Linguistics Society, R. I. Binnick, A. Davison, G. M. Green and J. L. Morgan (eds.), 252–639. Chicago, IL: Chicago Linguistics Society, University of Chicago. Saussure, F. de. 1916. Cours de linguistique générale. Paris: Payot. Sells, P. 2013. Lexical-functional grammar. In The Cambridge Handbook of Genera- tive Syntax, M. den Dikken (ed.), 162–201. Cambridge: Cambridge University Press. Speas, M. J. 1990. Phrase Structure in Natural Language. Dordrecht: Kluwer. Starke, M. 2004. On the inexistence of specifiers and the nature of heads. In Struc- ture and Beyond: The Cartography of Syntactic Structures, A. Belletti (ed.), 252– 268. Oxford: Oxford University Press. Stowell, T. 1981. Origins of Phrase Structure. Doctoral dissertation, MIT. Stuurman, F. 1985. Phrase Structure Theory in Generative Grammar. Dordrecht: Foris. Thornton, R. and Crain, S. 2013. Parameters: The pluses and minuses. In The Cam- bridge Handbook of Generative Syntax, M. den Dikken (ed.), 927–970. Cam- bridge: Cambridge University Press. Travis, L. 1984. Parameters and Effects of Word Order Variation. Doctoral disserta- tion, MIT. A Brief History of Generative Syntax 59 Uriagereka, J. 1999. Multiple spell-out. In Working Minimalism, S. D. Epstein and N. Hornstein (eds.), 251–282. Cambridge, MA: MIT Press. Uriagereka, J. 2008. Syntactic Anchors. Cambridge: Cambridge University Press. van Craenenbroeck, J. and Merchant, J. 2013. Ellipsis phenomena. In The Cam- bridge Handbook of Generative Syntax, M. den Dikken (ed.), 701–745. Cam- bridge: Cambridge University Press. Wasow, T. A. 1972. Anaphoric Relations in English. Doctoral dissertation, MIT. Wells, R. S. 1947. Immediate constituents. Language 23: 81–117. Williams, E. 1981. On the notions “lexically related” and “head of a word”. Lin- guistic Inquiry 12: 245–274.

2 Noam Chomsky A Selected Annotated Bibliography

with Howard Lasnik

2.1 Introduction Avram Noam Chomsky was born in Philadelphia on 7 December 1928. His father, William Chomsky, was a noted Hebrew scholar. Chomsky came to the University of Pennsylvania to study, and there he met Zellig S. Harris through their common political interests. Chomsky’s first encounter with Harris’s work was when he proofread Harris’s 1951 book Methods in Struc- tural Linguistics (Chicago: Univ. of Chicago Press). The independent work that Chomsky then started to do resulted in serious revision of Harris’s approach, including the proposal that syntax, in part, is a matter of abstract representation. This led to a number of highly influential papers and books, which together have defined modern linguistics. After Chomsky spent 1951 through 1955 as a junior fellow of the Society of Fellows at Harvard Uni- versity, he joined the faculty at the Massachusetts Institute of Technology, under the sponsorship of Morris Halle. Chomsky was promoted to full professor in the Department of Foreign Languages and Linguistics in 1961 and appointed the Ferrari P. Ward Professorship of Modern Languages and Linguistics in 1966 and Institute Professor in 1976. In 1967 both the Uni- versity of Chicago and the University of London awarded him honorary degrees, and since then he has been the recipient of scores of honors and awards. In 1988 he was awarded the Kyoto Prize in Basic Sciences, created in 1984 (along with prizes in two other categories) in order to recognize work in areas not included among the Nobel Prizes. These honors are all a testament to Chomsky’s influence and impact on linguistics, and cognitive science more generally, since the mid-twentieth century. He has continually revised and updated his technical analyses, from phrase structure grammars to the standard theory, in the 1960s; to the extended standard theory and X-bar theory, in the 1970s; to the principles and parameters theory and its variant, the Minimalist Program. Over the years the technical details have changed, sometimes dramatically, but many of the core assumptions, as laid out in his foundational work, have remained essentially the same. His work has been both applauded and criticized but remains central to investigations of language. 62 Transformational Constraints 2.2 Foundational Work As Zellig S. Harris’s student, Chomsky was deeply immersed in structural linguistics, and his first works were attempts to extend the method in Har- ris’s 1951 book Methods in Structural Linguistics, as in Chomsky 1951. Harris had one sentence transform into another, and Chomsky soon dis- covered data that could not be captured using such a method, as discussed in Chomsky 1957 and Chomsky 1962. Instead, Chomsky had to appeal to abstract structures, and this is what he did in two of his most famous, and groundbreaking, works: The Logical Structure of Linguistic Theory (LSLT; Chomsky 1975) and Syntactic Structures (Chomsky 1957). Chomsky 1975 was written while Chomsky was a junior fellow of the Society of Fellows at Harvard University and completed in 1955. It was only published in 1975, with a comprehensive introduction that outlines the development of the manuscript. Whereas both of these texts are concerned with formal details, Chomsky (1959), a review of B. F. Skinner’s book Verbal Behavior, focused on questions of language use and creativity. This review quickly gained fame for demonstrating the fundamental problems of behaviorism. Chom- sky (1965) outlines a theory of language embedded in the human mind (see also Chomsky 1964). The first chapter of this book is essential reading for anyone who wants to attain a basic understanding of Chomsky’s ideas. In this chapter, he attempts to define a distinct, scientific project for linguistics: “scientific” because it aims to explain what underlies individual linguistic abilities and “distinct” because the properties of human language appear to be special. Chomsky (1957), Chomsky (1959), and Chomsky (1965) are quite accessible and still relevant to contemporary debates.

Chomsky, Noam. 1951. Morphophonemics of modern Hebrew. MA thesis, Univ. of Pennsylvania. In this thesis, Chomsky discusses certain morphophonemic alternations in modern Hebrew. He is particularly concerned with the simplicity of this grammar and how to design other such grammars. Chomsky, Noam. 1955. Transformational analysis. PhD dissertation, Univ. of Pennsylvania. This doctoral dissertation was based on one chapter from Chomsky 1975. Chomsky, Noam. 1957. Syntactic structures. Janua linguarum. The Hague: Mouton. Chomsky’s first published book, introducing transformational syntax. This book also contains the important discoveries and insights regarding the English auxiliary system that were used to motivate abstract structures. Chomsky, Noam. 1959. Verbal behavior by B. F. Skinner. Language 35.1: 26–58. This famous review of B. F. Skinner’s Verbal Behavior gave behaviorism the silver bullet and laid the ground for modern cognitive science. Chomsky: A Selected Biography 63 Chomsky, Noam. 1962. A transformational approach to syntax. In Proceedings of the Third Texas Conference on Problems of Linguistic Analysis in English, May 9–12, 1958. Edited by Archibald A. Hill, 124–148. Austin: Univ. of Texas Press. An outline of a transformational approach to syntax, including a comparison with the work of Zellig S. Harris. Chomsky, Noam. 1964. Current issues in linguistic theory. Paper presented at the Ninth International Congress of Linguists, Cambridge, Massachusetts, 1962. Janua linguarum. The Hague: Mouton. This short book details the goals of linguistic theory and the nature of structural descriptions for both syntax and phonology. Chomsky, Noam. 1965. Aspects of the theory of syntax. MIT: Research Laboratory of Electronics; Special technical report. Cambridge, MA: Massachusetts Institute of Technology Press. One of Chomsky’s most important publications. The first chapter (pp. 3–62) defines his way of approaching the study of language as a component of the human mind and emphasizes the goal that theory should account for how a child can acquire a language. The theory described here is known as the standard theory. Chomsky, Noam. 1975. The logical structure of linguistic theory. New York: Plenum. Chomsky’s monumental work, completed in 1955 and published in 1975. Lays out the formal basis for a complete theory of linguistic structure. The concepts and technical notions (level of representation and syntactic transformation, among many others) that became central to linguistic theorizing were introduced in this text.

2.3 Formal Grammars In the 1950s, Chomsky pursued the idea that a sentence is the result of a computation that produces a “derivation”. This computation starts with an abstract structural representation that is sequentially altered by operations that are structure dependent. These operations quickly became known as transformations. Based on work on recursive function theory in the late 1930s, Chomsky was able to refine the idea, and he developed algebraic linguistics, a branch of abstract algebra (part of the field of computer sci- ence, in the early twenty-first century). Chomsky wrote several important papers, including Chomsky (1956), Chomsky (1959), Chomsky (1963), and Chomsky and Schützenberger (1963), in which he introduced what would later be referred to as the Chomsky hierarchy (also called the Chomsky Schützenberger hierarchy). Together with the renowned cognitive scien- tist George Miller, he also wrote an influential paper (Chomsky and Miller 1963) in which the distinction between competence and performance first emerged. Peters and Ritchie (1973) is a well-known formalization of the 64 Transformational Constraints theory developed in Chomsky (1965, cited under Foundational Work). Jäger and Rogers 2012 is an overview and assessment of Chomsky’s work on formal grammars.

Chomsky, Noam. 1956. Three models for the description of language. IRE Transactions on Information Theory 2.3: 113–124. A formal statement of three increasingly complex models of language structure: finite state, phrase structure, and transformational. This has since become known as the Chomsky hierarchy. Chomsky, Noam. 1959. On certain formal properties of grammars. Information and Control 2.2: 137–167. Presents the classification of formal grammars along the Chomsky- Schützenberger hierarchy: recursively enumerable, context sensitive, context free, and regular languages. Chomsky, Noam. 1963. Formal properties of grammars. In Handbook of mathematical psychology. Vol. 2. Edited by R. Duncan Luce, Robert R. Bush, and Eugene Galanter, 323–418. New York: Wiley. Substantial discussion of abstract automata, context-sensitive grammars, and context-free grammars. Chomsky, Noam, and George A. Miller. 1963. Introduction to the formal analysis of natural languages. In Handbook of mathematical psychology. Vol. 2. Edited by R. Duncan Luce, Robert R. Bush, and Eugene Galanter, 269–321. New York: Wiley. This paper contains several classical psycholinguistic studies and introduces the distinction between competence and performance. Chomsky, Noam, and Marcel-Paul Schützenberger. 1963. The algebraic theory of context-free languages. In Computer programming and formal systems. Edited by P. Braffort and D. Hirschberg, 118–161. Studies in Logic and the Foundations of Mathematics. Amsterdam: North-Holland. Extensive examination of the link between formal language theory and abstract algebra. Jäger, Gerhard, and James Rogers. 2012. Formal language theory: Refining the Chomsky hierarchy. Philosophical Transactions of the Royal Society B 367.1598: 1956–1970. An overview and reassessment of Chomsky’s work on formal languages. Peters, P. Stanley, Jr., and R. W. Ritchie. 1973. On the generative power of transformational grammars. Information Sciences 6.1: 49–83. A well-known formalization of the theory of transformational grammar in Chomsky 1965 (cited under Foundational Work) and a study of its expressive power. Chomsky: A Selected Biography 65 2.4 Introductions and Biographies There are many books that present Noam Chomsky’s work, both his lin- guistic study and his political activities. There are few books, however, that go into great detail about Chomsky’s life; the best one is Barsky (1997). Barsky (2011) mainly concerns Chomsky’s teacher, Zellig S. Harris, but it has a lot of valuable material on the environment in which Chomsky grew up as well as on the relationship between Harris and Chomsky. Lyons (1970), McGilvray (1999), McGilvray (2005), and Smith (2004) all provide more in-depth examination of Chomsky’s ideas and work. Bracken (1984) contains one of the best discussions of the relation between Chomsky and Descartes. Bricmont and Franck 2010 is a collection of essays that introduce numerous topics from a Chomskyan perspective.

Barsky, Robert F. 1997. Noam Chomsky: A life of dissent. Cambridge, MA: MIT Press. The most detailed book about Chomsky. The author relies heavily on personal correspondence with Chomsky and traces the intellectual and political environments that contributed to shaping the life of Chomsky. The book covers both Chomsky’s linguistic study and his political work. Barsky, Robert F. 2011. Zellig Harris: From American linguistics to socialist Zionism. Cambridge, MA: MIT Press. This book is primarily about Chomsky’s teacher, but it also describes in detail the environment in which Chomsky grew up. The relationship between Chomsky and Harris, a matter of some debate in the literature, is also explored in detail. Bracken, Harry M. 1984. Mind and language: Essays on Descartes and Chomsky. Publications in Language Sciences. Dordrecht, The Netherlands, and Cinnaminson, NJ: Foris. This collection contains several informative essays on Chomsky’s approach to the study of language, including the relation between Chomsky and Descartes and a defense of the views that Chomsky put forward in Cartesian Linguistics (see Chomsky 2009, cited under Philosophy of Language: Early Contributions). Bricmont, Jean, and Julie Franck. 2010, eds. Chomsky notebook. Columbia Themes in Philosophy. New York: Columbia Univ. Press. This book addresses the wide range of topics that Chomsky has devoted his career to, including naturalism, the evolution of linguistic theory, and truth. The book also contains two essays by Chomsky. Lyons, John. 1970. Chomsky. Fontana modern masters. London: Fontana. The first book-length introduction to Chomsky’s views on language and mind. 66 Transformational Constraints McGilvray, James. 1999. Chomsky: Language, mind, and politics. Key contemporary thinkers. Cambridge, UK, and Malden, MA: Polity. An introduction to Chomsky’s views on language, mind, and politics. McGilvray, James, ed. 2005. The Cambridge companion to Chomsky. Cambridge, UK, and New York: Cambridge Univ. Press. This collection of essays by prominent scholars deals with various aspects of Chomsky’s work: human language, the human mind, and some of the political activities. Smith, Neil. 2004. Chomsky: Ideas and ideals. 2nd ed. Cambridge, UK, and New York: Cambridge Univ. Press. An examination of Chomsky’s work and ideas for nonspecialists. Smith explains how these ideas have shaped modern linguistics and cognitive science, and he also looks at the controversies concerning Chomsky’s work.

2.5 Interviews There have been countless interviews with Chomsky over the years. Chom- sky (1982), Chomsky (2004b), and Chomsky (2012) are book-length inter- views. Chomsky (2004a) contains several interviews about language and mind, although the main focus is Chomsky’s political views. Beckwith and Rispoli (1986) is an interview concentrating on language, learning, and psy- chology. Cheng and Sybesma (1995) and Cela-Conde and Marty (1998) are devoted to the Minimalist Program.

Beckwith, Richard, and Matthew Rispoli. 1986. Aspects of a theory of mind: An interview with Noam Chomsky. New Ideas in Psychology 4.2: 187–202. Discusses psychology and language, innateness, and learning. Cela-Conde, Camilo J. and Gisèle Marty. 1998. Noam Chomsky’s Minimalist Program and the philosophy of mind. Syntax 1.1: 19–36. An interview about the minimalist program and its place within Chomsky’s philosophy of language and mind. Cheng, Lisa, and Rint Sybesma. 1995. “Language is the perfect solution”: Interview with Noam Chomsky. Glot International 1.9–10: 31–34. An interview on the minimalist program. Chomsky, Noam. 1982. Noam Chomsky on The Generative Enterprise: A discussion with Riny Huybregts and Henk van Riemsdijk. Dordrecht, The Netherlands, and Cinnaminson, NJ: Foris. An interview based on discussions in November 1979 and March 1980. Chomsky, Noam. 2004a. The Generative Enterprise revisited: Discussions with Riny Huybregts, Henk van Riemsdijk, Naoki Fukui and Mihoko Zushi. Berlin and New York: Mouton de Gruyter. Chomsky: A Selected Biography 67 A republication of Chomsky (1982), with the addition of an interview updating what happened in the field in the subsequent twenty years. Chomsky, Noam. 2004b. Language and politics. 2d ed. Edited by C. P. Otero. Oakland, CA: AK. This comprehensive volume contains more than fifty interviews, conducted between 1968 and 2002. Many of them deal with Chomsky’s political views, but several are devoted to issues related to language. Chomsky, Noam. 2012. The science of language: Interviews with James McGilvray. Cambridge, UK, and New York: Cambridge Univ. Press. A book-length collection of interviews on the ties between Chomsky’s linguistics and his conception of the human mind. The book also includes a number of explanatory texts by McGilvray.

2.6 Assessments There are a number of assessments of Chomsky’s ideas and work. They all cite much relevant literature that should be explored further. Lees (1957) is an influential review of Chomsky’s first book, Syntactic Structures (see Chomsky 1957, cited under Foundational Work). Harman (1974) is an early collection of essays on Chomsky’s approach to language. Hymes (2001) reprints a 1971 critique of Chomsky’s distinction between competence and performance. Cowie (1999) and Sampson (2005) are more recent critical discussions of Chomsky’s work on language. Otero (1994) is a very com- prehensive assessment of Chomsky’s work, including reprints of previously published papers that can be hard to access. Antony and Hornstein (2003) is a collection of advanced essays on Chomsky’s linguistic work. Piattelli-Pal- marini (1980) contains a classical discussion of a meeting in which Chom- sky and Jean Piaget debated their different views on language and cognition.

Antony, Louise M., and Norbert Hornstein, eds. 2003. Chomsky and his critics. Philosphers and Their Critics. Malden, MA: Blackwell. A collection of critical essays on Chomsky’s linguistic work, with replies from Chomsky. Cowie, Fiona. 1999. What’s within? Nativism reconsidered. Philosophy of Mind. New York: Oxford Univ. Press. This book is a critique of nativism and of Chomsky’s work. Harman, Gilbert, ed. 1974. On Noam Chomsky: Critical essays. Modern studies in Philosophy. Garden City, NY: Anchor. An anthology examining Chomsky’s work on linguistics, philosophy, and psychology. Hymes, Dell. 2001. On communicative competence. In Linguistic anthropology: A reader. Edited by Alessandro Duranti, 53–73. Malden, MA: Blackwell. 68 Transformational Constraints A critique of Chomsky’s distinction between competence and performance. Lees, Robert B. 1957. “Syntactic structures by Noam Chomsky”. Language 33.3: 375–408. A very influential review of Chomsky’s first major publication. This review, which is very positive, also foreshadows the focus on psychology that Chomsky adopted soon after. Otero, Carlos P., ed. 1994. Noam Chomsky: Critical assessments. 4 vols. London and New York: Routledge. A four-volume collection of essays on Chomsky’s work and ideas. Piattelli-Palmarini, Massimo, ed. 1980. Language and learning: The debate between Jean Piaget and Noam Chomsky. Papers presented at a colloquium, Paris, October 1975. Cambridge, MA: Harvard Univ. Press. A conference report centered on a discussion between Chomsky and Piaget. Sampson, Geoffrey. 2005. The Language instinct debate. Rev. ed. London and New York: Continuum. A book-length criticism of Chomsky’s views and claims regarding innate structure.

2.7 Textbooks There are many textbooks on generative syntax, as developed by Chomsky. Included here are a few that will provide a useful introduction to various periods and aspects of Chomsky’s work, both technical and more general. Adger (2003) is an accessible introduction to syntax from a Minimalist Pro- gram perspective, whereas Hornstein et al. (2005) gives a more in-depth introduction, based on what government and binding had accomplished. Radford (2009) is yet another introduction to the Minimalist Program, mostly stressing English data. Boeckx (2006) presents the Minimalist Pro- gram and offers the best discussion of its conceptual and historical origins and motivations. Haegeman (1994) is the most authoritative introduction to government and binding. Lasnik (2000) goes through Chomsky’s early work and then looks at connections between that work and more contem- porary theories, such as the Minimalist Program. Jenkins (2000) focuses especially on the biological orientation to grammar, as found in Chomsky’s work. For introductions to his philosophy of language, see Introductions and Biographies.

Adger, David. 2003. Core syntax: A minimalist approach. Core Linguistics. Oxford and New York: Oxford Univ. Press. This is an introduction to the minimalist way of doing syntax, concentrating in particular on how syntax is driven by features. Chomsky: A Selected Biography 69 Boeckx, Cedric. 2006. Linguistic minimalism: Origins, concepts, methods, and aims. Oxford Linguistics. Oxford and New York: Oxford Univ. Press. A useful introduction to the minimalist program, discussing motivation and the conceptual aspects of the program. Haegeman, Liliane. 1994. Introduction to government and binding theory. 2nd ed. Blackwell Textbooks in Linguistics. Oxford and Cambridge, MA: Blackwell. The most authoritative introduction to government and binding. Essential reading for anyone who wants to attain an understanding of this approach to syntax. Hornstein, Norbert, Jairo Nunes, and Kleanthes K. Grohmann. 2005. Understanding minimalism. Cambridge Textbooks in Linguistics. Cambridge, UK, and New York: Cambridge Univ. Press. This is an introduction to the minimalist program, while also explaining why minimalism succeeded government and binding. The best book on the market for those who want to attain an understanding of the latter transition. Jenkins, Lyle. 2000. Biolinguistics: Exploring the biology of language. Cambridge, UK, and New York: Cambridge Univ. Press. This book covers the biolinguistic approach to the study of language and focuses in particular on outlining Chomsky’s version of this program. Lasnik, Howard. 2000. Syntactic structures revisited: Contemporary lectures on classic transformational theory. Current Studies in Linguistics 33. Cambridge, MA: MIT Press. A thorough introduction to the theory presented in Chomsky 1957 (cited under Foundational Work). Lasnik also shows how the problems in that book are still very relevant to early-twenty-first-century theorizing. Radford, Andrew. 2009. Analysing English sentences: A minimalist approach. Cambridge Textbooks in Linguistics. Cambridge, UK, and New York: Cambridge Univ. Press. This textbook introduces the minimalist program through an in-depth study of the syntax of English.

2.9 Extended Standard Theory In the 1970s the extended standard theory grew out of the standard the- ory, as presented in Chomsky (1965, cited under Foundational Work). In particular, the theory of semantic interpretation is significantly changed, with Surface Structure playing an increasingly significant role (Chomsky 1970b, Chomsky 1972, Chomsky 1975). It is also during this period that X-bar theory is proposed for the first time (Chomsky 1970a; although the 70 Transformational Constraints conception of this theory changes quite a bit in the early 1980s). During this period, Chomsky published a series of highly influential papers. Chomsky (1973) and Chomsky (1977) are especially important for their attempts at generalizing the locality constraints on movement that John Robert Ross discovered in his 1967 Massachusetts Institute of Technology dissertation “Constraints on Variables in Syntax” (see also Chomsky 1976). Chom- sky and Lasnik (1977) is also significant as a predecessor of principles and parameters theory.

Chomsky, Noam. 1970a. Remarks on nominalization. In Readings in English transformational grammar. Edited by Roderick A. Jacobs and Peter S. Rosenbaum, 184–221. Waltham, MA: Ginn. A paper that provides a strikingly nontransformational view of how the derivation of complex words fits into the grammar. Also suggests the X-bar theory of phrase structure, whereby every phrase is a projection of a head. Chomsky, Noam. 1970b. Deep Structure, Surface Structure and semantic interpretation. In Studies in general and Oriental linguistics, presented to Shirô Hattori on the occasion of his sixtieth birthday. Edited by Roman Jakobson and Shigeo Kawamoto, 52–91. Tokyo: Tokyo English Center. This paper addresses the inadequacies of the standard theory. Chomsky proposes a revised theory of semantic interpretation, lessening the role of Deep Structure. Chomsky, Noam. 1972. Studies on semantics in generative grammar. Janua Linguarum. The Hague: Mouton. A collection of three essays defining extended standard theory. The status of Deep Structure is a central concern. Chomsky, Noam. 1973. Conditions on transformations. In A festschrift for Morris Halle. Edited by Stephen R. Anderson and Paul Kiparsky, 232–286. New York: Holt, Rinehart and Winston. Chomsky’s first far-reaching attempt at replacing conditions on specific transformations with general constraints on transformations that would capture restrictions on movement (e.g., subjacency) and relations more generally (e.g., the tensed sentence condition). Chomsky, Noam. 1975. Questions of form and interpretation. Linguistic Analysis 1.1: 75–109. A general paper on questions related to interpretation and grammatical levels. Chomsky, Noam. 1976. Conditions on rules of grammar. Linguistic Analysis 2.4: 303–351. This paper develops and refines the theory in “Conditions on Transformations” (see Chomsky 1973), whereby all movement rules Chomsky: A Selected Biography 71 leave behind a trace. Certain previous constraints on movement now become constraints on the relation between a trace and its antecedent. Chomsky, Noam. 1977. On wh-movement. Paper presented at the Mathematical Social Science Board—UC Irvine Conference on the Formal Syntax of Natural Language, Newport Beach, CA, 1–11 June 1976. In Formal syntax. Edited by Peter W. Culicover, Thomas Wasow, and Adrian Akmajian, 71–132. New York: Academic Press. Chomsky argues that what had been considered a range of different transformations should all be captured as instantiations of wh-movement. Chomsky, Noam, and Howard Lasnik. 1977. Filters and control. Linguistic Inquiry 8.3: 425–504. Focusing on explanatory adequacy, this paper suggests that transformational rules are very general and that the output of these rules is filtered out in order to yield only grammatical representations.

2.10 Principles and Parameters Theory Toward the end of the 1970s, in part through the work done together with Howard Lasnik in their 1977 paper (see Chomsky and Lasnik 1977, cited under Extended Standard Theory), Chomsky started developing a new approach, whereby language- and construction-specific rules are replaced by very general operations. Certain operations and rules are universal, and they constitute the principles. There is limited variation among the world’s languages, and this variation is considered to be captured by parameters. If true, these principles and parameters would provide a solution to the fundamental problem of language acquisition. Chomsky (1981) outlines this program; more details can be found in Lectures on Government and Binding (see Chomsky 1981, cited under Government and Binding). Chomsky and Lasnik 1993 is a synthesis of work that happened through- out the 1980s, and it laid the ground for the Minimalist Program. More recently the logic and empirical validity behind principles and parame- ters have been criticized (Newmeyer 2005, Boeckx 2011). Principles and parameters theory comes in two different guises: One guise is govern- ment and binding, the approach that Chomsky developed circa 1980. The other guise is the Minimalist Program, which began to develop in the late 1980s and which continues its evolution in the early twenty-first century. These are superficially very different, but there is also a sense of continuity between the two; the Minimalist Program can be seen as a rationalization of the principles and generalizations that were discovered during govern- ment and binding.

Boeckx, Cedric. 2011. Approaching parameters from below. In The biolinguistic enterprise: New perspectives on the evolution and nature of the human language faculty. Edited by Anna Maria Di Sciullo and 72 Transformational Constraints Cedric Boeckx, 205–221. Oxford linguistics. Oxford and New York: Oxford Univ. Press. A critique of the principles and parameters framework from a minimalist program perspective. Chomsky, Noam. 1981. Principles and parameters in syntactic theory. In Explanation in linguistics: The logical problem of language acquisition. Edited by Norbert Hornstein and David Lightfoot, 32–75. Longman Linguistics Library. London and New York: Longman. An outline of principles and parameters theory. Chomsky, Noam, and Howard Lasnik. 1993. The theory of principles and parameters. In Syntax: An international handbook of contemporary research. Vol. 1. Edited by Joachim Jacobs, Arnim von Stechow, Wolfgang Sternefeld, and Theo Venneman, 506–569. Handbooks of Linguistics and Communication Science. Berlin and New York: de Gruyter. An overview of the principles and parameters theory, as developed in government and binding and early aspects of the minimalist program. Newmeyer, Frederick J. 2005. Possible and probable languages: A generative perspective on linguistic typology. Oxford and New York: Oxford Univ. Press. A critique of the principles and parameters approach, arguing that the crosslinguistic generalizations that it relies on are less solid than previously assumed.

2.10.1 Government and Binding This approach is based on the interaction of several modules (subtheories), government and binding being two of the most important ones. Principles such as the theta criterion and the extended projection principle were for- mulated and explored, and, in one form or another, they are still an impor- tant part of early-twenty-first-century theories. Chomsky (1980) examines indexing and case theory in particular, whereas Chomsky (1981) offers a comprehensive theory of syntax. Chomsky (1982) and Chomsky (1986a) develop this further by studying parasitic gaps, a topic that plays a promi- nent role in government and binding. Lasnik (1994) is included in this sec- tion because it provides a useful overview of the development of Chomsky’s proposals concerning anaphora and binding theory, from 1973 to 1986. Chomsky (1986b) introduces the distinction between I-language and E-language, explaining that I-language refers to the study of the individual and internal language of a speaker, whereas and E-language is a broad label for language use.

Chomsky, Noam. 1980. On binding. Linguistic Inquiry 11.1: 1–46. This highly technical paper created two of the pillars of government and binding: binding theory and case Theory. Indexing is the core theoretical Chomsky: A Selected Biography 73 principle in this essay, and many of the filters in “Filters and Control” (see Chomsky and Lasnik 1977, cited under Extended Standard Theory) are reinterpreted as effects of case theory. Chomsky, Noam. 1981. Lectures on government and binding. Papers presented at the Generative Linguistics in the Old World conference, Pisa, April 1979. Dordrecht, The Netherlands, and Cinnaminson, NJ: Foris. Based on lectures given in Pisa in 1979, this is the first extensive survey of syntactic theory since Chomsky 1965 (cited under Foundational Work). The theory is based on a small number of modules. Chomsky, Noam. 1982. Some concepts and consequences of the theory of government and binding. Linguistic Inquiry Monographs 6. Cambridge, MA: MIT Press. This monograph extends government and binding theory to empirical domains, most notably, properties relating to empty categories and parasitic gaps. Chomsky, Noam. 1986a. Barriers. Linguistic Inquiry Monographs 13. Cambridge, MA: MIT Press. An extensive analysis of locality, based on bounding nodes, subjacency, and barriers. Chomsky, Noam. 1986b. Knowledge of language: Its nature, origin, and use. Convergence. Westport, CT: Praeger. The first half of this book presents Chomsky’s views on how to study language, introducing the distinction between I-language and E-language. The second half extends case theory and analyzes expletive constructions in detail. It also introduces the economy principle full interpretation. Lasnik, Howard. 1994. Noam Chomsky on anaphora. In Noam Chomsky: Critical assessments. Vol 1, Linguistics. Edited by Carlos P. Otero, 574–606. London and New York: Routledge. An overview of Chomsky’s analyses on anaphora from 1973 to 1986.

2.10.2 The Minimalist Program The principles and parameters theory developed in the late 1980s into the Minimalist Program. Four papers are collected in Chomsky (1995). Chap- ter 2 was written and circulated in 1988, based on lectures in 1986 and 1987. It was originally published in 1991 (see Principles and Parameters in Comparative Grammar, edited by Robert Freidin [Cambridge, MA: MIT Press]). Chapter 1 is the essay Chomsky and Lasnik (1993, cited under Prin- ciples and Parameters Theory), on principles and parameters, whereas chap- ters 3 and 4 offer more detailed presentations of the Minimalist Program. 74 Transformational Constraints The goal of the Minimalist Program is to rationalize the principles of gov- ernment and binding, that is, to provide a deeper understanding of the core syntactic mechanisms and operations. Since the 1990s the program has continued to develop, and Chomsky (2000), Chomsky (2001), Chomsky (2004), Chomsky (2007), and Chomsky (2008) all further the technical and conceptual details. In particular, these works have conceived the notion of a phase, a specific domain of syntactic computation. Berwick et al. (2011) revisits poverty of stimulus arguments, that is, arguments postulating the existence of innate, tacit knowledge of language.

Berwick, Robert C., Paul Pietroski, Beracah Yankama, and Noam Chomsky. 2011. Poverty of the stimulus revisited. Cognitive Science 35.7: 1207–1242. A reply to several publications on the poverty of stimulus, clarifying the logic behind the concept. Chomsky, Noam. 1995. The minimalist program. Current Studies in Linguistics 28. Cambridge, MA: MIT Press. A collection of four essays that illustrate the development of the minimalist program as well as presenting many of its technicalities. Chomsky, Noam. 2000. Minimalist inquiries: The framework. In Step by step: Essays on minimalist syntax in honor of Howard Lasnik. Edited by Roger Martin, David Michaels, and Juan Uriagereka, 89–155. Cambridge, MA: MIT Press. In this paper, Chomsky introduces the concept of a phase (encompassing some aspects of the earlier “cyclic node” and “barrier”), which has become an important part of the minimalist program. Chomsky, Noam. 2001. Derivation by phase. In Ken Hale: A life in language. Edited by Michael Kenstowicz, 1–52. Current Studies in Linguistics 36. Cambridge, MA: MIT Press. This paper further develops the notion of phase. Chomsky, Noam. 2004. Beyond explanatory adequacy. Paper presented at a workshop, Siena, 1999. In The cartography of syntactic structures. Vol 3, Structures and beyond. Edited by Adriana Belletti, 104–131. Oxford studies in Comparative Syntax. New York: Oxford Univ. Press. Here, Chomsky suggests that the study of human language can move beyond explanatory adequacy and asks why language is structured just the way it is. Chomsky, Noam. 2007. Approaching UG from below. In Interfaces + recursion = language? Chomsky’s minimalism and the view from syntax-semantics. Edited by Uli Sauerland and Hans-Martin Gärtner, 1–29. Studies in Generative Grammar. Berlin: Mouton de Gruyter. This paper focuses on making universal grammar as small as possible. Chomsky: A Selected Biography 75 Chomsky, Noam. 2008. On phases. In Foundational issues in linguistic theory: Essays in honor of Jean-Roger Vergnaud. Edited by Robert Freidin, Carlos P. Otero, and María-Luísa Zubizarreta, 133–166. Current Studies in Linguistics 45. Cambridge, MA: MIT Press. This paper further develops the technology for phase-based derivations.

2.11 Biolinguistics Since the early 2000s it has become increasingly common to use the label “biolinguistics” instead of “Minimalist Program”. In the literature there is disagreement as to whether biolinguistics is just another name for Chomskyan generative grammar or whether it is a new approach, with a different focus, compared with the Minimalist Program. With Hauser, Chomsky, and Fitch (2002) and Fitch, Hauser, and Chomsky (2005), a whole subfield has been devoted to studying language evolution and try- ing to figure out what the unique and language-specific parts are of our genetic endowment. Chomsky (2005) provides a framework in which one can study the different factors that enter into the design of language, and this is further developed in Chomsky (2010) and in Berwick and Chom- sky (2011). Uriagereka (1998) is an early textbook attempt at exploring the biology of language from essentially the perspective later presented in Chomsky (2005).

Berwick, Robert C., and Chomsky, Noam. 2011. The biolinguistic program: The current state of its development. In The biolinguistic enterprise: New perspectives on the evolution and nature of the human language faculty. Edited by Anna Maria Di Sciullo and Cedric Boeckx, 19–41. Oxford and New York: Oxford Univ. Press. A review of where biolinguistics stands in 2011, focusing on the origin of language and language variation. Chomsky, Noam. 2005. Three factors in language design. Linguistic Inquiry 36.1: 1–22. This paper outlines what the three factors are when studying language: genetic endowment, experience, and general principles of computation and physical laws. Chomsky, Noam. 2010. Some simple evo devo theses: How true might they be for language? In The evolution of human language: Biolinguistic perspectives. Edited by Richard K. Larson, Viviane Déprez and Hiroko Yamakido, 45–62. Approaches to the Evolution of Language. Cambridge, UK, and New York: Cambridge Univ. Press. This paper is concerned with some fundamental issues that arise when pursuing the study of language from a biological perspective. Chomsky places particular emphasis on the nature of merge and of the interfaces. 76 Transformational Constraints Fitch, W. Tecumseh, Marc D. Hauser, and Noam Chomsky. 2005. The evolution of the language faculty: Clarifications and implications. Cognition 97.2: 179–210. This is a detailed reply to Ray Jackendoff and Steven Pinker, who, in turn, replied to Hauser, Fitch, and Chomsky (2002). The paper concentrates especially on evolution. Hauser, Marc D., Noam Chomsky, and W. Tecumseh Fitch. 2002. The faculty of language: What is it, who has it, and how did it evolve? Science 298.5598: 1569–1579. This seminal paper discusses the evolution of language and distinguishes between the faculty of language in the narrow sense (FLN) and the faculty of language in the broad sense (FLB). ´ Uriagereka, Juan. 1998. Rhyme and reason: An introduction to minimalist syntax. Cambridge, MA: MIT Press. An introduction to the minimalist program from the perspectives of biology and physics.

2.12 Phonology Although most of Chomsky’s linguistic work has been on syntax and the philosophy of language, he also did groundbreaking work in phonology (Chomsky 1967). Much of this research was in collaboration with Morris Halle, at the Massachusetts Institute of Technology. Chomsky and Halle (1968) is a momentous book, studying the phonology of English in great detail (see also Chomsky and Halle 1965). Here, the authors introduced as well a theory of markedness, which came to play a significant role in syntax in the late 1970s and early 1980s, as in the paper “Filters and Control,” written together with Howard Lasnik (see Chomsky and Lasnik 1977, cited under Extended Standard Theory). Another influential paper was coau- thored with Halle and Fred Lukoff (Chomsky et al. 1956). In this paper the concept of a phonological cycle was presented, and the notion of a cycle has been a cornerstone of all generative work on syntax since the mid-1960s.

Chomsky, Noam. 1967. Some general properties of phonological rules. Language 43.1: 102–128. A paper that discusses some general principles and rules that govern the phonological component of the grammar. Chomsky, Noam, and Morris Halle. 1965. Some controversial questions in phonological theory. Journal of Linguistics 1.2: 97–138. A reply to a criticism by Fred W. Householder of the generative program, with particular emphasis on phonological questions. Chomsky, Noam, and Morris Halle. 1968. The sound pattern of English. Studies in language. New York: Harper and Row. Chomsky: A Selected Biography 77 The fundamental work on phonology from a generative perspective. The text contains several groundbreaking analyses of the phonology of English. Chomsky, Noam, Morris Halle, and Fred Lukoff. 1956. On accent and juncture in English. In For Roman Jakobson: Essays on the occasion of his sixtieth birthday, 11 October 1956. Edited by Morris Halle, Horace G. Lunt, Hugh McLean and Cornelis H. van Schooneveld, 65–80. The Hague: Mouton. This paper presents the phonological cycle, here a new concept within generative theory, extended to syntax in Chomsky 1965 (cited under Foundational Work).

2.13 Philosophy of Language Chomsky has made profound contributions to the philosophy of language and philosophy of mind. In addition to developing his own approach, he has been at pains to situate his work in relation to more mainstream work within the philosophy of language. In particular, he has discussed and criti- cized philosophers such as Willard Quine, Hilary Putnam, John Searle, Paul Grice, and Michael Dummett for their work based on externalism and ref- erentiality. Chomsky has also argued that formal semantics, as tradition- ally developed by Gottlob Frege, Alfred Tarski, and Richard Montague, is fundamentally misguided when it comes to natural language because of the emphasis on denotation and truth. Instead, Chomsky has advocated an approach based on internalism of meaning, and he has focused more on meaning than on semantics. The relevant publications are divided into Early Contributions and Later Contributions, although it is suggested that there are not any significant differences between these two categories.

2.13.1 Early Contributions The first significant publication in which Chomsky presents his ideas on the philosophy of language and philosophy of mind is Chomsky (2009). Chom- sky (1967) is a very accessible and useful summary of the leading ideas in Chomsky (2009). Chomsky (1975) is a very useful collection that includes a look at innateness. Chomsky (1977) and Chomsky (2005) are further devel- opments of the author’s ideas, in part aimed at a general audience. Chomsky (1986) is well-known for introducing the distinction between I-language and E-language, and Chomsky (1988) further develops these ideas. Searle (1972) is a critical discussion of Chomsky’s views.

Chomsky, Noam. 1967. Recent contributions to the theory of innate ideas: Summary of oral presentation. In Special issue: Including a symposium on innate ideas. Synthese 17.1: 2–11. 78 Transformational Constraints A very clear and brief summary of the core thesis that there is a relationship between the classical doctrine of innate ideas and a theory of psychological a priori principles. Chomsky, Noam. 1975. . New York: Pantheon. An early and relatively nontechnical look at the issues concerning the study of language and mind. The book contains a lengthy defense of innateness and covers the central features of Chomsky’s approach to studying language. Chomsky, Noam. 1977. Essays on form and interpretation. Studies in Linguistic Analysis. New York: North-Holland. A collection of essays that mainly focus on semantic interpretation. Chomsky, Noam. 1986. Knowledge of language: Its nature, origin, and use. Convergence. Westport, CT: Praeger. In this influential book, Chomsky outlines his conception of knowledge and the distinction between I-language and E-language. This is a core reading for attaining an understanding of Chomsky’s approach to studying language. Chomsky, Noam. 1988. Language and problems of knowledge: The Managua lectures. Current Studies in Linguistics 16. Cambridge, MA: MIT Press. This book develops many of the same concepts that Chomsky (1986) does, although with a different emphasis. Most of the examples in this book are drawn from Spanish. Chomsky, Noam. 2005. Rules and representations. Columbia Classics in Philosophy. New York: Columbia Univ. Press. Originally published in 1980. This book elucidates the Chomskyan approach to linguistics. Chomsky examines various principles that belong to universal grammar and considers some of the philosophical consequences. Chomsky, Noam. 2009. Cartesian linguistics: A chapter in the history of rationalist thought. 3d ed. Edited by James McGilvray. Studies in Language. Cambridge, UK, and New York: Cambridge Univ. Press. This short but challenging book provides the foundation for modern rationalist approaches to language, and Chomsky reviews the history since the sixteenth century. The book presents the principles of Chomsky’s philosophy of language. Originally published in 1966 (New York: Harper & Row). Searle, John R. 1972. Chomsky’s revolution in linguistics. New York Review of Books 18.12 (29 June): 16–24. A much-cited critical discussion of Chomsky’s theories of language. Chomsky: A Selected Biography 79 2.13.2 Later Contributions In Chomsky (1975, cited under Early Contributions), Chomsky introduces a distinction between problems and mysteries. Problems are things that can gen- erally be solved, whereas mysteries may be outside the reach of our intellectual abilities. He develops this idea further in Chomsky (1991). Chomsky (1993) is a more general introduction to his ideas about the relationship between lan- guage and thought; Chomsky (2002) and Chomsky (2006) provide further details. Chomsky (2000) is the best collection of essays on the philosophy of language and mind but arguably is also the most challenging one for the reader. Chomsky (2009) is a discussion of general issues in the philosophy of mind.

Chomsky, Noam. 1991. Linguistics and cognitive science: Problems and mysteries. Paper presented at the international workshop “The Chomskyan Turn: Generative Linguistics, Philosophy, Mathematics, and Psychology,” Tel Aviv, 11–14 April 1988. In The Chomskyan turn. Edited by Asa Kasher, 26–53. Cambridge, MA: Blackwell. This chapter looks at the relationship between linguistics and, in particular, the philosophy of language. Chomsky criticizes well-known approaches, such as those of Quine and Dummett. He also elaborates on his distinction between “problems” and “mysteries”. Chomsky, Noam. 1993. Language and thought. Anshen Transdisciplinary Lectureships in Art, Science, and the Philosophy of Culture. Wakefield, RI: Moyer Bell. Introduces Chomsky’s views on the study of language and considers its influence on other disciplines. Chomsky, Noam. 2000. New horizons in the study of language and mind. Cambridge, UK, and New York: Cambridge Univ. Press. The most comprehensive and advanced examination of language and mind and of issues that are prominent in the philosophy of language literature. This book requires good background knowledge of the latter literature. Chomsky, Noam. 2002. On nature and language. Edited by Adriana Belletti and Luigi Rizzi. Cambridge, UK, and New York: Cambridge Univ. Press. This book contains two chapters by Chomsky; one that traces the history of modern linguistics and cognitive science and a second that focuses on linguistics and the brain sciences. There is also an extensive interview with Chomsky on the minimalist program. Chomsky, Noam. 2006. Language and mind. 3d ed. Cambridge, UK, and New York: Cambridge Univ. Press. Originally published in 1968. This collection of essays aimed at university audiences explores Chomsky’s views on language and mind. A very accessible introduction, without many technicalities. 80 Transformational Constraints Chomsky, Noam. 2009. The mysteries of nature: How deeply hidden? In Special issue: Our knowledge of nature and number: Grounds and limits. Edited by Carol Rovane. Journal of Philosophy 106.4: 167–200. A philosophical paper discussing the mental, mind–body, and physicalism.

2.14 Controversies There have been a number of controversies concerning Chomsky’s work, and the focus in this section is one of them. Harris (1993) calls this contro- versy ‘the linguistic wars’, referring to the debate between Chomsky and those scholars who developed generative semantics, a model proposing much more abstract underlying syntactic forms and, concomitantly, much more powerful transformational operations. Huck and Goldsmith (1995) provides another take on the same issue, though concentrating more on external reasons for the breakdown of generative semantics than Harris does. Newmeyer (1996) argues that external factors are not important and that the generative semantics enterprise ended because the theory was falsi- fied. Seuren (1998) offers yet another perspective, in addition to discussing the history of generative linguistics.

Harris, Randy Allen. 1993. The linguistics wars. New York: Oxford Univ. Press. This is an in-depth study of the rift between generative semantics and Chomskyan theorists. Huck, Geoffrey J. and John A. Goldsmith. 1995. Ideology and linguistic theory: Noam Chomsky and the Deep Structure debates. History of Linguistic Thought. London and New York: Routledge. Provides another view on the breakdown of generative semantics. Newmeyer, Frederick J. 1996. Generative linguistics: A historical perspective. Routledge History of Linguistic Thought. London and New York: Routledge. Contains several chapters on the history of the field, including extensive discussion of why generative semantics did not work. Seuren, Pieter A. M. 1998. Western linguistics: An historical introduction. Oxford and Malden, MA: Blackwell. A very comprehensive discussion of the history of Western linguistics, including generative linguistics.

2.15 Applications of Chomsky’s Work Chomsky’s ideas have been applied in a number of different areas. A dis- tinction is made between those scholars who have stayed close to Chom- sky’s ideas and those who adopt a generative approach, with significant modifications. Chomsky: A Selected Biography 81 2.15.1 Chomskyan Perspectives Most of Chomsky’s work has concentrated on synchronic syntax, phonol- ogy, and the philosophy of language. There are several significant exten- sions of this work and of Chomsky’s ideas—far too many to do justice to in this article. Therefore, a very small selection is given and includes Light- foot (1979) and Roberts (2007), on diachronic syntax; Crain and Thorn- ton (1998), on acquisition; Larson and Segal (1995), McGilvray (1998), and Pietroski (2005), on meaning; and Hale and Reiss (2008) and Samuels (2011), on phonology.

Crain, Stephen, and Rosalind Thornton. 1998. Investigations in Universal Grammar: A guide to experiments on the acquisition of syntax and semantics. Language, Speech, and Communication. Cambridge, MA: MIT Press. Outlines work on language acquisition, including extensive discussion of how to conduct experiments on children, from a Chomskyan perspective. Hale, Mark, and Charles Reiss. 2008. The phonological enterprise. Oxford Linguistics. Oxford and New York: Oxford Univ. Press. This book defends the claim that phonology should be studied as a mental computational system, and it critiques optimality theory. Larson, Richard, and Gabriel Segal. 1995. Knowledge of meaning: An introduction to semantic theory. Cambridge, MA: MIT Press. Provides a way of formalizing a theory of meaning, based on a Chomskyan approach. Lightfoot, David W. 1979. Principles of diachronic syntax. Cambridge Studies in Linguistics. Cambridge, UK, and New York: Cambridge Univ. Press. The first comprehensive attempt to account for language change from a generative perspective. McGilvray, James. 1998. Meanings are syntactically individuated and found in the head. Mind and Language 13.2: 225–280. This paper offers a theory of meaning internal to the speaker’s mind and develops several of Chomsky’s ideas on meaning. Pietroski, Paul M. 2005. Events and semantic architecture. Oxford and New York: Oxford Univ. Press. A Chomskyan theory of meaning, combined with Davidsonian event structures. Roberts, Ian. 2007. Diachronic syntax. Oxford Textbooks in Linguistics. Oxford and New York: Oxford Univ. Press. A textbook overview of generative work on diachronic syntax. 82 Transformational Constraints Samuels, Bridget D. 2011. Phonological architecture: A biolinguistic perspective. Oxford studies in biolinguistics. Oxford and New York: Oxford Univ. Press. A theory of phonology and the syntax–phonology interface from a biological perspective.

2.15.2 Other Generative Approaches Common to these generative approaches is that they share with Chom- skyan approaches an overall commitment of providing a computational theory of syntax and other components of the grammar, but the technical assumptions differ significantly from what is found in those approaches. For example, many of these approaches to syntax do not assume transforma- tions. Gazdar, Klein, Pullum, and Sag (1985) introduces generalized phrase structure grammar; Joshi (1985) and Frank (2002), Tree Adjoining Gram- mar; Kaplan and Bresnan (1982), lexical-functional grammar; and Pollard and Sag (1994), head-driven phrase structure grammar. Optimality theory is included in this group because this theory is not derivational in the tradi- tional generative way. Prince and Smolensky (2004) details this framework.

Frank, Robert. 2002. Phrase structure composition and syntactic dependencies. Current Studies in Linguistics 38. Cambridge, MA: MIT Press. A proposal for how to integrate the minimalist approach with tree- adjoining grammar. Gazdar, Gerald, Ewan Klein, Geoffrey K. Pullum, and Ivan A. Sag. 1985. Generalized Phrase Structure Grammar. Oxford: Blackwell. Outlines and defends generalized phrase structure grammar, a model that increases the power of phrase structure rules to do the work that transformations do in Chomskyan models. Joshi, Aravind K. 1985. Tree adjoining grammars: How much context- sensitivity is required to provide reasonable structural descriptions? In Natural language parsing: Psychological, computational, and theoretical perspectives. Edited by David R. Dowty, Lauri Karttunen, and Arnold M. Zwicky, 206–250. Studies in Natural Language Processing. Cambridge, UK: Cambridge Univ. Press. Outlines and defends tree-adjoining grammar. Kaplan, Ronald M. and Joan Bresnan. 1982. Lexical-functional grammar: A formal system for grammatical representation. In The mental representation of grammatical relations. Edited by Joan Bresnan, 173– 281. MIT Press series on Cognitive Theory and Mental Representation. Cambridge, MA: MIT Press. Outlines and defends lexical-functional grammar. Chomsky: A Selected Biography 83 Pollard, Carl, and Ivan A. Sag. 1994. Head-driven phrase structure grammar. Studies in Contemporary Linguistics. Chicago: Univ. of Chicago Press. Outlines and defends head-driven phrase structure grammar. Prince, Alan and Paul Smolensky. 2004. Optimality Theory: Constraint interaction in generative grammar. Malden, MA: Blackwell. Details the contemporary mainstream approach to phonology.

3 Comp-t Effects Variation in the Position and Features of C

3.1 Introduction* Much variation among natural languages is attributed to the complemen- tizer area. In many ways, this started with the study of that-trace (that-t) effects over thirty years ago (Perlmutter 1971). Recently, phenomena like that-t have gained renewed interest within the Minimalist Program. In particular, that-t effects have been claimed to provide a window into one of the strategies languages resort to in order to extract displaced subjects (Rizzi 2006, Rizzi and Shlonsky 2007, Boeckx 2008), which is particularly interesting since it is generally assumed that subjects are frozen when they have reached the canonical subject position (labeled SpecTP in the pres- ent paper), as argued by, for example, Takahashi (1994), Stepanov (2001), Boeckx (2003, 2008) and Rizzi (2006) (though see Chomsky 2008 for a dif- ferent view). Rizzi (2006) argues that there are mainly two ways to escape being frozen in SpecTP. One way is to be a null subject language since such languages have the ability to insert an expletive pro in SpecTP; the other way is to insert an overt expletive.1 The first alternative is shown in (1) for Italian and the second in (2) for English (Rizzi 2006: 124):

(1) a. Chi credi che [pro verrà?]

who do you think.2S that pro will come.FUT b. Credo che verrà Gianni.

I think.1S that will come.3S Gianni (2) a. *What do you think that is in the box? b. What do you think that there is in the box?

Both of these strategies derive the desired result: SpecTP is filled, which makes it possible to extract the subject from a lower position. However, there are also cases where no such material is merged in SpecTP. Cases in point are the variation in that-t effects in English and the fact that variet- ies of Norwegian do not have any that-t effects. Boeckx (2008) presents an explanation of the variation in English by way of variable positions in a split-CP framework (Rizzi 1997). The goal of the present chapter is to 86 Transformational Constraints extend his approach to the Scandinavian languages, and to other instances of what could generally be called Comp-trace (C-t) effects. I argue that C-t effects can be derived from different features on Fin° and by positing vari- able positions in the complementizer domain. The chapter is organized as follows: Section 3.2 presents Boeckx’ (2008) account of that-t effects in English. Section 3.3 extends this theory to the Scandinavian languages and includes a short discussion of extraction from embedded V2 clauses. I also show that his theory is able to deal with other C-t phenomena in section 3.4. Section 3.5 deals with the fact that relative clauses show a reversed C-t effect, and also makes some remarks on the kind of parameter we are dealing with. Section 3.6 concludes the paper.

3.2 A Theory of That-t in English That-t effects have been an important object of study in generative grammar for over thirty years. It was first thought that it was possible to formulate a universal filter (Chomsky and Lasnik 1977), but Dutch and Scandinavian data were rapidly presented that argued against any universality (Perlmutter 1971: 116, Maling and Zaenen 1978). These languages were particularly important as they are not null-subject languages. Today everyone agrees that we cannot speak about any universal that-t filter, both for empirical and theoretical reasons. Since we are not dealing with a universal phenomenon, we need some kind of parametrization. In this section I first discuss how we can explain the Standard English facts based on the theory in Boeckx (2008). Then I extend Boeckx’s theory to the Scandinavian languages, before I in sections 3.4 and 3.5 go on to discuss other C-t phenomena. The basic facts related to that-t in Standard English are well-known. When a subject is extracted from an embedded clause, that cannot be pres- ent. For objects, there is no such requirement. These properties are illus- trated in (3)-(4), though in the remainder of this paper, I only focus on the asymmetry in (3).

(3) a. What did Janet say had happened? b. *What did Janet say that had happened? (4) What did Janet say that Paul had done?

Boeckx (2008) attempts to derive the contrast between (3a) and (3b). Of course, there are many other recent proposals in the literature that aim at the same goal (see, e.g., Rizzi 1990, 1997: 310–314, Déprez 1991, 1994, Lasnik and Saito 1992: 16, 58, 177–183, Szczegielniak 1999, Pesetsky and Torrego 2001, Richards 2001: 168–171, 177–178, Roussou 2002, 2007, Ishii 2004, Branigan 1996, 2005, Manzini 2007b, Mizuguchi 2008), but none of them aims at a comprehensive picture of the variation we find in, for example, Germanic, contrary to the present paper. Crucial to Boeckx’s Comp-t Effects 87 approach is the understanding of Case. He follows Pesetsky and Torrego (2001: 361) who define nominative Case as follows:

(5) Nominative Case Nominative case is uT on D uT stands for an uninterpretable [T] feature, and “on D” means that it is present on a nominal phrase. Based on this, Boeckx formulates nominative Case as follows:

(6) [Fin° [+T] [__ T° [-φ]]]

(6) incorporates Chomsky’s (2008, 2007) claim that assignment of nomina- tive Case is dependent on a finite C and assumes Rizzi’s (1997: 288) cartog- raphy of the CP:

(7) ForceP . . . (TopicP) . . . (FocusP) . . . FinP TP

The φ-features on the functional head in (6) are present to indicate the pres- ence of a Probe-Goal relation with a DP (Boeckx 2008: 172). Furthermore, Fin°, merged as [±T], obtains a [+T] feature by agreeing with a valued T° ([+T]). The important issue from the present point of view is the assumption that an item is frozen when it has checked its Case. This follows from what Boeckx (2003) calls the Principle of Unambiguous Chain (PUC; see also Richards 2001, Rizzi 2006); the fact that a chain can contain at most one strong position/occurrence. A strong position/occurrence is defined in terms of checking of a feature associated with an Extended Projection Principle (EPP) property (Boeckx 2008: 165). However, the picture is slightly more complicated. The present theory suggests that an element can only move to a single feature-checking site, that is, remerge once (Boeckx 2008: 167; see also Boeckx 2003, Bošković 2007). How does one define the latter notion? Boeckx suggests the following (Boeckx 2008: 171):

(8) A chain can be defined by two domains, one associated with the Exter- nal Merge position, the other with the Internal Merge position. (9) checking domain

[H1°[–F] [__ [H2°[+F]]]]

checking site 88 Transformational Constraints The configuration in (9) is very important, as Boeckx points out, because it counts as an unambiguous checking site for the interfaces. The check- ing domain is unambiguous because it identifies a checking relation tied to movement, which commonly is difficult to identify because Agree (fea- ture checking at a distance) and non-feature-driven movement (movement to intermediate landing sites) exist. Movement is necessary because if an element with an identical feature is not remerged in “__”, there will be an intervention effect because of H2 (as H1, H2 and this element would stand in an asymmetric relation and share a feature). Consequently, an element remerged in a configuration like (9) cannot be remerged further. This is com- monly the case with subjects, and it should be clear that (6) is an instance of (9). The theory furthermore explains, under the current theory of Case, why an element only can receive one Case feature. Crucially, it is also pre- dicted that there should be no freezing effect even when objects get their Case checked (in a Pesetsky-Torrego way). The direct object would be exter- nally merged, thus the checking site is not unambiguous. I do not discuss objects in the following, but the fact that (4) is grammatical shows that the prediction is borne out. We then need a way to avoid the subject being stuck in cases like (3a). Boeckx (2008: 178) suggests that this is done through deactivation of Fin°’s [T]-feature. When Fin° no longer has a valued [T]-feature, it cannot value the Case feature of the subject DP. That is, we have two configurations, as shown in (10):

(10) a. [Fin° [+T] [__ T° [−φ]]] subject extraction disallowed

b. [Fin° [T] [__ T° [−φ]]] subject extraction allowed

Later I argue that (10) is an important source for parametric variation. What is important now is that when that is missing, Fin° does not have a valued [T]-feature (10b). Thus, the DP does not receive Case and can freely move to the main clause where its Case feature is valued.2 The Case feature will be valued by the relevant C°; the same head that values the Case feature on the subject in the main clause. These two Case valuations do not happen simultaneously, as the Case feature of the extracted subject is valued after the subject of the main clause’s Case feature is valued. Roussou (2007) discusses that-t in Standard English, and says that “a lexical subject seems to be a precondition for the presence of that”. She illustrates this among others by way of the data in (11) through (14):

(11) a. *Who do you think that left? b. Who do you think left? (12) a. The student that the teacher predicted (*that) will be outstanding. b. The student that the teacher predicted (that) everyone will admire. (13) a. It was Peter that the teacher told us (*that) had been outstanding. b. It was Peter that the teacher told us (that) everyone admired. Comp-t Effects 89 (14) a. John ate more cookies than he estimated (*that) would be eaten. b. John ate more cookies than he estimated (that) his friends would eat.

The present account derives these cases straightforwardly. In (11a) there is a Fin°[+T], which entails that the subject will be frozen because the Case fea- ture will be valued. (11b), on the other hand, is grammatical because there is a Fin°[T]. I assume that a null complementizer normally corresponds to an unvalued [T]-feature, simply [T]; compare with (10b). This is also the case for (12a) through (14a), whereas in (12b) through (14b) there is a

Fin°[+T]. In the following I will return to instances where pronunciation of that is optional and show that these do not relate specifically to the that-t effect. There is a well-known fact that merits attention in the present context, namely, what Culicover (1992) dubbed the “adverb effect” (originally noted by Bresnan 1977: 194). Representative data from English are provided in (15) through (18) (from Browning 1996: 237–238, building on Culicover 1992):

(15) a. Robin met the man Leslie said that for all intents and purposes was the mayor of the city. b. *Robin met the man Leslie said that was the mayor of the city. (16) a. This is the tree that I said that just yesterday had resisted my shovel. b. *This is the tree that I said that had resisted my shovel. (17) a. I asked what Leslie said that in her opinion had made Robin give a book to Lee. b. *I asked what Leslie said that had made Robin give a book to Lee. (18) a. Lee forgot which dishes Leslie had said that under normal circum- stances should be put on the table. b. *Lee forgot which dishes Leslie had said that should be put on the table.

There have been a number of analyses to account for these facts, either by suggesting a PolP as in Culicover (1992) or assuming CP recursion (Bra- nigan 1996, 2005, Browning 1996; the latter one building on Watanabe 1992). Boeckx (2008) discusses these cases and argues that they fit the pres- ent theory nicely (see also Rizzi 1997: 310–313 for a similar analysis). Spe- cifically, he argues that these adverbials are located in SpecTopicP, that is, between Force° and Fin°. That means that that must be in Force° in these constructions, and since there is no [+T]-feature on Fin° (Fin° is silent; i.e. it only has a [T]-feature), extraction is licit. Further corroborating this view is the fact that English has several instances of high complementizers, as Elly van Gelderen (p.c.) brought to my attention.

(19) a. I think that frankly, Mr. Leary was Nixon’s best friend. b. Because I think that, frankly, there is no easy way to solve this. 90 Transformational Constraints These data lend themselves to a similar analysis as the adverb effect. In both cases the complementizer is forced into Force° because of the TopicP; hence, the A-chain can be extended beyond the mono-clausal domain. However, in

(19) there still is a Fin°[+T] because there is no movement out of the embed- ded clause. In other words, we see that that in Force° actually can co-occur with both a silent Fin°[+T] and a silent Fin°[T], depending on whether there is movement out of the embedded clause or not. This is entirely in line with the present theory where Fin°[T] is what we could call a result of complementizer “manipulation”, where the moved subject manipulates the feature content on Fin°. As we will see in the following, such an alternation is morphologi- cally reflected in other languages. We have seen how the that-t effect can be explained in Standard English. However, there are also varieties of English that lack the that-t effect, as has been carefully documented by Sobin (1987, 2002). I use the term varieties because, as an anonymous reviewer points out, although the expectation that the that-t effect or the lack of it is dialectal was set up in early work (Chomsky and Lasnik 1977: 450ff.), this expectation is not borne out in later work. That is, there do not appear to be dialects qua dialects that either have or do not have the that-t effect. In particular, Cowart (2007) argues in favor of this latter point of view, and Sobin (2002) mentions studies that seem to show that the variation is, to a lesser extent, related to dialects. Thus, it seems misconceived to speak about dialects when discussing the variation in that-t effects. Instead, I use the more neutral variety (“gram- mar”), meaning just that more than one grammar exists, but with no impli- cation of a delineated geographical area. The important fact that needs to be captured is that many English-speaking persons find both (20a) and (20b) equally acceptable:

(20) a. Who did Peter say that left? b. Who did Peter say left?

Recall that it was just argued that the adverb effect implies that the comple- mentizer that is merged in Force°; that is, that in Force° makes it possible to extract the subject from the embedded clause. To make sense of (20a), we extend this line of reasoning and argue that in grammars that allow both (20a) and (20b), that lexicalizes Force° (cf. Rizzi 1997, 2001: 289 on Italian che, which fits the present picture as Italian does not have a that-t effect, cf. (1)). Contrary to Fin°, Force° is not involved in agreement; thus, we do not expect any agreement with the subject in SpecTP. Furthermore, when that is in Force° in English, the silent Fin° corresponds to Fin°[T], as described earlier. Importantly, there is an empirical argument in favor of this picture. Con- sider the co-occurring complementizers in Polish (21) and Dutch (22):

(21) On myślał, że Janowi żeś dał książkę. he thought that John that.2.sg gave book “He thought that you gave the book to John.” (Szczegielniak 1999) Comp-t Effects 91 (22) a. Piet denkt (*of) dat Jan het gedaan heeft. Piet thinks if that Jan it done has “Piet thinks that Jan did it.” b. Wat denkt Piet (of) dat Jan gedaan heeft? what thinks Piet if that Jan done has “What does Piet think that Jan has done?” (Zwart 1996: 321)

As the data show, Polish has two morphologically identical complementizers, separated by a topic. The lower complementizer hosts the subject agreement marker/clitic. Dutch, on the other hand, can only realize the highest comple- mentizer in the context of wh-movement. These two examples suggest that low complementizers are associated with subject A-properties like agree- ment, whereas higher complementizers can be used to extend A-bar chains beyond the mono-clausal domain. This is supported by the fact that several languages exhibit what we often refer to as complementizer agreement (see, among others, Bayer 1984, Haegeman 1992, Zwart 1997, Carstens 2003). There are varieties, like West Flemish, that have a full agreement paradigm and no that-t effects (Haegeman 1992), but these are not problematic from the present point of view. Since the ability to move a displaced subject is crucially related to whether Fin° has a valued [T]-feature, a complementizer without an agreeing [T]-feature can nevertheless agree in φ-features. I do not discuss the specific technical implementation of this in this chapter. The different complementizer heads are simply a reflection of a strategy that languages employ to stretch the subject chain (Boeckx 2008: 189). The peculiar thing about English is that we do not see any morphological reflections of this strategy; that is, there is no morphological trait showing whether Fin°[+T] or Fin°[T] is present. However, they abound in other lan- guages. Witness, for example, the well-known complementizer alternation in Modern Irish.

(23) a. Dúirt sé gur bhuail tú é? said he that struck you him “He said that you struck him.” b. An t-ainm a hinsíodh dúinn a bhí ar an áit. the name that was.told to.us that was on the place “The name that we were told was on the place.” (Boeckx 2008: 186)

We see that a special complementizer a is used for extraction (in general). Similar data are found in creoles like Gullah:

(24) a. This da young man Ø/*say come yah yesiday. This the young man comp came here yesterday “This is the young man that came here yesterday.” b. Faye answer say/*Ø Robert coming. Faye answered comp Robert coming “Faye answered that Robert was coming.” (Boeckx 2008: 186) 92 Transformational Constraints Thus, the poor English morphology acts as a veil, but based on the com- parative data, the account of English is less stipulative than it may look like at first glance. There is another well-known fact regarding that which necessitates a brief discussion. Consider the data given in (25) and (26):

(25) a. John thinks (that) Mary left. b. Who do you think (that) John saw? (26) Why do you think (that) he left early? (Lasnik and Saito 1992: 46)

Kayne (1981), Stowell (1981) and Pesetsky and Torrego (2001) argue that these cases are instances of optional pronunciation of that. However, notice that neither of these cases involves movement of the subject. As such they do not seem to be related to the that-t effect since there is always a Fin°[+T] pres- ent. Thus, I do not provide any exhaustive discussion of this but point to the papers by Pesetsky and Torrego, Poletto (2001) and Cocchi and Poletto (2002). The two latter papers aim at providing an account of the pronuncia- tion of optional complementizers in terms of parallel checking. It is argued that either the verb or the complementizer checks the same feature in the complementizer domain. This account can quite easily be extended to the English (and Scandinavian) data, as mentioned by the cited authors.3 The overall idea of the present proposal is by no means novel; see, for example, Authier (1992), Hoekstra (1993), Szczegielniak (1999), Roussou (2000). Another way to implement this is suggested by Lightfoot (2006) where complementizers differ as to whether they constitute a host for cliti- cization (see his work for details). However, none of these works make the specific link between a complementizer and the subject-chain properties, which is the gist of the proposals by Boeckx (2008), Rizzi (2006) and Rizzi and Shlonsky (2007) which I have adopted in the present chapter. We are now able to formulate a general that-t parameter that accounts for the cases discussed so far.

(27) The that-trace parameter a. Grammars exhibiting the that-trace effect: The complementizer lexicalizes Fin°. Fin° has a [+T]-feature b. Grammars lacking the that-trace effect: The complementizer lexicalizes Force°. Fin° has a [T]-feature

This parameter restates what I have argued earlier. When a complementizer lexicalizes Fin°, there is a Case feature which makes the subject chain ter- minate in SpecTP. When a complementizer lexicalizes Force°, Fin° does not have such a feature, and the subject chain does not terminate. We see that Fin° is crucial in deriving these differences, and I have argued that the two different positions Fin° and Force° correspond to two different properties: Comp-t Effects 93 Fin° shows agreement with a subject whereas Force° is used to extend chains. However, as David Lightfoot (p.c.) points out, there is an impor- tant question related to (27): What is the trigger experience; that is, what expresses a complementizer lexicalizing either Fin° or Force°? One solution might be that the child is able to infer this from the island data. That is, when the child realizes that there either is or is not a that-t effect, it will set the parameter for the complementizer correctly. However, this account is arguably circular. At the moment I do not have a clearly noncircular pro- posal as to what an independent trigger experience might be. A further important question, raised by an anonymous reviewer, is whether the claim that that can occupy Force° in addition to Fin° is ad hoc. This question is dealt with in Boeckx (2008: 189; see already Rizzi 1997: 312, although the implementation is slightly different), which I would like to quote in full:

The fact that that-trace effects are notoriously variable should not come as a surprise. It is just another case of lexical variation. The fact that its effects are found in domains that go beyond simple, unembedded clauses (Lightfoot’s 1989 Degree-0) makes it even harder for the learner to fix the right parametric value. The task is made even more difficult by the presence of two complementizer-heads (Force° and Fin°), both of which appear to be lexicalized as that in various circumstances.

This squares well with what we have seen regarding the fact that the that- t effect itself is highly variable. It seems that people can lexicalize that in Force° overtly in cases where there is a topicalized phrase, the latter being the trigger for this high realization. The fact that this all happens in embed- ded circumstances is easily explained on Lightfoot’s theory as the variation is indeed expected. An argument supporting this is that not all child lan- guages exhibit the that-t effect (McDaniel, Chiu and Maxfield 1995: 716, quoting Thornton 1990), as seen in (28). This can be taken to mean that the child has not yet set the accurate parameter for the complementizer.

(28) Who do you think that’s under the can?

In sum, the variation in English is generally attributed to differences in lexi- calization, which in general is a welcome result from the perspective of the Minimalist Program where parametric differences mainly are reduced to lexical differences (Borer 1984, Chomsky 1995). Thus, the parameter in (27) is not of the kind of “strict” parameters like, for example, a pro-drop parameter would be (either you have pro drop or you do not). I return to this issue at the end of section 3.5. We have seen how to account for the variation in that-t effects in Eng- lish. In the next section, I extend the empirical coverage of the theory and 94 Transformational Constraints show that (27) is able to account for the variation among the Scandinavian languages. Then I go on and generalize the parameter in (27) to general instances of C-t effects.

3.3 That-t in Scandinavian The Scandinavian languages are by no means identical when it comes to movement out of a that-clause. Varieties of Norwegian are arguably the most liberal ones regarding the availability of subject extraction. Most of the literature that discusses Norwegian fails to acknowledge this, and one can even get the impression that Norwegian is like English. An important aim of the present chapter is to show that this is incorrect. Danish and Swedish pattern mostly like English, whereas Icelandic is special in requiring the complementizer to be present (see, e.g., Hellan and Christensen 1986 for a brief summary). In this section I provide a general overview of the data and suggest how they can be analyzed according to the earlier account pro- vided for English. I first look at Norwegian in 3.3.1, at Danish and Swedish in 3.3.2, and, last, at Icelandic in 3.3.3. Then I briefly discusshow embed- ded V2 squares with the data in sections 3.3.1 through 3.3.3. Section 3.3.5 contains a brief summary.

3.3.1  Norwegian It is very difficult to use the notion “Norwegian” in this context. The reason for this is that this language has a lot of variation among the varieties. In this section I restrict myself to those varieties that do not show any that-t effect. Lohndal (2007) attempts to show the great variation in Norwegian dialects, but unfortunately this was only done on the basis of an informal and small study. Currently a large survey on among others that-t effects in Norwegian dialects is being conducted by the Scandinavian Dialect Syntax project (see http://uit.no/scandiasyn/?Language=en). At present, the survey is not finished, and in order to say anything about the status of this phe- nomenon across dialects, it is necessary to have a complete survey. Such a systematic and quantifiable study is obviously of great value, not to say necessary, and therefore, I do not discuss the informal survey in this chapter. The reader should however have in mind that there is variation, and we find grammars in Norwegian with a strong that-t effect (see also Taraldsen 1979, Fiva 1985, 1991, and Lohndal 2007). The examples in (29) through (31) illustrate that varieties of Norwegian do not have any that-t effect:

(29) a. Hvem tror hun at vil komme? who thinks she that will come b. Hvem tror hun vil komme? “Who does she think will come?” Comp-t Effects 95 (30) a. Hva tror han at er i boksen? who thinks he that is in box.def b. Hva tror han er i boksen? “What does he think is in the box?” (31) a. Hvem tror hun at unngår å lese boken? who thinks she that avoids to read book.def b. Hvem tror hun unngår å lese boken? “Who does she think avoids reading the book?”

As we shall see, this is arguably the most radical variant found among the Scandinavian languages. The data are accounted for in the same way as the English cases where there is no that-t effect, namely, that at lexicalizes Force°. Force° does not agree with the subject, and since silent Fin° only has an unvalued [T]-feature, no agreement obtains. It should also be mentioned that for those who have a strong that-t effect, they have at in Fin°, as in Eng- lish. But as argued earlier, it is difficult to make further speculations beyond this as the dialect survey of Norwegian is not yet completed.

3.3.2 Danish and Swedish Danish and Swedish are not as liberal as the varieties of Norwegian pre- sented in 3.3.1. Both languages have the that-t effect just like Standard English. This is illustrated in (32) and (33) for Danish (taken from Vikner 1995:12; see also Vikner 1991) and (34) for Swedish.4

(32) a. Hvilken kok tror du har kogt de her grønsager? which cook think you has cooked these here vegetables “Which cook do you think has cooked these vegetables?” b. *Hvilken kok tror du at har kogt de her grønsager? (33) a. Hvem tror du ofte tager til Paris? who think you often goes to Paris “Who do you think often goes to Paris?” b. *Hvem tror du at ofte tager til Paris?5 (34) a. Vilken elev trodde ingen skulle fuska? which pupil thought nobody would cheat “Which pupil didn’t anyone think would cheat?” b. *Vilken elev trodde ingen att skulle fuska? (Engdahl 1982:166)

A fact that distinguishes Danish from all the other languages is that it often uses a dummy element der, “there”, when the subject of a that-clause has been extracted. (35) and (36) illustrate this:

(35) Vennen (som) han påstod at der havde lånt bogen var friend.def that he claimed that there had borrowed book.def was forsvundet. disappeared 96 Transformational Constraints “The friend that he claimed had borrowed the book had disappeared.” (Engdahl 1985:21) (36) Hvem tror du, at der har gjort det? who think you that there has done it “Who do you think has done it?” (Engdahl 1986: 123)

This is impossible in all the other Scandinavian languages, but resembles what we have seen for English above. Recall (2b), repeated as (37):

(37) What do you think that there is in the box?

It seems valid to say that der in (35) and (36) is an expletive-like element (Engdahl 1985). Danish has an expletive der which corresponds to English there (Vikner 1991), and although the use of der in (35) and (36) does not mirror (37), it appears to be similar. It is hardly a relative der (38), which Vikner (1991) correctly concludes is a complementizer.

(38) Vi kender de lingvister, der vil købe denne bog. we know the linguists there will buy this book “We know the linguistics that will buy this book.” (Vikner 1991: 109)

Another argument in favor of der being an expletive-like element is that der follows the complementizer at and precedes the auxiliary in (36), which seems to indicate that der stands in SpecTP. On this account, it is no surprise that extraction (from the base position) is possible as the expletive receives Case. Thus, Danish at lexicalizes Fin°, on a par with English that. In Swedish, there is another alternative than to insert an expletive in order to avoid the that-t effect, namely, to make use of a resumptive pronoun (Eng- dahl 1985). Resumptive pronouns are commonly known to rescue potential islands (see e.g. Boeckx 2003 for an overview), and therefore it comes as no surprise that we do not find any that-t effect in these cases. This is shown in (39) through (41) where a complementizer is present. We also find similar instances with embedded interrogative clauses (41). (39c) illustrates that it is impossible with a resumptive pronoun if the complementizer is absent.

(39) a. Vilken elev trodde ingen att han skulle fuska? which pupil thought nobody that he would cheat “Which pupil didn’t anyone think would cheat?” b. *Vilken elev trodde ingen att skulle fuska? c. *Vilken elev trodde ingen han skulle fuska? (Engdahl 1982: 166) (40) a. Vilka elever var det oklart in i det sista om de skulle which pupils was it unclear in in the last if they should klara sig? succeed Comp-t Effects 97 “Which pupils was it unclear until the last minute whether they were going to make it?” b. *Vilka elever var det oklart in i det sista om skulle klara sig? (Engdahl 1985:22) (41) a. Vilken film kunde ingen minnas hur den slutade? what film could no.one remember how it ended “What film could no one remember how ended?” b. *Vilken film kunde ingen minnas hur slutade? (Engdahl 1986:121)

In this case, resumptive pronouns in SpecTP get their Case from Fin°. Thus, the Swedish complementizer att lexicalizes Fin°, as in English. The difference between English and Swedish is that Swedish may use a resumptive pronoun to rescue a potential that-t effect. However, it is also important to note that we find varieties of Swedish that pattern like Norwegian. In Finland-Swedish, the sentences in (42) and (43) are grammatical.

(42) Vi har forsökt täcka sådana fall som vi tänkte att skulle vara we have tried cover such cases that we thought that should be intressanta. interesting “We have tried to cover such cases as we thought would be interesting.” (Engdahl 1985: 22) (43) Vem sa du att hade sett oss? who said you that had seen us “Who did you say had seen us?” (Holmberg 1986: 192)

Finland-Swedish, then, patterns like Norwegian, and the complementizer att lexicalizes Force°. An interesting fact is that there was no that-t effect in earlier stages of Swedish. Specifically, Platzack (1987) shows that there was no that-t effect in Medieval Swedish. An example is provided in (44):

(44) Och thenne Elden mena en part att förorsakas af . . . and this fire believe some that is.caused by “Some believe that this fire is caused by . . .” (Platzack 1987:397)

The change from Medieval Swedish to Modern Swedish can easily be accounted for on the present theory. There has been a lexical change from att in Medieval Swedish lexicalizing Force° to att lexicalizing Fin° in Mod- ern Swedish. It is in my opinion a favorable aspect of the present theory that it can account for this change in such a simple manner. 98 Transformational Constraints 3.3.3  Icelandic Icelandic has a pattern of its own, which is illustrated in (45) (Maling and Zaenen 1978: 478–479; see also Kitahara 1994).

(45) a. Hver sagðir þú að væri kominn til Reykjavíkur? who said you that was come to Reykjavik “Who did you say that had come to Reykjavik?” b. Þetta er maðurinn sem þeir halda að sé of heimskur til að this is man.def that they think that is too dumb in order vinna verkið. to.do job.def “This is the man that they think is too dumb to do the job.” c. Það er Ólafur sem þeir segja að muni koma. it is Olaf that they say that would come “It is Olaf who they say would come.”

Regarding að, Maling and Zaenen (1978: 480) note that “[d]eletion of the complementizer að is only marginally possible in Icelandic, and is certainly no better in these examples of extracted subjects”.6 This makes Icelandic special, as it has a strict “anti” that-t effect: The complementizer always has to be present. The fact that extraction is possible is derived by arguing that að lexicalizes Force°. As Howard Lasnik (p.c.) points out, it does not seem necessary to ensure that a silent complementizer does not license extraction since Icelandic does not allow for silent complementizers in these configura- tions. Once again, the difference between languages piggyback on the fea- ture content of Fin° and the structural position of the complementizer.

3.3.4 A Note on Embedded V2 Recently a lot of attention has been devoted to embedded V2 in the Scandi- navian languages (see, in particular, Vikner 1995, Thráinsson 2003, Bentzen et al. 2007a, 2007b, Julien 2007, Wiklund et al. 2007). One interesting question from the present point of view is whether embedded V2 show any different properties than non-V2 embedded clauses. Interestingly, Icelandic, Norwegian, and Swedish (and Danish) show similar properties in both these cases. This is illustrated in (46) through (48):

(46) Hver sagði hann að gæti ekki sungið þetta lag? (Icelandic) who said he that could not sing this song (47) Hvem sa han at kunne ikke synge denne sangen? (Norwegian) who said he that could not sing this song.def (48) *Vem sa han (att) kunde inte sjunga den här sången? (Swedish) who said he. (that) could not sing this here song.def (Bentzen et al. 2007a: 123–124) Comp-t Effects 99 We see that whereas Icelandic and Norwegian allow for extraction of the subject out of the embedded clause, Swedish does not. This follows on the account given here, where Icelandic and Norwegian have the complemen- tizer in Force° whereas Swedish has the complementizer in Fin°. Swedish seems to not allow extraction even in the absence of the complementizer, which is a difference compared to non-V2 embedded clauses. I have nothing to say about this special property here.

3.3.5 Summary We can summarize the variation across the Scandinavian languages as fol- lows. Danish and Swedish pattern more or less like English, and both Danish and Swedish have different strategies to mitigate subject extraction. Icelan- dic is different in that the complementizer is always present when the subject is extracted. The most liberal variety is found within Norwegian. There, movement is allowed regardless of whether the complementizer is present or not. It is very interesting to note this great variation between these closely related languages, and I have, following Boeckx (2008), showed how to account for this variation by arguing that there are various complementizers that are merged in different positions within the left periphery. In addition the feature content of Fin° varies. We have also seen that the same properties obtain for embedded V2 clauses. In the next section, I generalize the that-t parameter to a few other complementizer effects.

3.4 A General C-t Effect Starting with Chomsky and Lasnik (1977), there has been much discussion concerning the possible range of effects like that-t. The question is whether there can be a unified analysis of the C-t effects in general. In this short sec- tion, I show that this might indeed be possible by extending the theory in Boeckx (2008) (pace Kandybowicz 2006). Chomsky and Lasnik (1977) pointed out that there is a similar effect with for as with that. This is illustrated in (49) and (50):

(49) a. Who do you want to see Bill? b. *Who do you want for to see Bill? (Chomsky and Lasnik 1977: 454) (50) *Who would you prefer for to win? (Chomsky and Lasnik 1977: 500)

However, Chomsky and Lasnik also point out that there are certain dia- lects, for example, Ozark English, that permit for-to sequences (like Middle English). The examples in (51) from Ozark English are all acceptable, but require deletion of for in Standard English (52):

(51) a. Are you here for to fish? b. It is wrong for to do that. c. These sheep are for to sell. (Chomsky and Lasnik 1977: 454) 100 Transformational Constraints (52) a. *Are you here for to fish? b. *It is wrong for to do that. c. *These sheep are for to sell.

This difference between what Chomsky and Lasnik (1977) call dialects seems to square well with the proposed that-t parameter. That is, in Stan- dard English for is merged in Fin° and carries a [+T]-feature. This makes the subject unable to move out of the infinitival clause. In varieties like Ozark English, Fin° has a [T]-feature, and correspondingly for is merged in Force°. Notice also that for shows the same effects in Standard English and those varieties of English that have no that-t effect. This speaks in favor of a lexical treatment of C-t effects, and not an overall syntactic parameter that applies to all tokens of C. The account of for can also be extended to yet another C-t effect, namely, that found with whether (given that whether is an interrogative complemen- tizer). Moving an object across whether is grammatical, though not entirely perfect, but movement of a subject across whether is ungrammatical.

(53) a. ?Who did you wonder whether Bill saw? b. *Who did you wonder whether saw Bill? (Chomsky and Lasnik 1977: 455)

However, it is not possible to remove whether, as shown in (54) and (55):

(54) *Who did you wonder saw Bill? (55) *Who did you wonder wrote the book? (Marc Richards, p.c.)

This is an important difference compared with that and for. (53b) shows that whether lexicalizes Fin°[+T]. Notice further that whether has semantic content; that is, it is not semantically empty like that. I propose that this accounts for why deletion is impossible in (54) and (55). Contentful items are not generally available for deletion, which would induce a crash at the interface. I leave it to future research to develop the consequences of this suggestion. Turning to Norwegian, it is interesting to note that the Norwegian comple- mentizer om ‘if’ apparently behaves like at. This is illustrated in (56) and (57):

(56) a. Jeg vet om han kommer. I know if he comes “I know if he is coming.” b. Han vet jeg om kommer. He know I if comes “He, I know is coming.” c. Han vet jeg kommer. He know I comes “He I know is coming.” Comp-t Effects 101 (57) a. Hun husket jeg om hadde låst døren. she remembered I if had closed door.def “I remembered if she had closed the door.” b. Hun husket jeg hadde låst døren. she remembered I had closed door.def “She I remembered had closed the door.”

However, this is only apparently the case. In these sentences, om has seman- tic content; that is, sentences introduced by om have hypothetical or irrealis semantics. In other words, om seems to behave on a par with whether, with the exception that the subject is allowed to move out of the embedded clause. Thus, om lexicalizes Force°. It also has to be noted that there are very few cases of nominal embedded clauses with complementizers other than that in Norwegian. In (56) and (57), om has a nominal function, though in most cases om does not have this function (Faarlund, Lie and Vannebo 1997: 976, 1043–1044). Another complementizer is the infinitive markerå , “to”. The interesting fact is that this marker is subject to great variation among the Scandinavian languages. This variation is illustrated in (58) through (61):

(58) Hann lofaði að lesa ekki bókina. (Icelandic) he promised to read not book.def (59) Han lovade att inte läsa boken. (Swedish) he promised to not read book.def (60) Han lovede ikke at lese bogen. (Danish) he promised not to read book.def (61) Han lovet ikke å lese boken. (Norwegian) he promised not to read book.def (Beukema and den Dikken 1989: 65)

These data show that the infinitive marker does not have the same posi- tion across the Scandinavian languages (Christensen 2003, 2005, 2007). An anonymous reviewer asks why it is not equally conceivable that it is the position of negation that varies. However, virtually all recent analyses of negation in Scandinavian assume a fixed position corresponding to the NegP (see Christensen 2005, 2007, Lindstad 2007), and I see no compelling reason to question these analyses. In fact, a closer look at Norwegian pro- vides further evidence against a variable position for negation. Norwegian exhibits variation or optionality when it comes to the order of negation and the infinitival marker. This optionality is illustrated in (62):

(62) a. Du må love å ikke drikke. you must promise to not drink b. Du må love ikke å drikke. you must promise not to drink “You must promise not to drink.” 102 Transformational Constraints Interestingly, there is change going on here. (62a) is not attested in earlier varieties of Norwegian. However, there are no empirical reasons that I am aware of to assume that the position of negation in the middle field has changed (cf. Christensen 2003 and Faarlund 2007). In my opinion, these data can be analyzed according to the general idea of this chapter: There is a lexicalization difference regarding the infinitive marker: It can either be lexicalized as Force° or as a lower head. The peculiar thing that sets the infinitive marker aside from other complementizers is that it does not limit its variation to the CP. It can be merged lower, as in Danish (60) or ear- lier stages of Norwegian (62b) (Faarlund 2007: 70). Interestingly, the lower position does not prohibit movement out of the clause, which also sets the infinitive marker apart from the other complementizers we have discussed above. Notice also that there is no issue of agreement in terms of Case (i.e., there is no [+T]-feature on the infinitive marker). The question is whether we can propose a unified analysis of the C-t effects in general. That is, is it possible to extend the proposed that-t param- eter to a general C-t parameter? Such a parameter could look like (63):

(63) The C-t parameter a. The C-t effect: A complementizer lexicalizes Fin°. Fin° has a [+T]-feature b. Lack of the C-t effect: A complementizer lexicalizes Force°. Fin° has a [T]-feature

At first sight (63) may seem to mask an important difference between that-clauses and infinitive clauses:that lexicalizes Fin° whereas to lexical- izes Force°. However, this is allowed under (63). A complementizer can lexicalize Force° even though other complementizers show a C-t effect. So Standard English that is subject to (63a), and to is subject to (63b). The parameter is a highly lexical one: Its setting can vary from complementizer to complementizer. Infinitive clauses clearly speak against an overall C-t parameter where all complementizers behave alike. Another phenomenon lends further support to this argument, namely, relative clauses. In the next section I discuss extraction from relative clauses in relation to C-t effects, and I argue that they are derived from (63).

3.5 Relative Clauses and C-t Effects So far I have discussed that-t effects in English and Scandinavian, and I have also discussed various extensions of that-t effects as a reflection of more general C-t effects. In this section I discuss relative clauses, and I show that they speak in favor of the C-t parameter suggested in section 3.4. One reason why relative clauses are interesting from the present point of view is that they differ from complement clauses in a significant way. Spe- cifically, they show a reversed that-t effect. Comp-t Effects 103 (64) a. The book that impressed everyone is on the table. b. *The book impressed everyone is on the table.

We see that that needs to be retained in these configurations. This is obvi- ously something in need of an explanation, and in the following I show that the present theory is able to derive these facts straightforwardly. Again we see in (64) that there is a link between the subject and the complementizer, and I would once more like to link this to the feature content on Fin°. But first it is necessary to say a few words about relative clauses. I follow the raising analysis of relative clauses as argued by, for example, Vergnaud (1974), Kayne (1994) and Bianchi (1999). Furthermore, I assume that the complementizers in relative clauses are true complementizers. It is not possible to justify this assumption in this chapter. Recently several researchers have argued that these complementizers are really demonstra- tives, for example, Kayne (2007) and Manzini (2007a, 2007b), but I have to set aside any discussion of the merits of each of these approaches. On my approach, a relative clause like (65a) has the simplified structure in (65b).

(65) a. The book that impressed everyone.

b. [DP the [CP book [CP that impressed everyone]]]]

The choice of analysis is important for the present chapter because the rais- ing analysis assumes that there actually is movement to the specifier of the complementizer. Let us now go on and see how the preceding facts can be derived. We have noted that there is a contrast regarding that between comple- ment clauses and relative clauses. The asymmetry is repeated for conve- nience in (66):

(66) a. The man I know (*that) left early. b. The book *(that) impresses everyone.

There are two important differences: that in complement clauses must be deleted when a subject moves across it whereas that in relative clauses has to be retained when the local subject is relativized. I will argue that the comple- mentizer that in relative clauses lexicalizes Fin°. It is necessary to select Fin° because the subject commonly gets Case marked by this head. At the same time, the subject still moves over the complementizer, and the complemen- tizer needs to be retained. This actually reminds us of Icelandic complement clauses, but with one important difference: The relativized subject cannot move further when it has reached the left periphery in the embedded clause. Therefore, I argue that Fin° in relative clauses has an EPP feature in the sense of Collins (1997) and Chomsky (2000, 2001; alternatively an EPP property in the sense of Pesetsky and Torrego 2001). This feature kicks in immediately upon Agree, which means that the Case valuation of the 104 Transformational Constraints subject will happen simultaneously as the subject moves to SpecFinP (see, e.g., Gallego 2006 for a different proposal). It is possible to see this EPP feature as yet another reflection of the “com- plementizer manipulations” seen in section 3.2. That is, it is a strategy that specifically applies to subjects and which we predict to be morphologically marked in other languages. Data from the Niger-Congo language Bùlì (67) and (68) show that this prediction is borne out:

(67) nùrù-wā:y áhī /*àtì dà mángò-kù lá man-rel c bought mango-d dem “the man who bought the mango” (68) mángò-kū:y *áhī/àtì Àtìm dà lă mango-rel c Àtìm bought dem “the mango that Àtìm bought” (Hiraiwa 2005b: 268)

These data show that the complementizer is different depending on whether a subject or a nonsubject is relativized. This supports the claim that there are different processes going on for subjects and nonsubjects, viz. a special EPP feature on the complementizer in case of subject relativization (see McCloskey 2002 and especially Hiraiwa 2005a, b for similar proposals). Hiraiwa (2005a, b) also argues that we have a different EPP feature when nonsubjects are relativized. Since I am primarily dealing with subjects in this chapter, I do not discuss this further but only assume that Hiraiwa is correct about this. Notice also that I have said nothing about why the complementizer cannot be deleted when the subject is relativized. Tenta- tively, I suggest that this is related to the EPP feature in question, viz. the fact that the configuration is highly local. Obviously, this proposal requires extensive further studies which would go beyond all reasonable limits for this chapter. I have now accounted for the asymmetry between relative clauses and complement clauses and suggested that in English relative clauses, Fin° is lexicalized by that. The conclusions so far have been based on English data, and I now turn to relative clauses in Norwegian. Because of the important work of Taraldsen (1978, 1979, 1981, 1986, 2001), many properties of Norwegian relatives are pretty well known. Nor- wegian relatives and English relatives all require that the complementizer has to be present when subjects are relativized. Similarly, the asymmetry between subjects and objects exists, as shown in (69) and (70):

(69) a. Jeg kjenner han som jobber i butikken. I know he that works in shop.def “I know he that works in the shop.” b. *Jeg kjenner han jobber i butikken. I know him works in shop.def Comp-t Effects 105 (70) a. Per kjenner mannen som de arresterte. Per knows man.def that they arrested “Per knows the man that they arrested.” b. Per kjenner mannen de arresterte. Per knows man.def they arrested “Per knows the man they arrested.”

However, the picture is apparently more complicated when a wh-word is relativized. Consider the data and judgments in (71), based on Taraldsen (2001: 165):

(71) a. Vi vet ikke hvem *(som) oppfant ostehøvelen. we know not who . that invented cheese.slicer.def “We don’t know who invented the cheese slicer.” b. Vi vet ikke hvem (*som) de har ansatt. we know not who that they have hired. “We don’t know who they have hired.”

Taraldsen (1986, 2001: 168) argues that som is an expletive and that this explains why its presence in (71b) is ungrammatical. Since Norwegian disal- lows transitive expletive constructions, an expletive in addition to a subject is not licit. Regarding the data in (70), Taraldsen argues that som is an argument and that som doubles the DP in SpecCP. This analysis seems to work on the basis of the data Taraldsen considers. However, there is an important issue here: Most Norwegian speakers that I have consulted find sentences like (71b) acceptable. Such sentences are apparently unacceptable to Taraldsen, but it is clear that many native speakers do not share his judg- ments. This is important because these other native speakers’ judgments speak in favor of the analysis I have just given for English, and not the analysis Taraldsen (2001) argues in favor of. The following sentences all show that som is perfectly capable of occurring when a non-subject wh-item is relativized.7

(72) a. Kva sko (som) han kjøpte? what shoes that he. bought “What shoes did he buy?” (Åfarli 1994: 84) b. Per visste [[kva for nokre av kakene sine] som Ola likte best] Per knew what for ones of cakes.def his that Ola liked best “Per knew which ones of his cakes Ola liked best.” (Åfarli 1994: 97) c. Per beundra [[dei av kakene sine] som Ola likte best] Per admired those of cakes.def his that Ola liked best “Per admired those of his cakes that Ola liked best.” (Åfarli 1994: 97) 106 Transformational Constraints Åfarli and Eide (2003: 257) argue that the wh-phrase needs to be complex in order for som to appear. That is consistent with the data in (72). How- ever, a quick Google search reveals that som indeed occurs with just a single wh-phrase. A few examples are provided in (73):

(73) a. ? . . . hva som de synes burde forbedres . . .8 what that they think should be.improved b. ? . . . hva som de regner . . .9 what that they think c. ? . . . hva som de fleste fagfolk på det aktuelle felt vil . . .10 what that the most specialists on the specific field would d. ? . . . hvem som de helst ser . . .11 who that they rather see e. ? . . . hvem som de mener . . .12 who that they think

These are but a few examples that turn up, and I and most of my informants think all of these are acceptable, both with and without som. Thus, it is very hard to argue that som is an expletive in these cases. In fact, the data lend themselves quite naturally to treating som as a complementizer. There is nothing about the Norwegian data that set them apart from the English data. Thus, they can be analyzed exactly as the English rela- 13 tive clauses earlier. Fin°[+T] has an EPP feature when the subject is rela- tivized that forces som to be pronounced. When the object is relativized,

Fin°[+T] does not have a similar EPP feature (Hiraiwa 2005a, b), and som is optional. We also see that both at and som in Norwegian do not lexicalize the same head. That is, at lexicalizes Force° whereas som lexicalizes Fin°.14 In this section we have seen that relative clauses have their own behavior that distinguishes them from complement clauses in terms of C-t effects. I have argued that the complementizers in relative clauses are different from the ones in complement clauses. In particular, I have suggested that Fin° has an EPP feature in relative clauses. The differences between various clause types and various C-t effects have all been connected to whether the com- plementizer lexicalizes Force° or Fin°, a difference related to lexical items. What it all boils down to is that the variation in question is best analyzed as lexical microparameters, in the sense of Borer (1984), Chomsky (1995) and Kayne (2005). Furthermore, we have seen that C-t-effects can be given a strictly syntactic analysis, which is a good argument in favor of a syntactic treatment of C-t-effects.

3.6 Conclusion In this chapter I have argued that there are two factors that determine whether a complementizer will allow subject extraction out of an embed- ded clause: the kind of features on Fin° and whether the complementizer Comp-t Effects 107 lexicalizes Force° or Fin° in the left periphery. Case has been claimed to be related to Fin°, and either Fin° has a valued [T]-feature ([+T]) or an unvalued [T]-feature ([T]). The latter makes it possible to extract the sub- ject. I have also argued that topicalization triggers that to appear in Force°, accompanied by a null Fin°[T] in English, which accounts for the fact that topicalization ameliorates the that-t effect. Furthermore, Force° has been related to the extension of a chain beyond the mono-clausal domain, viz. the embedded clause, whereas Fin° has been related to Case agreement with SpecTP. This theory has been extended to the Scandinavian languages and, furthermore, to general C-t effects both in English and Norwegian. I have also provided an account of relative clauses and the fact they show an oppo- site that-t effect. The latter has been explained as a consequence of the fact that Fin° has an EPP feature. Last, I have argued that C-t effects speak in favor of lexical parameters, where each complementizer may have its own parametric value.

Notes * I am grateful to an audience at Harvard in March 2008 and to Marcel den Dik- ken, Marit Julien, Howard Lasnik, David Lightfoot, and Christer Platzack, and especially to Cedric Boeckx and Elly van Gelderen for useful comments and sug- gestions. Many thanks to two anonymous Studia Linguistica reviewers whose many constructive comments have helped improve this chapter significantly. 1 As Boeckx (2008:185) points out, if Taraldsen’s (2001) decomposition of French

qui as que +ilexpletive, then the French data can be analyzed similarly to English. However, see Rizzi and Shlonsky (2007) for counterarguments to Taraldsen’s proposal. 2 There are also languages that allow quirky Case on subjects. For these I assume Chomsky’s (2000) theory to be correct, namely, that quirky subjects also have a structural Case feature. The welcome consequence of this is that such subjects behave similarly to subjects that receive nominative Case. 3 Although, as Poletto (2001) and Cocchi and Poletto (2002) admit, their account cannot account for the fact that the complementizer is obligatory in subject clauses (i): (i) a. [That Mary will buy the car] was expected by everyone. b. *[Mary will buy the car] was expected by everyone. 4 When it comes to Swedish, we also find the adverb effect seen in English (i), that is, the presence of an adverb makes extraction possible. Notice, however, that there also is an “anti” that-t effect in these constructions since the complemen- tizer needs to be present. (i) a. Vem är du glad att inte kunde komma? who are you glad that not could come “Who are you glad couldn’t come?” b. *Vem är du glad inte kunde komma? (Holmberg 1986: 193) 5 Notice also that Danish is more restrictive than Swedish in not allowing an adverbial in higher positions (Haeberli 2002: 207, 236–238). 6 Kjartan Ottosson (p.c.) informs me that the only systematical case where að is absent is when the að– clause has a light subject pronoun, as in (i). (i) Ég held ‘ann komiI think he come.subj “I think he comes.” 108 Transformational Constraints 7 Norwegian has two written standards, Bokmål and Nynorsk. Although these examples are given in Nynorsk, this does not affect the judgment. The same judgment obtains in my opinion in Bokmål too. 8 http://sentrum.blogspot.com/ [Accessed January 26, 2008] 9 www.europanytt.no/index.html?id=205 [Accessed January 26, 2008] 10 http://no.wikipedia.org/wiki/Wikipedia:Oppsettsveiledning [Accessed January 26, 2008] 11 www.offshore.no/nyheter/sak.asp?Id=11613 [Accessed January 26, 2008] 12 www.nettavisen.no/innenriks/valg07/article1338569.ece [Accessed January 26, 2008] 13 I hasten to add that we would also need a proposal for a grammar like the one Taraldsen represents. However, doing so would require a larger empirical inves- tigation of this grammar. Since I have restricted myself to the main variety of Norwegian, I do not discuss this grammar any further in this chapter. 14 Notice that this theory also derives the fact that subjects cannot be moved out of relative clauses (i) whereas objects can (ii): (i) Bøker kjenner jeg noen som skriver. books know I someone who writes “I know someone who writes books.” (ii) *En nabo liker jeg bøker som skriver. a neighbor like I books that write The subject has checked a strong position (it has been checked in an unambigu- ous checking site), and it is thus impossible to move it further. No such checking has taken place regarding the direct object.

References Åfarli, T. A. 1994. A promotion analysis of restrictive relative clauses. The Linguis- tic Review 11: 81–100. Åfarli, T. A. and Eide, K. M. 2003. Norsk generativ syntaks. Oslo: Novus. Authier, J-M. 1992. Iterative CPs and embedded topicalization. Linguistic Inquiry 23: 329–336. Bayer, J. 1984. COMP in Bavarian. The Linguistic Review 3: 209–274. Bentzen, K., Hrafnbjargarson, G. H., Hróarsdóttir, Þ. and Wiklund, A-L. 2007a. Extracting from V2. Working Papers in Scandinavian Syntax 79: 119–128. Bentzen, K., Hrafnbjargarson, G. H., Hróarsdóttir, Þ. and Wiklund, A-L. 2007b. The Tromsø guide to the force behind V2. Working Papers in Scandinavian Syn- tax 79: 93–118. Beukema, F and Dikken, M. D. 1989. The position of the infinitival marker in the germanic languages. In Sentential Complementation and the Lexicon: Studies in Honour of Wim de Geest, D. Jaspers, W. Klooster, Y. Putseys and P. Seuren. (eds.), 57–75. Dordrecht: Foris. Bianchi, V. 1999. Consequences of Antisymmetry: Headed Relative Clauses. Berlin: Mouton de Gruyter. Boeckx, C. 2003. Islands and Chains. Amsterdam: John Benjamins. Boeckx, C. 2008. Bare Syntax. Oxford: Oxford University Press. Borer, H. 1984. Parametric Syntax. Dordrecht: Foris. Bošković, Ž. 2007. On the locality and motivation of move and agree: An even more minimal theory. Linguistic Inquiry 38: 589–644. Branigan, P. 1996. Tracing that-trace Variation. In Microparametric Syntax and Dialect Variation, J. R. Black and V. Motapanyane (eds.), 25–39. Amsterdam: John Benjamins. Comp-t Effects 109 Branigan, P. 2005. The Trace-Fin Effect. Ms., Memorial University. Bresnan, J. 1977. Variables in the theory of transformation. In Formal Syntax, P. W. Culicover, T. Wasow and A. Akmajian (eds.), 157–196. New York: Academic Press. Browning, M. A. 1996. CP recursion and that-t effects. Linguistic Inquiry 27: 237–255. Carstens, V. 2003. Rethinking complementizer agreement: Agree with a case-checked goal. Linguistic Inquiry 34: 393–412. Chomsky, N. 1995. The Minimalist Program. Cambridge, MA: MIT Press. Chomsky, N. 2000. Minimalist inquires. In Step by Step. Essays on Minimalist Syn- tax in Honor of Howard Lasnik, R. Martin, D. Michaels and J. Uriagereka (eds.), 89–155. Cambridge, MA: MIT Press. Chomsky, N. 2001. Derivation by phase. In Ken Hale: A Life in Language, M. Ken- stowicz (ed.), 1–52. Cambridge, MA: MIT Press. Chomsky, N. 2007. Approaching UG from below. Interfaces + Recursion = Lan- gauge? Chomsky’s Minimalism and the View from Syntax-Semantics, H. M. Gärtner and U. Sauerland (eds.), 1–30. Berlin: Mouton de Guyter. Chomsky, N. 2008. On phases. In Foundational Issues in Linguistic Theory, R. Freidin, C. P. Otero and M. L. Zubizaretta (eds.), 133–166. Cambridge, MA: MIT Press. Chomsky, N. and Lasnik, H. 1977. Filters and control. Linguistic Inquiry 8: 425–504. Christensen, K. R. 2003. On the synchronic and diachronic status of the negative adverbial. ikke/not. Working Papers in Scandinavian Syntax 72: 1–53. Christensen, K. R. 2005. Interfaces: Negation—Syntax—Brain. Doctoral disserta- tion, University of Aarhus. Christensen, K. R. 2007. The infinitive marker across Scandinavian. Nordlyd 34: 147–165. Cocchi, G. and Poletto, C. 2002. Complementizer deletion in florentine: The inter- action between merge and move. In Romance Languages and Linguistic Theory 2000, C. Beyssade, R. Bok-Bennema, F. Drijkoningen and P. Monachesi (eds.), 57–76. Amsterdam: John Benjamins. Collins, C. 1997. Local Economy. Cambridge, MA: MIT Press. Cowart, W. 2007. Detecting Syntactic Dialects: The that-Trace Phenomenon. Ms., University of Southern Maine. Culicover, P. 1992. The adverb effect: Evidence against ECP accounts of the that- trace effect. Proceedings of NELS 23: 97–111. GLSA, University of Massachu- setts, Amherst. Déprez, V. 1991. Economy and the that-t effect. Proceedings of the Western Confer- ence on Linguistics, 74–87. California State University. Déprez, V. 1994. A minimal account of the that-t effect. In Paths Towards Universal Grammar: Studies in Honor of Richard S. Kayne, G. Cinque, J. Koster, J-Y. Pol- lock, L. Rizzi and R. Zanuttini (eds.), 121–135. Washington DC: Georgetown University Press. Engdahl, E. 1982. Restrictions on unbounded dependencies in Swedish. In Read- ings on Unbounded Dependencies in Scandinavian Languages, E. Engdahl and E. Ejerhed (eds.), 151–174. Umeå: Almqvist and Wiksell. Engdahl, E. 1985. Parasitic gaps, resumptive pronouns and subject extractions. Lin- guistics 23: 3–44. Engdahl, E. 1986. Constituent Questions: The Syntax and Semantics of Questions with Special Reference to Swedish. Dordrecht: Reidel. 110 Transformational Constraints Faarlund, J. T. 2007. Parametrization and change in non-finite complementation. Diachronica 24: 57–80. Faarlund, J. T, Lie, S. and Vannebo, K. I. 1997. Norsk referansegrammatikk. Oslo: Universitetsforlaget. Fiva, T. 1985. Resumptive pronomen i nordnorske dialekter. Heidersskrift til Kåre Elstad, T. Bull & A. Fjeldstad, 48–68. Tromsø: Institutt for språk og litteratur. [Reprinted in Tromsø Linguistics in the Eighties, 134–160. Oslo: Novus]. Fiva, T. 1991. Resumptive pronouns and binding theory. In Papers from the 12th Scandinavian Conference in Linguistics, H. Á. Sigurðsson (ed.), 66–77. Reykja- vík: University of Iceland. Gallego, Á. J. 2006. T-to-C movement in relative clauses. In Romance Languages and Linguistic Theory 2004: Selected Papers from “Going Romance” 2004, J. Doetjes and P. Gonzáles (eds.), 143–170. Amsterdam: John Benjamins. Haeberli, E. 2002. Features, Categories and the Syntax of a-Positions: Cross Lin- guistic Variation in the Germanic Languages. Dordrecht: Kluwer. Haegeman, L. 1992. Theory and dDescription in Generative Syntax: A Case Study in West Flemish. Cambridge: Cambridge University Press. Hellan, L. and Christensen, K. K. 1986. Introduction. In Topics in Scandinavian Syntax, L. Hellan and K. K. Christensen (eds.), 1–29. Dordrecht: Reidel. Hiraiwa, K. 2005a. Dimensions of Symmetry in Syntax: Agreement and Clausal Architecture. Doctoral dissertation, MIT. Hiraiwa, K. 2005b. The morphosyntax of the EPP and C in Bùlì. Proceedings of NELS 35, 267–278. GLSA, University of Massachusetts, Amherst. Hoekstra, E. 1993. Dialectal variation inside CP as parametric variation. Linguist- ische Berichte 5: 161–179. Holmberg, A. 1986. Word Order and Syntactic Features in the Scandinavian Lan- guages and English. Doctoral dissertation, University of Stockholm. Ishii, T. 2004. The phase impenetrability condition, the vacuous movement hypoth- esis, and that-t effects. Lingua 114: 183–215. Julien, M. 2007. Embedded V2 in Norwegian and Swedish. Working Papers in Scan- dinavian Syntax 80: 103–161. Kandybowicz, J. 2006. Comp-Trace Effects Explained Away. Proceedings of WCCFL 25: 220–228. Somerville, MA: Cascadilla Proceedings Project. Kayne, R. S. 1981. ECP extensions. Linguistic Inquiry 12: 93–133. Kayne, R. S. 1994. The Antisymmetry of Syntax. Cambridge, MA: MIT Press. Kayne, R. S. 2005. Some notes on comparative syntax, with special reference to Eng- lish and French. In The Oxford Handbook of Comparative Syntax, G. Cinque and R. S. Kayne (eds.), 3–69. Oxford: Oxford University Press. Kayne, R. S. 2007. Some thoughts on grammaticalization. The case of that. Talk given at the International Conference of Historical Linguistics, Montreal, August 8. Kitahara, H. 1994. A Minimalist Analysis of Cross-Linguistically Variant CED Phe- nomena. Proceedings of NELS 24: 241–253. GLSA, University of Massachusetts, Amherst. Lasnik, H. and Saito, M. 1992. Move α. Cambridge, MA: MIT Press. Lightfoot, D. 1989. The child’s trigger experience: Degree-0 learnability. Behavioral and Brain Sciences 12: 321–375. Lightfoot, D. 2006. Minimizing government: Deletion as cliticization. The Linguis- tic Review 23: 97–126. Comp-t Effects 111 Lindstad, A. M. 2007. Analysis of Negation: Structure and Interpretation. Doctoral dissertation, University of Oslo. Lohndal, T. 2007. That-t effects: Variation in the position of C. Working Papers in Scandinavian Syntax 79: 47–73. Maling, J. and Zaenen, A. 1978. The nonuniversality of a surface filter. Linguistic Inquiry 9: 475–497. Manzini, R. 2007a. The Structure and Interpretation of (Romance) Complementiz- ers. Ms., University of Florence. Manzini, R. 2007b. The Romance k- Complementizers. Ms., University of Florence. McCloskey, J. 2002. Resumption, successive cylicity, and the locality of operations. In Derivation and Explanation in the Minimalist Program, S. Epstein and T. D. Seely (eds.), 184–226. Malden: Blackwell. McDaniel, D., Chiu, B. and Maxfield, T. L. 1995. Parameters for wh-movement types: Evidence from child English. Natural Language and Linguistic Theory 13: 709–753. Mizuguchi, M. 2008. Derivation, minimalism, and that-trace effects. English Linguis- tics 25: 56–92. Perlmutter, D. 1971. Deep and Surface Structure Constraints in Syntax. New York: Holt. Pesetsky, D. and Torrego, E. 2001. T-to-C movement: Causes and consequences. In Ken Hale: A Life in Language, Michael Kenstowicz (ed.), 355–426. Cambridge, MA: MIT Press. Platzack, C. 1987. The Scandinavian languages and the null subject parameter. Nat- ural Language and Linguistic Theory 5: 377–401. Poletto, C. 2001. Complementizer deletion and verb movement in standard Italian. In Current Studies in Italian Syntax: Essays Offered to Lorenzo Renzi, G. Cinque and G. Salvi (eds.), 265–286. London: Elsevier. Richards, N. 2001. Movement in Language: Interactions and Architectures. New York: Oxford University Press. Rizzi, L. 1990. Relativized Minimality. Cambridge, MA: MIT Press. Rizzi, L. 1997. The fine structure of the left periphery. In Elements of Grammar: A Handbook of Generative Syntax, L. Haegeman (ed.), 281–337. Dordrecht: Kluwer. Rizzi, L. 2001. On the position “Int(errogative)” in the left periphery of the clause. In Current Studies in Italian Syntax: Essays offered to Lorenzo Renzi, G. Cinque and G. Salvi (eds.), 289–296. London: Elsevier. Rizzi, L. 2006. On the form of chains: Criterial positions and ECP effects. In Wh Movement: Moving On, L. Lai-Shen Cheng and N. Corver (eds.), 97–133. Cam- bridge, MA: MIT Press. Rizzi, L. and Shlonsky, U. 2007. Strategies of subject extraction. In Interfaces + Recursion = Langauge? Chomsky’s Minimalism and the View from Syntax- Semantics, H. M. Gärtner and U. Sauerland (eds.), 115–160. Berlin: Mouton de Guyter. Roussou, A. 2000. On the left periphery: Modal particles and complementisers. Journal of Greek Linguistics 1: 65–94. Roussou, A. 2002. C, T, and the Subject: That-t phenomena revisited. Lingua 112: 13–52. Roussou, A. 2007. Subjects on the Edge. Ms., University of Patras. 112 Transformational Constraints Sobin, N. 1987. The variable status of comp-trace phenomena. Natural Language and Linguistic Theory 5: 33–60. Sobin, N. 2002. The comp-trace effect, the adverb effect and minimal CP. Journal of Linguistics 38: 527–560. Stepanov, A. 2001. Cyclic Domains in Syntactic Theory. Doctoral dissertation, Uni- versity of Connecticut, Storrs. Stowell, T. 1981. Origins of Phrase Structure. Doctoral dissertation, MIT. Szczegielniak, A. 1999. ‘That-trace effects’ cross-linguistically and successive cyclic movement. MIT Working Papers in Linguistics 33: 369–393. Takahashi, D. 1994. Minimality of Movement. Doctoral dissertation, University of Connecticut, Storrs. Taraldsen, K. T. 1978. On the scope of wh-movement in Norwegian. Linguistic Inquiry 9: 623–640. Taraldsen, K. T. 1979. On the Nominative Island Condition, Vacuous Application and the that-Trace Filter. Ms., University of Oslo. [Reproduced by the Indiana University Linguistics Club, 1980]. Taraldsen, K. T. 1981. The theoretical interpretation of a class of ‘marked’ extrac- tions. In Theory of Markedness in Generative Grammar, A. Belletti, L. Brandi and L. Rizzi (eds.), 475–516. Pisa: Scuola Normale Superiore di Pisa. Taraldsen, K. T. 1986. Som and the binding theory. In Topics in Scandinavian Syn- tax, L. Hellan and K. K. Christensen (eds.), 149–184. Dordrecht: Reidel. Taraldsen, K. T. 2001. Subject extraction, the distribution of expletives and stylistic inversion. In Subject Inversion in Romance and the Theory of Universal Gram- mar, A. Hulk and J-Y. Pollock (eds.), 163–182. New York: Oxford University Press. Thornton, R. 1990. Adventures in Long-Distance Moving: The Acquisition of Com- plex Wh-Questions. Doctoral dissertation, University of Connecticut, Storrs. Thráinsson, H. 2003. Syntactic variation, historical development and minimalism. In Minimalist Syntax, R. Hendrick (ed.), 152–191. Malden: Blackwell. Vergnaud, J. R. 1974. French Relative Clauses. Doctoral dissertation, MIT. Vikner, S. 1991. Relative der and other Co elements in Danish. Lingua 84: 109–136. Vikner, S. 1995. Verb Movement and Expletive Subjects in the Germanic Languages. Oxford: Oxford University Press. Watanabe, A. 1992. Larsonian CP recursion, factive complements, and selection. Proceedings of NELS 23: 523–537. GLSA, University of Massachusetts, Amherst. Wiklund, A-L., Hrafnbjargarson, G. H., Bentzen, K. and Hróarsdóttir, Þ. 2007. Rethinking Scandinavian verb movement. Journal of Comparative Germanic Linguistics 10: 203–233. Zwart, C. J-W. 1996. “Shortest Move” versus “Fewest Steps”. In Minimal Ideas: Syntactic Studies in the Minimalist Framework, W. Abraham, S. D. Epstein, H. Thráinsson and C. J-W. Zwart (eds.), 305–327. Amsterdam: John Benjamins. Zwart, C. J-W. 1997. Morphosyntax of Verb Movement: A Minimalist Approach to the Syntax of Dutch. Dordrecht: Kluwer. 4 Freezing Effects and Objects1

4.1 Introduction Since the 1960s, syntactic islands have occupied a central role within gen- erative grammar. In recent years, some islands have been studied in terms of the freezing approach. This approach tries to elucidate the conditions that prohibit an element from taking part in any further syntactic operations after it has moved from its base position (see Wexler and Culicover 1980). Much attention has focused on the upper layers of the clause, in particular the TP-CP domain. However, in this chapter I take a closer look at the lower area of the clause, namely, the area where arguments are externally merged. Specifically, I provide, among other things, an account of the Norwegian contrasts shown in (1) through (3):2

(1) Extraction from/of subjects

(a) [CP [Hva for en fyr]i gav [IP ti barna pakker til bursdagen]]? what for a guy gave children.def gifts for birthday.def “What guy gave children gifts for their birthday?”

(b) *[CP Hvai gav [IP [ti for en fyr] barna pakker til bursdagen]]? what gave for a guy children.def gifts for birthday.def

(c) [CP Hvemi gav [IP ti barna pakker til bursdagen]]? who gave children.def gifts for birthday.def “Who gave the children gifts for their birthday?” (2) Extraction from/of indirect objects

(a) [CP [Hva for en fyr]i gav [IP du [vP ti pakker til bursdagen]]]? what for a guy gave you gifts for birthday.def “Which guy did you give gifts for their birthday?”

(b) *[CP Hvai gav [IP du [vP [ti for en fyr] pakker til bursdagen]]]? what gave you for a guy gifts for birthday.def

(c) [CP Hvemi gav [IP du [vP ti pakker til bursdagen]]]? who gave you gifts for birthday.def “Who did you give gifts for their birthday?” 114 Transformational Constraints (3) Extraction from/of direct object3

(a) [CP [Hva for noe]i gav [IP du [vP barna ti til bursdagen]]]? what for something gave you children.def for birthday.def “What (kind of thing(s)) did you give the children for their birthday?”

(b) [CP Hvai gav [IP du [vP barna [ti for noe] til bursdagen]]]? what gave you children.def for something for birthday.def

(c) [CP Hvai gav [IP du [vP barna ti til bursdagen]]]? what gave you children.def for birthday.def “What did you give the children for their birthday?”

Examples (1b) and (2b) show that subextraction is impossible from subjects and indirect objects, whereas extraction of the entire constituent is possible ((1a, 1c) through (2a, 2c)). However, this difference is not found with direct objects, in (3). The goal of the present chapter is to show how to derive these data by adopting and developing the theory of locality proposed in Boeckx (2008b). We will see that as it currently stands the theory covers quite a range of data, but we need to make modifications in order to deal with some problem- atic examples and to account for a wide array of subextraction facts. The theory provides no unified account of why subextraction is bad—at least three different accounts are shown to be necessary, which suggests that the theory does not provide a satisfactory explanation of subextraction. This issue recurs throughout the chapter. It will also be shown that there is a crucial difference between Norwe- gian and English manifested by the fact that English does not have direct counterparts of the Norwegian (2a, 2c). An entire section is devoted to indi- rect objects and A-bar properties concerning such structures, as, it turns out, that this is a poorly studied area. A comparative study of such closely related languages as English and Norwegian appears to be particularly well suited to getting a better understanding of what enables and what prevents indirect objects from A-bar moving since English and Norwegian constitute a minimal pair in this regard. Despite this difference, English and Norwe- gian passives are similar in that both allow the passive version of (2b). For the active cases, I suggest that there is a difference between inherent and structural Case, where the latter makes it impossible for any syntactic object to move further after Case is checked in a particular checking domain. This does not hold when an indirect object bears inherent Case only since inher- ent Case does not require a checking domain. Passives require additional assumptions that I turn to at the end of the chapter. This chapter is structured as follows: Section 4.2 presents the background and reviews examples of extraction and subextraction from subjects based on Boeckx (2008b). It also deals with a problem for the theory related to movement of subjects out of finite clauses, which has not been discussed Freezing Effects and Objects 115 before, and carries a discussion of what to do about local subject ques- tions. The core theoretical assumptions adopted in this chapter are also be presented in the course of this section. Section 4.3 shows how the analysis developed in section 4.2 can be extended and modified so that it can deal with direct and indirect objects, and it also discusses object shift and scram- bling. Section 4.4 is a comparative study of English and Norwegian indirect objects and their A and A-bar properties (such as extraction from active and passive sentences and subextraction from indirect objects). Section 4.5 discusses some general issues before section 4.6 concludes the chapter.

4.2 Subjects and Extraction In this section I present various data involving extractions related to sub- jects. This is mainly as a background to the main body of the chapter, which centers on the properties of freezing related to direct and indirect objects. First I look at subextraction from subjects and then briefly at extraction of subjects related to that-trace effects. Here I also address an empirical prob- lem for the approach adopted in this chapter that has not been addressed earlier. Finally, data on local subject questions are discussed.

4.2.1 Subextraction From Subjects Let us first look at English. Subextraction from the canonical subject posi- tion (labeled SpecTP in this chapter) is impossible in English, as shown by the examples in (4):

(4) (a)  *[CP [Which Marx brother]i did [IP she say [CP [IP [a biography of ti] is going to be published/will appear this year]]]]?

(b) *[CP [Which Marx brother]i did [IP she say [CP that [IP [a biographer

of ti] interviewed her/worked for her]]]]? (Lasnik and Park 2003: 650, from Merchant 2001: 185)

This fact has commonly been derived from Huang’s (1982) Condition on Extraction Domains, and more specifically, the Subject Condition (Chom- sky 1973), which says that extraction from the canonical subject position is prohibited.4 This is further confirmed by the data in (5), where extraction is possible from a position below the canonical subject position.

(5) (a)  [CP [Which candidate]i were [IP there [posters of ti] all over the town]]?

(b) *[CP [Which candidate]i were [IP [posters of ti] all over the town]]? (Lasnik and Park 2003: 651, from Merchant 2001: 187)

A number of authors have argued that the canonical subject position is gen- erally a freezing position (cf. Ormazabal, Uriagereka, and Uribe-Etxebarria 116 Transformational Constraints 1994; Takahashi 1994; Stepanov 2001; Boeckx 2003, 2008b; Lasnik and Park 2003; Rizzi 2006, Rizzi and Shlonsky 2007).5 There is much cross-linguistic evidence for this view: (6) shows this for Spanish, (7) for Norwegian, (8) for Icelandic, (9) for Danish, and (10) for Swedish.6

(6) [CP [De qué conferenciantes]i te parece [CP que . . . (Spanish) of what speakers cl.to.you seem.3.sg that . . .

(a) [IP me van a impresionar [las propuestas ti]]]]? cl.to.me go.3.sg to to.impress the proposals

(b) *[IP [las propuestas ti] me van a impresionar]]]? the proposals cl.to.me go.3.sg to to.impress “Which speakers does it seem to you that the proposals by will impress me?” (Uriagereka 1988: 118)

(7) *[CP Hvemi tror du [CP at [IP [brev fra ti] kommer i morgen]]]? (Norwegian) who think you that letters from come in tomorrow “Who do you think that letters from come tomorrow?”

(8) *[CP Hverjumi telur Jens [CP að [IP [mynd af ti] hangi á who believes Jens that photograph of hangs on skrifstofu Willochs]]]? (Icelandic) office Willoch’s “Who does Jens believe that a photograph of hangs in Willoch’s office?”

(9) *[CP [Hvilken forfatter]i blev [IP bøgerne af ti] hurtig udsolgt]]? (Danish) which author were books.def of soon sold.out “Which author were books by soon sold out?”

(10) *[CP [Vilken kung]i hänger [IP [många porträtt av ti] på Gripsholm]]? (Swedish) which king hang many portraits of at Gripsholm “Which king hang many portraits of at Gripsholm?” (Engdahl 1982: 164)

The Spanish examples in (6) are especially revealing as the contrast shows that extraction is possible from SpecvP, (6a), but not from SpecTP, (6b), assuming that in Spanish postverbal subjects stay in SpecvP (see, e.g., Gal- lego 2007). This is not a peculiar feature of Spanish, however; we have already seen in (5) that this is allowed in English ((5a) is repeated in (11) for convenience):7

(11) [CP [Which candidate]i were [IP there [posters of ti] all over the town]]?

So far I have illustrated that SpecTP is a freezing position. The question is why this is the case. Here I rely on the theory put forward in Boeckx (2008b: Freezing Effects and Objects 117 Chapter 5). By way of introducing his theory, I focus on (pure) extraction and then return to subextraction once we have an understanding of how extraction works. Crucial to Boeckx’s approach is the understanding of Case. He follows Pesetsky and Torrego (2001: 361), who define nominative Case as follows:

(12) Nominative Case Nominative case is uT on D.

uT stands for an uninterpretable [T(ense)]-feature, and “on D” means that it is present on a nominal phrase (i.e., a DP). Based on this, Boeckx (2008b: 172) formulates the domain of nominative Case as in (13). That is, (13) is the position where nominative Case is licensed:

(13) [Fin° [+T] [__ T° [−φ]]]

The structure in (13) incorporates Chomsky’s (2007, 2008) claim that assignment of nominative Case is dependent on a finite C and assumes Rizzi’s (1997: 288) cartography of the CP:

(14) ForceP . . . (TopicP) . . . (FocusP) . . . FinP TP

The φ-feature bundle on the functional head in (13) is present to indicate the presence of a Probe-Goal relation between the functional head and a DP (Boeckx 2008b: 172). Phi-features are unvalued (hence, the “minus” sign), and they get valued through subject–verb agreement. Fin° comes with a [+T]-feature from the numeration, just like T° does (not shown in (13)). Here Boeckx follows Chomsky’s (2007, 2008) claim that Fin° is the relevant Case licenser and not T°.8 An assumption significant from the present point of view, which will become clearer as we go along, is that an item is frozen when it has checked a feature in a special position.9 This follows from what Boeckx (2003) calls the Principle of Unambiguous Chain (PUC; see also Richards 2001, Rizzi 2006; cf Müller 1998 on the Principle of Unambiguous Binding). The PUC says that a chain can contain at most one “strong” position/occurrence; that is, an item can only move to one strong position. A strong position/ occurrence is defined in terms of checking of a feature associated with an EPP property (Boeckx 2008b: 165). Put slightly differently, movement to the canonical subject position is movement to a strong position. Obviously, there might be other positions that bear an EPP property, and the idea is that each such position counts as a strong position.10 If an item moves to such a position, it cannot move any further. Richards (2001) has suggested a quite similar idea. He suggests that the EPP property of an element α gives a signal to PF that α should be pronounced in the relevant context, and then formu- lates a condition on what constitutes a legitimate chain at the interface. In short, Richards argues that PF must receive unambiguous instructions about 118 Transformational Constraints which part of a chain to pronounce. The EPP property is by hypothesis one such signal that gives an unambiguous instruction to the interfaces. In summary, we see that for both Boeckx and Richards, there can only be one strong position/EPP-position per chain and that this is central to understand freezing effects. However, it should be pointed out that it is not clear that the system has to be organized exactly like this: The claim about there being only one strong position per chain is obviously axiomatic. The picture is slightly more complicated than we have seen so far, though. The present theory suggests that an element can only move to a single feature-checking site, that is, remerge once (Boeckx 2008b: 167; see also Boeckx 2003 and Bošković 2007; Ormazabal, Uriagereka, and Uribe-Etxe- barria 1994 contains an earlier and quite similar proposal). How does one define this single feature-checking site? On the assumption that a chain is defined by two domains, one associated with the External First-Merge posi- tion, the other with the Internal Merge position, Boeckx suggests the follow- ing characterization of what he calls a “checking domain” (the diagram in (15) is adapted from Boeckx 2008b: 171):11

(15) checking domain

[H1°[–F] [__ [H2°[+F]]] … [E[+F]]]

checking site

Here we have two functional heads, H1° and H2°, and there is a speci- fier, marked “__” where an item can undergo checking for the feature F. The original position of the element E that will remerge is also included. Crucially, this element will have to move into “__”. Unless an element E is externally merged in the “__” position in (15), movement to this position is required for E to establish a Probe–Goal relation with H1°. If movement does not happen, a relativized minimality violation will occur: H1, H2 and E all share a feature, and they stand in an asymmetric c-command relation.

Boeckx assumes that H1 is the Probe and that it has to probe E, and given the intervening H2, we have a standard case of intervention or relativized minimality. The only way E can escape the minimality effect is to move above H2, that is, to remerge in “__”; if no movement happens, the deriva- tion crashes. Upon remerge, an unambiguous checking site, of which only one can be established per chain, has been established, as discussed earlier. Although I have kept the discussion of checking domains fairly general, the reader will not fail to recognize that the configuration in (15) is similar to the canonical subject position. (16) shows a schema for the assignment of Freezing Effects and Objects 119 nominative Case, where Fin° is assumed to be the bearer of nominative Case in English (Boeckx 2008b, Chomsky 2008, Lohndal 2009):12

(16) (a) [Fin°[+T, −φ] [__ [T°[+T, −φ]]] . . . [E[−T, +φ]]] before valuation

(b) [Fin°[+T, −φ] [[E[−T, +φ]] [T°[+T, −φ]]] . . . [E]] movement of E

(c) [Fin°[+T, +φ] [[E[+T, +φ]] [T°[+T, +φ]]] . . . [E]] valuation

Boeckx (2008b) assumes that Fin° values nominative Case on the relevant DP (E in the preceding structures) without saying explicitly how Fin° actu- ally qualifies as a Probe. Here I interpret this to mean that Fin° also needs to have unvalued φ-features in order to be a Probe (see also Fortuny 2008: Chapter 2 for extensive discussion of the link between Fin° and T°).13 This is supported by complementizer agreement facts of the sort we find in West Flemish (Haegeman 1992); see also Richards (2007) and Chomsky (2008) for a more abstract version of this idea that generalizes to all languages.14

E[-F] moves to SpecTP and then the unvalued Case feature on E is valued. Since the elements Fin°, T° and E share a [T]-feature, movement is required. If there is no movement, the φ-features on Fin° will not get valued since the unvalued φ-features on T° will create a defective intervention effect. This means that (16) is an instance of a checking domain as defined, which means that the subject is frozen in the canonical subject position. Interestingly, this theory also predicts that there should be no freezing effects when direct objects get their case checked (in a Pesetsky and Torrego 2001, 2004 way). Direct objects are externally first-merged in their check- ing configuration as a complement to V°, where v° bears the relevant Case feature, on standard assumptions. (17) shows the relevant parts of the struc- ture, though I will return to this in greater detail in section 4.3.1.15

(17) [v°[+T] [V° [DP[–T]]]]

Although direct objects are externally merged, they of course enter into checking, but not in a configuration of the kind given in (15). The latter assumption is motivated by the fact that raising of objects do not seem to be required in overt syntax in English (Lasnik 1999, 2003). Within Boeckx’s theory, direct objects are eligible to move further. In the following we will see that this fits with the data. However, note that the present account encompasses a disjunction: DPs can check Case in two different ways—via External First-Merge as in the case of direct objects and via Internal Merge in the case of subjects moving to SpecTP. In the following we will see that direct objects can in fact also be checked in a derived position, in which case a freezing effect occurs. At the moment, I have no good reason for why there is a disjunction when it comes to checking positions for direct objects, so it remains a stipulation, as in most other theories. This line of thought can be extended to the subextraction cases discussed above. Boeckx (2008b: 196) says that “in the illicit cases, the point of origin 120 Transformational Constraints is a complete chain [. . .] that cannot be extended further”. It is not clear exactly what this means since it is not clear that there actually is a chain inside the entire constituent. However, what I think Boeckx is trying to say is that when the entire constituent has been checked in a checking domain, none of its internal parts can be extracted. This requires that what happens to the big constituent (e.g., Case checking) also happens to the constituent’s internal parts. There exist sophisticated theoretical ways of achieving this; see Boeckx (2008b) for one suggestion.16 Basically he suggests that when subextraction fails, the point of origin for subextraction is a complete chain that cannot be extended further. That is, since the constituent has already moved into a checking domain, we would get an ambiguous chain if parts of the moved constituent moved further since these parts would eventually have to reach another checking domain. This accounts for why subextrac- tion out of in situ subjects is better than subextraction out of displaced subjects, as we have seen illustrated earlier (see, e.g., example (5)).17 Thus, we see that Boeckx assumes that the accounts of extraction and subextrac- tion are very similar. This is not a logical necessity in his framework, but it is an assumption that he adopts. However, when we look at various data in the following, we will see that there are clear differences between the two types of extraction and that more theoretical power is required to prevent subextraction than to prevent extraction. In particular, I argue that Case is required in understanding why subextraction is more restricted than extrac- tion if we want to treat extraction and subextraction in similar ways, as Boeckx does. This leads me to address the difference between extraction and subextraction more extensively, which I do in section 4.4. The upshot of the discussion will be that the theory in Boeckx (2008b) does not offer a satisfactory explanation of why subextraction is prohibited given that it makes use of at least three ways to rule out the bad cases.

4.2.2 Extraction of Subjects Subextraction from subjects was shown to relate to the assumption that the subject moves into a checking domain, i.e. an unambiguous feature-check- ing site. We expect the same to be true of subjects being extracted, but a complication enters the picture at this point. It is a well-established fact that languages are subject to variation when it comes to subject extraction, and this is commonly related to that-trace effects (see Lohndal 2009 for discus- sion). In this section, I brieflyshow how the theory in Boeckx (2008b) and Lohndal (2009) derives that-trace effects, though I discuss some examples that motivate changing some of the assumptions that Boeckx and Lohndal made. Since the main purpose of this section is to introduce the details of the framework, I only focus on English; see Lohndal (2009) for an extensive discussion of how to accommodate the variation we find across dialects and languages. Freezing Effects and Objects 121 The standard case of that-trace effects from English is given in (18)18

(18) (a) *What do you think that is in the file? (b) What do you think is in the file?19

Boeckx (2008b: 178) suggests that movement is allowed out of the embed- ded finite clause in (18b) because [T]-feature on Fin° is deactivated. When Fin° no longer has a valued [T]-feature, it cannot value the Case feature of the subject DP. That is, we have two configurations, as shown in (19):

(19) (a) [Fin° [+T] [__ T° [−φ]]] subject extraction disallowed

(b) [Fin° [T] [__ T° [−φ]]] subject extraction allowed

The Standard English that-trace effect illustrated in (18a) exhibits the con- figuration in (19a). When that is missing, as in (18b), Fin° does not have a valued [T]-feature, as is shown in (19b). Thus, the DP does not receive Case and can freely move to the main clause, where its Case feature is valued. I adopt the standard assumption that the subject move through SpecTP even if there is no checking in this position, as long as the subject still needs to check some feature (see Boeckx 2007 and Bošković 2007 for a defense of the latter claim). However, Boeckx (2008b) does not tell us where the Case feature of what is valued. One answer is provided by Lohndal (2009), who argues that what receives Case from the matrix Fin°. Unfortunately, this solution predicts (20) to be grammatical, contrary to fact:

(20) *Who was it expected t to kiss Mary?

Since the theory in Lohndal (2009) cannot account for this case, in this chap- ter I argue that who actually receives Case from the matrix verb expected, similarly to Kayne (1980), who argued that a matrix verb can assign Case to an element in COMP. On this assumption, I return to cases like (18b), here using the example in (21a). The relevant parts of a derivation where there is no that-trace effect are shown in (21b–e), where I ignore any other feature checking except Case:

(21) (a) Who do you think left?

(b) [Fin°[+T] do [TP [vP you[−T] [v°[+T] think [FinP[T] [TP who[−T] left]]]]]]

(c) [Fin°[+T] do [TP you[+T] [vP you [v°[+T] think [FinP who[−T] [FinP[T] who[−T]

[TP left]]]]]]]

(d) [Fin°[+T] do [TP you[+T] [vP you [v°[+T] think [FinP who[+T] [FinP[T] who

[TP left]]]]]]]

(e) [FinP who[+T] [Fin° do [TP you[+T] [vP you [v°[+T] think [FinP who[+T] [FinP[T]

who [TP left]]]]]]]] 122 Transformational Constraints The derivation stages in (21b–c) show how the main-clause subject you gets Case whereas (21d–e) show how the moved embedded subject who gets Case. The embedded SpecTP is not a checking domain because the [T]- feature on Fin° is deactiviated; thus, the relevant relativized minimality situ- ation will not emerge.20 This means that who is allowed to move further, potentially triggered by an EPP-feature. Note that gives us a way to analyze (20) as well: On the standard assumption that accusative Case is absorbed in passives, there will be no Case checker for who if it is merged as the sub- ject. Thus, the sentence is predicted to be unacceptable. Regarding dialects of English that do not exhibit that-trace effects (Sobin 1987, 2002), these show that the only possible configuration is (19b). Sub- ject extraction is possible regardless of whether the complementizer is pres- ent or not. I am assuming that who in (22a) gets Case in the same way as in (21) (pace Lohndal 2009):

(22) (a) Who do you think that left?

(b) [Fin°[+T] do [TP[-φ] [vP you[−T] [v°[+T] think [Fin° [T] that [TP who[−T] left]]]]]]

(c) [Fin°[+T] do [TP[-φ] you[+T] [vP you [v°[+T] think [FinP who[−T] [Fin°[T] that

[TP who[−T] [TP left]]]]]]]]

(d) [Fin°[+T] do [TP you[+T] [vP you [v°[+T] think [FinP who[+T] [Fin°[T] that

[TP who [TP left]]]]]]]]

(e) [FinP who[+T] [Fin°[+T] do [TP you[+T] [vP you [v°[+T] think [FinP who [Fin°[T]

that [TP who [TP left]]]]]]]]]

As we see, the only difference between the English derivations is whether the [T]-feature on Fin° is activated or not.21 This crucially determines whether subject extraction is possible.22 Before moving on, we need to pause to look at a different case, namely, (23), brought to my attention by Howard Lasnik (p.c.).23

(23) (a) I know who to say [t solved the problem] (b) It is unclear who to say [t solved the problem]

In (23), there does not seem to be a Case assigner for who, which seems to be a problem for the theory I have adopted here. However, the solution I offered for the cases where who is allowed to move out of an embedded finite clause seems to work for (23) as well. That is, I argue that say assigns Case to who in SpecCP in the examples in (23).24 Specifically, I assume that say has a valued [T]-feature that can check the unvalued [T]-feature on who. Since the C head of the CP that say embeds has a deactiviated [T]-feature, we do not have a checking domain in the sense of (15). Hence, who can move further. One may wonder why who cannot remain in this lower posi- tion (*I know to say who solved the problem); this is presumably due to the fact that the matrix predicate selects for a wh-complement.25 Freezing Effects and Objects 123 This account does not derive the unacceptability of (24a), though the latter seems to be ruled out on independent grounds given that (24b) is also bad:

(24) (a) *I know who to say to solve the problem. (b) *I know why to say to solve the problem.

Note, however, that this account does handle the following data accurately:

(25) (a) *I don’t know who to tell Bill to solve the problem. (b) I don’t know why to tell Bill to solve the problem. (c) I don’t know who to tell Bill to have t solve the problem.

In (25c), who has its Case checked by know. (25b) should be good since it involves an adjunct that does not need to be checked for Case, and (25a) should be bad because who has not moved and hence there is no trace. Since there is no trace, there is no meaningful interpretation. The claim that say can assign Case to who in (23) may be complicated by the following data:

(26) I know who to say very clearly [t solved the problem].

Here the adjunct very clearly intervenes such that there is no adjacency between the verb and who. However, we know that adjacency matters in other cases, as is shown in (27a–c).

(27) (a) I sincerely believe John. (b) *I believe sincerely John. (c) *I believe very sincerely [John to be smart] (d) Who do you believe very sincerely [t to be smart]?

As Chomsky and Lasnik (1977) pointed out, sentences like (27c) improve if the subject is an extracted wh-element, as in (27d), which lends support to the preceding analysis. The generalization seems to be that adjacency mat- ters for licensing elements in situ and ECM cases, but not when a constituent is in the specifier of CP. I do not have a good answer to why adjacency does not matter in the latter case and have to set that aside for future research. There is a lot of empirical evidence in favor of this approach to that- trace effects and Comp-trace effects more generally. For instance, Lohndal (2007b, 2009) argues that this approach can derive the variation among the Scandinavian languages when it comes to that-trace effects and that it can also deal with relative clauses and extraction out of these. For reasons of space, I have to refer the reader to Boeckx (2008b) and in particular Lohndal (2007b, 2009) for details. In the next section, I discuss local subject questions. 124 Transformational Constraints 4.2.3 Local Subject Questions So far I have discussed extraction from embedded questions and briefly touched on subextraction from subjects. I now argue that this approach can also account for local subject questions like (28):

(28) Who told the story?

There is a lot of discussion in the literature as to whether the subject in (28) is in SpecCP or in SpecTP (see, among others, Travis 1984, Vikner 1995, Zwart 1997). Rizzi and Shlonsky (2007) discuss this question thoroughly and conclude that it does reside in SpecCP. They advance an argument in favor of this position based on languages with overt question morphemes, like Imbabura Quechua:

(29) pi—taj shamu-rka who-q left-agr ‘Who left?’

Rizzi and Shlonsky (2007) argue that there is a covert expletive (called there in (30)) directly merged in SpecTP, making it possible for the wh- phrase to circumvent being frozen in SpecTP. The theory I have adopted so far is neutral on whether the subject is in its Case position or in an A-bar position (Boeckx 2008b: 193). I will follow Rizzi and Shlonsky’s proposal and assume that there is a silent there in SpecTP. According to Koopman (2006), there is evidence for a silent there in other languages (see also Rizzi and Shlonsky 2007), so (28) and (29) may be other such cases. What would falsify my claim is a case where the sequence ‘who there V’ is bad, as it would show that there cannot be a silent there in a matrix clause with wh- movement across the board in English. Bearing this point in mind, I need to clarify how the wh-phrase will get Case in an example like (30a). The relevant parts of the derivation are given in (30b–e):

(30) (a) Who left?

(b) [Fin°[+T] [TP[-φ] there[-T] [vP who[−T] left]]]

(c) [Fin°[+T] [TP[-φ] there[+T] [vP who[−T] left]]]

(d) [FinP who[−T] [Fin°[+T] [TP[-φ] there[+T] [vP who left]]]]

(e) [FinP who[+T] [Fin°[+T] [TP[−φ] there[+T] [vP who left]]]]

I assume that there is no intervention effect here because Case checking of there happens prior to A-bar movement of who, which may also be required to prevent a Multiple Agree configuration if the arguments in Haegeman and Lohndal (2010) extend to the present case. As for the details of the derivation, I assume that the wh-phrase bears a [–T] feature, c-commands Fin° and is in a possible Agree configuration. Fin° is therefore responsible Freezing Effects and Objects 125 for valuing the Case feature on both the expletive and the wh-phrase. This account is fully in line with the overall theory of the present paper, namely, that extraction/further movement is only possible when an item does not enter into a checking domain. For local subject questions, the silent exple- tive ensures that the first checking domain that the wh-phrase is able to enter is in SpecFinP. The reader may at this point wonder how I would analyze V2 languages like Norwegian, given that most theories assume that the subject is in the A-bar domain in declarative sentences like (31):

(31) Han fortalte historien. he told story.def “He told the story.”

Here I have to assume that the Case feature [+T] is on the Force head in declarative main clauses (irrespective of whether there is a that-trace effect or not; see the variation in the Scandinavian languages discussed in Lohndal 2009), which will then trigger movement to the C-domain. V2 is arguably the cue responsible for this fact since V2 generally only holds in main clauses. In V2 languages, the verb has been argued to target a position higher than Fin° in the left periphery (Westergaard and Vangsnes 2005, Bentzen 2007, Julien 2007), and it is thus reasonable to assume that V2 activates an articu- lated C-area (it ensures that Fin° and Force° are split, cf. Rizzi 1997). Notice that this split ensures that the subject has to move into the C-domain. This movement is required for Case reasons, and made possible since SpecTP is no longer a checking domain due to the lack of a Case feature. This account also fits with the established fact that most Norwegian dialects do not have embedded V2 (though see Bentzen 2007 for complications and some dia- lectal variation), which means that we do not expect subjects in embedded clauses to move to the C-area since there is no V2 that may trigger this movement. In this section we have seen how local subject questions can be analyzed. In the next section I turn my attention to objects and some extraction phe- nomena from a comparative perspective.

4.3 Objects and Extraction The previous section discussed subjects and some properties related to extraction. This section shows that direct objects and indirect objects fit the theory, which crucially relies on the notion of a checking domain. Sec- tion 4.3.1 discusses direct objects and notes a peculiar asymmetry found in Basque. This asymmetry is important for what follows in section 4.3.2, where I propose a way to analyze indirect objects. Section 4.3.3 shows how the present theory receives further support from object shift and scrambling. 126 Transformational Constraints 4.3.1  Direct Objects In section 4.2 I argued that there is a special freezing position for subjects, namely, SpecTP. According to Rizzi (2006: 124), there is no corresponding freezing position for objects. The goal of this section is to show how this can be implemented to account for the Norwegian data in (32) and (33):

(32) (a) Hva tror du Peter vet t? what think you Peter knows “What do you think Peter knows?” (b) Hva tror du Peter maler et bilde av t?26 what think you Peter draws a painting of “What do you think Peter draws a painting of?” (33) (a) Hva gav du barna t for noe til bursdagen? what gave you children.def for something to birthday.def (b) Hva gav du barna t til bursdagen? what gave you children.def to birthday.def “What did you give the children for their birthday?”

As these data show, there are no freezing effects related to direct objects. This follows straightforwardly from the theory advocated for here as direct objects do not check their case in a checking domain (see also section 4.2) but normally get their Case checked in situ. The Case of the direct object will be assigned by v° as in (34). I am assuming that the direct object is merged as a complement to V° and, following Pesetsky and Torrego (2004) and Boeckx (2008b), I take it that the abstract representation of accusa- tive Case is a [+T]-feature, and that the feature is morphologically realized either as nominative or accusative depending on the environment in which it occurs.

(34) [v°[+T] [V° [DP[−T]]]]

The structure in (34) is an ambiguous checking site; the direct object DP is externally first-merged in its position (see also the discussion earelier related to (15)). Thus, this checking site does not prevent the object from mov- ing further since it is not a checking domain as defined above. Nor does it prevent subextraction from the object. As (27) and (28) show, this account gives us the correct facts. Interestingly, there is a peculiar asymmetry in the data discussed so far. We have seen that subextraction from subjects is generally banned, but that both subextraction from an argument and subject extraction are be pos- sible in a postverbal position, as in the Spanish and English examples in (6) and (4). A prediction seems to be that we do not expect the same to hold true of direct objects since they do not enter into a checking domain Freezing Effects and Objects 127 (I discuss object shift in section 4.3.3). However, this prediction fails. Con- sider Basque, where subextraction from direct objects is banned but extrac- tion of the entire direct object is grammatical. This is illustrated in (35):

(35) (a) *Norii buruzko sortu zituzten aurreko asteko istiluek [zurrumur-

ruak ti]? who about.of create aux last week scandals rumors

“Who1 have last week’s scandals caused [rumors about t1]?” (Uriagereka 1998: 395)

(b) Zer1 egiten duzu zuk t1 hemen? what do aux you.erg here “What are you doing here?” (Etxepare and de Urbina 2003: 464)

We see here that in Basque subextraction from direct objects is impossible, while as we have seen both Norwegian and English allow such subextrac- tion. The question is how we are to account for this in terms of the frame- work we are developing. Following Laka and Uriagereka (1987), Uriagereka (1999a) and Boeckx (2003, 2008a, b, c), I argue that this extraction fact is related to properties of Basque agreement. Basque shows object φ-feature agreement with the verb, contrary to languages like English and Norwegian, and it is reasonable that this agreement makes it impossible to subextract from the object in Basque. This account is supported by a vast number of examples where the absence of agreement makes extraction possible (see, in particular, Boeckx 2003, 2008a, b, c, Baker and Collins 2006; see also the analysis of that-trace effects above). Verb–direct object agreement yields the same effect as moving to a checking domain does, namely, that sub- extraction becomes impossible. Boeckx (2003: 104) argues that the target for subextraction would be φ-complete, in which case subextraction is not licit. For this to be true in the present case, it would mean that verb–direct object agreement somehow makes the entire direct object φ-complete. It is not entirely clear how this can be done technically, and I leave the imple- mentation of this for future research. For our purposes, it suffices to note the strong correlation between agreement and lack of subextraction. Notice, though, that since the direct object itself has not entered a checking domain, movement of the entire object is still possible. Thus, there are now two ways in which subextraction of a DP becomes impossible: either by enter- ing a checking domain (the case of subjects), or if there is agreement (in φ-features) between a verb and the DP (the case of objects). We now have an asymmetry between extraction and subextraction. Extraction is ruled out whenever a constituent enters a checking domain. Subextraction is either ruled out by a constituent entering a checking domain or by φ-feature agreement. This already suggests that subextraction is more restricted than extraction since the environment where full extrac- tion is impossible is a subset of the environments where subextraction is 128 Transformational Constraints impossible. This is something that the theory in Boeckx (2008b) cannot capture as it stands. I have argued that a disjunction is necessary in order to capture the data, which suggests that Boeckx’s theory does not offer an explanation of why subextraction is prohibited even if it looks like we are roughly dealing with the same type of structure each time subextraction is bad (i.e., extraction out of a complex phrase). In the next section I argue that this conclusion is strengthen by the fact that in order to account for indirect objects in Norwegian, we need yet a third mechanism to rule out subextraction.

4.3.2  Indirect Objects The data that this section seeks to account for are provided in (36):

(36) (a) *Hva gav du [t for en fyr] pakker til bursdagen? what gave you for a guy gifts for birthday.def (b) Hvem gav du t pakker til bursdagen? who gave you gifts for birthday.def “Who did you give gifts for the birthday?”

The example in (36a) shows that it is impossible to subextract from an indi- rect object whereas (36b) shows that it is possible to move the entire indi- rect object. As we have seen, Basque shows the same contrast with direct objects, but this section argues that the two cases are to be accounted for differently. We know that indirect objects are special. They are claimed to be inherently theta-marked (and thus linked to theta-selection—see Chomsky 1986)—and often get dative case across languages. My account of the observed contrasts found with indirect objects is that the inherent Case assigned to them makes them opaque; thus, subextraction is impossible (cf. Řezáč 2008 on Basque and other languages, though see Uriagereka 2008: 157–158 for complicat- ing factors for Basque). This holds because the Probe would not be able to locate any element within the indirect object that it could extract. In other words, inherent Case acts like a “shield”, preventing any probing into its inner structure. This shield might plausibly be a PP shell, as suggested by Kayne (1984), Bittner and Hale (1996) and more recently Řezáč (2008), although admittedly this in itself does not guarantee nonprobing; however, see Řezáč for a specific proposal using phase theory. Indirect objects are therefore moveable as units, but it is impossible to subextract from them. Arguably, the reason they are movable as units is that they do not enter a checking domain; thus, they can freely move to the left periphery of the clause. However, in section 4.4 we will see that this picture is cross-linguis- tically more complicated in interesting and nontrivial ways. Before we move on to briefly discuss object shift and scrambling, we should pause to reflect on what it means that we have three different factors Freezing Effects and Objects 129 that prevent subextraction: entering a checking domain, triggering agree- ment, or being assigned inherent Case. These three factors make it clear that we should be well prepared to deal with the empirical generalization that subextraction is more restricted than full extraction, though it also means that we are either missing a generalization or that we cannot really explain subextraction. This chapter suggests that the latter is more likely the case. Intuitively, it also makes sense that we need more factors to rule out a quite diverse set of environments where subextraction is banned, compared to the rather homogeneous environments where extraction is banned (the latter being by and large confined to checking domains, if Boeckx 2008b and the present paper are correct). This is nontrivial from a theoretical perspective since it can be taken to mean that the prospects for developing a theory of subextraction are scant.

4.3.3 Some Remarks on Object Shift and Scrambling The theory of Boeckx (2008b) makes an interesting prediction, which I have indirectly touched on already: when we have scrambling or object shift, the internal parts of the moved constituent should be frozen or immovable if the moved constituent enters a checking domain (earlier versions of this predic- tion are the Frozen Structure Constraint (Ross 1967/1986: 173), the Freez- ing Principle (Wexler and Culicover 1980: 179) and the Revised Extraction Constraint (Diesing 1992: 128). This prediction follows because a shifted object would have its Case checked in a checking domain. I am assuming that in cases of object shift and scrambling, Case is not checked until the shifted position is reached because the absence of checking in the base posi- tion is what makes object shift and scrambling possible in the first place. An abstract derivation is shown in (37), where the starting point looks some- thing like (37a):

(37) (a) [Agr°[+T] [DPsubject v°[T] [V° DP[−T]]]]

(b) [Agr°[+T] DP[−T] [DPsubject [v°[T] [V° DP]]]]

(c) [Agr°[+T] DP[+T] [DPsubject [v°[T] [V° DP]]]]

What happens here is that the v° is deactivated and that the Case feature is on a higher head, (37b), say, Agr°, to use a familiar label (although the label should not be taken to have any theoretical significance). The deac- tivation is of the familiar sort that we saw in cases of subject extraction from finite embedded clauses without a complementizer (recall, e.g., the analysis of English sentences without that as in (18b)). The object has to move in this case because otherwise the defective [T]-feature on v° will create an intervention effect which will block Case assignment. We can test that the derivation in (37) gives the right empirical results by looking at data related to these phenomena. I use Icelandic and German data for illustration. 130 Transformational Constraints Icelandic does not have a general ban on floating quantifiers, as shown in (38):

(38) (a) Hún þekkir ekki [öll börnin]. she knows not all children.def “She doesn’t know all the children.”

(b) ?Börnini þekkir hún ekki [öll ti]. children.def knows she not all (Halldór Á. Sigurðsson, p.c.)

It is, however, impossible to strand a quantifier in a shifted position:

(39) *Börnin i þekkir hún [öll ti] ekki. children.def knows she all not (Halldór Á. Sigurðsson, p.c.)

These data seem to confirm out prediction: Shifted objects do not allow subextraction. However, this claim is somewhat complicated by the standard treatment of quantifier floating as in Sportiche (1988), where it is assumed that quantifiers can be floated in shifted positions. The data from Icelandic suggest that one cannot trivially equate quantifier float and subextraction but that in this language, the lack of quantifier float in shifted positions may follow from conditions on subextraction. Scrambling data from German provide the same picture as Icelandic, though without the additional complication. (40) and (41) illustrate that scrambled objects do not allow subextraction (for some complications, see Fanselow 2001).

(40) (a) Was hat Otto immer [t für Romane] gelesen? what has Otto always for novels read “What kind of novels has Otto always read?” (b) *Was hat Otto [t für Romane] immer gelesen? what has Otto for novels always read “What kind of novels has Otto always read?” (Diesing 1992: 129) (41) (a) Über wen hat der Fritz letztes Jahr [ein Buch t] geschrieben? about whom has the Fritz last year a book written “Who did Fritz write a book about last year?” (b) *Über wen hat der Fritz [ein Buch t] letztes Jahr geschrieben? about whom has the Fritz a book last year written “Who did Fritz write a book about last year?” (Müller 1998: 125)

(40a) and (41a) show that subextraction from the base position is entirely licit, whereas (40b) and (41b) show that subextraction from the scrambled position is banned. This is in line with our prediction. English provides further support. Here I draw on the observation in Las- nik (2001) that an object is only an island for extraction when it has raised Freezing Effects and Objects 131 overtly; the contrast between (42a) and (42b) illustrates that. (See discussion of whether or not English has object shift in a series of papers in Lasnik 1999, 2003).

(42) (a) Whoi did Mary call up [friends of ti]?

(b) ?*Whoi did Mary call [friends of ti] up? (Lasnik 2001: 111)

In (42b) we have exactly the configuration shownin (37). The Case feature is on what I have labeled the Agr head in (42b), whereas it is on the v head in (42a). Thus, the data from object shift and scrambling can be accounted for by the present theory. In the next section, I look at cross-linguistic differ- ences regarding indirect objects and their freezing properties.

4.4 Variation in Indirect Objects: English and Norwegian Compared In this section I provide a quite extensive study of variation in indirect objects between English and Norwegian. I show that Case is a central ingre- dient in deriving the differences. Before I can turn to that, it is necessary to briefly discuss recent studies on indirect objects.27 This section is structured as follows: First I mention the English data that show that indirect objects cannot A-bar move, in contrast to the Norwegian data we saw in section 4.2. The analysis is an extension of the account I have developed so far based on freezing, and we will see that subextraction again continues to demand “special” solutions in order for us to account for the dif- ferences between English and Norwegian. I argue that English indirect objects have to check a structural Case feature whereas Norwegian indirect objects only have an inherent Case feature. This is argued to derive the differences between the two languages. Last, I turn to passives and give a short analysis of some differences between English and Norwegian. Passives will not be dis- cussed in great detail, but I hope to show how the framework adopted in this chapter can be used to analyze passive as well as active sentences. Although Norwegian allows for both the direct object and the indirect object to A-bar move (see (32a) and (36b)), this, as (43c) shows, is not true for English. The fact was first discussed by Chomsky (1955/1975: 492–493) and subsequently by Fillmore (1965), Jackendoff and Culicover (1971) and Oehrle (1976).

(43) (a) Joanie gave a bouquet of flowers to who. (b) Joanie gave who a bouquet of flowers. (c) *Who did Joanie give a bouquet of flowers?28 (Whitney 1982: 307)

The examples have traditionally been taken to show that it is possible to move a dative once, as in (43b), but not twice, as in (43c). However, 132 Transformational Constraints Oehrle (1976) was the first to argue that there is no movement and that both (43a) and (43b) are base-generated. I do not discuss issues related to whether sentences like (43a) and (43b) are derived from one common struc- ture (Czepluch 1982, Whitney 1982, Baker 1998, 1996, 1997, Marantz 1993, den Dikken 1995, Pesetsky 1995) or have different base structures (Oehrle 1976, Harley 1995, 2002) (see Rappaport Hovav and Levin 2008 for a critical discussion of the latter position regarding the semantics). The consensus in recent year has clearly been to assume two base structures, and that is what I assume here too (see McGinnis 2001, Pylkkänen 2002/2008 and Jeong 2007; McGinnis 2008 provides a good overview). All recent studies are by and large confined to A-movements and, in par- ticular, passives. However, it seems worthwhile to try to unify the research on various types of indirect objects and their behavior under A-bar move- ments. The main goal of this section is to do that and at the same time briefly look at passives to make sure that the theory can handle the basic facts.29 As we have seen, one fact in need of an explanation is why (43c) is bad. (44) gives a different example, where the difference in judgement means that speaker vary a bit on how bad they find these examples:

(44) ?*Who did Mary give a book?

As we have seen in detail earlier, Boeckx’s (2008b) theory says that when an item enters a checking domain, it becomes frozen. Thus, it is natural to conclude that the indirect object in (44) is frozen and is not able to A-bar move. Jeong (2007) argues that Case is a crucial ingredient in deriving vari- ous properties of high and low applicatives/indirect objects, and I would like to follow this insight. A natural way to account for the data in (44) is to argue that English indirect objects need to check a structural Case feature. This is in effect the suggestion by Baker (1997), and it has also been adopted in the literature subsequently. Baker suggests an AspP merged directly below vP, which is compatible with the theory advocated here.30 Let us look more closely at the derivation of (44). Consider the structures in (45), where I fol- low Pylkkänen (2002/2008) and others in assuming that English has a low ApplP where both the indirect and the direct object are merged.

(45) (a) [vP SU [v°[+T] [AspP [Asp°[+T] [VP [V° [LApplP IO[−T] [LAppl° DO[−T]]]]]]]]]

(b) [vP SU [v°[+T] [AspP IO[+T] [Asp°[+T] [VP [V° [LApplP IO [LAppl° DO[+T]]]]]]]]]

Here I am only focusing on the indirect object. I continue to simplify by marking the abstract Case feature as a [T]-feature. I assume that the Asp° head and the v° head both have a [+T] feature, where the former licenses accusative Case on the direct object and the latter licenses dative Case on the indirect object.31 Since v° needs to enter into a relation with the indirect object, the indirect object has to move to SpecvP in order to circumvent a potential relativized minimality effect due to the [+T] feature on Asp°. The Freezing Effects and Objects 133 specifier of Asp° will then be a checking domain; thus, nothing will be able to move further from this position. As with subjects, subextraction is also ruled out because the constituent has entered a checking domain. We thus see that indirect objects in English behave quite similarly to subjects regard- ing extraction and subextraction, with the exception that subject movement can be ameliorated in certain environments (that-trace effects). Interestingly, the present account also accounts for why (46) is grammatical:

(46) Who did Mary send a present to?

In (46) the dative is realized in a prepositional phrase, and the dative does not enter a checking domain, though it gets its Case in its base position. Since this is not a checking domain, no freezing occurs. The structure is shown in (47):

(47) (a) did Mary send a present [to[+T] who[−T]]

(b) did Mary send a present [to[+T] who[+T]]

(c) who[+T] did Mary send a present to who

Again, a Case feature is realized as a [T]-feature, which Pesetsky and Tor- rego (2001) argue is the case also for prepositions. Case checking takes place in situ, which is not a checking domain. Thus, we expect movement to be licit, and as (47) shows, the dative can move freely to the left periphery of the clause. Essentially, the present proposal captures the same effects that Anagnostopoulou (2003: 146) does. She argues that both the PP and the DP are in the same minimal domain (the VP) so movement of either is gram- matical. However, no assumption about elements being in the same minimal domain is necessary here, which arguably is a welcome consequence as it gives us a more restrictive theory of phrase structure since we do not have to assume the kind of parameter that Anagnostopoulou suggests (see Boeckx 2008b). Crucially, the current theory also offers a way to account for why Norwe- gian allows the indirect object to A-bar move.

(48) [CP Hvemi gav [IP Marie [vP ti en bok]]]? who gave Mary a book

In Norwegian, the shifted position is not a checking domain, which sug- gests that Norwegian indirect objects bear inherent Case, as suggested in section 4.3.2.32 Since there is no structural Case feature, indirect objects in Norwegian do not freeze if they move through SpecAspP since Spe- cAspP will not be a checking domain as no features on the indirect objects have to be checked. Thus, indirect objects are able to move freely as units. I argued earlier (following, e.g., Řezáč 2008) that inherent Case typically renders the internal structure of the item opaque. This accounts for why 134 Transformational Constraints subextraction from indirect objects is not possible in Norwegian, as we saw in section 4.3.2. Interestingly, the same data as (36a) (noted in Culicover and Wexler 1973: 26) occur in English; see (49):

(49) (a) I sent a friend of Peter a book. (b) *Who did you send a friend of a book?

The explanation for the ungrammaticality of (49b) cannot be the same as for subextraction from indirect objects in Norwegian. For Norwegian, I argued that subextraction is impossible because of inherent Case. As discussed in section 4.3.2, inherent Case renders the internal structure of an item opaque for subextraction, which means that a Probe–Goal relation cannot be estab- lished between a Probe and a Goal within the indirect object. For English, on the other hand, subextraction is ruled out because the indirect object moves to a checking domain. Again we see how two of the three ways to rule out subextraction come to play: inherent Case-marking and entering checking domains (the third being agreement, e.g., in the case of Basque). Thus, there are differences between languages and between structures as to which option they chose. In other words, although the English and Norwe- gian data on subextraction from indirect objects are similar, I argue that they should be accounted for differently if we want to bring subextraction under the freezing fold as Boeckx (2008b) wants to. This seems to again show that it is necessary to rule out subextraction in more ways than extraction itself, a theoretical claim that we have seen a lot of empirical evidence in favor of. However, it also means that from a freezing perspective, there does not seem to be an explanation of why subextraction is bad. If there are three ways of deriving the ban on subextraction, we are either missing a generalization or there is no generalization to be obtained, despite the fact that the structures that are frozen in cases of subextraction seem to be very similar. Insofar as the present chapter is on the right track, it suggests that the latter is true. In sum, we see that Case provides a natural way to understand the differ- ences between languages like English and Norwegian. The main difference is to do with whether indirect objects have a structural Case feature that they have to move into a checking domain to check or whether they are inherently Case-marked. Let me now discuss how we can also derive the contrasts in passives from the present framework. There are several things that we need to explain. First of all, we have to explain why indirect objects freely A-move in passives, whereas direct objects do not in (American) English. The relevant data are given in (50):

(50) (a) John was given a book by Mary. (b) *A book was given John by Mary.33

I have already argued that English indirect objects have structural Case. This means that the Case feature has to be checked. From the perspective Freezing Effects and Objects 135 I have been defending in this chapter, the natural way to account for (50) would be that for some reason (to be discussed immediately in the follow- ing) the Case feature of the indirect object cannot be checked in the shifted position. Anagnostopoulou (2003) suggests that the difference between the Eng- lish examples in (50) can be accommodated through locality. Specifically, in passives, the goal blocks movement of the theme. However, if this were the case, it remains mysterious why Norwegian goals do not block this movement. I think the theory proposed by Boeckx (2008b) provides a bet- ter solution to this puzzle. Namely, in passives there is by hypothesis no Case feature that can check the structural Case feature of indirect objects. I assume that part of what passivization does is to remove/absorb this struc- tural Case feature on v°, on a par with transitive verbs that are passivized, as in (51c) (cf. the analysis in, e.g., Jaeggli 1986, Baker, Johnson and Rob- erts 1989). This will allow (51b) as the passive counterpart of (51a). On this proposal, it follows that nothing will be able to check the structural Case feature on indirect objects, though it offers no explanation of why passive does exactly this and not, say, remove the Case checker of accusative Case of the direct object.

(51) (a) John read a book. (b) A book was read by John (c) *It was read a book.

As such, one may think that there is some sort of Case hierarchy here where the structural Case feature that is highest in the argument domain is removed. For ditransitives, this will be the v head, and for transitives, it will be the v head as well since the v head bears the accusative Case feature in the latter case. Put differently, this could lead one to speculate that the Case feature on the v head is always the feature that is removed in passives. I leave this issue and the possible implications for future research. The indirect object will not get its Case checked since there is nothing that can check the Case feature within the argument domain of the clause, and it therefore has to move to the subject position to get its Case checked. Interestingly, there is also another fact that seems to support this theory. Consider the contrast in (52):

(52) (a) *?A letter was sent Mary (by John). (b) A letter was sent to Mary (by John).

I take the grammaticality of (52b) to show that movement of the direct object is not excluded per se in English. What is impossible is for the Case feature of the indirect object to be checked in passives, and this is what ren- ders (50b) and (52a) ungrammatical. 136 Transformational Constraints The Norwegian data introduce some further complications. In Norwe- gian, an indirect object can remain in SpecAspP in passives (53a), but it can also move to the subject position (53b):

(53) (a) Bokeni ble gitt [AspP John [vP ti av Knut]]. book.def was given John by Knut

(b) Johni ble gitt [AspP ti [vP boken av Knut]]. John was given book.def by Knut

Again, indirect objects in Norwegian have inherent Case without a struc- tural Case feature, so there will be no Case checking in a derived position.34 Thus, on this approach, they can move freely to the left periphery of the clause.35 Direct objects are eligible for movement since they do not enter any checking domain in their base position. Given the data in (53), one might wonder what happens with the accusative Case feature in (53a), since the direct object bears accusative Case in (53b). Åfarli (1992: 79), in part build- ing on Baker (1988: 340f.) and Baker, Johnson, and Roberts (1989: 239f.), provides a solution to this as he argues that the passive morpheme in Nor- wegian can bear Case, but it does not have to. That is, it does not bear case in (53b) but it does in (53a).36 Here I adopt Åfarli’s analysis; see his work for further details. We have seen how Boeckx’s (2008b) theory makes it possible to account for the various restrictions on both A-bar and A-movement. The main hypothesis is that whenever an element enters into a checking domain, it cannot move any further. This is particularly clear for subjects when they check their EPP-property, which is derived on Boeckx’s (2008b) proposal, and I have suggested ways of extending this property to other cases of freez- ing. In particular, I have argued that this approach gives us a way to ana- lyze the differences between English and Norwegian indirect objects. The present proposal also lends support to Jeong’s (2007) and Citko’s (2009, 2011) claim that Case is a major ingredient in deriving locality constraints for indirect objects. However, we have also seen that the theory is not able to explain subextraction. Agreement, inherent Case, and structural check- ing domains all converge on the same effects. It is not clear that the prin- ciples that we use to achieve this are significantly different from the number of phenomena to be explained. In that regard, there is no explanation of subextraction. Rather, the fact that we need a disjunction in our theory to derive the data suggests that extraction and subextraction are different phe- nomena, pace Boeckx (2008b). Needless to say, further studies are needed to say something about other languages that might differ from English and Norwegian, but I also think that the extraction contrasts displayed by these two closely related languages offer a good window into the underlying pro- cesses related to movements of indirect objects within Universal Grammar. In the next section, I discuss some remaining issues before concluding the chapter in section 4.6. Freezing Effects and Objects 137 4.5 Discussion In this section, I discuss some more general issues that arise based on the arguments that I have given in this chapter. I mainly look at possible prob- lems and questions that the paper raises. The theory of Boeckx (2008b), which I have adopted and extended, works well for pure extraction, where the notion of a checking domain is able to do a lot of work. However, in the course of the paper it became clear that the theory faces difficulties in accounting for the variation we find for subextraction. The paper ended up arguing that we need three different ways to preclude subex- traction if we want to bring subextraction into the freezing fold. Thus, there is no explanation of why subextraction is bad. This raises important questions about how a child would go about figuring out which language he or she is acquiring. The child does not have access to the cross-linguistic correlations, and by assumption the child does not have enough relevant input of subextrac- tion cases. For subjects, the child would presumably get the ban on subextrac- tion for free given that this is a case where the subject would be in a checking domain, which is something that the child has to know. For direct objects, presumably the explicit object–verb agreement marking in the case of Basque would be a cue, but then again it is not clear how this correlation gets imple- mented grammatically. For indirect objects, the issues are even less clear. Both English and Norwegian disallow subextraction out of indirect objects, but I have argued that the freezing story forces us to assume that this is for different reasons. For Norwegian, the lack of subextraction has been argued to relate to inherent Case whereas in English it follows from the claim that indirect objects move to a checking domain. Since there are no overt morphological differ- ences between Norwegian and English, it is not clear what the child would rely on. Thus, it is hard to see how the child would figure out exactly how to rule out subextraction. The acquisition argument is another argument that speaks against a theory that does not try to unify instances of illicit subextraction. It is also worthwhile to consider some facts that the present paper pre- dicts should not occur. It predicts that we should not find a language in which extraction out of in situ elements that have structural Case is impos- sible. For example, direct objects that bear structural Case should always be extractable themselves. The theory also predicts that there should be no difference between subextraction from a subject in situ and subextraction from a direct object in situ, as long as both of these bear structural Case, and there is otherwise no overt agreement as in Basque. Here it seems that the theory is challenged by the recent results in Jurka (2009), who shows that there is a difference in acceptability ratings depending on whether sub- extraction is from an in situ subject or an in situ object in a language like German. Presumably such differences in acceptability judgments make pre- dictions about grammatical differences, though it is not clear how exactly to correlate the acceptability judgments with grammatical differences. How- ever, it is clear that the present theory as it stands cannot easily account for 138 Transformational Constraints these facts. Furthermore, the theory predicts that an element can only move into one Case-checking domain. That is, an element cannot move into two Case positions as long as both of these are checking domains. Last, subex- traction out of an inherently Case-marked element should never be possible. Before concluding, it is worth considering whether the parameters that have been invoked in this chapter are the best ones, or at least why they are better than other plausible parameters. Put differently, one wonders why Case should be such a central ingredient as opposed to φ-features, for example, an issue that Boeckx (2008b) does not discuss. I think there are at least some empirical reasons for thinking that Case is a central mechanism in narrow syntax, and that it is more central than φ-features. One is that we find some instantiation of Case in all the described world’s languages. Even for languages like Mandarin Chinese it has been argued persuasively that abstract Case plays a crucial role even though there is no morphological marking of Case (Li 1985, 1990). That is not the case for φ-features. They vary a lot both in their syntactic role and their morphological realization, and it is not clear that we want to say that they are universal, as in particular Fukui (2006) has argued convincingly (though see Sigurðsson 2004 for a different view). So even if there is verb–direct object φ-feature agreement in Basque, that does not mean that English or Norwegian has such agreement. There are also differences between the φ-features: some play a more privi- leged role than others. It has been argued, for example, that Person is special (Baker 2008, Richards 2008). It is not clear how this variation would go together with an approach that tries to get freezing effects to fall out from φ-feature agreement. However, it has been argued that agreement is respon- sible for why it is impossible to subextract from direct objects in Basque (see references in section 4.3.1), and if that is on the right track, φ-features play some role. Their role, though, seems to be heavily dependent on overt mor- phological realization, that is, that the features are externalized at PF (Fukui 2006). If Fukui and others are right, that is a strong argument against deriv- ing freezing effects from φ-features. Instead, relying on Case seems like a better option given that its presence is more or less universally attested. Beyond Case and φ-features, there are hardly any other features that play a role throughout the clausal spine and that seem to approach some status of universality. Although this is a quite strong independent empirical argu- ment, conceptually it is not clear that Case has to be privileged the way it is in the present story. If one instead argued that φ-features were privileged, or some other feature, that could just as well have turned out to be the right story, conceptually speaking. Case is not a priori the best parameter, but it seems to be one that is well supported empirically.

4.6 Conclusion In this chapter I have made an attempt to explore the theory of Boeckx (2008b) and to see to what extent it can account of freezing effects that sub- jects, direct objects, and indirect objects exhibit. Boeckx argues that when Freezing Effects and Objects 139 an item is checked in a special checking domain, it is impossible for this item to move further or to subextract from the item. This makes it possible to account for the comparative variation in that-trace effects, some of which have been discussed here, and for why subextraction from subjects is bad. Furthermore, it allows us to account for why direct objects commonly allow both extraction and subextraction, though I have also looked at a case from Basque where the latter is not allowed. The chapter also gave a comparative analysis of indirect objects in Eng- lish and Norwegian, where the crucial contrast is that indirect objects can undergo A-bar movement in Norwegian but not in English. This difference was accounted for by arguing that English indirect objects have a struc- tural Case feature whereas Norwegian indirect objects have an inherent Case feature. The structural Case feature needs to be checked in a checking domain; thus, a freezing effect occurs. Norwegian indirect objects, on the other hand, do not have a structural Case feature, hence no checking in a checking domain and consequently no freezing effect. I have also discussed how this account can account for why indirect objects are able to passivize in both English and Norwegian by arguing that in passives, the structural Case feature that licenses indirect objects is absorbed. Future work would have to address subextraction more extensively. The present paper suggests that a freezing approach does not offer a way to explain subextraction. Ideally we would like to be able to explain subex- traction since it seems to involve similar complex structures. It will also be important to explain subextraction to address the acquisition concern raised in the previous section. However, despite this, the theory advocated in the present chapter appears to be a fruitful tool for investigating a number of freezing phenomena across natural languages.

Notes 1 Parts of this chapter were presented at the NORMS Grand Meeting in Iceland in August 2007 and at the Syntax Brown Bag at New York University in Decem- ber 2009, and I am grateful to the audiences for valuable comments. Thanks also to Željko Bošković, Noam Chomsky, Alex Drummond, Ángel Gallego, Elly van Gelderen, Marc Richards, Luigi Rizzi, Bridget Samuels, and two anonymous JL referees. Thanks are also due to Cedric Boeckx for his support and constructive suggestions, and especially to Caroline Heycock, Norbert Hornstein, Howard Lasnik, and Juan Uriagereka for all their valuable comments and for ensuring that my writing is much clearer than it would otherwise have been. 2 This is a construction similar to the was für construction in many Germanic languages 3 It should be noted that in Norwegian one can also extract hvem out of a direct object, as shown in (i). (i) Hvem kjente han [t for noen]? who knew he for someone “Who did he know?” 4 See Müller (2010) for a discussion of how to capture the CED generally within a phase-based framework. Uriagereka (1999b, to appear) pursues a different Multiple Spell-Out approach. 140 Transformational Constraints 5 However, Chomsky (2008) presents a radical reanalysis of the data. He argues that the determining property is the underlying base position and that the verb’s argument structure determines whether extraction is licit (due to properties of his phase analysis). I do not discuss this analysis, as several authors have illus- trated that it does not work cross-linguistically (Broekhuis 2005, Gallego 2007, Gallego and Uriagereka 2007, Lohndal 2007a, Mayr 2007). Furthermore, Starke (2001) and Abels (2008) report examples from various languages where extraction from SpecTP is possible. Consider (i) from French and (ii) from German:

(i)  [CP [De quel film] est-ce que IP[ tu crois [CP que [IP [la première partie t] of which film is-it that you think that the first part va créer un scandale]]]]? goes create a scandal “Which movie do you think that the first part of would create a scandal?” (Starke 2001: 36)

(ii)  [CP [Von diesem Film] hat [IP [der erste Teil t] doch letztes Jahr einen groβen of this film has the first part pr t last year a big Skandal ausgelöst]]. scandal caused “The first part of this film caused a bit scandal last year.” (Abels 2008: 76) (i) shows extraction from SpecTP whereas (ii) shows topicalization from a sub- ject. Interestingly, both of these are impossible in Norwegian, and (i) has often also been reported to be unacceptable in English. A more comprehen- sive discussion of these cases seems to be required, which would go beyond the scope of the present chapter. 6 Thanks to my informants Ken Ramshøj Christensen (Danish), and Kjartan Ottosson and Halldór Ármann Sigurðsson (Icelandic). 7 The same data obtain for Norwegian, as seen in (i):

(i)  [CP [Hvilken kandidat] var [IP det [plakater av t] over hele byen]]? which candidate were there posters of over whole city.def “Which candidate were there posters of all over the town?” However, as argued by Lødrup (1999), the constituent plakater av t is arguably a direct object in Norwegian. See his paper for arguments. 8 See also Bošković (2007) for a different theory where Case also is the ultimate ingredient. See Lasnik (2008), Legate (2008) and Sigurðsson (2008) for much discussion on general properties of Case and case (I use capital C to differentiate abstract and morphological case). 9 This does not make the item itself an island, though Boeckx (2008b) basically claims that it does. I return to this later. 10 This encompasses a view of successive cyclic movement which says that succes- sive cyclic movement is not driven directly by feature checking, as in Boeckx (2007) and Bošković (2007). 11 It should be noted that the curly bracket above the checking domain in (15) is non-trivial. The only worked out definition of a checking domain in the litera- ture is set-theoretic (Chomsky 1995), suggesting that there is something missing in the characterization in the text since the characterization in (15) is not set- theoretic. See also Nunes and Thompson (1998) for discussion. 12 I assume that this is not an instance of feature inheritance in the technical sense of Richards (2007) and Chomsky (2008). Feature inheritance assumes that the inheriting element (which would be T°) is merged without the relevant fea- ture and then it inherits the feature so that the originator of the feature (which would be Fin°) presumably loses the feature. Here T° already has the relevant [T]-feature and the feature is valued. Thus, it is hard to see exactly how feature valuation would work here for a [T]-feature, which in turns raises issues about Freezing Effects and Objects 141 what feature inheritance ultimately is. Since I am not saying anything about phases in this chapter, I set this issue aside. See also Haegeman and van Koppen (2009) for further complications for feature inheritance. 13 Boeckx (2008b: 174) adopts this solution for A-bar chains but seemingly not for A-chains. 14 Whether the φ-features on Fin° are actually inherited by T° upon probing (Rich- ards 2007, Chomsky 2008) is an issue I will not go into here since I am not adopting a phase-based framework. If feature inheritance occurs, that means that there should not be φ-features on T° when T° enters the derivation. 15 See note 12 for a discussion of feature inheritance that is also relevant to the present case. 16 I do not discuss the specific details as it requires an extensive discussion of issues that are not necessary for the remainder of the chapter. Instead, I urge the reader to consult Boeckx’s (2008b) original work for a detailed discussion. 17 Subextraction from adjuncts is universally bad regardless of where it occurs. I do not deal with adjuncts in this chapter, but see Boeckx (2008b) for ideas on how to analyze adjuncts within the present framework. 18 It is well known that topics ameliorate the that-trace effect, as in (i) (Bresnan 1977, Culicover 1992): (i) John met the man Mary said that for all intents and purposes was the presi- dent of the university. See Boeckx (2008b) and Lohndal (2009) for an analysis of these data that is compatible with the present chapter. 19 A case like (i), which is different from (18b) in that the subject has not moved out of the embedded clause, is accounted for by assuming that a silent comple- mentizer can either have a Case feature or not have a Case feature, or there are two different null complementizers. As far as I can tell, there are not any real differences between these two approaches: (i) I think he left. 20 Boeckx (2008b: 174) would disallow such a solution because “[a]n element can only move to a single Tense checking site or Force checking site”. 21 There are languages, like West Flemish, that have a full agreement paradigm and no that-t effect (Haegeman 1992), but they are not problematic from the present point of view. Since the ability to move a displaced subject is crucially related to whether Fin° has a valued [T]-feature, a complementizer without an agree- ing [T]-feature can nevertheless agree in φ-features. I do not discuss the specific technical implementation of this in this chapter. 22 The present proposal also accounts for (i): (i) Who do you believe [t that Mary said [t [t left early]]]? This example is good because there is not that in the lowest clause, which means that who can move to out of the embedded clause and into the left periphery of the next embedded clause, where Case is checked, as suggested in the main text. Last, who moves to the left periphery of the matrix clause. 23 Thanks to Richard Kayne, Howard Lasnik, and Juan Uriagereka for useful dis- cussions of these data. 24 Independent support for this claim may come from Basque, where the compa- rable relative pronoun gets absolutive Case, as shown in (i): (i) Badakit nor esan konpondu zuela arazoa know.I who.abs to.say solved has.Comp problem.the.abs “I know who to say solved the problem” (Ricardo Etxepare, p.c.)

In (i), nor “who.abs” would normally have been nork “who.ERG”, since it is the underlying subject of konpondu “solved”. For Basque, it is not straightforward to argue that badakit, “know”, is what is assigning the Case since this is a verb that takes two arguments, and the lower argument is the entire clause. How exactly Case “transmission” works here is not trivial, though see San Martin 142 Transformational Constraints and Uriagereka (2002) and Uriagereka (2008) for some suggestions. Thanks to Juan Uriagereka for clarifying the Basque data. 25 The issue is somewhat more complicated, as witnessed by the following data: (i) (a) I know who left. (b) I know that John left. (ii) (a) I said who left. (b) I said that John left. These data show that both know and say can take both wh-complements and non-wh- complements. Interestingly, the following case is good according to most speakers: (iii) I know who it was said t solved the problem. This aligns well with the discussion in the text. 26 There is an interesting definiteness issue at work here. The following sentence, where the direct object is a definite noun phrase, is not acceptable: (i) *Hva tror du Peter maler [bildet av t]? what think you Peter paints picture.def of I do not attempt to analyze this fact in this chapter as it requires an extensive treatment of definiteness effects. Thanks to Norbert Hornstein (p.c.) for raising this question. 27 Throughout this chapter, I use the term indirect object rather than applicative, except in cases where I refer to literature that explicitly uses “applicative”. This is an extension of the term applicative used to refer to a construction where a verb bears a special morpheme which licenses an oblique or non-core argument (see Marantz 1993). In the extended usage, all indirect-object constructions are called applicatives (see McGinnis 2001, Pylkkänen 2002/2008, Jeong 2007). Given this terminology, (ia) is an applicative and (ib) is not. (i) (a) I read John a letter. (b) I read a letter to John. Since I discuss both constructions in (i), ‘indirect objects’ is a better term. Another distinction that I will take for granted is the distinction between low and high indirect objects/applicatives (McGinnis 2001, Pylkkänen 2002). The basic idea is that cross-linguistically double object constructions are divided into two types: high and low phrases within the argument domain. Whether an indi- rect object is “high” or “low” is correlated with semantic distinctions. A high indirect object (merged above the VP) denotes a relation between an individual and an event whereas a low indirect object (a complement to the V head) denotes a relation between two individuals. In the latter case, this relation is a relation of possession between the indirect object and the direct object. 28 Interestingly, Baltin (2001: 251–252, fn. 2) points out that this picture is more complicated. He shows that verbs like teach and feed do allow for what he takes to be an indirect object to A-bar-move: (i) (a) John taught Sally (French). (b) Who did John teach? (ii) (a) John fed Sally (steak). (b) Who did John feed? Baltin does not offer a solution to this puzzle. The puzzle is even more serious because of (iii). (iii) (a) Who did John teach French? (b) Who did John feed steak? Both of the examples in (iii) are good. This seems to be idiosyncratic to the verbs teach and feed, which pattern more like Norwegian in this regard by allowing the indirect object to move. I assume that this is a lexical idiosyncrasy in the sense that there is no AspP (see the following discussion) available for these verbs, which means that the structural Case feature can be checked either in situ or in SpecCP (by T’s Case feature). Freezing Effects and Objects 143 29 A question that will not be dealt with here is why some languages, like French, do not allow the indirect object construction (Kayne 1984: 193) and why Lati- nate verbs like donate do not allow indirect objects (see Harley 2008 for a recent proposal). 30 One might wonder whether this proposal rather speaks in favor of a transfor- mational account of the dative alternation, since an ‘extra’ position is needed on my proposal as well. However, as Oehrle (1976) and Harley (1995, 2002) have argued, there are several important independent problems with assuming a transformational account where the indirect object originates in the complement of a preposition (which then becomes silent after movement). I am therefore assuming that there are two base structures, in line with the recent research on applicatives quoted above. In other words, AspP is independent of whether one assumes a transformational or a base-generation account. What is crucial is that we need a checking domain for the checking of the structural Case feature on (English) indirect objects. 31 This [+T] feature on v° may plausibly be inherited by V; see Richards (2007) and Chomsky (2008). This would circumvent a possible locality problem for Agree between v° and the direct object (since (the copy of) the indirect object intervenes; this assumes that the lower copy for some reason does not intervene); thus, the argument in Citko (2011) that such an Agree relation is not impossible may not apply here. Since I have not discussed a phase-based approach in the present chapter, I set this detail aside. 32 Further support for this comes from Icelandic, on the assumption that dative case is inherent in both of these languages. Although it is commonly assumed that Norwegian indirect objects bear inherent Case, it is even clearer that Ice- landic indirect objects do since they are morphologically marked. As (i) shows, Icelandic patterns with Norwegian in allowing indirect objects to A-bar move (cf. (48)). (i) (a) Jón sendi Maríu bréf. Jón sent Marie letter “Jón sent Marie a letter.” (b) Hverjum sendi Jón bréf? who sent Jón letter (Halldór Á. Sigurðsson, p.c.) 33 Oehrle (1976), Larson (1988: 364) and Anagnostopoulou (2003: 39) discuss an issue that shows that the picture is not as simple as I tacitly assume. Importantly, promotion of the direct object improves slightly if the indirect object is a pro- noun (i), but if it is a reduced pronoun, the sentence is fully acceptable (ii): (i) ??A letter was given me by Mary. (ii) A letter was given ‘im/*HIM by Mary This is an effect comparable to that of clitics in Greek. Anagnostopoulou (2003) discusses these cases in depth, and I refer to her study for details on this phenomenon. 34 Citko (2009) discusses some very interesting data from Polish (an asymmetric language regarding passivization), suggesting that Polish datives need to check a structural Case feature, and that this checking induces a freezing effect such that the dative indirect object cannot move to the subject position. Whereas this may be the right analysis for Polish, it is not clear how it derives the contrast between English and Norwegian. 35 I assume that this analysis can be extended to British English as well. That is, Norwegian and British English seem to be very similar concerning passivization and extraction of the indirect object. If I am correct, this relates to whether or not there is a structural Case feature on the indirect object. 36 Notably, Åfarli uses this analysis to give a comprehensive account of the dif- ferences between English and Norwegian personal and impersonal passives. To 144 Transformational Constraints review this complex material would take this chapter beyond all reasonable lim- its, so I will have to refer the interested reader to Åfarli’s (1992) original work. See also Citko (2011) for a different analysis of these data where crucially all indirect objects have a structural Case feature.

References Abels, K. 2008. Towards a restrictive theory of (Remnant) movement. Linguistic Variation Yearbook 7: 53–120. Åfarli, T. A. 1992. The Syntax of Norwegian Passive Constructions. Amsterdam: John Benjamins. Anagnostopoulou, E. 2003. The Syntax of Ditransitives: Evidence from Clitics. Ber- lin: Mouton de Gryter. Baker, M. C. 1988. Incorporation: A Theory of Grammatical Function Changing. Chicago, IL: University of Chicago Press. Baker, M. C. 1996. On the structural position of themes and goals. In Phrase Struc- ture and the Lexicon, J. Rooryck and L. Zaring (eds.), 7–34. Dordrecht: Kluwer. Baker, M. C. 1997. Thematic roles and syntactic structure. In Elements of Gram- mar, L. Haegeman (ed.), 73–137. Dordrecht: Kluwer. Baker, M. C. 2008. The Syntax of Agreement and Concord. Cambridge: Cambridge University Press. Baker, M. C and Collins, C. 2006. Linkers and the internal structure of vP. Natural Language and Linguistic Theory 24: 307–354. Baker, M. C., Johnson, K. and Roberts, I. 1989. Passive arguments raised. Linguistic Inquiry 20: 219–251. Baltin, M. 2001. A-movement. In The Handbook of Contemporary Syntactic The- ory, M. B. and C. Collins (eds.), 226–254. Oxford: Blackwell. Bentzen, K. 2007. Order and Structure in Embedded Clauses in Northern Norwe- gian. Doctoral dissertation, University of Tromsø. Bittner, M. and Hale, K. 1996. The structural determination of case and agreement. Linguistic Inquiry 27: 1–68. Boeckx, C. 2003. Islands and Chains. Amsterdam: John Benjamins. Boeckx, C. 2007. Understanding Minimalist Syntax. Malden, MA: Blackwell. Boeckx, C. 2008a. Aspects of the Syntax of Agreement. London: Routledge. Boeckx, C. 2008b. Bare Syntax. Oxford: Oxford University Press. Boeckx, C. 2008c. Islands. Language and Linguistics Compass 2: 151–167. Bošković, Ž. 2007. On the locality and motivation of move and agree: An even more minimal theory. Linguistic Inquiry 38: 589–644. Bresnan, J. 1977. Variables in the theory of transformation. In Formal Syntax, P. Culi- cover, T. Wasow and A. Akmajian (eds.), 157–196. New York: Academic Press. Broekhuis, H. 2005. Extraction from subjects: Some remarks on Chomsky’s On phases. In Organizing Grammar: Studies in Honor of Henk van Riemsdijk, H. Broekhuis, N. Corver, R. Huybregts, U. Kleinhenz and J. Koster (eds.), 59–68. Berlin: Mouton de Gruyter. Chomsky, N. 1955/1975. The Logical Structure of Linguistic Theory. Ms., Harvard University. [Published 1975. Plenum]. Chomsky, N. 1973. Conditions on transformations. In A Festschrift for Morris Halle, S. R. Anderson and P. Kiparsky (eds.), 232–286. New York: Holt, Rine- hart and Winston. Freezing Effects and Objects 145 Chomsky, N. 1986. Knowledge of Language. New York: Praeger. Chomsky, N. 1995. The Minimalist Program. Cambridge, MA: MIT Press. Chomsky, N. 2007. Approaching UG from below. In Interfaces + Recursion = Lan- guage? Chomsky’s Minimalism and the View from Syntax-Semantics, H-M. Gärtner and U. Sauerland (eds.), 1–30. Berlin: Mouton de Gruyter. Chomsky, N. 2008 On phases. In Foundational Issues in Linguistic Theory, R. Fre- idin, C. P. Otero and M-L. Zubizaretta (eds.), 133–166. Cambridge, MA: MIT Press. Chomsky, N. and Lasnik, H. 1977. Filters and control. Linguistic Inquiry 8: 425–504. Citko, B. 2009. A (New) Look at symmetric and asymmetric passives. Proceedings of NELS 39. Citko, B. 2011. Symmetry in Syntax: Merge, Move, and Labels. Cambridge: Cam- bridge University Press. Culicover, P. 1992. The adverb effect: Evidence against ECP accounts of the that- trace effect. Proceedings of NELS 23: 97–111. GLSA, University of Massachu- setts, Amherst. Culicover, P. and Wexler, K. 1973. An application of the freezing principle to the dative in English. Social Sciences Working Papers 39: 1–29. University of Cali- fornia, Irvine. Czepluch, H. 1982. Case theory and the dative construction. The Linguistic Review 2: 1–38. den Dikken, M. 1995. Particles: On the Syntax of Verb-Particle, Triadic and Caus- ative. Oxford: Oxford University Press. Diesing, M. 1992. Indefinites. Cambridge, MA: MIT Press. Engdahl, E. 1982. Restrictions on unbounded dependencies in Scandinavian. In Readings on Unbounded Dependencies in Scandinavian Languages, E. Engdahl and E. Ejerhed (eds.), 151–174. Umerå: Almquist & Wiksell. Etxepare, R. and de Urbina, J. 2003. In A Grammar of Basque, J. I. Hualde and J. O. de Urbina (eds.), 494–516. Berlin: Mouton de Gruyter. Fanselow, G. 2001. Features, θ-roles, and free constituent order. Linguistic Inquiry 32: 405–437. Fillmore, C. 1965. Indirect Object Construction in English and the Ordering of Transformations. The Hague: Mouton. Fortuny, J. 2008. The Emergence of Order in Syntax. Amsterdam: John Benjamins. Fukui, N. 2006. Theoretical Comparative Syntax: Studies in macroparameters. Lon- don: Routledge. Gallego, Á. J. 2007. Phase Theory and Parametric Variation. Ph.D. dissertation, Universitat Autònoma de Barcelona. Gallego, Á. J. and Uriagereka, J. 2007. Conditions on sub-extraction. In Corefer- ence, Modality, and Focus, L. Eguren and O. F. Soriano (eds.), 45–70. Amster- dam: John Benjamins. Haegeman, L. 1992. Theory and Description in Generative Syntax: A Case Study in West- Flemish. Cambridge: Cambridge University Press. Haegeman, L. and Lohndal, T. 2010. Negative concord and (Multiple) agree: A case study of West Flemish. Linguistic Inquiry 41: 181–211. Haegeman, L. and van Koppen, M. 2009. The Non-Existence of a φ-Feature Depen- dency Between C and T. Talk Given at NELS 40, Cambridge, MA: MIT. Harley, H. 1995. Subjects, Events and Licensing. Doctoral dissertation, MIT. 146 Transformational Constraints Harley, H. 2002. Possession and the double object construction. Linguistic Variation Yearbook 2: 31–70. Harley, H. 2008. The ‘Latinate’ ban on dative shift in English: A morphosyntactic explanation. Plenary talk given at the 14th Germanic Linguistics Annual Confer- ence, Madison, May 3. Huang, J. C.-T. 1982. Logical Relations in Chinese and the Theory of Grammar. Ph.D. dissertation, MIT. Jackendoff, R. and Culicover, P. 1971. A reconsideration of dative movements. Foundations of Language 7: 397–412. Jaeggli, O. 1986. Passive. Linguistic Inquiry 17: 593–599. Jeong, Y. 2007. Applicatives: Structure and Interpretation from a Minimalist Per- spective. Amsterdam: John Benjamins. Julien, M. 2007. Embedded V2 in Norwegian and Swedish. Working Papers in Scan- dinavian Syntax 80: 103–161. Jurka, J. 2009. Gradient Acceptability and Subject Islands in German. Ms., Univer- sity of Maryland. Kayne, R. S. 1980. Extensions of binding and case-marking. Linguistic Inquiry 11: 75–96. Kayne, R. S. 1984. Connectedness and Binary Branching. Dordrecht: Foris. Koopman, H. 2006. Agreement configurations: In defense of ‘Spec Head’. In Agree- ment Systems, C. Boeckx (ed.), 159–199. Amsterdam: John Benjamins. Laka, I. and Uriagereka, J. 1987. Barriers for Basque and vice-versa. Proceedings of NELS 17: 394–408. Larsen, R. K. 1988. On the double object construction. Linguistic Inquiry 19: 335–391. Lasnik, H. 1999. Minimalist Analysis. Oxford: Blackwell. Lasnik, H. 2001. Subjects, objects, and the EPP. In Objects and Other Subjects: Grammatical Functions, Functional Categories, and Configurationality, W. D. Davies and S. Dubinsky (eds.), 103–121. Dordrecht: Kluwer. Lasnik, H. 2003. Minimalist Investigations in Syntactic Theory. London: Routledge. Lasnik, H. 2008. On the development of Case Theory: triumphs and challenges. In Foundational Issues in Linguistic Theory: Essays in Honor of Jean-Roger Verg- naud, R. Freidin, C. P. Otero and M.-L. Zubizarreta (eds.), 17–41. Cambridge, MA: MIT Press. Lasnik, H. and Park, M-K. 2003. The EPP and the subject condition under sluicing. Linguistic Inquiry 34: 649–660. Legate, J. A. 2008. Morphological and abstract case. Linguistic Inquiry 39: 55–101. Li, Y-H. A. 1985. Abstract Case in Mandarin Chinese. Doctoral dissertation, Uni- versity of Southern California. Li, Y-H. A. 1990. Order and Constituency in Mandarin Chinese. Dordrecht: Kluwer. Lødrup, H. 1999. Linking and optimality in the Norwegian presentational focus construction. Nordic Journal of Linguistics 22: 205–229. Lohndal, T. 2007a. Sub-Extraction and the Freezing Effect: A Case Study of Scandi- navian, Ms., University of Oslo. Lohndal, T. 2007b. That-t in Scandinavian and elsewhere: Variation in the position of C. Working Papers in Scandinavian Syntax 79: 47–73. Lohndal, T. 2009. Comp-t effects: Variation in the position and features of C. Studia Linguistica 63: 204–232. Freezing Effects and Objects 147 Marantz, A. 1993. Implications of asymmetries in double object constructions. In Theoretical Aspects of Bantu Grammar, S. Mchombo (ed.), 113–150. Stanford: CSLI Publications. Mayr, C. 2007. Subject-object asymmetries and the relation between internal merge and pied-piping. Paper presented at Penn Linguistics Colloquium, February 25. McGinnis, M. 2001. Variation in the syntax of applicatives. Linguistics Variation Yearbook 1: 105–146. McGinnis, M. 2008. Applicatives. Language and Linguistics Compass 2: 1225–1245. Merchant, J. 2001. The Syntax of Silence. Oxford: Oxford University Press. Müller, G. 1998. Incomplete Category Fronting. Dordrecht: Kluwer. Müller, G. 2010. On deriving CED effects from the PIC. Linguistic Inquiry 41: 35–82. Nunes, J. and Thompson, E. 1998. Appendix. In Rhyme and Reason, J. Uriagereka (ed.), 497–521. Cambridge, MA: MIT Press. Oehrle, R. 1976. The Grammatical Status of the English Dative Alternations. Doc- toral dissertation, MIT. Ormazabal, J., Uriagereka, J. and Uribe-Etxebarria, M. 1994. Word-order and wh- movement: Towards a parametric account. Presented at GLOW 17, Vienna. Pesetsky, D. 1995. Zero Syntax. Cambridge, MA: MIT Press. Pesetsky, D. and Torrego, E. 2001. T-to-C movement: Causes and consequences. In Ken Hale: A Life in Language, Michael Kenstowicz (ed.), 355–426. Cambridge, MA: MIT Press. Pesetsky, D. and Torrego, E. 2004. Tense, case, and the nature of syntactic catego- ries. In The Syntax of Time, J. Guéron and J. Lecarme (eds.), 495–538. Cam- bridge, MA: MIT Press. Pylkkänen, L. 2002/2008. Introducing Arguments. Doctoral dissertation, MIT. [Published 2008, MIT Press]. Rappaport Hovav, M. and Levin, B. 2008. The English dative alternation: The case for verb sensitivity. Journal of Linguistics 44: 129–167. Řezáč, M. 2008. Phi-agree and theta-related case. In Phi-Theory: Phi-Features Across Modules and Interfaces, D. Harbour, D. Adger and S. Béjar (eds.), 83–129. Oxford: Oxford University Press. Richards, M. 2007. On feature inheritance: An argument from the phase impenetra- bility condition. Linguistic Inquiry 38: 563–572. Richards, M. 2008. Defective agree, case alternations, and the prominence of per- son. In Scales, M. Richards and A. L. Malchukov (eds.), 137–161. Universität Leipzig: Linguistische Arbeits Berichte. Richards, N. 2001. Movement in Language. New York: Oxford University Press. Rizzi, L. 1997. The fine structure of the left periphery. In Elements of Grammar: A Handbook of Generative Syntax, L. Haegeman (ed.), 281–337. Dordrecht: Kluwer. Rizzi, L. 2006. On the form of chains: Criterial positions and ECP effects. In Wh Movement: Moving On, L. L.-S. Cheng and N. Corver (eds.), 97–133. Cam- bridge, MA: MIT Press. Rizzi, L. and Shlonsky, U. 2007. Strategies of subject extraction. In Interfaces + Recursion = Language? Chomsky’s Minimalism and the View from Syntax- Semantics, H.-M. Gärtner and U. Sauerland (eds.), 115–160. Berlin: Mouton de Gruyter. 148 Transformational Constraints Ross, J. R. 1967. Constraints on Variables in Syntax. Doctor dissertation, MIT. [Published 1986 as Infinite Syntax! Norwood, NJ: Ablex] San Martin, I. and Uriagereka, J. 2002. Infinitival complements in Basque. In Erramu Boneta: Festschrift for Rudolf P. G. De Rijk, X. Artiagoitia, P. Goenaga and J. Lakarra (eds.), Bilbao: Universidad del Pais Vasco, Servicio Editorial. Sigurðsson, H. Á. 2004. Meaningful silence, meaningless sounds. Linguistic Varia- tion Yearbook 4: 235–259. Sigurðsson, H. Á. 2008. Externalization: The Case of C/Case. Ms., Lund University. Sobin, N. 1987. The variable status of comp-trace phenomena. Natural Language and Linguistic Theory 5: 33–60. Sobin, N. 2002. The comp-trace effect, the adverb effect and minimal CP. Journal of Linguistics 38: 527–560. Sportiche, D. 1988. A theory of floating quantifiers and its corollaries for constituent structure. Linguistic Inquiry 19: 425–451. Starke, M. 2001. Move Dissolves into Merge: A Theory of Locality. Doctoral dis- sertation, University of Geneva. Stepanov, A. 2001. Cyclic Domains in Syntactic Theory. Doctoral dissertation, Uni- versity of Connecticut, Storrs. Takahashi, D. 1994. Minimality of Movement. Doctoral dissertation, University of Connecticut. Travis, L. 1984. Parameters and the Effects of Word Order Variation. Doctoral dis- sertation, MIT. Uriagereka, J. 1988. On Government. Doctoral dissertation, University of Connecticut. Uriagereka, J. 1998. Rhyme and Reason. Cambridge, MA: MIT Press. Uriagereka, J. 1999a. Minimal restrictions on Basque movements. Natural Lan- guage and Linguistic Theory 17: 403–444. Uriagereka, J. 1999b. Multiple spell-out. In Working Minimalism, S. D. Epstein and N. Hornstein (eds.), 251–282. Cambridge, MA: MIT Press. Uriagereka, J. 2008. Syntactic Anchors: On Semantic Structuring. Cambridge: Cam- bridge University Press. Vikner, S. 1995. Verb Movement and Expletive Subjects in the Germanic Languages. Oxford: Oxford University Press. Westergaard, M. and Vangsnes, Ø. A. 2005. Wh-questions, V2, and the left periph- ery in three Norwegian dialects. Journal of Comparative Germanic Linguistics 8: 117–158. Wexler, K. and Culicover, P. 1980. Formal Principles of Language Acquisition. Cam- bridge, MA: MIT Press. Whitney, R. 1982. The syntactic unity of Wh-movement and complex NP-shift. Lin- guistic Analysis 10: 299–319. Zwart, C. J.-W. 1997. Morphosyntax of Verb Movement: A Minimalist Approach to the Syntax of Dutch. Dordrecht: Kluwer. 5 Medial- wh Phenomena, Parallel Movement, and Parameters

5.1 Introduction* This chapter analyzes the fact that some children produce intermediate cop- ies when they form long-distance questions (hereafter “medial-whs”), as seen in (1):

(1) a. Who do you think who is in the box? b. What do you think what Cookie Monster likes?

These data have been confirmed in several and different studies: Thornton (1990, 1995), McDaniel, Chiu and Maxfield (1995), Crain and Thornton (1998).1 They also seem to be cross-linguistically robust; they are attested in Dutch (van Kampen 1997, 2010), French (Strik 2006), Basque, and Spanish (Gutiérres Mangado 2006). In none of these languages are the sentences in line with the “target” grammar, and they are also not part of the input to the children who produce them. Interestingly, the data in (1) is very similar to data from adults speaking German dialects and Romani, as shown in (2):

(2) a. Wen glaubt Hans wen Jakob gesehen hat? German dialect whom thinks Hans whom Jakob seen has “Who does Hans think Jakob saw?” (McDaniel 1986: 183) b. Kas misline kas o Demìri dikhlâ? Romani whom you.think whom Demir saw “Who do you think Demir saw?” (McDaniel 1986: 182)

On the surface, English-speaking children and the adult German dialect and Romani appear to be very similar. In this chapter I argue that this is not the case and that the derivations underlying (1) and (2) are different in impor- tant ways. In particular, I argue that (1) follows from children’s analysis of null complementizers in English (cf. Jeong 2004), an analysis that plausibly can be extended to similar data from children speaking other languages. The data in (2), on the other hand, are argued to follow from parallel chain 150 Transformational Constraints formation, which is an extension of the notion of parallel movement in Chomsky (2008). Last, I argue that the analysis aligns well with recent developments on how to think about parametric variation within the Mini- malist Program.

5.2 Multiple Pronunciations in English-Speaking Children In this section I discuss medial-whs in English-speaking children. Before I go into the details of the phenomenon, I first present some arguments as to why medial-whs are traits of competence rather than performance. I also discuss arguments showing that medial-whs are really derived through successive cyclic movement. Then I go on in the following sections to discuss ways to account for multiple Spell-Out of copies, in particular focusing on Nunes (2004) and Jeong (2004). It was first discovered by Thornton (1990) that children are able to pro- nounce medial-whs. Some representative data are given in (3):

(3) a. Who do you think who’s in the box? b. What do you think who’s in that can? c. Which animal do you think what really says “woof woof”? d. Why do you think why Cookie Monster likes cookies?

In Thornton’s (1990) study, nine out of twenty children (2;1–5;5) frequently produced medial-whs during an elicited production task. McDaniel, Chiu and Maxfield (1995) also found that several children reported that these sentences are grammatical. Notice that the latter researchers asked for acceptability judgments from the children whereas Thornton used an elici- tation technique. I return later to this difference. Before we look closer at the restrictions children seem to obey, let us first ask the question whether medial-whs are a reflection of competence or a trait of performance. Thornton (1990: 331–333) discusses this question, and here I just mention some of the issues that she brings up, which favor the view that medial-whs are competence phenomena. We know that performance effects often result from memory overload and that this subsequently leads to deletion of material. This is as expected if memory limitations prevent us from keeping a certain amount of items in working memory. Deleting items would take away some of the burden imposed on working memory. In the cases we are discussing, material is inserted instead of being deleted. Thus, it is less plausible that we are dealing with performance effects in cases involving medial-wh structures. It has also been shown that performance errors typically are found with object extraction. However, children that produce medial-whs do so more often in subject extraction than in object extraction. In fact, there is an important developmental trajectory here. Children often start out produc- ing medial-whs in both subject and object extraction, but at a later stage Medial-wh Phenomena, Movement, Parameters 151 they only produce medial-whs in subject extraction, until they converge on the target grammar. If performance errors are mostly found with object extraction, it is clear that these data cannot be analyzed as such errors. A final argument against medial-wh structures being performance effects comes from parsing considerations. We know that resumptive pronouns are more likely to occur with depth of embedding. This is because of memory reasons: resumptive pronouns make it easier to recover the dependencies, and we do not need to store a gap in working memory. The prediction emerging from this is that the lower clause is more likely to be filled as it comes late in the parse. Consider (4):

(4) Who do you really think who Grover wants to hug?

If parsing considerations determine the Spell-Out of copies, we would expect there to be a copy in the infinitival clause, as in (5).

(5) *Who do you really think who Grover wants who to hug?

Such data are not attested. As I discuss in the following, Thornton (1990, 1995) has argued that children never produce a medial-wh in infinitival clauses. Even if that were true, it would not affect the argument I am making here. Parsing considerations are generally thought to work outside of gram- mar proper;2 thus, the parser would presumably not know that medial-whs cannot appear in infinitivals—if the goal of the parser is to aid memory retrieval. Taken together, these arguments make it hard to justify the claim that medial-whs are performance effects. There is another issue that has to be discussed before turning to the simi- larities between English child language and other languages that display medial-whs. The issue is whether the medial-whs really are reflexes of long- distance movement.3 Thornton (1990) already asked that question, though she only discussed one piece of evidence. She asked whether (6a) is better represented as (6b):4

(6) a. Who do you think who is in the box? b. Who do you think? Who is in the box?

However, as will become evident momentarily, this piece of data is not well suited to settle this question. Consider instead (7):

(7) What do you think what pigs eat?

This sentence would have the bi-clausal structure in (8) if (6b) is the correct representation for (6a):

(8) What do you think? What do pigs eat? 152 Transformational Constraints As we see, do-support is an ideal case here. Whereas we obligatorily get do-support in the main clause, we are not supposed to get do-support in the embedded clause in (8). This is also the case: Do-support in each clause as in (8) does not occur in children’s speech. To my mind, this seems like a good argument in favor of not analyzing these structures as bi-clausal structures. However, there are still some other alternatives to consider before we can move on. One is whether (9b) is an adequate representation for (9a):

(9) a. Who do you think who left? b. Who do you think is the one who left?

That is, is (9a) really a reduced relative of some kind where who is a relative pronoun? There are several issues that render an analysis such as (9b) untenable. First, we have to bear in mind the context where these medial-whs were produced, namely, a context highly favorable to elicita- tion of questions. Since children are otherwise very good at producing questions in this context, it seems odd that they suddenly should start producing relative clauses. Second, one wonders what would trigger the change from the relative pronoun structure to a long-distance question structure. The latter is arguably the correct structure for the target gram- mar that the child eventually converges on. It seems hard to come up with anything reasonable to say about this question. However, there is an empirical prediction that a reduced relative clause approach makes. In (9) we have an animate wh-element. If we instead have an inanimate wh-element, we would predict that to occur instead of a wh-element. That is, (10a) would be something like (10b) instead of (10c), contrary to fact:5

(10) a. What do you think is in the box? b. What do you think is the thing that is in the box? c. What do you think is the thing what is in the box?

This is a serious problem for any analysis claiming that medial-whs are reduced relatives. Of course, additional questions would emerge as to how one goes about analyzing a reduced relative of this kind, but I take it that the analysis is already too dubious to merit further discussion. So far I have presented a few arguments that medial-whs are competence effects and I have argued that the best analysis is one that views these medial- whs as reflexes of successive-cyclic movement. This is entirely uncontrover- sial within the generative literature. However, so far nothing has been said about how exactly medial-whs should be analyzed. I now turn to that ques- tion by reviewing two proposals that have been put forward in the context of the Minimalist Program. Medial-wh Phenomena, Movement, Parameters 153 5.3 Nunes on Spell-Out of Multiple Copies Nunes (2004) attempts to give a comprehensive theory of Spell-Out at the PF interface. He argues in favor of a Chain Reduction operation that ensures that only one copy is pronounced, based on Kayne’s (1994) Linear Correspondence Axiom (LCA). Nunes derives the ban on chains where multiple members of a chain are phonetically realized from linearization requirements. Syntactic items intervening between two phonetically realized chain links must both precede and follow the same element, thus resulting in a contradiction. As Nunes also points out, structures with two phonetically realized chain mem- bers violate the irreflexivity condition on linear order; that is, if A precedes B, then A ≠ B. Furthermore, the fact that usually the highest copy is pronounced is derived through the claim that lower copies usually have checked fewer fea- tures compared to higher copies. Thus, it is more economical to delete these copies. Notice that one cannot delete all copies because, according to Nunes, that would involve too many applications of the operation Chain Reduction. However, structures like those mentioned earlier are attested, and we need to account for them. The way Nunes does it is by saying that in cases where medial-whs are allowed, movement proceeds through adjunction to an inter- mediate C head. A simplified representation is given in (11):

In this case of adjunction to another head, Nunes (2004: 40) follows Chom- sky (1995: 337) who says that “[the structure’s] internal structure is irrele- vant; perhaps [the structure] is converted by Morphology to a ‘phonological word’ not subject internally to the LCA, assuming that the LCA is an opera- tion that applies after Morphology”. Put differently, Morphology converts

[C wh [C C]] into a single terminal element through a process of morpho- logical reanalysis akin to the fusion operation of Distributed Morphology (Halle and Marantz 1993). This morphological reanalysis makes the big structure invisible to the LCA. This solution seems to work quite well, but there are problems. One prob- lem relates to children who produce medial-whs. For these children, succes- sive cyclic movement has to be able to proceed along the preceding lines, namely, involving adjunction to the intermediate head. There is no evidence in the input that this is the case, and furthermore, the child will have to learn one way or other that successive cyclic movement in English happens by way of phrasal movement through specifiers. It is not clear what the relevant cue would look like here. 154 Transformational Constraints However, there are more serious empirical problems. According to Nunes, adjunction can only happen if the wh-phrase is a head. He used data such as the following from German dialects to support this view:

(12) *Wessen Buch glaubst du wessen Buch Hans liest? whose book think you whose book Hans reads “Whose book do you think Hans is reading?” (McDaniel 1986: 183)

Whereas this seems to be true for German, Felser (2004) observes that data such as the following from Afrikaans are problematic for Nunes (2004).

(13) met wie het jy nou weer gesê met wie het Sarie gedog met wie with who did you now again said with who did Sarie thought with who gaan Jan trou? go Jan marry (du Plessis 1977: 725) “Whom did you say (again) did Sarie think Jan is going to marry?”

In this case we have PPs that are doubled in intermediate positions. If these were the only kind of complex constituents that were allowed in intermediate position, one could perhaps say that these PPs are reanalyzed somehow. The data given in (14) through (16), however, suggest that this is implausible.6

(14) Van watter vrou het jy gedink van watter vrou het hulle of which woman have you though of which woman have they gister gepraat? yesterday talked “Which woman do you think they talked about yesterday?” (15) met watter meisie het jy gese met watter meisie wil Jan trou? with which girl have you said with which girl wants John marry “Which girl do you say John wants to marry?” (16) a. Watter meisie sê hy watter meisie kom vanaand kuier? which girl say he which girl come tonight visit “Which girl did he say is coming to visit tonight?” b. Watter mooi meisie sê hy watter mooi meisie kom vanaand which beautiful girl say he which beautiful girl come tonight kuier? visit “Which beautiful girls did he say is coming to visit tonight?”

(14) and (15) are cases where a complex PP occurs in a medial position, whereas (16) shows two examples of complex DPs. All of these are fine in (colloquial) Afrikaans. Together, these data clearly show that narrow syntax needs to allow for even complex DPs and PPs to be generated in such a way Medial-wh Phenomena, Movement, Parameters 155 that we can get multiple pronounced copies. Of course, this also raises the question how we deal with the difference between German and Afrikaans, a topic that I return to later. But as a general analysis of medial-wh phenom- ena, it should be clear that Nunes’s account is inadequate. Let me now turn to a different account that only tries to derive the child data, namely, that of Jeong (2004).

5.4 Jeong and Null Complementizers Jeong (2004) suggests a very interesting minimalist analysis of the medial-wh data produced by English-speaking children. In this section, I adopt Jeong’s proposal, and I also show how this can derive the absence of medial-whs in infinitives. Last, I discuss an asymmetry between production and compre- hension regarding the appearance of medial-whs in infinitives. Jeong’s point of departure is that phonetically null complementizers in English (and possibly other languages; cf. Richards 2001) are affixes. In particular, these null complementizers are affixes that need to attach to the immediately dominating verb. This assumption is able to account for the contrast in (17), as shown by Pesetsky (1992) and Bošković and Lasnik (2003; see also Stowell 1981 and Kayne 1984):

(17) a. John expected that/Ø Mary would come. b. That/*Ø Mary would come was expected.

In (17a), the null complementizer affix can attach to the verb whereas in (17b), the affixis not close enough to the verb. Jeong argues that children don’t know the exact specification of null complementizers. Whereas they know that null complementizers are affixes, they don’t know which ele- ments these affixes can be attached to. In English, null complementizer affixes cannot attach to nouns, as shown in (18):

(18) a. the claim that Mary was ill upset me. b. *the claim Ø Mary was ill upset me.

However, for children that produce medial-whs, they need to allow affixes to combine with wh-phrases.7 Jeong assumes that children know that affixes cannot be stranded (Lasnik’s 1981 Stranded Affix Filter) and that null com- plementizers attach to Vs. What they also do, then, is to entertain the possi- bility that wh-phrases can attach to the null C. As Jeong argues, the fact that the null C can attach to the wh-phrase plausibly forces pronunciation of the medial-wh because an affix cannot attach to something that is not pro- nounced. Interestingly, this also accounts for why children only pronounce copies of wh-phrases in SpecCP and not in the original landing site or any other intermediate positions, as in (19a). A simplified representation of the 156 Transformational Constraints derivation is given in (19b) and (19c), where comp illustrates the affixal null complementizer:

(19) a. Who do you think [CP who [IP the cat [vP who chased who]]

b. [who do you think [CP who [C comp [IP the cat [vP who chased who]]]]]

c. [who do you think [CP who + comp [IP the cat [vP who chased who]]]]

It is clear why the underlined copies cannot be pronounced: the affix that forces pronunciation of a wh-copy is located in C and not in v or V. Notice that Jeong’s approach makes two predictions. One is that medial- wh elements are not expected to show up in children’s production in lan- guages with embedded inversion, assuming C is filled, as in Belfast English. Or more precisely, medial-wh elements should not show up at the time when embedded inversion is acquired. The second prediction is that medial-wh should not appear in languages with overt complementizers coexisting with wh-elements, as in Basque and other languages. As far as I know, the first prediction is borne out. The second one is a bit trickier. Giving data from a Spanish child, Gutiérres Mangado (2006) shows that this child produces medial-whs with co-occuring complementizers. An example is given in (20):

(20) Dónde crees que dónde ha ido el señor? where think.2sg that where has gone the man Target: “Where do you think the man went?” (Gutiérres Mangado 2006: 269)

However, it’s not clear that Jeong’s analysis cannot be applied to such data. It may be that the wh-element is in a different functional projection, say SpecFocP (Rizzi 1990) from the complementizer que, which may be lexical- izing the Force head.8 If that is the case, there could be a silent complemen- tizer that would force pronunciation of the medial-wh. Such an analysis does not seem to be far-fetched, and insofar as it can be maintained, the second prediction goes through. In what follows, I basically adopt and extend Jeong’s proposal. In par- ticular, I am adopting the idea that what triggers the pronunciation of medial-whs in English-speaking children is the null complementizer affix. I also suggest that children assume that only simplex wh-elements can be pronounced as medial-whs, maybe for phonological reasons along the lines of the adult grammars (see the following discussion). This also accounts for the fact that children never produce structures like (21):

(21) *Which boy do you think which boy the cat chased?

Instead, a few children would produce structures like (22):

(22) Which boy do you think who the cat chased?9 Medial-wh Phenomena, Movement, Parameters 157 The problem, though, as Jeong (2004) points out, is how which boy can leave a copy like who, given that copies are identical. Jeong sug- gests a Distributed Morphology solution to this problem. Distributed Morphology assumes that there is a distinction between syntactic fea- tures, which are the only features that the syntax operates on, and mor- phophonological features that are added to syntactic feature bundles on Transfer. Late insertion of this kind makes a discrepancy between syn- tactic and morphophonological features possible. Jeong suggests that in which boy, the relevant syntactic features are [+wh, +singular, +human, +masculine], which are also features shared by who. Who can thereby function as an exponent of the syntactic features that characterize which boy. Structures like (22) are less frequent than medial-whs with simplex wh- phrases (Thornton 1990, 1995). Jeong suggests that this is because D-linked wh-phrases involve an extra morphological operation, namely, that of turn- ing a complex wh-phrase into a headlike element in the morphophonology. The flipside of (22), which also occurs in children’s production, is what looks like partial movement (23):

(23) What do you think who jumped over the fence? (Thornton 1990: 213)

In partial movement constructions, the topmost wh-phrase (what in (23)), acts like a scope marker for the downstairs wh-phrase (who in (23)). Similar constructions are found in German dialects, Romani and other languages, and I briefly return to them in section 5.3. Felser (2004) argues convincingly that these structures should be analyzed in a different way than medial-wh structures (see also Bruening 2006 for a similar and strong argument from Passamaquoddy). Therefore, I do not have anything to say about them in this chapter. Whatever one’s favorite analysis of partial movement may be, it will be compatible with the data in (23). There is one fact that Jeong (2004) does not discuss. Thornton (1990: 213) says that no child ever produced a medial-wh in infinitival clauses (24). Instead, children produced adult forms as in (25):

(24) a. *What do you want what to eat? b. *Who do you want who to eat the pizza? (25) a. What do you wanna eat? b. Who do you want to eat the pizza?

Thornton marks (24) as ungrammatical. It is straightforward to account for this by extending Jeong’s proposal. Infinitivals don’t have a null complemen- tizer in English (though see Kayne 1984 and Pesetsky 1992 for complica- tions), so there is nothing for the medial-wh to attach to. Consequently, the medial-wh cannot be pronounced. 158 Transformational Constraints Table 5.1 Percentage of Acceptance for Medial-whs in Infinitives

Infinitive type Session 1 Session 2 Session 3 Session 4 (N = 32) (N = 32) (N = 24) (N = 15)

Subject 28% 22% 21% 13% Object 22% 13% 13% 20%

It should be added that other experiments have shown that some children judge these sentences grammatical. McDaniel, Chiu, and Maxfield (1995) tested children’s (aged 2 years 11 months–5 years 7 months) judgments on the two sentences in (26):

(26) a. Who do you want who to cook dinner? b. Who do you want who to kiss?

I have excerpted the relevant results from McDaniel, Chiu, and Maxfield (1995: 724) in Table 5.1.10 As McDaniel, Chiu, and Maxfield (1995: 732, fn. 25) point out, there is no direct contradiction between their data and Thornton’s 1990. It is perfectly possible that subjects find a particular sentence acceptable, but they are not producing it in a task. It is not clear why there is such an asymmetry, though production–comprehension asymmetries are quite fre- quently attested in child language. It is likely that they have a variety of reasons, ranging from performance to competence reasons. I do not have anything new to add here. What is important for the present purposes is that Jeong’s account can be easily adapted to account for the lack of medial-whs in infinitives. We have now seen how we can analyze medial-whs in English child lan- guage. In the next section I discuss how medial-whs in adult grammars can be analyzed.

5.5 Medial- whs in Adult Grammars The aim of this section is to account for the data in (2), repeated here as (27):

(27) a. Wen glaubt Hans wen Jakob gesehen hat? German dialect whom thinks Hans whom Jakob seen has “Who does Hans think Jakob saw?” (McDaniel 1986: 183) b. Kas misline kas o Demìri dikhlâ? Romani whom you.think whom Demir saw “Who do you think Demir saw?” (McDaniel 1986: 182) Medial-wh Phenomena, Movement, Parameters 159 The first task is to explore whether Jeong’s (2004) account of the child Eng- lish data can be extended to the German dialects and Romani. Recall that in her hypothesis the child overgeneralizes the licensing requirements on null complementizers so that they could combine with wh-phrases. This even goes beyond just nouns since we have seen that adjunct wh-phrases can be pronounced medially. If we were to use this analysis for adults who have medial-whs as part of their I-language, that would mean that somehow these adults figured out that only verbs (cf. the data in (17)) and wh-phrases can attach to a null complementizer affix. This is clearly not a natural class; that is, it is not clear why this should be the relevant class as opposed to some other class. There is a more serious problem, though. First, some German dialects allow medial-whs with an overt complementizer, as shown in (28):

(28) Wen denkst du wen dass du eingeladen hast? who think you who that you invite have “Who do you think that you have invited?” (modeled on Fanselow and Ćavar 2001: 127)

This is similar to the Spanish case in (20). However, those data come from one single child, and it is unwise to build too much of an analysis on one case study. More important, though, the medial-wh structures for adults are typically optional. This sets the structures apart from the ones that children produce. For children, they often produce the medial-whs for a very limited time, and when they do so, they often do it frequently (Thornton 1990). To the extent that this is a general property across languages, it seems like medial-whs in child language is less optional than for adults. Another potential problem faces us if we want to use Jeong’s theory for the adult structures. We would have to say that there are two silent com- plementizers in these grammars. One is able to attach to wh-elements and the other is not. That would give us the facts, but at the cost of giving an account that basically just redescribes the facts. In that sense, it would not be a very deep account. Although these problems are not in and of themselves lethal for Jeong’s analysis, they seem to suggest that the pronunciation of medial- whs is not due to properties of the complementizers in the grammar of the adults that produce these structures. In the following I explore a dif- ferent analysis, which says that a quite different derivation takes place in grammars that allow medial-wh structures. Specifically, I suggest an extension of Chomsky’s (2008) parallel movement (cf. independent work by Kandybowicz 2008 and Aboh and Dyakonova 2009), and I argue that this accounts for the adult medial-wh data. Let me first present Chom- sky’s concept of parallel movement and its motivation before I go on to extend it. 160 Transformational Constraints Chomsky (2008) is concerned with how structures like (29) should be analyzed:

(29) Who saw John?

He suggests the following representations:

(30) a. C [T [who [v* [see John]]]]

b. Whoi [C [whoj [T [whok v* [see John]]]]] (Chomsky 2008: 149)

Here there is parallel movement from Specv*P to SpecTP and SpecCP. This means that who moves simultaneously to SpecTP and SpecCP; thus, we have two chains, namely, (whoi, whok) and (whoj, whok). The motivation for introducing parallel chains is that it gives us the distinction between A-chains and A-bar chains, and it is triggered by two kinds of features. The movement to SpecTP is related to phi-features that are inherited from C whereas the movement to SpecCP is driven by an Edge Feature on C. However, if there are two chains, and if, as is a common assumption, only one element of a chain is pronounced, how come (29) is not pronounced as (31)?11

(31) *Who who saw John?

Chomsky (2008: 150) says that “[b]y the usual demand of minimal compu- tation, the A-chains contain no pronounced copy”. Instead, I assume that (31) is ruled out by a syntactic version of the Obligatory Contour Principle (*XX), which is independently motivated; compare with Grimshaw (1997), Ackema (2001), van Riemsdijk (2008) and Ott (2009). In what follows, I suggest an extension of this notion of parallel movement. Chomsky uses it to derive the distinction between A- and A-bar movement, and I suggest that we extend this analysis to also involve parallel movement of two A-bar chains. There does not seem to be anything that would bar this extension; rather, preventing it would require a motivation.12 I also follow Nunes (2004) in assuming that the highest member of a chain is pronounced, presumably because this member has checked more features than the lower chain members. It is not clear how to reconcile this with a view where all copies are identical (Chomsky 2008), but I set that issue aside for present purposes. It is easiest to see how this would work by considering a specific deriva- tion. I first work through two concrete examples, and then I consider the difference between how German dialects, on one hand, and Afrikaans, on the other hand, can be derived, namely, why only Afrikaans allows D-linked wh-phrases to occur in medial-whs. Let us look at a case where there is one medial-wh. In the tree in (33), I use bold to indicate which wh-phrase is spelled-out, and I have numbered Medial-wh Phenomena, Movement, Parameters 161 the movement operations so that it is easier to keep track of them. Indices are also used to aid exposition, but they have no theoretical importance. Last, I do not show possible intermediate landing sites (e.g., the left edge of vP) in order to simplify the structures.

(32) Wen glaubt Hans wen Jakob gesehen hat? German dialects whom thinks Hans whom Jakob seen has “Who does Hans think Jakob saw?” (McDaniel 1986: 183)

(33) FocP

whoi Foc

thinks TP

Hans vP

Hans v´

thinks ForceP

whoi FocP

who TP 2 i

Jakob vP

Jakob v´

1 saw whok 1

Let us go through what happens in (33) step by step.13 Parallel movement applies to whok, creating whoi and whoj , which then each move to the left periphery of the embedded clause (movement 1). The two wh-phrases target different projections in the left periphery, namely, SpecFocP and SpecForceP. Rizzi (1990) argues convincingly that interrogatives and focus compete for the same position, namely, SpecFocP (see also Stoyanova 2008 for further and interesting developments of this idea in a cross-linguistic perspective). 162 Transformational Constraints I follow Nunes (2004), who argued that whenever an element moves to SpecFoc(us)P, it has to be pronounced because of the focus properties asso- ciated with it.14 Finally, the topmost wh-phrase in SpecForceP moves to the left periphery in the main clause (movement 2).15 The fact that it is the high- est wh-element that moves is derived by a principle such as Attract Closest, which says that the closest target always is attracted. Assuming that the Force head is a phase head (cf. Rizzi 2005, Julien 2007), one also derives this result because only the highest wh-phrase is available for further opera- tions since the lower wh-phrase is within the Spell-Out domain of the Force head. As a result, we see that the heads of both chains are in SpecFocP, which is a position that requires pronunciation. There are some potential questions that we should consider before mov- ing on to a more complex case. One issue is whether anything would change if parallel movement took place from an adjoined vP position. The answer is no (though see Grewendorf and Kremers 2009 for a different answer). It makes no difference whether parallel movement happens from the base position or from an intermediate landing site. I have only shown one deriva- tion, but parallel movement from an intermediate site yields another con- vergent derivation. Another question is, “What happens if there is parallel movement of whoi in (33). Wouldn’t that predict two occurrences of wh-ele- ments in the main clause left periphery?” Such a derivation would be ruled out by a syntactic version of the Obligatory Contour Principle, on a par with *Who who saw John. Another issue concerns the semantics. Usually it is assumed that LF cares about chain membership, but in this case we only want to say that the wh-phrase is interpreted once, despite there being two chains in the syntax. There are various ways one can get this result, either by stipulating that the semantics only sees one copy (Hornstein 1995), or by invoking some notion of chain collapse (Martin and Uriagereka 2008). The exact details do not matter much for present purposes; it is only important that there is a way to ensure that the semantics recognizes that there is one wh-phrase despite there being multiple chains in the syntax. Let us now move on to consider a case with two medial-whs, that is, a case like (34):

(34) met wie het jy nou weer gesê met wie het Sarie gedog met wie with who did you now again said with who did Sarie thought with who gaan Jan trou? go Jan marry (du Plessis 1977: 725) “Whom did you say (again) did Sarie think Jan is going to marry?”

In order to simplify the tree, I have only put in the wh-phrases. This makes it easier to see exactly how the derivation works. The conventions for illustrat- ing the derivation are the same as for (33). Boldface is used for spelled-out wh-phrases, the movement steps are numbered, and possible intermediate landing sites are not shown. Medial-wh Phenomena, Movement, Parameters 163 (35) FocP

whl Foc

TP

vP

ForceP

whl FocP

whm TP

vP

ForceP

whi FocP 2

2 whj TP

vP

whk 1 1

The difference between (33) and (35) is that in (35) there is parallel move- ment from SpecForceP in the most embedded clause. There are no restric- tions on where parallel movement can occur, so it can just as easily occur from SpecCP, Specv*P and from the base position. Otherwise, the deriva- tion is very similar to (33) and no new technology is introduced. A question that we need to deal with is how we can account for the differ- ence between German dialects and Romani, on one hand, and Afrikaans on the other (later I deal with the difference between medial-wh languages and English). Recall that only the latter allows complex wh-phrases to be pro- nounced in medial positions. It is likely that language-particular rules deter- mine the size of the medial-wh element (cf. Benincà and Poletto 2005 and Poletto and Pollock 2009 for romance, see also, more generally, Bošković 164 Transformational Constraints 2001, Landau 2006 and Grohmann 2008). It is also plausible that these are phonological rules, that is, that there are restrictions on how many syl- lables the wh-word may have. These rules will then apply after the syntactic rules.16 I leave the specific implementation of this for future work. We saw earlier that English-speaking children have structures that look very much like partial wh-movement of the kind one finds in German (36):

(36) Was glaubt Hans mit wem Jakob jetzt spricht? German what believes Hans with whom Jakob now talks “What does Hans believe with whom Jakob is now talking?” (McDaniel 1986: 111)

Again, I do not present any analysis of these for the same reason as men- tioned earlier, namely, that partial-wh movement probably should be ana- lyzed differently than medial-whs (Felser 2004). There is, however, a fact that is common to both partial wh-movement and medial-whs, namely, that negation cannot intervene. Long extraction when only one copy is pro- nounced is acceptable.

(37) a. Wie denk je niet dat zij uitgenodigd heeft? Dutch who think you not that she invited has “Who don’t you think she has invited?” b. *Wat denk je niet wie zij uitgenodigd heeft? what think you not who she invited has c. *Wie denk je niet wie zij uitgenodigd heeft? who think you not who she invited has (Barbiers, Koeneman, and Lekakou 2009)

As Barbiers, Koeneman, and Lekakou (2009) point out, it is not clear how these data should be analyzed, in particular because other operator struc- tures (like pure quantifiers) do not show similar patterns. Rett (2006) also argues that the restriction is not syntactic but, rather, semantic. One could also add that if analyses that do not analyze the wat in (37b) as originat- ing together with wie are correct, then that adds another argument against (37b, c) being the result of a syntactic violation. My goal in this chapter is not to provide an account of the restriction seen in (37), and given the lack of consensus in the literature as well, it seems that a larger investigation into these restrictions and their nature is required. Such an investigation clearly goes beyond the scope of the present chapter. There are some remaining questions that we need to consider. Earlier I presented some reasons for why the analysis of children’s medial-whs does not carry over to adult medial-whs. Here I want to consider the reverse question; namely, if the parallel movement analysis is the correct one for adult medial-wh grammars, why is this not the correct analysis of children’s Medial-wh Phenomena, Movement, Parameters 165 grammar? First it should be made explicit that the parallel movement analy- sis would give us the same result for the cases in English child language as for the adults. However, given that children are only producing medial-whs for a limited amount of time (obviously setting aside children who acquire medial-wh languages), it is not at all clear what would make them entertain the parallel movement derivation and then later discard it when they dis- cover that the derivation yields sentences that are not part of what we call English. Jeong’s (2004) analysis clearly states that children are misinterpret- ing properties of the null complementizer, which is a likely misinterpreta- tion/overgeneralization given that children nevertheless need to figure out the restrictions on affixation to null complementizers. However, if children are at first entertaining a non-medial-wh derivation, what would make them entertain a medial-wh derivation if the latter is derived through parallel movement? There is no obvious trigger for them to change their hypothesis about the grammar. However, on the null-complementizer analysis, there is an indirect trigger in that the children have to figure out what the licensing properties of null complementizers are. The latter can come closer to pro- viding an explanation for why children do what they do, whereas the paral- lel movement analysis would not do that. Instead, the parallel movement hypothesis would just state that children are exploring possibilities that are provided by Universal Grammar. Although that is a perfectly valid hypoth- esis, we should prefer a theory that can come closer to saying something about why things happen the way they do. As far as I can see, only the null complementizer analysis does that. What I have just argued amounts to saying that the parallel movement derivation needs an overt trigger. When children are acquiring Romani or German dialects, they presumably get medial-whs as part of their input. Therefore, it should come as no surprise that these children will grow up producing these structures using parallel movement. English children pre- sumably never hear medial-whs as part of their relevant input, and although they may produce them for a short amount of time, they never end up doing so permanently. Therefore, wh-movement does not proceed in parallel in English, with the exception of the structures that Chomsky (2008) claims are derived through parallel movement. This is a natural place to pause and consider a bigger issue that relates to the topic of this paper, namely, how we think about linguistic variation. In the next section I discuss some implications of the analysis I have suggested in sections 5.4 and 5.5.

5.6 Consequences for Parametric Theory I have argued that when English-speaking children produce medial-whs, they do it in a different way than adult speakers of German dialects. Most of the previous literature has argued that the English children are doing roughly the 166 Transformational Constraints same as German adults (Thornton 1990, McDaniel, Chiu and Maxfield 1995, Gutiérres Mangado 2006), though they disagree on the details. McDaniel, Chiu and Maxfield (1995) argue that there is something like a wh-parameter distinguishing medial-wh languages from non medial-wh languages, whereas Thornton (1990) argued that there is no such parameter. Instead, children are using the medial-wh as a complementizer, signalizing that Spec-head agree- ment has taken place, which is necessary to satisfy the Empty Category Prin- ciple in the framework she is assuming (roughly that of Rizzi 1990). Contrary to these proposals, I am arguing that the derivation underlying English chil- dren’s medial-whs and the derivation underlying German adults’ medial-wh are different. The former involves overgeneralizing the licensing requirements for null complementizer affixes, whereas the latter involves parallel move- ment. If this is on the right track, it has implications for how we conceive of parametric variation. In the remainder of this chapter, I discuss this issue. In the late 1970s, the question of variation among languages became more and more pressing within Chomskyan generative grammar (see Las- nik and Lohndal 2010 for some discussion of the history). A theory that distinguished between principles and parameters was developed. The prin- ciples were assumed to be universal and part of UG. Concerning param- eters, Chomsky (1981: 4) said that “[i]f [. . .] parameters are embedded in a theory of UG that is sufficiently rich in structure, then the languages that are determined by fixing their values one way or another will appear to be quite diverse”. The idea was that the parametric space is finite and innately speci- fied through UG. Put in the words of Baker (2001: 19), “we may think of parameters as the atoms of linguistic diversity”. On this approach, a funda- mental assumption was that UG should reflect typological generalizations. However, recent work has questioned this assumption. In particular, Newmeyer (2005) has questioned the empirical foundation and argued that parameters cannot capture typology the way we used to think. I do not review all his evidence in favor of this but just add that several other research- ers have come to the same conclusion (for useful discussion, see Baker 2008, Gallego 2008, Richards 2008, Hornstein 2009, Kandybowicz 2009, Boeckx 2011, and van Gelderen 2011). Besides the empirical issues, there are also theoretical reasons why one should be suspicious toward encoding typologi- cal variation in UG, especially if one is wearing one’s minimalist hat. Since the first minimalist papers in Chomsky (1995), Chomsky has argued that there should be no variation among languages at the level of Logical Form, known as the Uniformity Principle: “In the absence of compelling evidence to the contrary, assume languages to be uniform, with variety restricted to easily detectable properties of utterances”. Recently, Boeckx (2011) has strengthened this to what he calls the Strong Uniformity Thesis:

(38) Principles of narrow syntax are not subject to parametrization; nor are they affected by lexical parameters Medial-wh Phenomena, Movement, Parameters 167 Put differently, narrow syntax itself is entirely universal; thus, it does not correspond to any given language. The question then arises where we put the variation we know exists. A prominent proposal is to locate the variation in the lexicon, as differ- ences related to lexical elements (Borer 1984: 3; also cf. Chomsky 1995, Kayne 2000, 2005). In addition, we have word order variation, among other elements, which can be related to lexical elements (e.g., as strong and weak features as in Chomsky 1995). The strongest hypothesis would say that all variation is related to the externalization component, namely, PF (Chomsky 2005, Boeckx 2011, Berwick and Chomsky 2008). On this approach, nar- row syntax is universal and not the locus of variation. Going back to the medial-wh cases, it is clear that the analysis I have given earlier is only consistent with the latter view of parameters. The over- generalization I have argued that children who are producing medial-whs are exhibiting is a property of the null complementizer, that is, a lexical element. I have also argued that a similar analysis cannot be used for the adults. Instead, I have suggested a specific implementation of parallel move- ment. This computation is “input driven” in that it only seems to appear when there is positive evidence; if I am right, it is not something that the child (or an adult English speaker) just decides to do at random. Thus, Eng- lish children have the same computational system as German children, but because of different input, they end up doing what looks like similar things in different ways. There is no contradiction between this and the continuity hypothesis, which maintains that child language can differ from the lan- guage of the linguistic community only in ways that adult languages can differ from each other (Crain and Thornton 1998). The empirical arguments against encoding typological variation in UG appear to be solid, as Newmeyer (2005) convincingly argues. But notice that there is also a deeper motivation behind why it is better to put the varia- tion outside of UG. To see this, let us look at what Chomsky (2005) calls the three factors in the design of language. Chomsky says that “[a]ssuming that the faculty of language has the general properties of other biological systems, we should, therefore, be seeking three factors that enter into the growth of language in the individual” (Chomsky 2005: 6). These three fac- tors are given in (39):

(39) a. Genetic endowment b. Experience c. Principles not specific to the faculty of language

It is possible to say that whereas the Government and Binding view was that parametric variation is located in (39a), the more minimalist view is that the variation is related to experience and possibly to third factors, as (39c) is commonly called (see in particular Richards 2008 on the latter, 168 Transformational Constraints and Roberts and Holmberg 2010). If it is possible to show that all varia- tion belongs to (39b) and (39c), then we would be close to a view of UG as underspecified, contra the view that UG is overspecified (Yang 2002, Baker 2001). Such a view of UG as underspecified was actually explored in the 1980s by Richard Kayne and Juan Uriagereka, both inspired by Changeux (1981), but this view has not been developed in any great detail since (though see Holmberg 2010). Summarizing, I have pointed out that the view of parameters within Chomskyan generative syntax is changing, and I have argued that the cases discussed in this chapter lend further support to this change. On the new view, the range of variation among languages is no longer innately specified, though it is still innately given. Parameters are trivialized as being part of what a child needs to learn, which can be taken to imply that they are no longer necessarily binary. UG is the core universal part that is common to all languages, and as such it does not encode variation.

5.7 Conclusion The goal of this chapter has been to analyze medial-wh structures in both child grammars where medial-wh occurs despite not being present in the input and adult grammars where medial-whs occur regularly. I have argued that children are overgeneralizing the licensing requirement on null-comple- mentizer affixes (following Jeong 2004) and that this is different from what adults are doing in languages that have medial-whs. In the latter case, I have argued that medial-whs are derived through parallel movements combined with the assumption that only one element per chain is spelled out. I have also claimed that the current analysis is compatible with a minimalist view of parametric variation.

Notes * Parts of this chapter have been presented at the Workshop on Universal Gram- mar, Language Acquisition and Change (University of Oslo, August 2008), Syn- tax lunch (University of Maryland, October 2008), the Workshop on Typology and Parameters (University of Arizona, February 2009), and Harvard Univer- sity (February 2009). I’m grateful to the audiences for their helpful comments and to Jason Kandybowicz, Jeff Lidz, Jairo Nunes, and Bridget Samuels. For their encouragement along the way and insightful comments, I’m indebted to Cedric Boeckx, Noam Chomsky, Elly van Gelderen, Rozz Thornton, and Juan Uriagereka. 1 See also de Villiers, Roeper, and Vainikka (1990) for a comprehension study that looks at somewhat similar data. 2 For an important exception, see Phillips (1996). 3 I am assuming that successive cyclic movement exists in the traditional sense. See den Dikken (2009) for arguments to the contrary. See also Koster (2009) for critical discussion. 4 One question not raised by Thornton is whether the first part of (6b) really is grammatical: Who do you think? At least in adults’ English grammar, this Medial-wh Phenomena, Movement, Parameters 169 structure is not well formed. This is another reason to be suspicious toward an analysis of (6a) along the lines of (6b). 5 This is true for subject and object reduced relatives. Thornton (1990) shows that some English-speaking children show that-trace violations, which means that there are some cases of that appearing but only for subject extraction. 6 I am grateful to Theresa Biberauer (p.c.) for the Afrikaans data. Note that these structures are best with emphatic intonation on all the wh-words. However, given that the structures are still possible without such an intonation, it does not seem plausible to argue that emphasis somehow creates a unit of sorts, which could be sufficient for Nunes’s purposes to count as a “word”. 7 Jeong (2004: 14) actually says “nouns”, but this cannot be right since we have seen cases where adjuncts are pronounced as medial-whs (cf. (3d)). 8 This raises the question of how we know where affixal null complementizers are located in a split-CP. I assume that languages can differ, just as they may differ concerning where overt complementizers are merged. Independent evidence that may bear on this issue could be that-trace effects, cf. Lohndal (2009). 9 Similar cases can be found in German dialects: (i) Welchen Mann denkst du wen er kennt? which man think you who he knows “Which man do you think he knows?” (Fanselow and Ćavar 2001: 18) 10 The study consisted of four sessions, which were separated from each other by a period of three to four months. Not all children took part in all sessions, which can be seen in Table 5.1. There was some attrition, and any child who manifested adult knowledge of the constructions investigated three sessions in a row and was at least five years old as of the third session was not seen for a fourth session. 11 Cf. Chomsky (2005: 13) on why only one element is pronounced: “If language is optimized for satisfaction of interface conditions, with minimal computation, then only one will be spelled out, sharply reducing phonological computation”. 12 This is in particular true if one believes that the A/A-bar distinction should be eliminated. 13 Boeckx (2007) argues persuasively that movement to intermediate landing sites is not triggered by a feature (see also Bošković 2007), which I adopt here. 14 An interesting question arises concerning cases discussed by Chomsky 1971 where focus can be assigned to various constituents in a structure. An example is shown in (i), where capital letters indicate focus: (i) a. JOHN eats a banana. b. John eats a BANANA. As Chomsky discusses, this is related to the intonation contour. See Frascarelli (2000) for an analysis of cases like (ib). Thanks to Mark Baker (p.c.) for raising this point. 15 Grewendorf and Kremers (2009) argue that minimality does not hold on Chom- sky’s (2008) assumptions. Since this potential problem is irrelevant for present purposes given that all the wh-phrases are identical, I set it aside. 16 An alternative is to say that D-linked wh-phrases have a different syntax, that is, that they involve a referentiality phrase that is located at the top of the tree (cf. Thornton 1995 and references therein). In the present context, one would have to say something like the following: Whether an embedded referential phrase can be pronounced or not is a matter of variation. This seems to be nothing but a redescription of the facts.

References Aboh, E. O. and Dyakonova, M. 2009. Predicate doubling and parallel chains. Lin- gua 119: 1035–1065. 170 Transformational Constraints Ackema, P. 2001. Colliding complementizers in Dutch: Another OCP effect. Lin- guistic Inquiry 32: 717–727. Baker, M. C. 2001. The Atoms of Language. New York: Basic Books. Baker, M. C. 2008. The Syntax of Agreement and Concord. Cambridge: Cambridge University Press. Barbiers, S., Koeneman, O. and Lekakou. M. 2009. Syntactic doubling and the structure of wh-chains. Journal of Linguistics 46: 1–46. Benincà, P. and Poletto, C. 2005. On some descriptive generalizations in Romance. In The Oxford Handbook of Comparative Syntax, G. Cinque and R. S. Kayne (eds.), Oxford: Oxford University Press. Berwick, R. C. and Chomsky, N. 2008. The Biolinguistic Program: The Current State of its Evolution and Development. Ms., MIT [Forthcoming in Biolinguis- tic Investigations, A-M. Di Sciullo and C. Aguero (eds.). Cambridge, MA: MIT Press]. Boeckx, C. 2007. Understanding Minimalist Syntax. Malden: Blackwell. Boeckx, C. 2011. Approaching Parameters from below. In Biolinguistics: Lan- guage Evolution and Variation, A-M. Di Sciullo and C. Boeckx (eds.), 205–221. Oxford: Oxford University Press. Borer, H. 1984. Parametric Syntax. Dordrecht: Foris. Bošković, Ž. 2001. On the Nature of the Syntax-Phonology Interface: Cliticization and Related Phenomena. London: Elsevier. Bošković, Ž. 2007. On the locality and motivation of move and agree: An even more minimal theory. Linguistic Inquiry 38: 589–644. Bošković, Ž and Lasnik, H. 2003. On the distribution of null complementizers. Linguistic Inquiry 34: 527–546. Bruening, B. 2006. Differences between the Wh-Scope-Marking and Wh Copy Con- structions in Passamaquoddy. Linguistic Inquiry 37: 25–49. Changeux, J-P. 1981. Genetic determinism and epigenesis of the neuronal network: Is there a biological compromise between Chomsky and Piaget? In Language and Learning: The Debate between Jean Piaget and Noam Chomsky, M. Piattelli- Palmarini (ed.), 184–202, Cambridge, MA: Harvard University Press. Chomsky, N. 1971. Deep structure, surface structure, and semantic interpretation. In Semantics: An Interdisciplinary Reader in Philosophy, Linguistics, and Psy- chology, D. Steinberg and L. Jakobovits (ed.), 232–296, Cambridge: Cambridge University Press. Chomsky, N. 1981. Lectures on Government and Binding. Dordrecht: Foris. Chomsky, N. 1995. The Minimalist Program. Cambridge, MA: MIT Press. Chomsky, N. 2005. Three factors in language design. Linguistic Inquiry 36: 1–22. Chomsky, N. 2008. On phases. In Foundational Issues in Linguistic Theory, R. Fre- idin, C. P. Otero and M. L. Zubizarreta (eds.), 133–166, Cambridge: MIT Press. Crain, S. and Thornton, R. 1998. Investigations in Universal Grammar: A Guide to Experiments on the Acquisition of Syntax and Semantics. Cambridge, MA: MIT Press. De Villiers, J., Roeper, T. and Vainikka, A. 1990. The acquisition of long distance rules. In Language Processing and Language Acquisition, L. Frazier and J. de Vil- liers (eds.), 257–297, Dordrecht: Kluwer. den Dikken, M. 2009. On the Nature and Distribution of Successive Cyclicity. Ms., The Graduate Center of the City University of New York. Medial-wh Phenomena, Movement, Parameters 171 du Plessis, H. 1977. Wh movement in Afrikaans. Linguistic Inquiry 8: 723–726. Fanselow, G. and D. Ćavar. 2001. Remarks on the economy of pronounciation. In Competition in Syntax, G. Müller and W. Sternefeld (eds.), 107–150, Berlin: Mouton de Gruyter. Felser, C. 2004. Wh-copying, phases, and successive cyclicity. Lingua 114: 543–574. Frascarelli, M. 2000. The Syntax-Phonology Interface in Focus and Topic Construc- tions in Italian. Dordrecht: Kluwer. Gallego, Á. J. 2008. The Second Factor and Phase Theory. Ms., Universitat Autònoma de Barcelona. Grewendorf, G. and Kremers, J. 2009. Phases and cyclicity: Some problems with phase theory. The Linguistic Review 26: 385–430. Grimshaw, J. 1997. The best clitic: Constraint conflict in morphosyntax. In Ele- ments of Grammar, L. Haegeman (ed.), 169–196, Dordrecht: Kluwer. Grohmann, K. K. 2008. Copy Modification and the Architecture of the Grammar. Paper presented at the LAGB, University of Essex, September 10–14. Gutiérres Mangado, M. J. 2006. Acquiring long-distance wh-questions in LI Span- ish. In The Acquisition of Syntax in Romance Languages, V. Torrens and L. Esco- bar (eds.), 251–287, Amsterdam: John Benjamins. Halle, M. and Marantz, A. 1993. Distributed morphology and the pieces of inflec- tion. In The View from Building 20: Essays in Linguistics in Honor of Sylvain Bromberger, K. Hale and S. J. Keyser (eds.), Cambridge, MA: MIT Press. Holmberg, A. 2010. Parameters in minimalist theory: The case of Scandinavian. Theoretical Linguistics 36: 1–48. Hornstein, N. 1995. Logical Form. Malden: Blackwell. Hornstein, N. 2009. A Theory of Syntax. Cambridge: Cambridge University Press. Jeong, Y. 2004. Children’s Question Formations from a Minimalist Perspective. Ms., University of Maryland. Julien, M. 2007. Embedded V2 in Norwegian and Swedish. Working Papers in Scan- dinavian Syntax 80: 103–161. Kandybowicz, J. 2008. The Grammar of Repetition: Nupe Grammar at the Syntax- Phonology Interface. Amsterdam: John Benjamins. Kandybowicz, J. 2009. Externalization and emergence: On the status of parameters in the minimalist program. Biolinguistics 3: 93–88. Kayne, R. S. 1984. Connectedness and Binary Branching. Dordrecht: Foris. Kayne, R. S. 1994. The Antisymmetry of Syntax. Cambridge, MA: MIT Press. Kayne, R. S. 2000. Parameters and Universals. Oxford: Oxford University Press. Kayne, R. S. 2005. Movement and Silence. Oxford: Oxford University Press. Koster, J. 2009. IM not Perfect: The Case Against Copying. Ms., University of Groningen. Landau, I. 2006. Chain resolution in Hebrew V(P)-fronting. Syntax 9: 32–66. Lasnik, H. 1981. Restricting the theory of transformations. In Explanations in Lin- guistics, D. Lightfoot and Hornstein, N. (eds.), 152–173, London: Longmans. Lasnik, H. and Lohndal, T. 2010. Government-binding/principles and parameters theory. In Wiley Interdisciplinary Reviews: Cognitive Science 1: 40–50. Lohndal, T. 2009. Comp-t effects: Variation in the position and features of C. Studia Linguistica 63: 204–232. Martin, R and Uriagereka, J. 2008. Uniformity and collapse. Paper presented at Ways of Structure Building, University of the Basque Country, November 13. 172 Transformational Constraints McDaniel, D. 1986. Conditions on Wh-Chains. Doctoral dissertation, City Univer- sity of New York. McDaniel, D., Chiu, B. and Maxfield, T. L. 1995. Parameters for Wh movement types: Evidence from child English. Natural Language and Linguistic Theory 13: 709–753. Newmeyer, F. J. 2005. Possible and Probable Languages: A Generative Perspective on Linguistic Typology. Oxford: Oxford University Press. Nunes, J. 2004. Linearization of Chains and Sideward Movement. Cambridge, MA: MIT Press. Ott, D. 2009. Stylistic fronting as remnant movement. Working Papers in Scandina- vian Syntax 83: 141–178. Pesetsky, D. 1992. Zero Syntax, vol. 2. Ms., MIT. Phillips, C. 1996. Order and Structure. Doctoral dissertation, MIT. Poletto, C. and Pollock, J.-Y. 2009. Another look at wh-questions in Romance: The case of Mendrisiotto and its consequences for the analysis of French wh-in situ and embedded interrogatives. In Romance Languages and Linguistic Theory 2006, D. Torch and W. L. Wetzels (eds.), 199–258, Amsterdam: John Benjamins. Rett, J. 2006. Pronominal vs. determiner wh-words: Evidence from the copy con- struction. In Empirical Issues in Syntax and Semantics 6, O. Bonami and P. Cabredo Hofherr (eds.), 355–375, Paris: Colloque de Syntaxe et Sémantique à Paris. Richards, M. 2008. Two kinds of variation in a minimalist system. In Varieties of Competition, F. Heck, G. Müller and J. Trommer (eds.), 133–162. University of Leipzig: Linguistische Arbeits Berichte 87. Richards, N. 2001. Movement in Language. Oxford: Oxford University Press. Rizzi, L. 1990. Relativized Minimality. Cambridge, MA: MIT Press. Rizzi, L. 2005. Phase theory and the privilege of the root. In Organizing Grammar Studies in Honor of Henk van Riemsdijk, H. Broekhuis, N. Corver, R. Huy- bregts, U. Kleinhenz and J. Koster (eds.), 529–537, Berlin: Mouton de Gruyter. Rizzi, L. The fine structure of the left periphery. In Elements of Grammar: A Hand- book of Generative Syntax, L. Haegeman (ed.), 281–337, Dordrecht: Kluwer. Roberts, I. and Holmberg, A. 2010. Introduction: Parameters in minimalist theory. In Null Subjects: The Structure of Parametric Variation, T. Biberauer, A. Holm- berg, I. Roberts and M. Sheehan (eds.), 1–57, Cambridge: Cambridge University Press. Stowell, T. 1981. Origins of Phrase Structure. Doctoral dissertation, MIT. Stoyanova, M. 2008. Unique Focus: Languages Without Multiple wh-Questions. Amsterdam: John Benjamins. Strik, N. 2006. L’acquisition des phrases interrogatives chez les enfants franco- phones. Psychologie Française 52: 27–39. Thornton, R. 1990. Adventures in Long-Distance Moving: The Acquisition of Com- plex Wh-Questions. University of Connecticut doctoral dissertation. Thornton, R. 1995. Referentiality and wh-movement in child English: Juvenile D-Linkuency. Language Acquisition 4: 139–175. van Gelderen, E. 2011. The Linguistic Cycle: Language Change and the Language Faculty. Oxford: Oxford University Press. van Kampen, J. 1997. First Steps in Wh-movement. Delft: Eburon. van Kampen, J. 2010. The learnability of A-bar chains. In The Linguistic Enter- prise: From Knowledge of Language to Knowledge in Linguistics, M. Everaert, T. Medial-wh Phenomena, Movement, Parameters 173 Lentz, H. De Mulder, Ø. Nilsen and A. Zondervan (eds.), 115–140, Amsterdam: John Benjamins. van Riemsdijk, H. 2008. Identity avoidance: OCP effects in Swiss relatives. In Foun- dational Issues in Linguistic Theory, R. Freidin, C. P. Otero and M. L. Zubi- zarreta (eds.), 227–250, Cambridge, MA: MIT Press. Yang, C. 2002. Knowledge and Learning in Natural Languages. New York: Oxford University Press.

6 Sentential Subjects in English and Norwegian1

6.1 Introduction Across languages, subjects tend to be nominal phrases. However, many lan- guages also allow for what appears to be sentential subjects, that is, subjects that would ordinarily be analyzed as sentences. A couple of examples from English and Norwegian are provided in (1) and (2), respectively:2

(1) [That Mary is late] annoys John. (2) [At Marie er sent ute,] irriterer Jon. that Mary is late out annoys John “That Mary is late annoys John.”

Several questions emerge based on data like (1) and (2): (a) Where are sen- tential subjects in the clausal structure; (b) what is the category of senten- tial subjects; (c) do sentential subjects have the same structural positions across languages, for example, in English and Norwegian? Questions (a) and (b) converge if one were to argue that only a specific category could serve as subjects; that is, if one argues that only nominal phrases can sat- isfy the subject requirement in English (see Chomsky 1981, Lasnik 1999, and Alexiadou and Anagnostopoulou 1998 for much discussion). In this chapter I focus on questions (a) and (c), discussing (b) only in passing. Over the years, the status of sentential subjects has been debated. This chapter considers the status of sentential subjects in English and Norwegian. It reviews the literature on sentential subjects in English, demonstrating variation and gradience in judgments offered by native speakers. Essentially, the chapter argues that the variation among speakers suggests two possible analyses: For some speakers, sentential subjects are structurally subjects, whereas for other speakers sentential subjects are structurally topics. In con- trast, in Norwegian, sentential subjects are structurally topics as they cannot appear in the subject position if preceded by a verb and a nonsubject. That is, using Verb Second (V2) as a test, we will see that sentential subjects do not occupy the canonical subject position in Norwegian. The chapter is organized as follows: Section 6.2 provides some relevant back- ground discussion regarding clausal architecture and sentential subjects. Sec- tion 6.3 surveys the relevant literature regarding sentential subjects in English, 176 Transformational Constraints presenting evidence for and against both the subject and the topic analysis. Section 6.4 presents new data from Norwegian, arguing that sentential sub- jects in Norwegian cannot sit in the canonical subject position. Section 6.5 offers a brief general discussion before section 6.6 concludes the chapter.

6.2 Background In early work in generative grammar, rules such as (3) contributed to build- ing syntactic structure for sentences (Chomsky 1957: 26):

(3) Sentence  NP + VP

A standard syntactic representation for a sentence like (4) is provided in (5):

(4) Katie likes dogs.

More recently, the structure is assumed to consist of three layers: a com- plementizer layer, an inflectional layer, and a lexical layer. The sentence in (4) will then have the structure in (6) (Chomsky 1986a), where the subject moves from its position in the VP to the inflectional domain (Koopman and Sportiche 1991).3 Sentential Subjects 177 The subject is a relational notion in this theory (Chomsky 1965), which is to say that it is defined structurally in the tree structure. In (6), thesubject is said to occupy SpecIP. This position is often referred to as the canonical subject position (McCloskey 1997), that is, the position in which subjects occur most of the time. As the rule in (3) implies, the subject is generally assumed to be a nominal phrase (Emonds 1972, Chomsky 1973, Koster 1978, Stowell 1981, Grimshaw 1982, Iatridou and Embick 1997, Alex- iadou and Anagnostopoulou 1998, Davies and Dubinsky 1998, 2009, Las- nik 1999, Hartman 2012, Lohndal 2012, Stowell 2013). However, even though nominal phrases canonically occupy the subject position, there are instances where it looks like other phrasal categories appear in the subject position. (7) provides examples of PPs. This phenom- enon is known as locative inversion.

(7) a. [Among the guests] was sitting my friend Rose. b. [In the corner] was a lamp. c. [Back to the village] came the tax collector. (Bresnan 1994: 75)

There are also other nonnominal phrases that may seem to occur in the subject position, such as the cases in (8):

(8) a. [Under the bed] is a good place to hide. b. [In August] is too late to have the party. c. [Cheat on my wife] is something I would never do. d. [Strong] is how I like my coffee. e. [Afraid of spiders] is what you are. (Hartman 2012: 32)

These have a fairly more limited distribution compared to sentential sub- jects: They generally occur with copula verbs. There are instances of what appears to be a finite CP occupying the subject position. I label these CPs sentential subjects in what follows. Consider the following examples.4

(9) a. [That Mary left early] disappointed us. b. [That the Giants lost the World Series] really sucks. c. [That the Giants lost the World Series] surprised me. d. [That the Giants would lose] was expected (by most columnists) (Alrenga 2005: 177)

As (9d) illustrates, sentential subjects occur even in passive sentences. Sen- tential subjects can also be very long, as the following example from Miller (2001: 688) demonstrates (italics in the original):

(10) [. . .] But we must never forget, most of the appropriate heroes and their legends were created overnight, to answer immediate needs. [. . .] Most of the legends that are created to fan the fires of patrio- tism are essentially propagandistic and are not folk legends at all. [. . .] 178 Transformational Constraints Naturally, such scholarly facts are of little concern to the man trying to make money of fan patriotism by means of folklore. That much of what he calls folklore is the result of beliefs carefully sown among the people with the conscious aim of producing a desired mass emo- tional reaction to a particular situation or set of situations is irrelevant (Brown F19 0490–0870).

Furthermore, sentential subjects do not have to be headed by the comple- mentizer that: for, whether and a wh-phrase are allowed too, as shown in (11) through (13):

(11) [For the Giants to lose the World Series] would be terrible. (Alrenga 2005: 177) (12) [Whether we do it now or later] is immaterial. (Huddleston 2002b: 977) (13) [What a blunder it was] didn’t emerge till later (Huddleston 2002b: 992)

It seems plausible to argue that these CPs are ordinary CPs, since they allow for a range of different elements to appear as the head (complementizer) of the finite sentence (CP). However, there are two important questions that we need to separate: (a) What is the phrasal nature of these subjects, and (b) what is the syntactic position of these subjects? Different scholars have taken a stand on these issues, as the Table 6.1 illustrates. The table is based on Hartman (2012: 35), but he claims that he is not aware of proposals that say that sentential subjects are DPs and topics. As far as I can tell, that is exactly the proposal in Takahashi (2010), and by and large Moulton (2013), although the latter does not discuss the syntactic details of his analysis at length. There is a lot of cross-linguistic support for a DP shell analysis, that is, an analysis where there is a potentially null D head that embeds the CP of the sentential subject. We can illustrate it as follows:

(14) [IP [DP D [CP that Mary left early]] I [VP disappointed us]]

Table 6.1 The Analysis of Sentential Subjects

True CPs Actually DPs

True subjects Consistent with the analyses Rosenbaum (1967), Davies in Holmberg (2000), Bailyn and Dubinsky (1998, (2004) 2009), Han (2005) Actually Koster (1978), Alrenga (2005) Takahashi (2010), Moulton topics (2013) Sentential Subjects 179 Both Takahashi (2010) and Hartman (2012) present a range of evidence in favor of such a structure. For example, Takahashi (2010: 353) points at the fact that many languages realize this determiner overtly. This is illustrated in (15) for Modern Greek; see Picallo (2002) for Spanish data and section 6.4 for Norwegian data.

(15) [DP to [CP oti ehis filus]] simeni pola (Modern Greek) the.nom that have.2sg friends.acc mean.3sg much “That you have friends means a lot.” (Roussou 1991: 78)

Since the DP surfaces in a range of languages, this shows that this is a pos- sible structure for human language. Although it does not demonstrate con- clusively that English has the same underlying structure, together with other evidence in favor of sentential subjects acting like nominal phrases (Alrenga 2005, Takahashi 2010, Hartman 2012, Moulton 2013), we can conclude that the DP shell analysis is plausible for English as well. In the rest of this chapter, I therefore assume the DP shell analysis of sentential subjects.

6.3 Sentential Subjects in English There is a long-standing debate regarding sentential subjects and their syn- tactic position in English. Emonds (1976), Koster (1978), Stowell (1981), Safir (1985), Postal (1998), Haegeman and Guéron (1999), Adger (2003), Alrenga (2005), Takahashi (2010) and Moulton (2013) all argue that sen- tential subjects are topics and that something else occupies the canonical subject position SpecIP. On the other side, Rosenbaum (1967), Emonds (1972), Delahunty (1983), Miller (2001), Davies and Dubinsky (2009) and Hartman (2012) argue that sentential subjects are real subjects that sit in SpecIP. In this section, I review this issue. I go through a range of tests to try to determine where in the structure sentential subjects sit. Subsection 6.3.1 discusses subject–verb agreement. Subject–auxiliary inversion is the topic of subsection 6.3.2. Subsection 6.3.3 deals with whether or not sentential subjects pattern with topics. Subsection 6.3.4 discusses whether or not sen- tential subjects are an instance of a Main Clause Phenomenon. A summary is provided in subsection 6.3.5.

6.3.1  Subject–Verb Agreement A typical characteristic of real subjects is that they trigger agreement on the verb:

(16) a. Mary likes/*like cookies. b. Travis and David live/*lives in Washington D.C. 180 Transformational Constraints Topics cannot do the same in English, as (17b) illustrates:

(17) a. John and Mary, Paul likes. b. *John and Mary, Paul like them.

Assuming that subject-verb agreement is a clue to identifying the subject of the sentence, we can note that sentential subjects do trigger subject–verb agreement:5

(18) a. [[That the march should go ahead] and [that it should be canceled]] have been argued by the same people at different times. (McClos- key 1991: 564) b. [[That he’ll resign] and [that he’ll stay in office]] seem at thispoint equally possible. (McCloskey 1991: 564) c. [[That the project has not been properly costed] and [that the man- ager is quite inexperienced]] are just two of my objections to your proposal. (Huddleston 2002b: 957) d. [That John is mean] is well-known.

Davies and Dubinsky (2009: 124), discussing the data in (18) showing that sentential subjects display subject–verb agreement, point out that all non- NP subjects exhibit similar properties as sentential subjects.

(19) a. [Under the bed] appears [to be a good place to hide]. b. [Very tall] appears [to be just how he likes his bodyguards]. (20) a. Under the bed and in the fireplace are not the best (combination of) places to leave your toys (Levine 1989: 1015) b. Very brawny and very studious are what Cindy aspires to be. (21) a. Under the bed and in the closet equally reminded me of that game of hide-and-seek we played. b. Very tall and quaintly studious equally bring to mind my sixth- grade science teacher.

As these examples illustrate, PP and AP subjects also undergo obligatory raising; they can trigger verb agreement and license equally. However, an important question is whether these facts really show that sentential subjects sit in SpecIP. Given current theoretical tools such as agreement at a distance, or “Agree” as in Chomsky (2000, 2001), agree- ment does not generally tell us much about the position of a phrase (though see Polinsky and Potsdam 2001). Therefore, subject–verb agree- ment is not a reliable diagnostic when it comes to the position of senten- tial subjects. In the next subsection, I discuss subject–auxiliary inversion. Sentential Subjects 181 6.3.2  Subject–Auxiliary Inversion Ordinary nominal subjects invert with the auxiliary in interrogatives.

(22) a. David likes pasta. b. Does David like pasta? (23) a. Peter will read the book. b. What will Peter read? c. *What Peter will read?

If we apply this test to sentential subjects, Koster (1978) presents data such as the following:

(24) a. *Did [that John showed up] please you? b. *What does [that he will come] prove? (Koster 1978: 53)

The following two additional examples are from Adger (2003: 299):

(25) a. *Did [that Medea killed her children] upset Jason? b. *Has [that we have arrived back at our starting point] proved that the world is round?

These examples illustrate a claim that has been repeated frequently in the lit- erature, namely, that sentential subjects are incompatible with subject–aux- iliary inversion. Delahunty (1983) takes issue with this claim and provides a series of what he claims are acceptable sentences. Consider the following data (Delahunty 1983: 387):

(26) a. Does [that Fred lied to them] bother all of the people who bought stock in his company? b. Does [that the world is round] bother as many people now as it did 500 years ago? c. Does [that quarks have wings] explain their odd behavior? d. Does [that quarks have wings] explain anything at all?

More recently, Hartman (2012: 77) provides the following judgments where the sentences are not fully acceptable:

(27) a. ?Does [that your brother earns more than you] bother you? b. ?Is [that I like you] so obvious? c. ?When did [that I earn more than you] become an issue?

These examples show that there are cases of subject–auxiliary inversion where a sentential subject appears to occupy SpecIP, assuming that the 182 Transformational Constraints auxiliary moves to C or a low head in the C domain in a cartographic approach (Rizzi 1997). In addition, Delahunty (1983: 382–385) provides the following examples where a wh-item and an auxiliary precede a sentential subject:

(28) a. To what extent did [that Fred failed to show up] anger those of his devoted fans who had waited by the stage door since dawn of the previous day? b. Why does [that Fred wants to marry her] so upset Mary’s mother, father, brothers, sisters and four grandparents that they haven’t ceased to harangue her about it since they discovered the proposal? c. Who does [that Fred left early] bother so greatly that he refuses to visit us any more? d. Who does [that the world is ending] upset so terribly that they have decided to abandon the planet? e. To whom is [that quarks are green] so well known that he cannot conceive of people who have not heard of the notion? f. Amongst which people is [that the Earth was once flooded] so often recalled that they refuse to leave their mountain homes for fear they will be trapped in the lowlands if the flood should ever occur again?

From a contemporary perspective, one could argue that the wh-item sits in SpecCP and the auxiliary is in C, followed by a sentential subject in what is arguably SpecIP. A couple of additional examples that are claimed to be acceptable are provided in (29), from Davies and Dubinsky (2009: 115):

(29) a. To whom is [that pigs can fly] most surprising? b. Is [that I am done with this homework] really amazing?

Davies and Dubinsky (2009) add some parsing considerations in support of Delahunty’s analysis. They argue that prosody and phrasal weight play an important role: In Koster’s example (24a), the sentential subject is twice the length (in syllables) compared to the matrix predicate. In Delahunty’s example (28a), a six-word sentential subject is followed by a nineteen-word matrix predicate. They conclude that length issues are causing unaccept- ability in Koster’s examples. In addition, the complementizer that may also be misparsed as a demonstrative, requiring the parser to reanalyze the struc- ture. For current purposes, the important point is that the grammar does not filter out the preceding data for Davies and Dubinsky; rather, other mecha- nisms come into play in determining acceptability and unacceptability. A problem with this line of argumentation is that native speakers notori- ously disagree about the preceding judgments. Some speakers agree with Delahunty’s data; others disagree. Despite asking more than ten speak- ers, I have not been able to establish any patterns. It may be a question of Sentential Subjects 183 individual variation, and I doubt that a large-scale study will inform this question, since the variability will probably just be scaled up accordingly (see Phillips 2010). We might, then, be dealing with two different grammars among native speakers of English: one that allows sentential subjects in Spe- cIP and one that treats them as topics. The fact that there are speakers who conform to the patterns established for each of the analyses demonstrates that both analyses exist. From the point of figuring out what a possible I-language is (Chomsky 1986b), that is the more interesting question. In the next subsection, I look at a major topic when it comes to sentential subjects, whether or not they should be analyzed as topics structurally speaking.

6.3.3 Sentential Subjects as Topics Sentential subjects are not prototypical subjects. This is reflected in the fol- lowing two quotes:

(30) “Subordinate clauses can also function as subject, as in That he was quilty was obvious to everyone, such subjects are, however, non- prototypical, as is reflected in the existence of a more frequent (non- canonical) alternant in which the subject function is assumed by the dummy NP it and the subordinate clause is extraposed: It was obvi- ous to everyone that he was quilty. Other categories appear as subject under very restrictive conditions” (Huddleston 2002a: 236). (31) “Nevertheless, clauses have enough of the distinctive subject prop- erties to make their analysis as subject unproblematic” (Huddleston 2002b: 957)

The latter quote indicates that sentential subjects in English sit in the canoni- cal subject position. However, as Koster (1978) observes, sentential subjects have a more restricted distribution than nominal subjects (cf. Ross 1967, Emonds 1972, 1976, Hooper and Thompson 1973, Kuno 1973). We will now look at some of the facts that have been used to make this claim and, in addition, consider whether or not sentential subjects are structurally topics. Let us first compare nominal subjects to sentential subjects. The following examples from Alrenga (2005: 177) demonstrate an important asymmetry:

(32) a. *John, that the Giants lost the World Series shouldn’t have bothered. b. John, the story shouldn’t have bothered.

(32) show that, at the root level, nominal subjects can appear after sentence- initial topics: the story can appear after John, but not that the Giants lost the World Series. As we saw in section 6.3.2, a similar asymmetry has been argued to hold for subject–auxiliary inversion too, though the issue is com- plicated because of variation in judgments among native speakers. 184 Transformational Constraints Koster also points at parallels between sentence-initial topics and senten- tial subjects. The following examples from Alrenga (2005: 177–179) dem- onstrate that topic phrases and sentential subjects cannot occur after other topic phrases:6

(33) a. *John, the book, I gave to. b. *John, that the Giants lost the World Series shouldn’t have bothered.

In (33), John is the topic phrase. However, Kuno (1973: 368, fn. 5) presents the following example:

(34) To me, [that the world is round] is obvious.

Whether or not to me is a free adjunct or a fronted complement (see Miller 2001: 696–697 for discussion), it appears in front of the sentential subject. Another example is the following:

(35) Descartes claimed that the two lines in figure C were parallel and pro- vided a proof based on his second theorem. This proof was in fact mistaken. From his first theorem on the other hand, [that two lines are parallel] certainly does follow, but remarkably, Descartes apparently never noticed this. (Miller 2001: 697)

Miller (2001) argues that discourse conditions determine whether sentential subjects are available in these cases. The following examples illustrate that fronting of one PP or a sentential subject is possible, but fronting of both is not possible. The examples are from Miller (2001: 697).

(36) a. Through a detailed observation of gulls, Lorenz thought he had shown that the image of the mother was acquired. This conclusion turned out to be based on a series of misinterpretations. *On the other hand, from his observations of ducklings, that the image of the mother is innate, we have since learned, though Lorenz himself never noticed this. b. Through a detailed observation of gulls, Lorenz thought he had shown that the image of the mother was acquired. This conclu- sion turned out to be based on a series of misinterpretations. On the other hand, from his observations of ducklings, we have since learned that the image of the mother is innate, though Lorenz him- self never noticed this. c. Through a detailed observation of gulls, Lorenz thought he had shown that the image of the mother was acquired. This conclusion turned out to be based on a series of misinterpretations. On the other hand, that the image of the mother is innate, we have since Sentential Subjects 185 learned from his observations of ducklings, though Lorenz himself never noticed this.

This can be explained, Miller argues, if sentential subjects actually are real subjects and not structurally topics. More evidence supporting this conclu- sion comes from Davies and Dubinsky (2009: 122). They start with Koster’s (1978) sentence:

(37) *Such things, that he reads so much doesn’t prove.

Then they provide the following examples illustrating that sentential sub- jects are not responsible for the incompatibility with topics.

(38) a. *Such things, the fact that he reads so much doesn’t prove. b. *Such things, it doesn’t prove that he reads so much.

Both of these examples contain nominal phrases. Davies and Dubinsky (2009) instead advance a parsing explanation, see their paper for details. Let us look at more data which are problematic for the view that assimi- lates sentential subjects and topics. Delahunty (1983: 384–385) points out that topics and sentential subjects differ in important ways: Wh-movement to the right of a topic is possible but not to the left. The pattern is the opposite for sentential subjects. The following examples illustrate this:

(39) a. To Bill, what will you give for Christmas? b. And to Cynthia, what do you think you will send?

In these examples there is a topic in the left periphery and a wh-item to the right of the topic. The wh-item cannot occur to the right.

(40) a. *On which shelf, the pots will you put? b. *For whom, a fur coat will you buy?

For sentential subjects, Delahunty argues that the pattern is the opposite, and he provides the following data:

(41) a. *[That Fred always leaves early], who does bother? b. *[That the Earth is coming to an end], who does upset? (42) a. Who does [that Fred left early] bother so greatly that he refuses to visit us any more? b. Who does [that the world is ending] upset so terribly that they have decided to abandon the planet? c. To whom is [that quarks are green] so well known that he cannot conceive of people who have not heard of the notion? 186 Transformational Constraints The data involving subject–auxiliary inversion are also discussed in subsec- tion 6.3.2, but for speakers who accept these data, it is clear that topics and sentential subjects do not occupy the same structural position. A further argument provided by Delahunty (1983) is the following. Top- ics may be moved to a clause internal topic position:

(43) a. Bill says that he will give a raise to Fred. b. Bill says that to Fred he will give a raise.

However, a phrase cannot be topicalized in an infinitival sentence:

(44) a. Bill wants to give a raise to Fred. b. *Bill wants to Fred to give a raise.

Importantly, if sentential subjects are topics, we would not expect them to be possible in an internal position, that is, inside the infinitival clause. This prediction is not borne out:

(45) Bill wants [that Fred lied] to be obvious to everyone. (Delahunty 1983: 389)

For that reason, Delahunty concludes that sentential subjects are not topics but, rather, regular subjects. Another argument is presented in Hartman (2012), drawing on Lasnik and Saito (1992). Lasnik and Saito observe that topicalization of root subjects is ruled out by the Empty Category Principle (Chomsky 1981, 1986a). They claim that this is verified by the following contrast (Lasnik and Saito 1992: 110–111):

(46) a. *John thinks that Mary likes himself. b. John thinks that himself, Mary likes. (47) a. *John thinks that himself likes Mary. b. *John thinks that himself, likes Mary.

In (46a), the anaphor himself is topicalized and moved to a position in which John can bind the anaphor. Lasnik and Saito argue that if vacuous movement of a subject is possible, this should achieve the same effect as in (46b). (47b) shows that this is not the case. Hartman argues that there there- fore cannot be topicalization of sentential subjects either. This review of sentential subjects and their possible status as topics has demonstrated that the arguments in favor of these subjects being topics are not particularly strong. The next argument I look at is partly more theory-internal: I show that if one adopts a topic analysis of the kind Koster (1978) suggested, it is easy to capture a range of data. Sentential Subjects 187 Koster gave the following analysis of sentential subjects, where I am using the updated structure in Alrenga (2005: 180).

Koster dubs sentential subjects “satellites” because they are outside the sentence proper. These subjects are linked to the subject position by way of a silent nominal phrase which also moved to SpecCP in current terminology (cf. Chomsky 1977 on topic constructions, of which Koster argues senten- tial subjects are an instance).7 Alrenga (2005) provides an updated and extended analysis of Koster (1978). Alrenga’s analysis says that sentential subjects are only possible when a verb subcategorizes for a DP. This relies on an important generaliza- tion from Webelhuth (1992: 94).

(49) The Sentence Trace Universal Sentences can only bind DP-traces, that is, traces with the categorical speci- fication [+N, −V]

This is in part necessary in order to account for the following asymmetry (Alrenga 2005: 175–176), which was in part already noted by van Gelderen (1985: 139; see also Webelhuth 1992: 95–96):

(50) a. It really {sucks/blows/bites/stinks} that the Giants lost the World Series. b. That the Giants lost the World Series really {sucks/blows/bites/stinks}. (51) a. It {seems/happens/appears/turns out} that the Giants lost the World Series. b. *That the Giants lost the World Series {seems/happens/appears/turns out}. 188 Transformational Constraints Alrenga (2005: 197) also notes these data:

(52) a. *{This/the Giant’s loss} (really) seems. b. {This/the Giant’s loss} (really) sucks.

The account offered by Alrenga works as follows: The verb seem only sub- categorizes for a CP complement. This makes it impossible for a null DP to be base generated as a complement and then raise to SpecIP. Since sentential subjects must be linked to a null DP, seem cannot have a sentential subject. For suck the situation is different: This verb subcategorizes for a DP as well, which makes it possible to link a null DP to the sentential subject. Given the similarity between sentential subjects and topics, Koster (1978), Alrenga (2005: 182) and Moulton (2013) equate these structures in a way where both have roughly the following representations:8

(53) a. [[That he is silly] Op [IP John knows tOP]].

b. [[That he is silly] Op [IP tOP is well known]].

In both cases, there is operator movement to SpecCP. Alrenga points out that seem can occur with a sentential subject in raising constructions:

(54) a. That the Giants lost the World Series seemed to bother him. b. That the Giants would lose the World Series seemed obvious.

These are not counterexamples to Alrenga’s analysis: “In these examples, the null DP argument is base generated within the infinitival or small clause complement of seem; it then raises out of this complement to the matric Spec, IP position and finally moves to an A’-position” (Alrenga 2005: 197). If true, this predicts that if a DP cannot be base generated, the sentence should be bad. The following data confirm this prediction:

(55) That the Giants would win the World Series seems to have been {hoped *(for)/felt/wished*(for)/insisted/*reasoned} (by most baseball fans).

The analysis extends to the following case as well, not discussed in the literature:9

(56) a. That the Giants lost the World Series seems unlikely. b. *That the Giants lost the World Series seems.

As such, this analysis covers a range of facts. The topic analysis may also have another virtue. One reason why senten- tial subjects have been treated as topics in the literature is that these subjects Sentential Subjects 189 appear to be “topical”, pragmatically speaking.10 However, note that this is the case also for regular nominal subjects.

(57) Travis likes pasta.

In (57), Travis is the topic. This is very typical: “[. . .] the correlation between topic and subject is extremely strong on the level of discourse and has important grammatical consequences, in English as well as in other languages” (Lambrecht 1994: 131).11 Lambrecht further argues that sub- jects are the unmarked topics. Reinhart (1981) argues that topichood is a pragmatic notion and that it cannot be accounted for solely by way of syntactic position (see also Gundel 1988). Given that no one argues that Travis in (57) sits in a topic position in the syntax, it is not entirely clear that pragmatic topichood as such is an argument in favor of the syntactic topic analysis.

6.3.4 Sentential Subjects and Embedded Clauses In this subsection, I discuss the claim that sentential subjects cannot gener- ally occur in embedded clauses (Alrenga 2005), even for speakers who allow sentential subjects in SpecIP in main clauses.12 We will see that the situation is rather complicated and that a more extensive investigation of this issue is in order. To begin with, it is well-known that nominal subjects can easily occur as subjects of embedded sentences:

(58) a. John knows that [Amy] will leave late. b. Sue fears that [all the students] will fail the exam.

Koster (1978) already pointed out that sentential subjects are much less acceptable in the subject position of embedded clauses than in main clauses.13

(59) a. ?*Mary is unhappy because for her to travel to Tahiti is no longer necessary.14 b. Mary is unhappy because her trip to Tahiti is no longer necessary. (60) a. ?*That for us to smoke would bother her, I didn’t expect. b. That our smoking would bother her, I didn’t expect.

Alrenga (2005: 178) notes significant lexical sensitivity (see also Hooper and Thompson 1973, Kuno 1973):

(61) a. I {think/said/believe} that for us to smoke really bothers her. b. ?*I regret that for us to smoke bothers her so much. c. ?*Mary wishes that for us to smoke bothered her more than it did. 190 Transformational Constraints Alrenga (2005: 194) argues that bridge verbs have CP recursion and thus enough structure to host sentential subjects (as topic), whereas other verbs do not have enough projections. This is also consistent with the Penthouse Principle (Ross 1973), which says that more syntactic operations are allowed in main clauses than in embedded clauses, on the assumption that topics are generally licensed in main clauses. Sentential subjects cannot appear as subjects of infinitival complements:15

(62) a. John believes [IP that to be obvious].

b. That John believes [IP t to be obvious]. (63) *I {planned/intended/expected/hoped/prayed} for that the cult mem- bers cloned a human baby to be discovered. (Alrenga 2005: 178)

Takahashi (2010: 360) argues that sentential subjects have to move to the specifierof a topic phrase and that a silent determiner, which sits on the top of the sentential subject, requires the topic projection to be present. This is ensured by using features, which I won’t go into here. Alrenga (2005: 195) discusses the absence of sentential subjects in the sub- ject position (SpecIP) of an Exceptional Case Marking (ECM) structure. He argues that since clauses that allow sentential subjects are CPs, a sentential subject is not licit in an ECM context. As noted in Webelhuth (1992: 101), movement of sentential subject fixes the problem. Compare (64) with (62):

(64) a. *John believes [IP [that Bill is sick] to be obvious].

b. [That Bill is sick] John believes [IP t to be obvious].

Again, this is in accordance with what the Penthouse Principle predicts. However, there are some problematic examples. Kastner (2013: 32) pro- vides the following examples (from Haegeman (2010):

(65) a. I found [that no one left such a boring party early] remarkable. b. I thought [that no one would leave such a boring party early] unlikely.

Some speakers do not like these examples, but they seem to improve if cer- tain small changes are made. The following examples were accepted by two informants:

(66) a. I thought [that no one would leave such an entertaining party early] to be unlikely. b. I thought [that no one would leave such an entertaining party early] unlikely to have happened. c. I thought [that no one would leave such an entertaining party early] very unlikely.

Note also that these examples in (65) and (66) are small clauses, lacking to be. This may point at a difference between small clauses and ECM infinitives. Sentential Subjects 191 A more extensive investigation would be required in order to determine whether or not sentential subjects can occur in small clauses but not in ECM infinitives. Last, Zaenen and Pinkham (1976) and Iwakura (1976) observe that embed- ded sentential subjects block A-bar movement from within their c-command domain. The following examples are taken from Alrenga (2005: 191):

(67) a. John said that {this/for you to stop smoking} would please Sandy. b. I wonder who {this/*for you to stop smoking} would please. (68) a. I can’t think of anyone that {this/*for you to stop smoking} would please. b. Who did you expect John to say that {this/*for you to stop smok- ing} would please? (69) a. John thinks that {this/*for her to say such things} shows that Kim wants a raise. b. What does John think that {this/?*for her to say such things} shows that Kim wants?

Alrenga argues that the failure of A-bar movement is due to the additional phrasal projections that host topics. That is, since the sentential subject is in an A-bar position, the wh-phrase would have to move across another A-bar expression, essentially creating an island configuration. For speakers who accept the Delahunty (1983: 385) data discussed earlier, A-bar movement across sentential subjects is acceptable in main clauses:

(70) a. Who does that Fred left early bother so greatly that he refused to visit us any more? b. Who does that the world is ending upset so terribly that they have decided to abandon the planet?

For these speakers, it seems that there is an asymmetry between main clauses and embedded clauses when it comes to the availability of A-bar movement, in line with the Penthouse Principle. For speakers who do not accept the data in (70), the analyses proposed by Alrenga and Takahashi work. All of the preceding examples involve the complementizer for. Kastner (2013: 32) argues that we should also look at examples containing that. He provides the following two sentences as examples:

(71) a. I wonder who [that the Mayor resigned his post yesterday] surprised. b. I wonder who [that the Mayor resigned his post] caught off guard.

He provides these sentences with two question marks. Some of my informants find both of them bad, whereas others find that they are not acceptable, how- ever, they are a bit better than (67b). This suggests that A-bar movement across sentential subjects is not accepted in general, except in matrix clauses for speakers who allow the sentential subject to appear in subject position. In summary, it is not clear that sentential subjects can only appear in main clause environments. Several sentential subjects do occur in embedded 192 Transformational Constraints clauses, though this is contingent on the matrix predicate. As for ECM envi- ronments, again verbs seem to differ, though a more complete investigation of ECM verbs is in order. The unavailability of A-bar movement across sentential subjects in embedded environments, due to a sentential subject in an A-bar position, is the strongest argument in favor of sentential subjects being an instance of MCP.

6.3.5  Summary This section has discussed the status of sentential subjects in English apply- ing a range of tests. Table 6.2 summarizes the tests. As for the last two tests, I have looked at the overall evidence and based on that, it seems that the comparison with topics suggests that sentential subjects are syntactically subjects, whereas the topic analysis more easily accounts for the lack of sentential subjects in embedded environments. I have argued that there is interspeaker variation as to whether the sub- ject can occur in the canonical subject position or in a topic position in the left periphery of the clause. This variation is especially pronounced when it comes to subject–auxiliary inversion and embedded environments.

6.4 Sentential Subjects in Norwegian In this section, I want to discuss sentential subjects in Norwegian.16 I argue that the data show that sentential subjects are not allowed to occur in Spe- cIP in Norwegian, but that they are amenable to a topic analysis. Norwegian is a Verb Second (V2) language. I assume that subjects sit in SpecCP in subject-initial clauses (Schwartz and Vikner 1989, 1996; pace Travis 1984). This means that a sentence with a sentential subject occurring in subject position by default will be consistent with a topic-analysis as pro- posed by Koster, Alrenga, and others. A typical example of such a sentence with a sentential subject is given in (72):

(72) At han kom så sent, ødela festen. that he came so late ruined the.party “That he arrived so late ruined the party.”

Table 6.2 Summary of Tests of Sentential Subjects

Test Syntactically subject Syntactically topic

Subject–verb agreement √ √ Subject–auxiliary inversion √ √ Comparison with syntactic topics √ Embedded clauses √ Sentential Subjects 193 However, given that this movement would be string-vacuous, we do not have empirical evidence regarding the structural position of the sentential subject. We only have theoretical arguments involving subject-initial clauses in V2 languages. It is therefore important to develop other tests to determine where exactly a sentential subject sits in Norwegian. As Koster (1978) observes for Dutch, it is possible to insert an expletive in these cases, showing that the sentential subject occurs in SpecCP. The Dutch example is provided in (73) (Koster 1978: 59) and a Norwegian example in (74):

(73)  Dat hij komt (dat) is duidelijk. that he comes (that) is clear “That he will come is clear.” (74) At han vil komme, (det) er klart. that he will come, (that) is clear “That he will come is clear.”

In sentences with ordinary nominal subjects, the V2 property ensures that the subject has to appear in SpecIP when a nonsubject is located in SpecCP. In (75), the direct object has been moved to SpecCP:

(75) Bøkene leser John hver dag. the.books reads John every day “The books, John reads every day.”

This sentence has the following standard syntactic analysis (cf., e.g., Eide and Åfarli 2003), again setting aside further decompositions (cf. note 3). 194 Transformational Constraints The V2 property suggests that we can use it to test whether sentential subjects can occur in SpecIP in Norwegian. If a nonsubject occurs in SpecCP, the sentential subject should follow the verb and thereby sit in SpecIP. Let us first use a nominal subject as a baseline:

(77) a. John overrasket alle deltakerne i fjor. baseline John surprised all the.participants in last.year “John surprised all the participants last year.” b. Alle deltakerne overrasket John i fjor. fronted object all the.participants surprised John in last.year “All the participants, John surprised last year.”17 c. I fjor overrasket John alle deltakerne. fronted PP in last.year surprised John all the.participants “Last year, John surprised all the participants.”

We can now insert a sentential subject in the same position as John. The examples and judgments are as follows.

(78) a. At John vant prisen, overrasket alle deltakerne i fjor. that John won the.prize surprised all the.participants in last.year “That John won the prize surprised all the participants last year.” b. *Alle deltakerne overrasket at John vant prisen, i fjor. all the.participants surprised that John won the.prize in last.year Intended: “It surprised all the participants that John won the prize last year.” c. *I fjor overrasket at John vant prisen, alle deltakerne. in last.year surprised that John won the.prize all the.participants Intended: “Last year, it surprised all the participants that John won the prize.”

We can also note that wh-movement across the sentential subject is not possible:

(79) *Hvem overrasket at John vant prisen, i fjor? who surprised that John won the.prize in last.year Intended: “Who did it surprise that John won the prize last year.”

All the unacceptable examples can be “rescued” if a nominal determiner det “it” is inserted so that the that-clause modifies this determiner (Faarlund, Lie and Vannebo 1997: 678):

(80) a. Det at John vant prisen, overrasket alle deltakerne i fjor. it that John won the.prize surprised all the.participants in last.year “The fact that John won the prize surprised all the participants last year.” b. Alle deltakerne overrasket det at John vant prisen, i fjor. all the.participants surprised it that John won the.prize in last.year “The fact that John won the prize last year surprised all the participants.” Sentential Subjects 195 c. I fjor overrasket det at John vant prisen, alle deltakerne. in last.year surprised it that John won the.prize all the.participants “Last year, the fact that John won the prize surprised all the participants.” d. Hvem overrasket det at John vant prisen, i fjor? who surprised it that John won the.prize in last.year “Who did the fact that John won the prize last year surprise?”

This is similar to the effect we get in English if a that-clause is embedded within a the fact-phrase (see Haegeman and Ürögdi 2010 for discussion of how to analyze the fact that-phrases):

(81) a. *Did [that John showed up] please you? b. Did [the fact [that John showed up]] please you?

These data from both Norwegian and English show that there are distribu- tional differences between subjects that are clearly nominal and sentential subjects.18 Regarding the availability of sentential subjects in embedded clauses in Norwegian, these are generally not available, even with bridge verbs:19

(82) a. ?? Jeg tror [CP [at John kom for sent] vil irritere mange]. I think that John came too late will annoy many “I think that the fact that John arrived too late will annoy many.”

b. ?? Peter forteller [CP [at John kom for sent] vil irritere mange]. Peter says that John came too late will annoy many “Peter says that the fact that John came too late will annoy many.”

If the sentential subject is embedded within a nominal phrase, the structures become acceptable.

(83) a. Jeg tror [CP (at) [det [at John kom for sent]] vil irritere mange]. I think (that) it that John came too late will annoy many “I think that the fact that John arrived too late will annoy many.”

b. Peter forteller [CP at [det at John kom for sent]] vil irritere mange]. Peter says that it that John came too late will annoy many “Peter says that the fact that John came too late will annoy many.”

These data confirm that sentential subjects cannot occur in the canonical subject position SpecIP in Norwegian. I have not detected the same kind of speaker variability in Norwegian as I detected in English, which indicates that sentential subjects can never sit in SpecIP for native speakers of Nor- wegian. Rather, they appear to be topics, sitting in SpecCP or in a dedicated topic phrase (cf. Rizzi 1997). 196 Transformational Constraints 6.5 General Discussion English and Norwegian are different. In Norwegian, sentential subjects are structural topics sitting in a topic phrase in the left periphery of the clause. In English, they can either be structural topics or structural subjects. I now discuss a couple of more general issues concerning sentential subjects. If sentential subjects are embedded within a covert determiner, as assumed in this chapter, why are there distributional differences between sentential subjects and ordinary noun phrases? That is, we have seen contrasts such as the following for English (84) and Norwegian (85):

(84) a. *Did [that John showed up] please you? b. Did [the fact [that John showed up]] please you?

(85) a. ??Jeg tror [CP [at John kom for sent] vil irritere mange]. I think that John came too late will annoy many “I think that the fact that John arrived too late will annoy many.”

b. Jeg tror [CP (at) [det [at John kom for sent]] vil irritere mange]. I think (that) it that John came too late will annoy many “I think that the fact that John arrived too late will annoy many.”

If the a-examples in (84) and (85) contain a covert determiner, why are these examples acceptable? There has to be a difference between real nominal phrases and sentential subjects. Note also that although a DP subject can appear in an initial CP, an initial CP cannot appear inside an initial CP (Adger 2003: 299):

(86) a. [That [the answer] is obvious] upset Hermes. b. *[That [that the world is round] is obvious] upset Hermes. c. *[That [whether the world is round] is unknown] bothered Athena.

Again, this points at a difference between “normal” nominal subjects and sentential subjects. Takahashi (2010) suggests a feature-based analysis whereby the silent determiner is only licensed by a topic head. The answer could also be more semantic in nature, viz. the proposal in Moulton (2013). Since the question of what category sentential subjects are has not occupied us in this chapter, I will not discuss these alternatives further. Even though many analyses claim that sentential subjects cannot move, it is not the case that constituents of category CP cannot move in general. Moulton (2013) cites Stowell (1987) who shows that the clausal pro-form so is one item that seems to move. This movement occurs even with verbs that do not select a DP, such as seems:

(87) a. It seems so. b. *That seems. c. So it seems. Sentential Subjects 197 Moulton claims that the correct generalization is that CPs with internal structure do not move. Although he hints at the presence of a comple- mentizer, he does not present a way to implement this generalization. In the current chapter, I have not said much about whether sentential sub- jects move or not. The structural topic analysis claims that they are base- generated in the topic position, whereas the structural subject analysis argues that the subject does move. However, both analyses have a con- stituent moving from within the verbal domain to the canonical subject position SpecIP: either the sentential subject itself or an empty category. Reconstruction data provided in Moulton (2013) also show that this kind of movement is required. Sentential subjects are most likely of category D. There is definitely movement involved. There is also cross-linguistic evidence regarding the syntactic position of sentential subjects. I have also argued that speakers of English differ in terms of where the sentential subject is structurally located.

6.6 Conclusion This chapter has discussed the structural position of sentential subjects in English and Norwegian. It has been assumed that sentential subjects are introduced by a DP shell; that is, that they have a nominal property. The chapter argues that there is variation among English speakers and that the sentential subject sits in the canonical subject position for some speakers whereas it sits in a topic position in the left periphery of the clause for other speakers. In Norwegian, sentential subjects cannot sit in the canonical subject position, something that was tested using the V2 property of Nor- wegian. Thus sentential subjects have different positions across languages, and only in depth analyses for each language can reveal what the structural position is for each language.

Notes 1 I am grateful to Artemis Alexiadou, Brad Larson, Ian Roberts, Bridget Samuels, and audiences in Tromsø and at WCCFL 2013 for valuable comments on this material. Special thanks go to Elly van Gelderen, Liliane Haegeman, Hans Petter Helland, and two anonymous reviewers for their feedback. 2 Norwegian punctuation requires a comma after a sentence-initial finite embed- ded clause. I have chosen to adhere to this rule throughout the chapter. 3 Since I am concerned with the final landing site of the subject, I am setting aside more recent developments where there is a functional vP-layer between IP and VP (cf. Chomsky 1995, Kratzer 1996, and many others). 4 I do not discuss the relationship between sentential subjects and expletive constructions: (i) a. That Mary left early disappointed us. b. It disappointed us that Mary left early. See Stroik (1996) and Miller (2001) for a relevant discussion. 198 Transformational Constraints 5 Davies and Dubinsky (2009) argue that a further indication that these are sub- jects is provided by the fact that whereas conjoined CPs in subject position can license equally (18b), conjoined CPs in nonsubject position cannot: (i) Dale thought that Dana left and that Terry wouldn’t come (*equally) (Davies and Dubinsky 2009: 124) The problem with this reasoning is that “equally ADV” seems to be licensed also by conjoined objet clauses, as shown in the following example provided by an anonymous reviewer: (ii) Dale believed that Dana left and that Terry wouldn’t come equally strongly. Thus, this may not be a good test. 6 In (33a), the original sentence is (i). (i) I gave the book to John. In Alrenga’s example, the preposition is stranded. However, the example is equally bad without stranding: (ii) *To John, the book I gave. 7 Recent work has illustrated that moved CPs display connectivity effects (see especially Moulton (2013). Consider the following example:

(i) [That a student from hisi class cheated on the exam] doesn’t seem to [any professor]i to be captured by this document. (Takahashi 2010: 350) For reasons of space, I cannot discuss this here, but see Ott (2014: fn. 32) for an alternative that is compatible with the present approach. 8 The base position of the null DP argument as a complement of the verb has been ignored in these representations. 9 Stowell (1981: 165) discusses cases that are related, involving raising adjectives: (i) a. [That John likes Susan] is certain. b. [That the war is over] is hardly likely. 10 Miller (2001) argues that the sentential subject has to be discourse-old and that this is the relevant pragmatic notion. See his paper for arguments in favor of this claim. 11 Hartman (2012: 73–74) argues that sentential subjects lack the information- structural properties that topic phrases have. He provides the following data, showing that if the discourse requirement on topic phrases is not met, topic phrases are not licensed. In contrast, sentential subjects are. (i) a. A: Have you ever been to Paris? b. B: Paris, I visited last year. (ii) a. A: What did you do last year? b. B: #Paris, I visited last year. (iii) a. A: What’s bothering you? b. B: That John’s not here is bothering me. When checking these judgments, there are speakers who do not find (iii) well formed, although some agree with Hartman. Furthermore, Hartman would predict that the pattern would be the same as in (ii) for a sentential subject embedded in a the fact-phrase: (iv) a. A: What’s bothering you? b. B: The fact that John’s not here is bothering me. As the data show, (iv) is fine, contrary to (ii). Arguably this shows that the issue is more complicated and that the argument that sentential subjects lack the rel- evant informational-structural properties is not entirely watertight. 12 There has been a lot of work on Main Clause Phenomena in recent years. See Heycock (2006), Haegeman and Ürögdi (2010), Aelbrecht, Haegeman and Nye (2012) and Haegeman (2012) for discussion. 13 Koster also argues that both topics and sentential subjects do not appear in embed- ded clauses. However, this issue is complicated and there are counterexamples. Sentential Subjects 199 Cf., among others, Authier (1992) and Bianchi and Frascarelli (2010). See also the following discussion. 14 Kastner (2013) claims that five native speakers judged this sentence as “not all that bad”. Some of my informants also find this sentence acceptable. 15 This is arguably why sentential subjects also cannot occur in for . . . to . . . con- structions in English: (i) a. He arranged for her to leave early. b. *He arranged for [that she could leave early] to be easy. Since sentential subjects are bad in embedded environments, we expect them to be bad in these cases as well. 16 The author is a native speaker of Norwegian. All examples have been checked with at least two other native speakers. 17 This example requires that alle deltakerne (all the participants) is focused in order to not yield the interpretation that it was the participants who surprised John last year. 18 Kastner (2013) shows that Hebrew allows the patterns that Norwegian disal- lows. In that sense, Hebrew may be more similar to one of the English varieties discussed in this chapter. 19 “??” indicates 2 on a scale from 1 to 5 where 1 is unacceptable and 5 is acceptable.

References Adger, D. 2003. Core Syntax. Oxford: Oxford University Press. Aelbrecht, L., Haegeman, L. and Nye, R. (eds.) 2012. Main Clause Phenomena: New Horizons. Amsterdam: John Benjamins. Alexiadou, A. and Anagnostopoulou, E. (1998). Parametrizing Agr: Word order, verb-movement and EPP-checking. Natural Language & Linguistic Theory 16: 491–539. Alrenga, P. 2005. A sentential subject asymmetry in English and its implications for complement selection. Syntax 8: 175–207. Authier, J-M. 1992. Iterated CPs and embedded topicalization. Linguistic Inquiry 23: 329–336. Bailyn, J. F. (2004). Generalized inversion. Natural Language and Linguistic Theory 22: 1–49. Bianchi, V. and Frascarelli, M. 2010. Is topic a root phenomenon? Iberia 2: 43–88. Bresnan, J. 1994. Locative inversion and the architecture of universal grammar. Lan- guage 70: 72–131. Chomsky, N. 1957. Syntactic Structures. The Hague: Mouton. Chomsky, N. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press. Chomsky, N. 1973. Conditions on transformations. In A Festschrift for Morris Halle, S. R. Anderson and P. Kiparsky (eds.), 232–286. New York: Holt, Rine- hart and Winston. Chomsky, N. 1977. On WH-movement. In Formal Syntax, P. Culicover, T. Wasow and A. Akmajian (eds.), 71–132. New York: Academic Press. Chomsky, N. 1981. Lectures on Government and Binding. Dordrecht: Foris. Chomsky, N. 1986a. Barriers. Cambridge, MA: MIT Press. Chomsky, N. 1986b. Knowledge of Language: Its Nature, Origin, and Use. New York: Praeger. Chomsky, N. 1995. The Minimalist Program. Cambridge, MA: The Press. 200 Transformational Constraints Chomsky, N. 2000. Minimalist inquiries: The framework. In Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik, R. Martin, D. Michaels and J. Uriagereka (eds.), 89–155. Cambridge, MA: MIT Press. Chomsky, N. 2001. Derivation by phase. In Ken Hale: A Life in Language, M. Ken- stowicz (ed.), 1–52. Cambridge, MA: MIT Press. Davies, W. D. and Dubinsky, S. 1998. Sentential subjects as complex NPs: New rea- sons for an old account of subjacency. In Proceedings of the Chicago Linguistic Society 34. Part 1: Papers from the Main Session, M. C. Gruber, D. Higgins, K. S. Olson and T. Wysocki (eds.), 83–94. Chicago, IL: Chicago Linguistics Society, University of Chicago. Davies, W. D. and Dubinsky, S. 2009. On the existence (and distribution) of sen- tential subjects. In Hypothesis A/Hypothesis B: Linguistic explorations in Honor of David M. Perlmutter, D. B. Gerdts, J. C. Moore and M. Polinsky (eds.), 111– 128. Cambridge, MA: MIT Press. Delahunty, G. P. 1983. But sentential subjects do exist. Linguistic Analysis 12: 379–398. Eide, K. M. and Åfarli, T. A. 2003. Norsk generativ syntaks. Oslo: Novus. Emonds, J. 1972. A reformulation of certain syntactic transformations. In Goals of Linguistic Theory, S. Peters (ed.), 21–62. Englewood Cliffs, NJ: Prentice-Hall. Emonds, J. 1976. A Transformational Approach to English Syntax: Root, Structure- Preserving, and Local Transformations. New York: Academic Press. Faarlund, J. T., Lie, S. and Vannebo, K. I. 1997. Norsk Referansegrammatikk. Oslo: Universitetsforlaget. Grimshaw, J. (1982). Subcategorization and grammatical relations. In Subjects and other subjects: Proceedings of the harvard conference on the representation of grammatical relations, A. Zaenen (ed.), 35–55. Bloomington: Indiana University Linguistics Club. Gundel, J. K. 1988. Universals of topic-comment structure. In Studies in Syntactic Typology, M. Hammond, E. Moravcsik and J. Wirth (eds.), 209–339. Amster- dam: John Benjamins. Haegeman, L. 2010. Locality and the Distribution of Main Clause Phenomena. Ms., Ghent University/FWO. Haegeman, L. 2012. Adverbial Clauses, Main Clause Phenomena, and Composition of the Left Periphery. Oxford: Oxford University Press. Haegeman, L. and Guéron, J. 1999. English Grammar: A Generative Perspective. Malden: Blackwell. Haegeman, L. and Ürögdi, B. 2010. Referential CPs and DPs: An operator move- ment account. Theoretical Linguistics 36: 111–152. Han, H. J. 2005. A DP/NP-shell for subject CPs. In Proceedings of the 31st Annual Meeting of the Berkeley Linguistics Society: General Session and Parasession on Prosodic Variation and Change, R. T. Cover and Y. Kim (eds.), 133–143. Berkeley: Berkeley Linguistics Society, University of California. Hartman, J. 2012. Varieties of Clausal Complementation. Doctoral dissertation, MIT. Heycock, C. 2006. Embedded root phenomena. In The Blackwell Companion to Syntax, M. Everaert and H. van Riemsdijk (eds.), 174–209. Malden: Blackwell. Holmberg, A. (2000). Scandinavian stylistic fronting: How any category can become an expletive. Linguistic Inquiry 31: 445–483. Sentential Subjects 201 Hooper, J. B. and Thompson, S. A. 1973. On the applicability of root transforma- tions. Linguistic Inquiry 4: 465–497. Huddleston, R. 2002a. The clause: Complements. In The Cambridge Grammar of the English Language, R. Huddleston and G. K. Pullum (eds.), 213–321. Cam- bridge: Cambridge University Press. Huddleston, R. 2002b. Content clauses and reported speech. In The Cambridge Grammar of the English Language, R. Huddleston and G. K. Pullum (eds.), 947– 1030. Cambridge: Cambridge University Press. Iatridou, S. and Embick, D. 1997. Apropos. pro. Language 73: 58–78. Iwakura, K. 1976. Another constraint on sentential subjects. Linguistic Inquiry 7: 646–652. Kastner, I. 2013. Selection: Factivity and Interpreation. Ms., New York University. Koopman, H. and Sportiche, D. 1991. The position of subjects. Lingua 85: 211–258. Koster, J. 1978. Why subject sentences don’t exist. In Recent Transformational Stud- ies in European Languages, S. J. Keyser (ed.), 53–64. Cambridge, MA: MIT Press. Kratzer, A. 1996. Severing the external argument from its verb. In Phrase Structure and the Lexicon, J. Rooryck and L. Zaring (eds.), 109–137. Dordrecht: Kluwer. Kuno, S. 1973. Constraints on internal clauses and sentential subjects. Linguistic Inquiry 4: 363–385. Lambrecht, K. 1994. Information Structure and Sentence Form: Topic, Focus, and the Mental Representation of Discourse Referents. Cambridge: Cambridge Uni- versity Press. Lasnik, H. (1999). Minimalist Analysis. Malden: Blackwell. Lasnik, H. 2003. On the extended projection principle. Studies in Modern Grammar 31: 1–23. Lasnik, H and Saito, M. 1992. Move α. Cambridge, MA: MIT Press. Levine, R. 1989. On focus inversion: Syntactic valence and the role of a SUBCAT list. Linguistics 17: 1013–1055. Lohndal, T. (2012): Without Specifiers: Phrase Structure and Events. Doctoral dis- sertation, University of Maryland. McCloskey, J. 1991. There, it, and agreement. Linguistic Inquiry 22: 563–567. McCloskey, J. 1997. Subjecthood and subject positions. In Elements of Grammar, Haegeman, L. (ed.), 197–235. Dordrecht: Kluwer. Miller, P. H. 2001. Discourse constraints on (non)extraposition from subject in Eng- lish. Linguistics 39: 683–701. Moulton, K. 2013. Not moving clauses: Connectivity in clausal arguments. Syntax 16: 250–291. Ott, D. 2014. An ellipsis approach to contrastive left-dislocation. Linguistic Inquiry 45: 269–303. Phillips, C. 2010. Should we impeach armchair linguists? In Japanese-Korean Lin- guistics 17, S. Iwasaki, H. Hoji, P. Clancy and S.-O. Sohn (eds.), 49–64. Stanford: CSLI Publications. Picallo, M. C. 2002. Abstract agreement and clausal arguments. Syntax 5: 116–147. Polinsky, M. and Potsdam, E. 2001. Long-distance agreement and topic in Tsez. Natural Language and Linguistic Theory 19: 583–646. Postal, P. M. 1998. Three Investigations of Extraction. Cambridge, MA: MIT Press. Reinhart, T. 1981. Pragmatics and linguistics: An analysis of sentence topics. Philo- sophica 27: 53–94. 202 Transformational Constraints Rizzi, L. 1997. The fine structure of the left periphery. In Elements of Grammar, L. Haegeman (ed.), 281–337. Dordrecht: Kluwer. Rosenbaum, P. S. 1967. The Grammar of English Predicate Complement Construc- tions. Cambridge, MA: MIT Press. Ross, J. R. 1967. Constraints on Variables in Syntax. Doctoral dissertation, MIT. Ross, J. R. 1973. The penthouse principle and the order of constituents. In You Take the High Node and I’ll Take the Low Node, C. Corum, T.C. Smith-Stark and A. Weiser (eds.), 397–422. Chicago, IL: Chicago Linguistic Society. Roussou, A. 1991. Nominalized clauses in the syntax of modern Greek. In Vol. 3 of UCL Working Papers in Linguistics, H. van de Koot (ed.), 77–100. London: University of London. Safir, K. 1985. Syntactic Chains. Cambridge: Cambridge University Press. Schwartz, B. D. and Vikner, S. 1989. All verb second clauses are CPs. Working Papers in Scandinavian Syntax 43: 27–49. Schwartz, B. D. and Vikner, S. 1996. The verb always leaves IP in V2 clauses. In Parameters and Functional Heads, A. Belletti and L. Rizzi (eds.), 11–62. Oxford: Oxford University Press. Stowell, T. 1981. Origins of Phrase Structure. Doctoral dissertation, MIT. Stowell, T. 1987. As so not so as. Ms., University of California, Los Angeles. Stowell, T. 2013. Changing the subject: Shifting notions about subjecthood in gen- erative grammar. In The Bloomsbury Companion to Syntax, S. Luraghi and C. Parodi (eds.), 194–217. London: Bloomsbury. Stroik, T. S. 1996. Extraposition and expletive-movement: A minimalist account. Lingua 99: 237–251. Takahashi, S. 2010. The hidden side of clausal complements. Natural Language and Linguistic Theory 28: 343–380. Travis, L. 1984. Parameters and Effects of Word Order Variation. Doctoral disserta- tion, MIT. van Gelderen, E. 1985. S-bar: Its Character, Behavior and Relationship to (i)t. Doc- toral dissertation, McGill University. Webelhuth, G. 1992. Principles and Parameters of Syntactic Saturation. Oxford: Oxford University Press. Zaenen, A. and Pinkham, J. 1976. The discovery of another island. Linguistic Inquiry 7: 652–664. 7 Be Careful How You Use the Left Periphery*

with Liliane Haegeman

7.1 Information Structure and the Left Periphery The goal of this chapter is restricted: We focus on the left-peripheral analysis of gapping in English according to which gapping is movement of the gap- ping remnants to the left periphery followed by ellipsis of the TP they have vacated. This approach seems at first sight to align the movement of rem- nants to that independently observed in relation to the encoding of Infor- mation Structural properties of TP constituents. We mainly focus on the cartographic implementation of this approach though much of what we say also carries over to a noncartographic implementation. We show that in spite of the initial attraction of this approach, it is fraught with problems. Since the publication of Cinque’s (1999) and Rizzi’s (1997) seminal work in the cartographic tradition a line of work in formal syntax ties informa- tion structural notions to precise syntactic positions, in line with the carto- graphic remit as described by Cinque and Rizzi (2010):

The cartographic studies can be seen as an attempt to ‘syntacticize’ as much as possible the interpretive domains, tracing back interpretive algorithms for such properties as argument structure (Hale and Keyser 1993 and much related work), scope, and informational structure (the ‘criterial’ approach defended in Rizzi 1997 and much related work) to the familiar ingredients uncovered and refined in half a century of formal syntax. To the extent to which these efforts are empirically sup- ported, they may shed light not only on syntax proper, but also on the structure and functioning of the cognitive systems at the interface with the syntactic module. (Cinque and Rizzi 2010: 63, our italics)

Topic and focus figure most prominently among the information structural concepts taken to be “syntacticized”. Since a full characterization would lead us too far, let us just adopt Rizzi’s own informal definitions from the following two quotations: 204 Transformational Constraints The topic is a preposed element characteristically set off from the rest of the clause by ‘comma intonation’ and normally expressing old informa- tion, somehow available and salient in previous discourse; the comment is a kind of complex predicate, an open sentence predicated of the topic and introducing new information. The preposed element, bearing focal stress, introduces new informa- tion, whereas the open sentence expresses contextually given informa- tion, knowledge that the speaker presupposes to be shared with the hearer. (Rizzi 1997: 285)

According to Rizzi’s own work, what was originally the CP layer of the clause was recast in terms of an articulated “split CP” as in (1a). The exam- ples in (1b–g) illustrate various instantiations of the left-peripheral space:

(1) a. ForceP TopP FocP TopP FinP TP (Rizzi 1997)

b. [FocP Fido [FinP they named their dog]]. (Vallduvi and Engdahl 1996, Molnár and Winkler 2010)

c. [FocP Il tuo libro [FinP ho letto (, non il suo)]]. (Rizzi 1997: 286) the your book have-1sg read-part (, not the his) “Your book I have read (, not his)”

d. [TopP A Gianni, [FocP questo, [TopP domani, [FinP gli dovrete dire]]]]. to Gianni, this, tomorrow him must-fut-2pl say “This you should tell tomorrow to Gianni”

e. [TopP This dog, [FinP they’ll name Fido]].

f. [FocP Which book did [FinP you prefer?]]

g. He said [ForceP that [FocP at no point had [FinP he been aware of the problem]]].

Like overt movement to the CP area, overt movement to the articulated left periphery is generally considered to be A’-movement, that is, move- ment driven for interpretive reasons which interacts, among other things, with wh-movement, and which does not interact with A-movement. Hence, focalization or topicalization of a direct object DP (A’-movement), for instance, can cross a subject position (an A-position) without any problem.1 In parallel with the proposal that the CP be reanalyzed as an articulated left periphery, a specialized domain for the encoding of information struc- tural relations, it has also been proposed that a parallel left periphery must be postulated lower in the clause. (2a) is a schematic representation; pro- posals along these lines are made by Kayne (1998), Jayaseelan (1999, 2001, 2010), Butler (2004), and by Belletti (2001, 2004, 2009); among others Belletti (2001, 2004, 2009) argues, for instance, that the postverbal sub- ject Gianni in Italian (2b,c) is located in the vP related focus position. For Using the Left Pheriphery 205 a discussion of the interpretive properties of these two “peripheries”, see among others Drübig (2006).

(2) a. [CP ...... [TP ...... [TopP ...[FocP Foc [TopP ...... vP]]]]] b. É arrivato Gianni be-3sg arrive-part-msg Gianni “Gianni has arrived” c. Ha parlato Gianni have-3sg speak part Gianni “Gianni has spoken”

d. [CP .. [TP pro ...è arrivato/ha parlato... [FocP Gianni [vP ...... ]]]]]

The focus of this chapter is the syntax of gapping. For general discussion of the phenomenon and a survey of the literature, see Johnson (2014). Our focus is much narrower than his: We examine some analyses of gapping according to which the constituents that survive gapping have been moved to the left periphery of the clause. These analyses are usually motivated on the basis of island effects that can be detected in gapping (see Neijt 1979, Johnson 2014: 18). At first sight, the attraction of such analyses is that the movement postulated is arguably driven by information structure require- ments (see Kuno 1976 for an early discussion) and, thus, seems analogous to other well-established information structure–driven movements such as focus fronting and topicalization. Indeed, the interpretive parallelism with such overt movement can be considered further support for analyses of gap- ping in terms of movement of remnants. Though implementations diverge, there are problems for these analyses that have to the best of our knowledge not been addressed. The problems we point out all relate to the conclusion that, while initially conceived as being parallel to well-established information structure–driven movements, the movements required to derive the gapping patterns consistently diverge markedly from what would be their analogues, and thus, the movement required to derive gapping is sui generis. This considerably weakens the attraction of the left-peripheral movement analyses. The chapter is organized as follows: In section 7.2 we outline the main properties of gapping in English, and we present two left-peripheral analyses, one deployed in full cartographic terms and another that simply aligns the left-peripheral movement of gapping with fronting for contrastive effects. In section 7.3 we list the problems for these analyses, focusing in particu- lar on the fact that the left-peripheral movements postulated for gapping diverge quite strongly from other well-established information structure– driven movements to the left periphery. In section 7.4 we briefly show how an analysis according to which the movement deriving gapping remnants targets a vP related periphery may overcome at least some of the problems we raise. Section 7.5 provides our conclusion. 206 Transformational Constraints 7.2 Making Most (Too Much?) of the CP Periphery: The Movement Derivation of Gapping

7.2.1 The Pattern Given the assumption that the articulated CP encodes information struc- tural properties of the clause, it is not surprising that authors have sought to maximize its potential and expand it beyond the empirical domains at the basis of the first cartographic work. Two likely candidates for an analysis in terms of the left-peripheral articulation of information structure were it clefts (3a) and gapping (3b):

(3) a. It was the potatoes that Harry didn’t like. b. Harry cooked the beans and Henry the potatoes.

In this chapter we concentrate on the derivation of gapping. For arguments against a left periphery analysis of clefts see Haegeman, Meinunger and Vercauteren (2014). Since Neijt’s seminal work (1979), gapping has been of continued interest in the generative literature. For recent surveys of the properties and analy- ses of gapping see among others, López and Winkler (2002), Repp (2007: 16–38), Vanden Wyngaerd (2009), Toosarvardani (in press) and especially Johnson (2014). In (4) and (5), two strings are coordinated. The first con- junct is a clause, in the second conjunct some material matching that in the first clause has been deleted or “gapped”. We pair each example with the fully explicitized string in which the effects of gapping have been undone. In (4), gapping is “minimal”: the second corresponds to the first conjunct minus the finite verb. Observe that verb gapping is available regardless of whether the object is in its canonical position (4a) or has been fronted (4b). In the second conjuncts in (5) additional material is missing: In (5a–c) gap- ping seems to have affected the subject and the finite verb. In (5d), gapping deletes the verb and the direct object:

(4) a. Harry cooked the beans and Henry the potatoes. (López and Winkler 2003: 241) a’. Harry cooked the beans and Henry cooked the potatoes. b. The beans, Harry cooked, and the potatoes, Henry. b’. The beans, Harry cooked, and the potatoes, Henry cooked. (5) a. At our house, we play poker, and at Betsy’s house, bridge. (Sag 1976: 265) a.’ At our house, we play poker, and at Betsy’s house, we play bridge. b. During dinner, my father talked to his colleagues from Stuttgart and at lunch time to his boss. (Molnár and Winkler 2010: 1405: (34)) b’. During dinner, my father talked to his colleagues from Stuttgart and at lunch time my father talked to his boss. Using the Left Pheriphery 207

c. Fido they named their dog and Archie their cat. (Molnár and Winkler 2010: 1405: (35)) c’. Fido they named their dog and Archie they named their cat. (Molnár and Winkler 2010: 1405: (35)) d. My brother visited Japan in 1960, and my sister in 1961. (Kuno 1976: 306) d’. My brother visited Japan in 1960, and my sister visited Japan in 1961.

Gapping is dependent on coordination. Moreover, the “antecedent” and the gapped clause must be structurally parallel. For instance, (4c), in which the antecedent conjunct displays object fronting while in the second conjunct the object follows the subject, violates the parallelism constraint and is not a licit context for gapping. Similarly, (4d) with the object in its canonical position in the first conjunct and what seems like a reflex of fronting in the second is also unacceptable:

(4) c. *[The beans Harry cooked] and [Henry cooked the potatoes]. d. *[Harry cooked the beans] and [the potatoes Henry cooked].

At first sight, gapping might seem to illustrate nonconstituent coordination: In (6a), for instance, the first conjunct would be the bracketed clause and the string Henry the potatoes consisting of just the subject and the object would be the second conjunct. There is no direct way in which these two constituents can be seen as one constituent. The same observation applies to the other examples in (6): In (6b), the second conjunct would have to be the potatoes, Henry, that is, a constituent consisting of the direct object followed by the subject, and in (6c), the second conjunct consists of a place adjunct at Mary’s house followed by a complement bridge. As the bracketed strings that make up the second conjuncts in these examples also do not seem to be clauses either, the coordinations involved in gapping would also prima facie not really be affecting “like constituents”.

(6) a. [Harry cooked the beans] and [Henry the potatoes]. b. [The beans, Harry cooked[, and [the potatoes, Henry]]]. c. [At our home we play poker] and [at Mary’s house bridge].

As already suggested by the primed examples in (4) and (5), the problem posed by the coordination of what seem to be nonconstituents is eliminated by accounts, starting from Ross (1970), which analyze gapping in terms of clausal coordination with ellipsis in the second conjunct (see López and Winkler 2003 for discussion):

(7) a. [Harry cooked the beans] and [Henry cooked the potatoes]. b. [The beans, Harry cooked[, and [the potatoes, Henry cooked]]]. 208 Transformational Constraints In the spirit of the ellipsis analysis, we refer to the constituents that survive ellipsis in gapping as the “gapping remnants”. The gapping remnants have a contrastive interpretation with respect to the matching constituents in the antecedent conjunct: In (7a), for instance, Henry contrasts with Harry and the potatoes contrasts with the beans. While an analysis in terms of coordinated clauses with ellipsis as in the primed examples in (4) and (5) and the examples in (7) entails that coordi- nation affects like constituents, these derivations are not without problems. First, as already discussed, the ellipsis seems to affect quite different entities: In (4) the ellipsis deletes just the (tensed) verb, in (5a–c) the subject and the verb are deleted; in (5d) the verb and the direct object are deleted. Moreover, in the derivations sketched in (5) and in (7) ellipsis, at first sight, targets nonconstituents.

7.2.2 A Left Periphery Derivation of Gapping: Implementations In this section, we look at a number of implementations of derivations of gapping which make crucial use of the left periphery.

7.2.2.1 Left-Peripheral Movement and Ellipsis The currently accepted account of gapping that overcomes the constituency problem for the ellipsis analysis of gapping posed by data such as (5) is that which decomposes gapping into a two step process: (1) the constituents that are to survive gapping, that is, what will become the gapping remnants, evacuate TP by moving to the left periphery of the clause, and (2) subse- quently, the TP they have evacuated is deleted. The relevant derivations are schematically represented in (8) and (9): (8) is inspired by Aelbrecht (2007), by Frazier, Potter and Yoshida (2012), and by Sailor and Thoms (2013), with what seems to be a recursive CP and no specialized landing sites for the moved constituents. Representation (9) from Vanden Wyngaerd’s (2009: 11, (26)) implements the articulated CP structure: In line with the focal and contrastive nature of the gapped constituents the landing sites of the gapped constituents can straightforwardly be identified with Rizzi’s FocP and TopP, the latter in this case hosting a contrastive topic. The discussion mostly focus on the latter representation, because thanks to Vanden Wyngaerd’s detailed explicitation of the derivation, it allows for a more precise evalua- tion. However, as far as we can see, most of the points we are making carry over to left periphery analyses in (8):

(8) a. At our home we play poker and [CP at Mary’s house [CP bridge [TP we play bridge at Mary’s house]]].

b. and [CP at Mary’s house [CP bridge [TP we play bridge at Mary’s house]]] Using the Left Pheriphery 209

(9) a. At our home we play poker and [TopP at Mary’s house [FocP bridge [TP we play bridge at Mary’s house]]].

b. and [TopP at Mary’s house [FocP bridge [TP we play bridge at Mary’s house]]].

The assumption that gapping involves movement of the remnants out of a constituent which itself is subsequently deleted has been widely accepted (cf. Pesetsky 1982, Jayaseelan 1990, Lasnik 1995, Richards 2001: 134–6, Johnson 2014 etc.). Richards (2001) provides an overview of some of the arguments in favor of this type of analysis. One well established argument for a movement + ellipsis analysis comes from an observation originally due to Neijt (1979) that the relation between the two gapping remnants is subject to locality conditions: (10a,b) are from Richards (2001: his (80) and (81)): While the string tried to cook dinner in (10a) can be gapped, the string wondered what to cook in (10b) cannot. The latter string contains a wh- island. On the movement+ ellipsis analysis, (10b) would involve extraction of tomorrow from within the wh-island. (10c) is a sketch of the derivation that would be required:

(10) a. John tried to cook dinner today, and Peter tried to cook dinner yesterday. b. *John wondered what to cook today and Peter wondered what to cook tomorrow. c. and Peter tomorrow [Peter wondered [what to cook tomorrow]]. and Peter tomorrow [Peter wondered [what to cook tomorrow]].

Along similar lines, Pesetsky (1982: 645) notes the subject/object asymme- try in (11) (from Richards (2001: 136), his (85) and (86)), which again is a well-known property of wh-movement. To derive (11b), the subject salmon would have to be first extracted across the complementizer that:

(11) a. This doctor thinks that I should buy tunafish, and that doctor thinks that I should buy salmon. b. *This doctor thinks that tunafish will harm me, and that doctor thinks that salmon will harm me. c. and that doctor salmon [that doctor thinks [that salmon will harm me]]. and that doctor salmon [that doctor thinks [that salmon will harm me]].

Observe that the unacceptability of (10b) and of (11b) implies that in such cases apparently there is no “repair by ellipsis” according to which the dele- tion of a potential intervener rescues the derivation: deleting the offending structure containing the island does not salvage the sentence (for repair by ellipsis see Chomsky 1972 and Bošković 2011 among many others). A full 210 Transformational Constraints discussion of repair by ellipsis would lead us too far and we forego discus- sion here. For examples such as (12a), Vanden Wyngaerd (2009: 33–34) provides the derivation summarized in (12b–e):

(12) a. I tried to read Aspects, and John tried to read LGB. (his (88a))

b. [FocP Foc° [IP John [VP tried to read LGB]]]

c. Attraction to Top°: . . .[TopP Johni [FocP Foc° [IP ti [VP tried to read LGB]]]]

d. Attraction to Foc°: . . . [TopP Johni [FocP LGBj Foc° [IP ti [VP tried to 2 read tj]]]

e. Gapping: . . . [TopP Johni [FocP LGBj Foc° [IP ti [VP tried to read tj]] ] (Vanden Wyngaerd 2009: 34, his (89))

7.2.2.2 The Nature of the Left-Peripheral Movement

7.2.2.2.1 THE ARTICULATED CP Given that the movement of the object to SpecFocP in (12d) is driven by information structure (from now on abbreviated as IS) requirements, it would at first sight appear to be an instantiation of regular A’-movement illustrated already in (1b,c etc.). However, its status in Vanden Wyngaerd’s (2009) analysis is not clear. On one hand, in Note 29 on page 33 he com- ments on some Dutch and German examples as follows:

Movement into Spec,Foc° differs from wh-movement in not being able to use Spec,CP as an escape hatch. This property puts movement to Spec,Foc° in class with the A-like movement sometimes called Object Shift or Scrambling (see Vanden Wyngaerd 1989 for discussion).

A number of questions arise in relation to this point. In Vanden Wyngaerd’s derivation of the English example in (9a), repeated here as (13a) for the reader’s convenience, the focus fronting of the object bridge would have to cross the subject DP. If this focus fronting instantiates A-movement, then we note that the movement crosses the subject, by assumption also an A-posi- tion, and that it should give rise to an intervention effect.

(13) a. At our home we play poker and [TopP at Mary’s house [FocP bridge [TP we play bridge at Mary’s house]]].

b. and [TopP at Mary’s house [FocP bridge [TP we play bridge at Mary’s house]]].

However, since the intervening subject is subsequently deleted as a result of gapping this might be accounted for if (13) the intervention effect is Using the Left Pheriphery 211 removed thanks to repair by ellipsis along the lines of Chomsky (1972) and much later work; compare with, a.o., Bošković (2011), in which the deletion of a potential intervener rescues the derivation. As mentioned, though, not all extraction violations are repaired by ellipsis (cf. (10b) and (11b)). However, in the discussion of English data in an earlier section of his paper, Vanden Wyngaerd seems to provide arguments to the effect that there is “a parallel between raising-to-Foc and wh-movement, rather than with NP-movement” (Vanden Wyngaerd 2009: 28, Footnote 24). His argumen- tation is based on the asymmetries in the examples in (14): a direct object/ indirect asymmetry in (14a,b) and a DP/PP asymmetry in (14b,c). Subject and direct object remnants are unproblematic (14a); indirect object rem- nants realized as DPs are degraded (14b) while PP indirect object remnants are fine (14c):

(14) a. Grandpa gave her a new bicycle, and grandma a watch. b. ?Grandpa gave Sally a birthday present, and grandma Susan. c. Grandpa gave a birthday present to Sally and grandma to Susan.

If gapping involves A’-extraction, the direct object/indirect asymmetry in (14a,b) and the DP/PP asymmetry in (14b,c) follow. Specifically, the deg- radation of (14b) with the indirect object DP Sally as a remnant would be expected: it is known that in the double object pattern in (British) English, DP indirect objects are not easily A’-moved (14d) while both direct objects (14e) and PP indirect objects (14f) pose no particular problems:

(14) d. ?Whom did grandma give a watch? e. What did grandma give (to) Sally? f. To whom did grandma give a watch?

It is therefore not clear how Vanden Wyngaerd can argue later in his discus- sion that gapping displays properties related to A-movement and to what extent he assumes this a general property of the movement of the gapping remnant to FocP. The status of the movement to TopP is also not entirely clear from Vanden Wyngaerd’s discussion. For Richards (2001: 135–137), who does not adopt a left periphery analysis, both movements of the gap- ping remnants are more like A-movement. We refer to his work for discus- sion. While (14b) with the indirect object as the focus remnant is degraded, (15) with the indirect object DP as the topical remnant is fine. If in (15), following Vanden Wyngaerd, on Tuesday is in FocP and, thus, the indirect object Mary is moved to TopP, then under an A’-movement analysis of the latter the fact that there is no degradation at all is puzzling. One might conclude that this is evidence that the movement to the left-peripheral TopP is an instantiation of A-movement. Of course, such movement would cross 212 Transformational Constraints the subject, a potential intervener, but the subsequent ellipsis of TP would rescue the derivation (Chomsky 1972, Bošković 2011):

(15) Harry gave Susan a watch on Monday and Mary on Tuesday.

It would remain puzzling, though, that while overt IS driven movement to the articulated left periphery is standardly assumed to be an instantiation of A’-movement, movement of the gapping remnant to TopP would have to be an instantiation of A-movement.

7.2.2.2.2 MULTIPLE SPECIFIERS IN THE LEFT PERIPHERY Aelbrecht (2007)’s left-peripheral analysis does not deploy the cartographic left periphery. Differently from Vanden Wyngaerd, she assumes that all gapped constituents are moved to the specifier positions of a single left- peripheral C- head, with the observed order preservation effect ascribed to the fact that the movement targets multiple specifiers—rather than specifiers of different heads—resulting in “tucking in” (Richards 2001). The move- ments required create crossing dependencies, which is also typical of middle field A-movement (Haegeman 1993a,b, 1994). The following extract is taken from Aelbrecht (2007):

Movement and ellipsis analysis: gapping remnants are all attracted to multiple specifier positions of the same head (Richards 2001): crossing paths → same word order as before movement. [contrast]-feature on C probes down and attracts 1st contrasted phrase it encounters; then the 2nd one is tucked in below the 1st one and so on.

This hypothesis correctly derives (14a): The [contrast]-feature in the C probe will first attract the subject Grandma, which is closest to the probe, and then the object a watch, which will tuck into the lower position. How- ever, it is not immediately clear how tucking in also derives (4b) repeated here as (16a). If gapping is consistently derived by left-peripheral movement followed by TP-ellipsis, both the gapping remnants, the potatoes and Henry, have to be external to TP and hence have to be specifiers of C[contrast]. In Aelbrecht’s approach, the [contrast]-feature should first attract the (closer) subject Henry and then the object the potatoes, leading to the opposite order to that in (16a). For completeness’ sake we add that the predicted order, reproduced in (16b), is indeed also grammatical, of course, and fol- lows from the tucking-in account:

(16) a. The beans Harry cooked and the potatoes Henry

b. and [CP Henry [CP the potatoes [C] [TP Henry cooked the potatoes]]]. Using the Left Pheriphery 213 The next section shows that the distribution of gapping phenomena in embedded domains brings to light additional problems. To summarize our argumentation: we show that gapping is available in a number of domains which are not standardly taken to be compatible with left-peripheral A’- movement. As already anticipated in some of the discussion above, in order to maintain a rigid left-peripheral analysis of gapping one would have to assume that in the problematic cases at least, and perhaps in general, the movement of the gapping remnants instantiates A-movement (the position taken in Richards 2001). Such an analysis effectively sets apart the left- peripheral IS-driven (A) movements that derive gapping from established left-peripheral IS-driven (A’) movements.

7.3 The Distribution of Gapping

7.3.1 Introduction The focus of most of the current literature is on the relation of the gapping remnants with their “source” clause, but less attention is being paid to the “external” distribution of the gapping remnants (but see some remarks in Sailor and Thoms (2013: section 5)). Vanden Wyngaerd (2009) does pay some attention to the issue and says,

The approach just sketched might also give us a handle on the otherwise unexplained property of gapping, which is that it applies only in coor- dinations, not subordinations, [as observed by Hankamer (1979), LH/ TL]. The reason for this restriction would be the absence of the func- tional superstructure devoted to topic and focus in the left periphery of subordinate clauses. It would also explain why gapping cannot reach into an embedded clause, as in the following example:

(17) a. *Max plays blues, and Mick claims that Suzy plays funk.

If the remnants must be in the left periphery of the clause, and if gapping deletes IP, there is no way to derive this sentence. (Vanden Wyngaerd 2009: 12, his: (27)) It is not clear what is intended here. Obviously, some embedded clauses do have a left periphery, but nevertheless, gapping is not always available, regardless of whether the conjunction is realized or not:

(17) b. *Max plays blues and Mick says (that) Susy funk.

In fact, the claim that gapping is allegedly excluded from embedded clauses is empirically incorrect: an embedded clause coordinated with another embedded clause under one conjunction is compatible with gapping: the 214 Transformational Constraints first conjunct is then the antecedent for the gapping in the second one. This is shown in (17c). For discussion see also Johnson (2014). Following Van- den Wyngaerd’s analysis, we would assign the second conjunct in (17c) the partial representation in (17d). Crucially, the second conjunct does not include the projection hosting the conjunction so that the coordinated con- stituents are structurally parallel and both are embedded under one C head:

(17) c. He said that at his house they play poker and at Betsy’s house bridge.

d. [TopP at Betsy’s housei [FocP bridgej [IP they play tj ti]]].

If gapping is a left-peripheral phenomenon (be it seen in terms of an articu- lated TopP and FocP as in Vanden Wyngaerd or in terms of Aelbrecht’s contrastive C), the prediction is that gapping will only be possible in second conjuncts with a left-peripheral space. In addition, the parallelism constraint on gapping implies that for the second conjunct to have the left-peripheral space needed to host the gapping remnants, the first conjunct must also have one. If, for some reason (see Haegeman 2012 for various accounts), a left-peripheral space is not available in the first conjunct, then by parallel- ism the second conjunct will also lack the relevant space and according to the left-peripheral analysis, gapping should be unavailable. In what follows we show that this prediction is incorrect. A number of clausal domains are incompatible with left-peripheral fronting, while gapping remains available. In section 7.3.2. we consider nonfinite clauses that are usually considered to lack a left-peripheral space altogether. In sections 7.3.3. and 7.3.4 we consider a set of finite clauses which, though not lacking a left periphery entirely, have been argued to disallow a range of left-peripheral fronting operations that encode information structure. If gapping is derived by these operations, then again the incorrect prediction is that the relevant finite clauses are incompatible with gapping. In section 7.3.5, we turn to an addi- tional problem of implementation for the generalized left-peripheral analy- sis of gapping.

7.3.2  Nonfinite Domains It is usually assumed that nonfinite clauses have a reduced left periphery: This will account for the observation that both in English for to clauses and ECM clauses argument fronting is unacceptable. On the generalized left periphery accounts of gapping as in Vanden Wyngaerd (2009) or Aelbrecht (2007), such domains should not be compatible with gapping:

(18) a. *The idea is for the first year scholarship the local council to fund. b. *They expect the first year scholarship the local council to fund.

Yet, gapping remains available in a second non-finite conjunct, as shown in (19a) and (19b). On the left-peripheral analysis of gapping, the remnants in Using the Left Pheriphery 215 (19a) and (19b) would have to be moved to left-peripheral positions that are otherwise unavailable:

(19) a. The idea was for universities to be financed by state funding and primary schools through private investment. b. They intend universities to be financed by state funding and pri- mary schools through private investment.

c. [CP schoolsi [CP through private investmentj [ti to be financed jt ]]].

d. [TopP schoolsi [FocP through private investmentj [ti to be financed jt ]]].

One way out for the generalized left periphery accounts of gapping could be to assume that gapping is derived by a sui generis type of IS-driven movement. This would, however, still entail that contrary to what is assumed, for to infini- tival clauses and ECM clauses must have a left-peripheral space. By the same reasoning, one would have to assume that absolute -ing clauses as in (20) have a left-peripheral structure to host the gapping remnants Mary and the apartment:3

(20) a. John having sold the house and Mary the apartment, they had nowhere to go.

b. [CP Maryi [CP the apartmentj [ti having sold tj]]].

c. [TopP Maryi [Foc P the apartmentj [ti having sold tj]]].

(21a) illustrates an adjectival small clause complement to with. On a gen- eralized left-peripheral analysis of gapping one has to assume that such small clauses also have a left-peripheral space to host IS-driven gapping movement.

(21) a. With Jill intent on resigning and Pat ___ on following her example, we look like losing our two best designers. (Huddleston and Pullum 2002: 1339, their (11))

b. with [CP Pati [CP on following her examplej [sc ti intent on tj]]].

c. with [TopP Pati [Foc P on following her examplej [sc ti intent on tj]]].

7.3.3  Finite Clauses

7.3.3.1  Adverbial Clauses Central adverbial clauses (Haegeman 2012 for the term) are not compat- ible with argument fronting to the left periphery (22a). However, the same environment is fully compatible with gapping (22b). The generalized left- peripheral analyses of would entail that, though a temporal clause resists argument fronting, the movements required by gapping must be licit in the second conjunct, leading to either the derivation in (22c) or (22d) for the gapped conjuncts. Put differently, a movement which would be unavail- able in the antecedent conjunct clause would be required in the second conjunct. 216 Transformational Constraints (22) a. *After the beans Harry had cooked we could start to eat. b. After Harry had cooked the beans and Henry the potatoes, we could start to eat.

c. and [CP Henry [CP the potatoes [TP Henry had cooked the potatoes]]].

d. and [TopP Henry [FocP the potatoes [TP Henry had cooked the potatoes]]].

To salvage the generalized left-peripheral analyses of gapping in (22b) one might again say that the relevant movements required to extract Henry and the potatoes from TP are both A-movements. As discussed already, though this is of course a possible move, it makes the movement that derives gap- ping sui generis; this type of A-movement to the left periphery would be only available in ellipsis contexts (cf Richards 2001).4 As before, the impli- cation then is that IS-driven movements to the left periphery are not unified: overt IS-driven left-peripheral is standardly considered A’-movement and has a restricted distribution; in the case of gapping IS-driven left-peripheral movement is—at least in some cases—to be analyzed as A-movement. The implications of this proposal, in particular in terms of the articulation of the left periphery and the syntacticization of IS, would need closer scrutiny. On economy grounds, though, it would be preferable that all IS-related move- ments to the left periphery could be treated uniformly. However, a generalized left-peripheral A-movement account for gapping is also empirically problematic. As mentioned, on the basis of the direct object/indirect object asymmetry in (14a,b) and the PP/ DP asymmetry in (14b,c), Vanden Wyngaerd (2009: 28: Note 24) concludes that, in English (14a), repeated here as (23a), the left-peripheral movement of the object a watch that derives the gapping configuration must be A’-movement. The pattern in (23a) is compatible with adverbial clauses. This means that the derivation of the gapped pattern in (23b) would have to be derived by A’- movement of a watch to the left periphery, a movement that is otherwise unavailable in adverbial clauses (23c).5

(23) a. Grandma gave her a new bicycle, and Grandpa gave her a watch. (his (74a)) b. When Grandma gave her a new bicycle, and Grandpa a watch, . . . c. *When a watch Grandpa gave her . . .

Consider (24), in which the first conjunct displays argument fronting, stan- dardly assumed to be A’-movement. In the second conjunct, the fronted constituent the potatoes is parallel to that fronted in the first conjunct.

(24) a. The beans, Harry cooked and the potatoes, Henry.

We have seen that the fronting required to derive the first conjunct in (24a) is incompatible with temporal clauses. Given the parallelism constraint, gap- ping of the type illustrated in (24a) also becomes unavailable in temporal Using the Left Pheriphery 217 adverbial clauses. This can be ascribed to the fact that the antecedent con- junct in the gapping pattern, is itself ungrammatical.

(24) b. *When the beans, Harry cooked . . . c. *When the beans, Harry cooked and the potatoes, Henry, . . .

For completeness’ sake, we also add that when the left-peripheral movement in the first conjunct is independently possible, then gapping is available in the second conjunct. This is shown in English (25) and in French (26). In English, sentence initial adjuncts—unlike fronted arguments—are compat- ible with adverbial clauses, and in such cases, a continuation with gapping is unproblematic:6

(25) a. If in January you finish the first chapter, you’ll have some time left for the revisions. b. If in January you write the first chapter and in February the second, you’ll have some time left for the revisions. c. When in Flanders they issued the French version and in Wallonia the English one, there was a lot of protest from politicians. d. When in Paris people were buying the French version and in Lon- don the English one we knew that it had been worth issuing both versions simultaneously.

French CLLD, unlike English argument fronting, is compatible with adver- bial clauses (26a) and a gapping continuation is unproblematic in the same context (26b):

(26) a. Si à ton frère tu lui donnes le ipad If to your brother you him give the ipad, il sera tout content.7 he will-be all happy b. Si à ton frère tu lui donnes le ipad If to your brother you him give the ipad et à ta soeur le portable and to your sister the laptop, ils seront contents les deux. they will-be happy both

7.3.3.2 Complement Clauses of Factive Verbs In English, complement clauses of factive verbs are incompatible with left- peripheral argument fronting. Again there are various accounts in the lit- erature. In cartographic terms it has been claimed that such clauses lack the relevant left-peripheral space altogether (see Haegeman and Ürögdi 2010 for arguments against this) or, alternatively, that while in se they would allow for the space, the relevant movements are inhibited by the movement 218 Transformational Constraints of the factive operator to the left periphery. Basse (2008) assumes that the left periphery of factive clauses lacks edge features. Regardless of which account one adopts, it remains true that conjoined factive complements are again fully compatible with gapping:

(27) a. She resents that Grandma gave him a new bicycle and Grandpa a watch.

The problem is like that sketched for adverbial clauses. Derivations deploy- ing the left periphery as in (27b,c) imply that while in the regular case IS- driven argument fronting to the left periphery is incompatible with this clause type, the parallel gapping movement is possible:

(27) b. and [CP Grandpa [CP a watch [TP Grandpa gave him a watch]]]

c. and [TopP Grandpa [FocP a watch [TP Grandpa gave him a watch]]]

So once again, an operation that would be impossible in the antecedent con- junct clause would become possible in the second conjunct. One might again say that the movements of the gapping remnants in (27b,c) are A-move- ments. As before, this again entails that the left periphery of complement clauses of factive predicates must be available for A-movement and that the relevant left-peripheral A-movements have a similar role with respect to IS as what are usually analyzed as left-peripheral A’-movements. Note also that assuming the generalized left-peripheral movement analy- sis for gapping also entails that Basse’s hypothesis that the left periphery of complements of factive verbs is incompatible with an edge feature must be abandoned, at least if edge features trigger the left-peripheral movements involved in gapping. As before, though, not all types of gapping are licit in this environment. Again, in (27d) the first conjunct with illicit A’-fronting is ruled out:

(27) d. *She resented that the beans, Harry cooked and the potatoes, Henry, . . .

Once again, as soon as an overt left-peripheral movement is independently allowed in the antecedent conjunct clause, then it becomes available in the second conjunct too: (27e) illustrates adjunct fronting in English and (27f) illustrates CLLD in French:

(27) e. It is worrying that in his first year he published three papers and in the second only one. f. Je suis contente qu’ à ton frère I am happy-fem that to your brother tu lui aies donné l’ ipad. you him have-subj given the ipad Using the Left Pheriphery 219 et à ta soeur le portable and to your sister the portable

7.3.3.3 Other Finite Domains with a “Deficient” Left Periphery A number of other finite domains are incompatible with left-peripheral A’-movement in English (see Haegeman 2012) while remaining fully com- patible with gapping. We simply list and illustrate some of these here: Sub- ject clauses are illustrated in (28), complements to N in (29), clauses lacking an overt complementizer in (30), embedded wh interrogatives and embed- ded yes-no questions (31). As can be seen all remain compatible with gap- ping. The problems raised earlier and the various solutions suggested are identical.

(28) That Bill invited Mary and Peter Simon surprised everyone. (29) a. In the assumption that John will talk to Mary and Bill to Susan, we may be confident this plan can go ahead. b. Your assumption that Bill will invite Mary and Susan George is surprising. (30) John believes Mary has bought the food and Bill the drinks. (31) a. I wonder what Mary gave to Tom and Bill to Susan b. I wonder if Mary sent the message to Tom and Jane to Bill.

7.3.4 Gapping With wh-Remnants (32a) is another interesting example of what looks like gapping: The first gapping remnant which records is a wh-phrase, and the second to John is a PP. Additional examples of the same type are provided in López and Win- kler (2003: 240). Following the left periphery analysis in Vanden Wyngaerd (2009) the wh constituent which records would be occupying the specifier of the left-peripheral TopP and the PP to John would be in the focus position:

(32) a. Bill asked which books we gave to Mary and which records to John. (example from López and Winkler 2003: 240, their (29))

b. [TopP which recordsi [Foc P to Johnj [ti we gave tj]]]

In (32b) wh-fronting would target the topic projection, which is normally associated with givenness. This may not be problematic as such because the question format (“which”) is indeed “given” in the antecedent, but it does raise the question as to a uniform treatment of clause typing. Moreover, if the movement of the leftmost constituent is taken to be A-movement this would be at least slightly unexpected when the relevant constituent is a wh-phrase. A further problem arises for multiple sluicing (Richards 2001: 137–138) in (33). Under a left-peripheral analysis with clausal coordination the gap- ping remnants which bones and to which dogs move to the left periphery 220 Transformational Constraints and the vacated IP is deleted. Thus, (33) would instantiate multiple wh- movement to the left periphery, a pattern freely available in other languages such as Hungarian and Bulgarian (Rudin 1988, Bošković 2002). Again, the left-peripheral movement of the second wh-phrase would be one that is only manifested in English when associated with TP ellipsis.

(33) Bill asked which books we gave to which students and which bones to which dogs. (López & Winkler 2003: 240, their (29))

7.3.5  Intermediate Conclusions If gapping is derived by generalized left-peripheral movement, this move- ment systematically has to have properties setting it apart from the familiar IS-driven movements that it would appear to be “modeled on”, since the movements required to derive gapping are available in contexts in which the regular left-peripheral IS-driven fronting operations are not. As discussed, one possibility would be that the movements undergone by the gapping remnants would be identified as A-movement. However, the hypothesis raises problems. First, the movements required are then not uniform since, as pointed out by Vanden Wyngaerd (2009), certain patterns specifically require A’-movement underlying the derivation. Second, the hypothesis that the IS-driven move- ment required to derive gapping is A-movement implies that some IS related operations are part of the A-system while others are part of the A’-system, without there being provided a principled account for the contrast. In addition, what would be IS-driven A-movement to the left periphery would have to systematically apply in domains claimed to have a defective or reduced left periphery and in which “regular” A’-movement has so far not been manifested. Such domains would thus have to be argued to have a LP, contrary to what is often assumed, and one that can only be targeted by A-movement. Again, no account has been provided for why this should be.8 The consequences of the analyses described above can be overcome but it must be clear that they require a number of additional specifications, which means that the original attractiveness of the movement analysis of gapping is reduced. In section 7.4 we briefly discuss an alternative proposal that exploits the low left periphery.

7.4 The Alternative In this section, we discuss some alternative analyses that avoid some of the problems raised for the left-peripheral analysis. These analyses all make cru- cial use of a TP internal domain to derive gapping and, thus, avoid the space problem that arises for the left-peripheral analysis. We are not be able to discuss these in full, but we do highlight their main features. In an overview of gapping, Johnson (2014) suggests treating gapping as a combination of coordination and VP ellipsis. We briefly present his analysis first, and then we offer a cartographic reworking. Using the Left Pheriphery 221 7.4.1 Gapping: Extraction and VP Ellipsis Johnson (2014) proposes the following analysis of gapping:

(34) Gapping elides an XP from which the remnants have scrambled

(35) is derived as in (36): VP is elided after the object DP bourbon has been extracted and adjoined to the VP.

(35) Some have drunk whiskey and others have drunk bourbon. (36)

DP IP

others I VP

VP DP

V VP bourbon have

V drunk

Gapping may also elide a VP without any scrambling taking place, yield- ing sentences like (37), with the representation in (38):

(37) Mary left early, and Sally left early too. (38) IP

DP IP

Sally I VP

VP too

left early

Johnson’s analysis fares better with regard to the problems discussed in sections 7.1 through 7.3: In a right adjunction analysis like that in (38), 222 Transformational Constraints the space and locality problems identified will not arise since adjunction is usually considered to be relatively freely available. There remain certain issues, though. We only highlight some here (see also Johnson 2014 for some discussion). First consider (39):

(39) [IP Jill ate rice yesterday] and [IP Jill ate porridge today].

(39) can be derived if, following a tradition started by Harley (1995) and Kratzer (1996), we adopt an articulated VP structure according to which the subject is merged first in a specifier position of vP, the verb moves from V to v, the object is extracted and adjoined to vP and it is vP (rather than VP) that is elided:

(40) [vP [vP [vP Jill ate [VP ate porridge]] porridge] today].

However, it is crucial for this hypothesis that in gapping examples such as (39), the subject actually remain in its merge position, that is, that it does not move to the canonical subject position. Put differently, if (39) involves coordination of TPs, then in the second TP, the subject has not exited VP. Depending on the motivation for the movement of the subject in nongapped clauses this may be a problem. Johnson’s analysis would also have to be extended to instances of gap- ping involving wh-items, as in (32) and in the multiple sluicing example in (33), repeated here in (41). In the analysis outlined here one would have to assume that the wh-constituents are scrambled, that is, right adjoined to vP, a position not normally associated with the checking of a wh-feature:

(41) Bill asked which books we gave to which students and which bones to which dogs. (López and Winkler 2002: 240, their (29))

It is also not immediately obvious that a vP ellipsis approach can natu- rally capture examples in which gapping affects the auxiliary as well as the lexical verb, as in (42), because the relevant ellipsis would not affect the auxiliary, by assumption VP-external (see also Vanden Wyngaerd (2009)):

(42) a. During dinner, my father had talked to his colleagues from Stuttgart and at lunch time to his boss. (based on Molnár and Winkler 2010: 1405: (34)) a’. During dinner, my father had talked to his colleagues from Stuttgart and at lunch time my father talked to his boss. b. Fido they had named their dog and Archie their cat. (Molnár and Winkler 2010: 1405: (35)) b’. Fido they had named their dog and Archie they had named their cat. (Molnár and Winkler 2010: 1405: (35)) Using the Left Pheriphery 223 Alternatively, to capture such examples one might envisage that the rel- evant patterns in (42) are not in fact derived by clausal coordination but that the coordination is here restricted to a lower level, with the auxiliary as it were “shared” by both conjuncts.

7.4.2 A Cartographic Reworking: Exploring the Low Left Periphery

7.4.2.1 A vP Periphery In this section we consider cartographic variants of Johnson’s analysis in which the gapping remnants are not vP adjoined but are moved to designated positions in a low left periphery. In particular, in a series of papers Belletti (2001, 2004, 2008, 2009), has argued convincingly in favor of postulating a clause-internal left periphery composed of focus and topic projects situated right above the vP/VP. For similar proposals see also Jayaseelan (2001, 2011) and Butler (2004). Belletti also argues for a strict parallelism between the clause-internal periphery and clause-external periphery (Rizzi 1997). (43) is the general template for the clause-internal periphery, based on Belletti (2004):

(43) [ IP I [TopP Top [FocP Foc [TopP Top [vP v [VP V]]]]]]

One first implementation of this idea is, in fact, found in Vanden Wyngaerd (2009), and it is based on Kayne (1998). According to the latter, gapping is derived by a leftward IS-driven movement of the gapping remnants which target (or may target) what seems to correspond to Belletti’s low periphery in (43). (44) and (45) are from Vanden Wyngaerd (2009: 4–5, his (6)-(7)). In (44), the direct object pears, the contrastively focused remnant, moves to a focus position in the low periphery, and the VP itself moves to a higher TP internal projection, WP, possibly to be equated to the low TopP, where it is deleted (see also Kayne 2000: 239 on P stranding). A similar analysis derives (45), in which the time adjunct in 1961 is the lower focus:

(44) Mary likes apples and Sally pears.

a. [FocP Foc° [VP likes pears]]

b. Attraction to Foc°: . . . [FocP pearsi Foc° [VP likes ti]] c. Raising of Foc° to W: . . .

[WP Foc°j+W [FocP pearsi tj [VP likes ti]]] d. VP-preposing: . . .

[WP [VP likes ti] k Foc°j+W [FocP pearsi tj tk]] (45) My brother visited Japan in 1960, and my sister visited Japan in 1961.

a. [FocP Foc° [VP in 1961 visited Japan]] b. Attraction to Foc°: . . .

[FocP in 1961i Foc° [VP ti visited Japan]] c. Raising of Foc° to W: . . .

[WP Foc°j+W [FocP in 1961i tj [VP ti visited Japan]]] 224 Transformational Constraints d. VP-preposing: . . .

[WP [VP ti visited Japan]k Foc°j+W [FocP in 1961i tj tk]]

On the basis of scope facts and the distribution of NPIs, López and Winkler (2003) also argue in favor of an approach according to which the moved remnants target a low vP peripheral position. See also Coppock (2001), Johnson (2009, 2014) and Toosarvardani (in press) for discussion. Though the precise implementations of vP related movements differ, it is clear that movements targeting Belletti’s lower periphery will not give any rise to “space” problems identified with respect to “deficient” CP domains since the vP periphery is intact in the domains with a deficient LP. For instance, object shift or scrambling in the middlefield of the Germanic languages might also be associated with movement to this type of low periphery and scrambling is not affected by the “size” of the left periphery and remains available in infinitival clauses. Johnson’s (2014) analysis can be recast in terms of Belletti’s low periphery As we have seen, for Johnson remnants are scrambled, that is, right-adjoined to the VP. Reformulating his approach, it can be proposed that the remnants target SpecTopP and SpecFocP in the low periphery and vP/VP ellipsis can apply as before. (46) shows the relevant part of the structure of (35):

(46) [TopPP [DP others]i [FocP [DP bourbon]j [vP ti have drunk tj]]].

Recall the problem that arises for gapping patterns involving wh-remnants such as those illustrated in (32) and (33). Fox (1999), Nissenbaum (2000), Legate (2003) and den Dikken (2007) also provide evidence drawn from reconstruction that wh-movement must proceed by the vP phase edge, this could be taken to coincide with the low periphery, and thus, the wh-rem- nants could arguably halt in their lower landing site. Observe that if the CP periphery and the vP periphery are indeed strongly parallel, then indeed it might well be argued that both domains are available to provide landing sites for the derivation of gapping and that remnants may be stranded either in a low periphery or in a high periphery. Interestingly, exploring a movement analysis for VP ellipsis, Funakoshi (2012) has argued along similar lines that VP ellipsis involves movement to either the low or the high periphery. If VP ellipsis constitutes one component of the deriva- tion of gapping then it would only be natural that gapping can also use either periphery. We have to leave this for future work, but see Sailor and Thoms (2013) for additional arguments that both the low left periphery and the high periphery are relevant.

7.5 Conclusion One of the merits of the cartographic perspective is that it offers a way of formalizing the relation between information structural properties and the Using the Left Pheriphery 225 syntax. In the first cartographic work the focus was on the decomposition of the CP area as an articulated left periphery hosting positions for focus and for topic constituents. Given that gapping involves focus it was only natural to explore an analysis in which the remnants of gapping are stranded in the (articulated) CP area. However, on the basis of a closer examination of two left-peripheral analyses of gapping in English we have shown that care must be taken in the implementation of the mapping between IS and syntax. In particular, we demonstrate that if gapping is analyzed purely in terms of movement of the gapping remnants to the CP layer, the wide availability of the pattern in a range of clauses not normally compatible with left-periph- eral fronting, including nonfinite domains, goes unexplained. Though we do not provide a full alternative analysis in the chapter, we suggest that deploy- ing the low periphery as developed in crucial work by Belletti (2001, 2004, 2008, 2009) might allow for a way to overcome these problems. The material examined here also has revealed that there is as yet no con- sensus in the literature as to the nature of the movements implicated in deriving gapping, and in particular, it is not clear whether the fronting of the gapped constituents lines up with A-movement or with A’-movement. This is an area which, we think, merits further research.

Notes * We dedicate this work to Adriana Belletti, whose work throughout the years has been a leading example of empirical wealth combined with theoretical rigor. We are grateful to two anonymous reviewers for their comments. Liliane Haegeman’s research was supported by FWO Odysseus 2009-Odysseus-Haegeman-G091409. 1 For a more careful statement, see Belletti (2009). 2 It is not clear to us why Vanden Wyngaerd orders the movements in this way. 3 Culicover and Levine (2001: 297, Note 14, their (i)) provide the following exam- ple of argument fronting with an absolute ing clause: (i) That solution Robin having already explored t and rejected t, she decided to see if she could mate in six moves with just the rook and the two pawns (Culi- cover and Levine 2001: 297, Footnote 14, (i)) Such clauses can also be coordinated with a gapping pattern: Observe that in this case the remnant object can precede the remnant subject, in parallelism with the first conjunct: (ii) This hypothesis Robin having rejected and that one Justin, they had no idea what to do next. 4 Richard accounts for the special status of the movement as follows: The answer to the second question is that the features on this head which are responsible for attracting the remnants are weak in English, and thus cannot ordinarily be active in the overt syntax. VP ellipsis, however, makes these weak features capable of driving overt movement, as predicted by the theory devel- oped here. The chains headed by the remnants have only a single copy out- side the ellipsis site, and are therefore legitimate PF objects, since they give PF unambiguous instructions to which part of the chain to pronounce. Richards (2001: 137) It is unclear how Aelbrecht’s analysis would fare here since presumably she would assume that the contrast feature is also responsible for the overt movement of 226 Transformational Constraints contrastive topics and foci to the left periphery in English. On Vanden Wyngaerd’s account one would have to ensure that the features on Foc and Top may be strong (with overt movement) or weak. 5 This analysis also entails that the left periphery of adverbial clauses cannot be fully truncated as is often assumed to account for the ungrammaticality of (23c). 6 Native speakers disagree about (25c–d): Some accept them; some do not. We do not have anything to say here about this variation. 7 We have chosen an instance with a CLLD PP to avoid the alternative Hanging topic analysis (see Cinque 1990 for extensive discussion). 8 Observe that under Haegeman’s (2012) intervention account of the distribution of main clause phenomena assuming that gapping involves A-movement indeed allows us to predict that gapping remains available in domains incompatible with A’-fronting. Haegeman derives the unavailability of main clause phenomena in a subset of embedded clauses from A’-intervention effects. Such effects would indeed not be triggered by A-movement of the gapping remnants.

References Aelbrecht, L. 2007. A movement account of Dutch gapping. Talk presented at TIN- dag, Utrecht University, February 3. Basse, G. 2008. Factive complements as defective phases. In Proceedings of the WCCF, N. Abner and J. Bishop (eds.), 27: 54–62. Belletti, A. 2001. Inversion as focalization. In Subject Inversion in Romance and the Theory of Universal Grammar, A. Hulk and J. Y. Pollock (eds.), 60–90. New York: Oxford University Press. Belletti, A. 2004. Aspects of the low IP area. In The Structure of CP and IP: The Cartography of Syntactic Structures, Volume 2, L. Rizzi (ed.), 16–51. Oxford: Oxford University Press. Belletti, A. 2008. The CP of Clefts. Rivista di Grammatica Generativa 33: 191–204. Belletti, A. 2009. Structures and Strategies. New York: Routledge. Bošković, Ž. 2002. On multiple Wh-fronting. Linguistic Inquiry 33: 351–383. Bošković, Ž. 2011. Rescue by PF deletion, traces as (Non)interveners, and the That- trace effect. Linguistic Inquiry 42: 1–44. Butler, J. 2004. Phase Structure, Phrase Structure and Quantification. Doctoral dis- sertation, University of York. Chomsky, N. 1972. Some empirical issues in the theory of transformational gram- mar. In Goals of Linguistic Theory, S. Peters (ed.), 63–130. Englewood Cliffs, NJ: Prentice-Hall Inc. Cinque, G. 1990. Types of A’ Dependencies. Cambridge, MA: MIT Press. Cinque, G. 1999. Adverbs and Functional Heads. New York: Oxford University Press. Cinque, G and Rizzi, L. 2010. The cartography of syntactic structures. In The Oxford Handbook of Grammatical Analysis, B. Heine and H. Narrog (eds.), 51–65. Oxford: Oxford University Press. Coppock, E. 2001. Gapping: In defense of deletion. In Proceedings of the Chicago Linguistics Society 37, M. Andronis, C. Ball, H. Elston, and S. Neuvel (eds.), 133–147. Chicago, IL: University of Chicago. Culicover, P. and Levine, R. D. 2001. Stylistic inversion in English: A reconsidera- tion. Natural Language and Linguistic Theory 19: 283–310. Dikken, M. den. 2007. Phase extension: Contours of a theory of the role of head movement in phrasal extraction. Theoretical Linguistics 22: 1–41. Using the Left Pheriphery 227 Drübig, H. B. 2006. Phases and the typology of focus constructions. In On Infor- mation Structure: Meaning and Form, K. Schwabe and S. Winkler (eds.), 33–68. Amsterdam: John Benjamins. Fox, D. 1999. Reconstruction, binding theory, and the interpretation of chains. Lin- guistic Inquiry 30: 157–196. Frazier, M., Potter, D. and Yoshida, M. 2012. Pseudo noun phrase coordination. In Proceedings of WCCFL 30, N. Arnett and R. Bennet (eds.), 142–152. Somerville, MA: Cascadilla Proceedings Project. Funakoshi, K. 2012. On headless XP-movement/ellipsis. Linguistic Inquiry 43: 519–562. Haegeman, L. 1993a. Some speculations on argument shift, clitics and crossing in West Flemish. Linguistische Berichte, Sonderheft 5: 131–160. Haegeman, L. 1993b. The morphology and distribution of object clitics in West Flemish. Studia Linguistica 47: 57–94. Haegeman, L. 1994. The typology of syntactic positions: L-relatedness and the A/A’ distinction. Groniger Arbeiten zur Germanistischen Linguistic (GAGL) 37: 115–157. Haegeman, L. 2012. Adverbial Clauses, Main Clause Phenomena, and the Composi- tion of the Left Periphery. Oxford: Oxford University Press. Haegeman, L., Meinunger, A. and Vercauteren, A. 2014. The architecture of it clefts. Journal of Linguistics 50: 269–296. Haegeman, L. and Ürögdi, B. 2010. Referential CPs and DPs: An operator move- ment account. Theoretical Linguistics 36: 111–152. Hale, K. and Keyser, S. J. 1993. On argument structure and the lexical expression of semantic relations. In The View from Building 20: Essays in Linguistics in Honor of Sylvian Bromberger, K. Hale and S. J. Keyser (eds.), 53–109. Cambridge, MA: MIT Press. Hankamer, J. 1979. Deletion in Coordinate Structures. New York: Garland. Harley, H. 1995. Subjects, Events and Licensing. Doctoral dissertation, MIT. Huddleston, R. and Pullum, G. K. 2002. The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press. Jayaseelan, K. A. 1990. Incomplete VP deletion and gapping. Linguistic Analysis 20: 64–81. Jayaseelan, K. A. 1999. A focus phrase above vP. In Proceedings of the Nanzan GLOW, Y. Abe, H. Aoyagi, M. Arimoto, K. Murasugi, M. Saito and S. Tatsuya (eds.), 195–212. Nagoya, Japan: Nanzan University. Jayaseelan, K. A. 2001. IP-internal topic and focus phrases. Studia Linguistica 55: 39–75. Jayaseelan, K. A. 2010. Stacking, stranding, and pied-piping: A proposal about word order. Syntax 13: 298–330. Johnson, K. 2009. Gapping is not VP ellipsis. Linguistic Inquiry 40: 289–328. Johnson, K. 2014. Gapping. Ms. University of Massachusetts, Amherst. Kayne, R. S. 1998. Overt versus covert movement. Syntax 1: 128–191. Kayne, R. S. 2000. Parameters and Universals. Oxford and New York: Oxford Uni- versity Press. Kratzer, A. 1996. Severing the external argument from the verb. In Phrase Structure and the Lexicon, J. Rooryck and Z. Laurie (eds.), 109–137. Dordrecht: Kluwer. Kuno, S. 1976. Gapping: A functional analysis. Linguistic Inquiry 7: 300–318. Lasnik, H. 1995. A note on pseduogapping. In MIT Working Papers in Linguistics 27, R. Pensalfini and H. Ura (eds.), 143–163. Cambridge MA: MITWPL. 228 Transformational Constraints Legate, J. 2003. Some interface properties of the phrase. Linguistic Inquiry 34: 506–516. López, L. & Susanne, W. 2002. Variation at the Syntax-Semantics Interface: Evi- dence from Gapping. In The Interfaces: Deriving and Interpreting Omitted Structures, Schwabe, Kerstin & Susanne Winkler (eds.), 227–250. Amsterdam: John Benjamins. López, L. and Winkler, S. 2003. Variation at the syntax-semantics interface: Evi- dence from gapping. In The Interfaces: Deriving and Interpreting Omitted Struc- tures, K. Schwabe and S. Winkler (eds.), 225–248. Amsterdam: John Benjamins. Molnár, V. and Winkler, S. 2010. Edges and gaps: Contrast at the interfaces. Lingua 120: 1392–1415. Neijt, A. 1979. Gapping. Dordrecht: Foris. Nissenbaum, J. 2000. Investigations of Covert Phrase Movement. Doctoral disserta- tion, MIT. Pesetsky, D. 1982. Paths and Categories. Doctoral dissertation, MIT Repp, S. 2007. Negation in Gapping. Oxford: Oxford University Press. Richards, N. 2001. Movement in Language. Oxford: Oxford University Press. Rizzi, L. 1997. The fine structure of the left periphery. In Elements of Grammar, L. Haegeman (ed.), 289–330. Dordrecht: Kluwer. Ross, J. R. 1970. Gapping and the order of constituents. In Progress in linguis- tics, M. Bierwisch and K. E. Heidolph (eds.), 249–259. The Hague: Mouton de Gruyter. Rudin, C. 1988. On multiple questions and multiple wh-fronting. Natural Language and Linguistic Theory 6: 445–501. Sag, I. 1976. Deletion and Logical form. Doctoral dissertation, MIT. Sailor, C. and Thoms, G. 2013. On the non existence of non constituent coordina- tion and non constituent ellipsis. In Proceedings of WCCFL 31, R. E. Santana-La Barge (ed.), 361–370. Somerville, MA: Cascadilla Proceedings Project. Toosarvardani, M. In press. Gapping is low coordination (plus (VP) ellipsis): A reply to Johnson. Linguistic Inquiry. Vallduví, E. and Engdahl, E. 1996. The linguistic realization of information packag- ing. Linguistics 34: 459–519. Wyngaerd,. G. V. 1989. Object shift as an a-movement rule. MIT Working Papers in Linguistics 11: 256–271. Wyngaerd, G. V. 2009. Gapping constituents. HUB Research Paper 2009/02: 1–53. Part B The Syntax–Semantics Interface

8 Negative Concord and (Multiple) Agree A Case Study Of West Flemish*

with Liliane Haegeman

8.1 Introduction With the advent of the Agree model (Chomsky 2000, 2001, 2004, 2007, 2008), negative concord, in which there seems to be agreement between negative constituents, has garnered renewed interest, both from a synchronic (Watanabe 2004, Zeijlstra 2004, Lindstad 2007, Penka 2007a,b, c) and a diachronic (Roberts and Roussou 2003, Zeijlstra 2004, Roberts 2007, van Gelderen 2008) point of view. In this article, we focus exclusively on data such as West Flemish (WF) (1a).1 As the translation indicates, (1a) is interpreted as if it contained a single expression of sentential negation, even though it con- tains three negative expressions, nooit, “never”; niets, “nothing”; and niet, “not”, each of which can express sentential negation all by itself:2

(1) a.  K’(en)-een nooit niets niet gezien. I (en)-have never nothing not seen “I have never seen anything.” b. K’(en)-een niet gewerkt. I (en)-have not worked. “I haven’t worked.” c. K’(en)-een niets gezien. I (en)-have nothing seen “I haven’t seen anything.” d. K’(en)-een nooit gewerkt. I (en)-have never worked “I have never worked.”

The interest of (1a) for the concept Agree is that the three so called n-words, nooit, niets, and niet, jointly convey a single (sentential) negation. (1a) sug- gests that such negative constituents are not semantically negative (i.e., that they do not themselves encode sentential negation); instead, they are unin- terpretable “negative dependents” (see Borsley and Jones 2005, Willis 2006) of an interpretable (possibly null) negative constituent.3 Or, to put it differ- ently, (1a) can be taken to display a form of syntactic agreement between 232 The Syntax–Semantics Interface a number of constituents that depend on/are in the scope of the constitu- ent encoding semantic negation (Ladusaw 1992, Brown 1999, Zeijl­stra 2004, 2008, Penka 2007a,b, Biberauer and Zeijlstra 2012). Formalizing this hypothesis, it has been argued (Roberts and Roussou 2003: 145, Zeijlstra 2004, 2008, Moscati 2006, Penka 2007a,b) that negative concord involves only one interpretable negative feature that values (possibly multiple) unin- terpretable negative features.4 In this view, negative concord (hereafter NC) would be a case of Multiple Agree (Ura 1996, Hiraiwa 2001, 2005, Chom- sky 2008). Although attractive, the Multiple Agree (MA) account raises questions. One is conceptual in nature: MA, in which many probes enter into an agree relation (henceforth, Agree) with one goal, leads to abandoning a strict locality condition on agreement. In addition, as we show, adopting MA to account for NC (as proposed by Zeijlstra 2004, Penka 2007b) leads to empirical problems for WF. We propose that a slightly revised formulation of binary Agree (much in the spirit of Pesetsky and Torrego 2007) makes it possible to handle the WF data. The article is structured as follows: Section 8.2 presents the core data of sentential negation in WF relevant for the issue of NC as an instantiation of MA. Section 8.3 presents the MA account of NC proposed by Zeijlstra (2004, 2008) and discusses the conceptual and empirical problems raised by the proposal. Section 8.4 introduces the theoretical machinery that we adopt for our own analysis, and section 8.5 elaborates our analysis of WF NC in terms of binary Agree. Section 8.6 summarizes the chapter.

8.2 Sentential Negation in West Flemish This section introduces the data regarding sentential negation in WF that are relevant for the analysis of NC as MA. Readers familiar with the WF data will not find much new here (see Haegeman and Zanuttini 1991, 1996, Haegeman 1995). For reasons of space, we omit issues that do not seem relevant for the present discussion.

8.2.1 Expressions of Negation: An Inventory Three types of constituents are implicated in the expression of sentential negation in WF. One is the morpheme en, which cliticizes onto the finite verb (see Haegeman 1998a,b, 2000a,c, 2002b) and moves along with it (see (2d)). We assume that it spells out a head. En cannot express negation all by itself (2a); it must cooccur with a negative constituent (2b–c). Fur- thermore, en is never obligatory: In (2b–d), it may be left out without loss of grammaticality. As it is only tangential to our discussion, we do not dis- cuss the properties of en in detail. Following Haegeman (1998a,b, 2000a,c, 2002b), we assume that en is a Spell-Out of the head Pol (see Willis 2006 for PolP in Welsh; see also Breitbarth and Haegeman 2008 for a slightly differ- ent implementation) rather than being associated with a [neg] feature. For Negative Concord and (Multiple) Agree 233 reasons of space, we do not elaborate this point here, and we refer to the papers cited for arguments.

(2) a. *da Valère dienen boek en-kent that Valère that book en-knows b. da Valère dienen boek niet en-kent that Valère that book not en-knows “that Valère doesn’t know that book” c. da Valère niemand en-kent that Valère no.one en-knows “that Valère doesn’t know anyone” d. Valère en-kent dienen boek niet. Valère en-knows that book not “Valère doesn’t know that book.”

A second negative element is the marker of sentential negation, niet, “not”, which is parallel to Germanic negative markers such as German nicht, Dutch niet, and Norwegian ikke. Niet is located in the middle field, in a position c-com- manding vP. As (2d) shows, niet is not affected by the movement of the finite verb. We assume that niet has XP status (see Haegeman 1995, Zeijlstra 2004). Negative constituents, or n-words as they are usually called following Laka (1990),5 are the third type of negative expression. An n-word is a constituent that appears in the NC contexts we are interested in here. The relevant WF n-words are either simple one-word items such as niemand, “nobody”; niets, “nothing”; nooit, “nowhere”; and nieverst, “nowhere” (these will be referred to jointly as simple n-constituents) or syntactically more complex constituents that contain the negative quantifier geen, “no”, such as geen studenten, “no students” and geen geld, “no money” (which will be referred to as geen-NPs), or that contain the negative marker niet as in niet dikkerst, “not often”; niet lange, “not long”; niet vele, “not much”; and so on. The use of n-words is illustrated in (1c), (1d), (2c) and in (3). As the parentheses indicate, en remains optional.

(3) a. da Valère dienen boek nieverst (en)-vindt that Valère that book nowhere (en)-finds “that Valère doesn’t find that book anywhere” b. da Valère geen geld (en)-eet that Valère no money (en)-has “that Valère doesn’t have any money” c. da Valère ier niet dikkerst geweest (en)-eet that Valère here not often been (en)-has “that Valère hasn’t been here often”

Our article is concerned with the extent to which the n-constituents and niet enter into NC readings (see Vanacker 1975 for a first description [in Dutch] of some of the crucial data). 234 The Syntax–Semantics Interface 8.2.2 Negative Concord in WF Haegeman (1995, 1997) argues that in WF an n-word with sentential scope must undergo leftward Neg-movement, as illustrated in (4) (see Haegeman’s discussion for details and Christensen 1986, 1987 for similar proposals for Norwegian):

(4) a.  da Valère van niemand ketent en-was that Valère of no.one contented en-was “that Valère was not pleased with anyone” b. *da Valère ketent van niemand en-was6 that Valère contented of no one en-was

When n-constituents with sentential scope cooccur with niet, they must move to the left of niet. Such moved constituents enter into an NC rela- tion with each other and with niet (Haegeman 1995: 138–139) as in (5a). Failure to undergo Neg-movement leads to a double negation (DN) read- ing as in (5b). Importantly, though, as also shown by (4), the obligatory leftward movement of the n-constituent(s) in (5a) cannot be motivated by their need to enter into NC with niet as such, because Neg-movement must also take place when niet is absent. Parallel with (5a), in which the n-constituents precede niet, in (5c) niet is absent. Once again the n-con- stituents have to undergo Neg-movement. If over niets, “about noth- ing”, were to remain to the right of ketent, “contented”, NC would be excluded (5d):

(5) a. dat ter niemand over niets niet ketent en-is (NC) that there no.one about nothing not contented en-is “that no one is satisfied with anything” b. da ter niemand niet ketent over niets en-is (*NC/?DN) that there no.one not contented about nothing en-is “that no one isn’t satisfied with anything” c. dat ter niemand over niets ketent en-is (NC) that there no.one about nothing contented en-is “that no one is satisfied with anything” d. da ter niemand ketent over niets en-is (*NC/?DN) that there no.one contented about nothing en-is “that no one isn’t satisfied with anything”

Not only simple n-words such as niemand, “no one”; niets, “nothing”; nieverst, “nowhere”; and nooit, “never”, enter into an NC relation. Other negated DPs with more complex structure can also enter into NC with clausemate n-constituents (Haegeman 2002b). For instance, in (6a) the DP geenen tyd, “no time”, enters into an NC relation with nooit, “never”.7 In Negative Concord and (Multiple) Agree 235 (6b), niet, “not”, negates a quantified nominal constituent (te) vele tyd, “too much time”; the negated constituent enters into NC with nooit, “never”. In (6c), niet negates an adverb (lange, “long”; dikkerst, “often”), and the negated adverb enters into NC with niemand, “no one”. On the basis of data such like (6a–c), Haegeman (2002b: 157) concluded that DPs containing negated quantifiers or negated adverbs are to all intents and purposes clausal negators.

(6) a.  K’(en)-een nooit geenen tyd. I (en)-have never no time “I never have any time.” b. K’(en)-een nooit niet (te) vele tyd. I (en)-have never not (too) much time “I never have a lot of /too much time.” c. T’(en)-eet doa niemand niet lange /dikkerst geweest. it (en)-has there no.one not long/often been “No one has been there for a long time/often.”

It is also possible for constituents containing a negative quantifier to have local scope. This is illustrated in (7): in geen tyd, “in no time”, does not negate the clause; instead, it means something like “in very little time”. Because the clause is not negative, en is not licensed, there is no need for Neg-movement (7b). Any n-word present in the middle field of the clause will not enter into NC with in geenen tyd. In (7c), en is licensed by virtue of the presence of niet, but niet and in geen tyd do not enter into an NC relation. For reasons of space, we do no discuss n-words with local or con- stituent scope; we refer to, among others, Borkin (1971), Lawler (1971), Haegeman (1997, 2000b), Progovac (2000), Svenonius (2002), Moscati (2006), and the references cited there.

(7) a. In geen tyd (*en)-oan-ze da gedoan. in no time (*en)-had-they that done “They had finished that in no time.” b. dan-ze da gedoan(*en)- oan in geen tyd. That-they that done(*en)-had in no time “that they had finished that in no time.” c. Z’(en)-oan da niet gedoan in geen tyd. they (en)-had that not done in no time “They did not finish that in no time.”

8.2.3 DP-Internal Negative Concord The bracketed negative constituent in (8) also expresses sentential negation.8 The string differs minimally from the quantified n-constituent in (6b) by the 236 The Syntax–Semantics Interface addition of geen, “no”, but importantly, this does not lead to a change in meaning. For arguments that the bracketed string in (8) is a constituent, see Haegeman (2002a). Haegeman analyzes the niet Q geen N sequences as instantiations of DP-internal NC.

(8) K’(en)-een nooit [niet (te) vele geen tyd]. I (en)-have never not (too) much no time “I never have a lot of/too much time.”

8.3 Negative Concord as Multiple Agree (Zeijlstra 2004, 2008) In this section, we first summarize Zeijlstra’s (2004, 2008) proposal for ana- lyzing NC in terms of MA (see also Penka 2007a,b). We then discuss the conceptual and empirical problems facing his account.

8.3.1 Zeijlstra (2004, 2008) To account for the cooccurrence of what seems like multiple n-constituents conveying a single sentential negation, Zeijlstra (2004, 2008) proposes that such constituents are semantically non-negative indefinites with an [uneg] feature (2004: 245). The sentential negative marker (e.g., WF niet) is also taken to bear [uneg]. The very existence of [uneg] features trig- gers the projection of NegP. Sentential negation as such is encoded by a covert negative operator OP¬, in SpecNegP, associated with an [ineg] fea- ture. According to Zeijlstra’s definition, “OP¬ (i) introduces a negation at LF, and (ii) unselectively binds all free variables under existential closure” (2004: 247).9 In Zeijlstra’s system, Op¬ [ineg] in SpecNegP c-commands the (multiple) [uneg] n-constituent(s) on the vP edge. This “reverse Agree” departs from the standard view according to which the probe with the uninterpretable feature c-commands the goal with the interpretable feature. (For some dis- cussion of reverse Agree, see also Brown 1999: 29, Note 11, Adger 2003, Merchant 2004, von Stechow 2005, Bošković 2007, Baker 2008, Merchant and Sadock 2008, and von Stechow and Zeijlstra 2008). In Zeijlstra’s approach, NC is the result of MA (Hiraiwa 2001) between OP¬, on one hand, and the negative marker and n-words, on the other:

The central hypothesis behind the assumption that [NC] languages express (sentential) negation by means of syntactic negation is that negation in these languages exhibits syntactic agreement that, in prin- ciple, does not differ from (syntactic) person or Tense agreement . . . . n-words are non-negative indefinites that are syntactically marked for negation, i.e. they bear an uninterpretable [uneg] feature, that at some Negative Concord and (Multiple) Agree 237 point during the derivation needs to be checked against an overt or covert element that carries an interpretable [ineg] feature. This feature checking is governed by the syntactic operation Agree. Thus [NC] is the realization of an agreement relation between a negative operator and an n-word. (2004: 244–5; our italics)

8.3.2. Application Consider the Czech example (9a). Since Czech is an NC language, Zeijlstra assumes it has a NegP whose specifier hosts a covert operator with an [ineg] feature. In (9a), the verb vidi, “see” is associated with a negative morpheme ne, with a [uneg] feature, and so is the n-word nikoho, “no one”. Through MA, the [uneg] features get checked and deleted (9b):10

(9) a. Milan nevidi nikoho. Milan neg.sees no.one

b. [NegP OP¬ [ineg] [vP nikoho [uneg] [vP Milan nevidi [uneg]]]] (Zeijlstra2004: 250)

Zeijlstra also applies his analysis to WF (2004: 255–256). According to his analysis, in a WF example with a single negative marker niet, “not”, and the negative morpheme en, both niet and en carry [uneg] and the two [uneg] features are checked by the [ineg] feature on the negative operator in SpecNegP. In (10), his analysis is applied to an example with a single negative marker niet, “not”, and with the negative morpheme en: Both carry a [uneg] feature, and the two uninterpretable features are checked via the interpretable feature on the negative operator in Spec- NegP. Observe that en is optional here. In (11a), sentential negation is conveyed by means of an n-word, niemand, “no one”, which may be accompanied by niet as well as by en. Zeijlstra provides the representa- tions (10b) and (11b):

(10) a. da Valère niet (en)-klaapt that Valère not (en)-talks “that Valère doesn’t talk”

b. [NegP OP ¬ [ineg] [vP niet [uneg] [vP Valère [v’ en-klaapt [uneg]]]]] (Zeijlstra 2004: 255) (11) a. da Valère tegen nieman[d] (niet) en-klaapt that Valère against no.one (not) en-talks “that Valère doesn’t talk to anyone”

b. [NegP OP ¬ [ineg] [vP [PP tegen niemand [uneg]] [vP (niet [uneg]) [vP Valère [v’ en- klaapt [uneg]]]]]] (Zeijlstra 2004: 255) 238 The Syntax–Semantics Interface 8.3.3 Negative Concord as Multiple Agree: Problems for the Account A first problem for Zeijlstra’s (2004, 2008) MA account of NC is concep- tual: MA, in which many probes agree with one goal, leads to the abandon- ment of a strict locality condition on Agree, in that precisely in the context of MA a probe need not have a local relation with (at least one of) its goal(s). Not only does this raise general questions concerning the role of locality in syntax, but also, as we show, locality plays a crucial role in determining the conditions of NC in WF. There are two specific empirical problems for the MA account of WF NC. First, the across-the-board application of MA to derive NC gives rise to the wrong predictions. Second, the MA approach has difficulty in handling the DP-internal application of NC and its relation to NC at the sentential level.11

8.3.3.1 Multiple Agree And Locality In Hiraiwa’s (2001) original conception as well as in Zeijlstra’s (2004, 2008) implementation, MA is a process whereby all uninterpretable features are “simultaneously” eliminated:

Multiple agree (multiple feature checking) with a single probe is a sin- gle simultaneous syntactic operation; Agree applies to all matched goals at the same derivational point derivationally simultaneously. (Hiraiwa 2001: 69, our italics)

The implementation of MA for the phenomenon of NC can be presented schematically as in (12). Following and adapting Hiraiwa’s own formula- tion (“Agree applies to all matched features”) we assume that MA, like binary Agree, is a two-step process that first matches the features and then leads to checking. After Merge/Move of the individual n-constituents to the edge of vP, each with its [uneg] feature, the abstract negative operator, OP¬ with [ineg], is merged in SpecNegP. MA relates [ineg] “across-the-board” to each of the individual [uneg] features; crucially, there is no relation between the [uneg] constituents as such. MA thus implies that Agree can be nonlocal, since in (12c), for instance, [B uneg] and [C uneg] intervene between [OP ineg] and [D uneg]. Negative Concord and (Multiple) Agree 239 We illustrate the application of the system to WF in (13). Here we apply Zeijlstra’s approach to an example in which three n-words, nooit, “never”; niemand, “no one”; and niet vele, “not much”, enter into an NC relation.

(13) a. dat er nooit niemand niet vele gewerkt eet nooit niemand that there never no.one not much worked has niet vele: NC “that no one has ever worked a lot” b.

8.3.3.2 Empirical Problems I: NC and Binary Relations

In the following sections, we discuss conditions on the application of NC in WF (section 8.3.3.2.1) and implications these have for an MA analysis (section 8.3.3.2.2).

8.3.3.2.1 CONDITIONS ON NEGATIVE CONCORD IN WEST FLEMISH According to the MA account, NC is a one-to-many relation in which the negative operator agrees with each n-word and in which there is no specific relation between the individual n-words. However, Haegeman and Zanut- tini (1996) signal that in WF the nature of the negative element also plays a role in generating NC.12 To the best of our knowledge, the data they present have so far not been taken into account in the literature on NC. Consider (14): in (14a), niemand ‘no one’ enters into an NC relation with niet, “not”; in (14b), niemand enters into an NC relation with niet dikkerst, “not often”; and in (14c), the three n-constituents, niet dikkerst, niemand and niet, enter into NC:

(14) a. dat er doa niemand niet gewerkt eet niemand nie: NC that there there no.one not worked has “that no one has worked there” b. dat er doa niet dikkerst niemand gewerkt eet niet dikkerst that there there not often no.one worked has niemand: NC “that not often did anyone work there” c. dat er doa niet dikkerst niemand niet gewerkt eet niet dikkerst that there there not often no.one not worked has niemand nie: NC “that not often did anyone work there”

In terms of Zeijlstra’s approach this means that niet dikkerst, “not often”; niemand, “no one”; and the marker of sentential negation niet, “not”, all 240 The Syntax–Semantics Interface carry a [uneg] feature which is checked by the [ineg] feature on the senten- tial negative operator. Since niet dikkerst and niet are in an NC relation in (14c), one might expect that (14d), with the same three n-constituents, now in the sequence niemand niet dikkerst and niet, would also be grammatical with an NC reading. But this is not the case: (14d) is ungrammatical with an NC reading. It is marginal with an interpretation in which niemand and niet dikkerst enter into NC and in which (stressed) niet expresses an independent negation, resulting in a double negation (DN) reading.13 When niet is replaced by niet meer, “no more” (14e), the NC reading is again available:14

(14) d. *dat er doa niemand niet dikkerst niet gewerkt eet niet dikkerst that there there no.one not often not worked has niet: ??DN/*NC DN: “that rarely did anyone not work there.” e. dat er doa niemand niet dikkerst niet meer gewerkt eet niet dikkerst that there there no.one not often not more worked has niet meer: NC DN: “that rarely did anyone work there any more.”

The ungrammaticality of the NC reading in (14d) cannot be due to a simple ban on the co-occurrence of niet dikkerst with niet since (14c) also contains niet dikkerst and niet and is grammatical with the desired NC reading. The ungrammaticality of the NC reading in (14d) is also not due to an anti-­adjacency condition on niet dikkerst and niet: In (14f ), niet dikkerst and niet are separated by the PP in dat us “in that house”, but this in itself is not sufficient to rescue the sentence. Apparently, niet dikkerst must be separated from niet by a simple n-constituent such as niemand (see (14g), and also (14c)):

(14) f. *dat ter niemand niet dikkerst in dat us niet niet dikkerst that there no.one not often in that house not niet: ??DN/*NC gewerkt eet worked has DN: ‘that no often did anyone not work in that house’ g. dat er niet dikkerst niemand in dat us niet niet dikkerst that there not often no.one in that house not niemand niet: NC worked has gewerkt eet “that not often has anyone worked in that house”

Furthermore, the problem with (14d) is also not directly due to the fact that niemand precedes niet dikkerst; this is shown by (14h), which only contains the sequence niet dikkerst niet and is ungrammatical with the NC reading. Once again, replacing niet by niet meer leads to a grammatical sentence with an NC reading (14i). (14h) again shows that it is not the adjacency of niet Negative Concord and (Multiple) Agree 241 dikkerst and niet that blocks the NC reading: simply inserting a constituent between niet dikkerst and niet is not sufficient to save the NC reading (14j):

(14) h. *da Valère doa niet dikkerst niet gewerkt eet niet dikkerst that Valère there not often not worked has niet: ??DN/*NC DN: “that Valère has not often not worked there” i. da Valère doa niet dikkerst niet meer gewerkt eet niet dikkerst that Valère there not often not more worked has niet meer: NC “that Valère has not often worked there any more” j. da Valère niet dikkerst in Gent niet *(meer) gewerkt eet that Valère not often in Ghent no *(more) worked has “that Valère has not often worked in Ghent any more”

Data such as those in (14) can be multiplied. What emerges from (14) is that although a complex n-constituent such as niet dikkerst, “not often”, can participate in NC readings, it cannot do so if it is the n-word that is closest to the sentential negator niet. Instead, such an n-constituents can only par- ticipate in an NC relation with niet if it is separated from niet by a simple n-constituent such as niemand. No such “antilocality” constraint applies to niemand (14a) or to the other simple n-words such as nooit, “never”; niets, “nothing”; and nieverst, “nowhere” (15a–c). (15d) shows that the presence of a geen-NP between niet dikkerst and niet does not suffice to yield an NC reading.

(15) a. da Valère nooit niet gewerkt eet nooit niet: NC that Valère never not worked has “that Valère has never worked” b. da Valère niets niet gezeid eet niets niet: NC that Valère nothing not said has “that Valère has not said anything.” c. da Valère nieverst niet over geklaapt eet nieverst niet: NC that Valère nowhere not about talked has “that Valère has never talked about anything” d. *da Valère niet dikkerst over geneenen student *niet dikkerst geenen that Valère not often about no student student niet: *NC niet geklaapt eet not talked has

For completeness’ sake, note that there is no adjacency requirement between the simple n-constituent and niet, as already shown by (14g), but see sec- tion 8.5.4.2. for further discussion. The restriction on the creation of NC readings for complex n-constituents such as niet dikkerst also applies to n-constituents containing the negative quantifier geen, “no”. We illustrate this point in (16). As just shown in 242 The Syntax–Semantics Interface (15c), nieverst, “nowhere”, and niet can enter into an NC relation. The n-constituent geneenen student cannot enter into an NC relation with niet (16a),15 but it can do so when it is separated from niet by nieverst; see (16b), in which geneenen student, nieverst, and niet enter into an NC relation. Alternatively, if niet is replaced by niet meer (16c), the sentence is also gram- matical with an NC reading:

(16) a. *dat er geneenen student over zukken dingen geneenen student that there no student about such things niet: ??DN/*NC niet klaapt not talks DN: “that no student does not talk about such things” b. dat er geneenen students nieverst niet over geneenen student that there no student nowhere not about nieverst niet: NC klaapt talks “that no student talks about anything” c. dat er geneenen student over zukken dingen geneenen student that there no student about such things niet meer: NC niet meer klaapt not more talks “that no student talks about such things any longer”

8.3.3.2.2 IMPLICATIONS FOR AN MA ANALYSIS The precedimg data show that WF NC is sensitive to the type of n-constituents involved and to their relative positions. Because all n-constituents (niemand, niet lange, niet dikkerst, niet, niet meer, geen-NP, etc.) can enter into an NC reading in some combinations, Zeijlstra’s (2004, 2008) MA analysis would lead us to expect that they can always enter into an Agree relation with the relevant negative operator, and it is not clear how MA formulated as a one- time across-the-board procedure can “distinguish” acceptable combinations from unacceptable ones. In (17), we provide schematic representations of (14c) and (14d) to illustrate this point. In an MA approach, it will be the case that niemand, niet dikkerst and niet can enter into an NC relation in (17a), while this is not possible in (17b).

(17) a. dat er doa [NegP [iNEG] niet dikkerst [uNEG] niemand [uNEG] niet [uNEG] .. eet]

b. *dat er doa [NegP [iNEG] niemand [uNEG] niet dikkerst [uNEG] niet [uNEG] .. eet] Negative Concord and (Multiple) Agree 243 As (17) shows, WF NC is subject to a locality condition, a property that is crucially absent from the formulation of MA. It is therefore not clear that the MA account can handle these co-occurrence restrictions, which are not addressed in Zeijlstra’s work (2004, 2008).16 In section 8.5, we develop our own proposal to derive NC readings in WF, using a modified version of Haegeman and Zanuttini’s (1996) proposal cast in terms of binary Agree.

8.3.3.3 Empirical Problems II: DP-Internal Negative Concord WF also displays DP-internal NC. This was illustrated in (8) and is also shown in (18a). We want to say that niet vele and geen enter into an Agree relation, because geen can only be present in the DP by virtue of the negative property of niet vele as shown in (18b) (see Haegeman 2002a):

(18) a. niet vele geen boeken not many no books “not many books” b. *vele geen boeken

In (18a), niet negates the quantifier vele. Geen itself does not express a quan- tificational negation of the nominal constituent: niet vele geen boeken, liter- ally “not many no books”, can only mean “not many books”; it can never be interpreted as meaning “no books”, nor can it mean “many books”. One might propose that geen bears the [uneg] feature, that niet in niet vele bears the [ineg] feature, and that [uneg] is subject to DP-internal checking as in (18c–d):

(18) c. niet vele [ineg] geen [uneg] ⇒Agree d. niet vele [ineg] geen [uneg]

However, the resulting complex n-constituent niet vele geen boeken will then carry an [ineg] feature (18d). Thus, following Zeijlstra’s (2004) account, the n-constituent should contribute its own negative value to the clause.17 This has two consequences: (a) If Neg-movement of n-constituents is driven by [uneg], the resulting n-constituent (18a), with the feature content in (18d), should not be subject to Neg-movement, since it no longer contains an unchecked [uneg]. (b) The n-constituent (18a) should not enter into an NC relation with other n-constituents in the clause. Bearing [ineg], the ­n-­constituent should give rise to a DN reading if it is c-commanded by the clausal negative operator with the [ineg] feature. These predictions, which follow from the standard assumption that when valuation has happened, the valued item is not able to enter into further Agree relations (Chomsky 2000 et seq.) or to undergo further movement (for extensive arguments, see Boeckx 2007, 2008 and Bošković 2007), are both incorrect. 244 The Syntax–Semantics Interface First, just like any other n-constituent, the constituent in (18a) must undergo Neg-movement (see also Footnote 6):

(18) e. *dan ze ketent van niet vele geen boeken zyn that they contented of not many no books are f. dan ze va niet vele geen boeken ketent zyn that they of not many no books contented are “that they are not pleased with many books”

Second, just like niemand, niet, and so on (for which we assume, following Zeijlstra 2004, 2008, that they bear [uneg]), niet vele geen boeken, “not many books” may enter into an NC relation with other n-constituents: In (19), niet vele geen boeken enters into an NC relation with nooit, “never”, and with niemand, “no one”.

(19) Ier en leest er nooit niemand niet vele geen boeken. here en reads there never no.one not many no books “No one ever reads many books around here.”

So since niet vele geen boeken undergoes Neg-movement and is able to enter into an NC relation, (18c–d) cannot be correct. That is, [uneg] must remain active and must not have been valued and deleted DP-internally. An alternative would be to assume that both niet vele and geen bear [uneg], basically along the lines of Zeijlstra’s proposals for NC at the clausal level. Under MA, then, they would enter into an Agree relation with the [ineg] feature of the clausal negative operator:18

(20)(20)a. a. [O[OP [iNEGP [iNEG] [Neg] [[NegvP [[nietvP [niet [uNEG [uNEG] vele] vele geen geen [uNEG [uNEG]…]]]]]…]]]]⇒Agree⇒Agree

b. b. [O[OP [iNEGP [iNEG] [Neg] [[NegvP [[nietvP [niet [uNEG [uNEG] vele] vele geen geen [uNEG [uNEG]…]]]]]…]]]]

(20) represents both geen and niet vele as being checked by (hence directly related to) the [ineg] feature of the negative operator, but it fails to cap- ture their observed DP-internal interdependency. The MA analysis would provide (21a) with the representation in (21b), again with no dependency between the DP-internal [uneg] on niet vele and that on geen.

(21) a. T’eet ier niemand niet vele geen boeken. it has here no.one not many no books “No one has many books here.” b.b. [NegPOP[iNEG] [niemand[ uNEG] [niet vele[uNEG] [geen[uNEG]boeken]]]] Negative Concord and (Multiple) Agree 245 But the availability of geen does depend on that of niet vele: (21a) does not have a variant (21c) in which geen is directly dependent on the sentential negation, with MA applying as shown in (21d–e):

(21) c. *T’eet ier niemand vele geen boeken. it has here no.one many no books

d. *[OP [ineg] [Neg niemand [uneg] [vP vele geen [uneg]. . .]]] ⇒ Agree e. e. *[ OP [i NEG] [Neg niemand [uNEG] [vP vele geen [uNEG]…]]]

As DP-internal geen in (21) is seen to depend on the presence of the DP- internal negative niet, what is required instead of (21e) is a representation like (21f) in which we can first establish an Agree relation between the [uneg] features on both geen and niet vele, prior to establishing the NC relation with the [neg] feature on niemand.

(21)(21)f. f. [NegP[NegPOP [OPiNE G[i]NE …G] [ vP… niet[vP niet[uNEG [u]NEG vele] vele[vP geen[vP geen [uNEG [u]…NEG]]]]…]]]

If DP-internal NC is analyzed as a process relating two n-constituents each of which bears [uneg], this leads to the hypothesis that Agree can be established between items with [uneg].

8.3.4 Summary We have shown in this section that apart from the conceptual issue concern- ing the role of locality, the MA approach to NC has two empirical short- comings when applied to WF. Specifically,

1. It fails to predict the binary matching restrictions on NC; 2. It fails to provide a separate application for NC/MA in cases of DP- internal NC.

In what follows, we show how these two problems can be dealt with by an alternative approach to NC in terms of binary Agree.

8.4 Negative Concord Is Binary Agree (in West Flemish)

8.4.1 Agree One outcome of our discussion in section 8.3 is that in order to cap- ture the observed locality restrictions on WF NC in terms of Agree, we 246 The Syntax–Semantics Interface need to abandon Zeijlstra’s (2004, 2008) “across-the-board” MA and to revert to binary Agree. Furthermore, to accommodate DP-internal NC we need to be able to establish an Agree relation between [uneg] features. In this section, we outline the conception of Agree that we implement in section 8.5.19 We propose the following informal definition (building in particular on Pesetsky and Torrego 2007: 268):20

(22) Agree α Agrees with β if α c-commands β, α and β both have a feature F, and there is no γ with the feature F such that α c-commands γ and γ c-commands β.21

The locality condition in the latter half of the definition (“and there is no . . .”) enables us to account for cases in which NC is disallowed. We return to this point shortly. Before doing so, we note that—crucially, for our purpose—our definition of Agree allows for agreement between two unin- terpretable/unvalued features (see also López 2008 for a different imple- mentation of the same idea).22 Pesetsky and Torrego (2007: 269) elaborate on this point as follows:

If value assignment is allowed to apply vacuously, the derivation on this view contains two unvalued occurrences of F before Agree, and contains exactly the same two unvalued occurrences of F after Agree. If the feature sharing view is correct, however, Agree between two unval- ued occurrences of F [. . .] is far from vacuous, since its output will be a structure that contains only one occurrence of F with two instances.

In an Agree relation between uninterpretable features, it is difficult to say which is the probe and which is the goal, and whether there is a probe–goal relationship at all between the two features. Pesetsky and Torrego (2007: 269, fn. 9) acknowledge this, saying that “when Agree applies between two unvalued occurrences of a feature, inspection of the output cannot reveal whether the goal replaced the probe, or vice versa”. We depart from their proposal in that we do not adopt a feature-sharing view and in that we assume that after Agree between two uninterpre- table features, the uninterpretable feature survives on the higher ele- ment. An Agree relation that is allowed in principle by (22) but must be ruled out on independent grounds is a relation between two interpretable features. That should be excluded because if Agree reduces the agree- ing features to one, in effect interpretable features—information that has to be retained—would be deleted (see Chomsky’s 1995 notion of Full Interpretation). Negative Concord and (Multiple) Agree 247 Schematically, our proposal can be illustrated as follows:

(23)a(23)a.[.[α α [β [β [γ]]][γ]]]

iF iF uF uF uF uF

⇒ ⇒ AgreeAgree

b. b. iF iF uF uF ⇒ ⇒ AgreeAgree

c. c. iF iF

In (23), β c-commands γ and, according to (22), by virtue of their shared feature (F), they are able to Agree, eliminating the lowest feature ([uF]). The topmost [uF] on β survives and, given that α c-commands β, it is able to Agree with [iF] on α. On this approach, Agree operates “stepwise” and locally.23

8.4.2 Negative Concord as Binary Agree Returning to NC, in (24) we give a schematic representation of how binary Agree can derive NC. We use overstrikes to indicate that only one of the [uneg] features survives after Agree. As a result of stepwise Agree, just one [ineg] feature is left.

(24) a. [C [uneg]] [D [uneg]] ⇒ Agree b. [C [uneg]] [D [uneg]] Merge [B [uneg]] c. [B [uneg]] [C [uneg]] [D [uneg]] ⇒ Agree d. [B [uneg]] [C [uneg]] [D uneg]] Merge [A [ineg]] e. [A [ineg]] [B [uneg]] [C [uneg]] [D [uneg]] ⇒ Agree f. [A [ineg]] [B [uneg]] [C [uneg]] [D [uneg]]

8.5 Decomposing n-Words in West Flemish: Our Proposal In this section, we propose an analysis of NC in WF based on a particular feature decomposition of the n-words. We should stress at the outset that our proposal is restricted to WF. Although we are convinced that our analy- sis can ultimately be extended to other NC languages, it is not clear to us at this point that it can capture all crosslinguistic variation in NC (see Gianna- kidou 2006 for discussion of variation across NC languages). We plan to return to the comparative aspect in future work. 248 The Syntax–Semantics Interface 8.5.1 A “Maximization” Requirement on Negative Concord Haegeman and Zanuttini (1996: 143) describe the cooccurrence restrictions on NC in some detail. They classify WF n-constituents in terms of their internal structure and feature composition. We do not repeat their discus- sion, but simply provide Table 8.1, which shows their classification of the n-constituents with the associated features (from Haegeman and Zanuttini 1996: 145). The bare quantifiers such as niemand and niets correspond to our simple n-words. In Table 8.1, [q] is a quantificational feature, yes means that a NC reading is possible, no that it is not possible. Haegeman and Zanuttini (1996) derive NC by means of Neg-factorization, which extracts the negative component from all the items involved. Factor- ization operates in a stepwise binary fashion: rather than across-the-board factorization as in (25a), Haegeman and Zanuttini propose a pairwise factor- ization as in (25b):24

(25) a. [x¬][y¬][z¬] ⇒ [[x, y, z]¬] b. [x¬][y¬][z¬] ⇒ [x¬][[y, z]¬] ⇒[[x,y, z]¬]

The precise conditions under which pairwise factorization operates are not clear, and how it could be implemented to derive the restrictions seen in Table 8.1 is not straightforward. The internal makeup of n-constituents plays a role in determining how they enter into NC relations. Starting from Haegeman ans Zanuttini’s classification, we propose here that WFn -words be composed featurally as in (26):25

(26) a. niet [uneg, uq] “not” b. niemand [uneg, iq] “no one” c. geen-NP [uneg] “no NP” d. niet meer [uneg] “no more” e. niet dikkerst [uneg] “not often”

Table 8.1 Head Features on Negative Elements and Co-Occurrence Restrictions

Bare Q Geen-NP Niet [neg, q] [q] [neg]

Bare Q yes yes yes [neg, q] niemand niets niemand geen geld niemand niet no.one nothing no.one no money no.one not Geen-NP yes yes no [q] geen mens niemand geen mens geen tyd *geen mens niet no person no.one no person no time no person not Niet meer yes yes no [q] niemand niet meer geen mens niet meer *niet meer niet no.one no more no person no more no more not Negative Concord and (Multiple) Agree 249 These items enter into NC relations as follows:

(27) a. niemand niet [uneg, iq] + [uneg, uq] b. * niet dikkerst + niet [uneg] + [uneg, uq] c. * geen-NP + niet [uneg] + [uneg, uq] d. * niet meer + niet [uneg] + [uneg, uq] e. niemand + geen-NP [uneg, iq] + [uneg] f. niemand + niet meer [uneg, iq] + [uneg]

Give the patterns displayed in (27), NC (and its formalization in terms of Agree) seems to be subject to a maximization requirement, in the sense that, having two uninterpretable features, niet can match and undergo Agree only with an item that carries both of them. A match between niet and the simple n-constituent niemand is possible, the latter combining a [uneg] feature with an [iq] feature, but a match between niet and a complex n-constituent is not possible because the latter lacks the quantificational feature. It looks as if, because of the lack of maximal matching, [uq] of niet remains unchecked in (27b–d). The same problem does not arise for NC between niemand with its two features [uneg] and [iq] and the complex n-constituents with their one feature [uneg]: Even though niemand does have one additional feature, [iq], the latter is interpretable and hence need not be checked by Agree (27e–f). The feature composition in (26) gives us the right results to derive NC readings, but at this stage the feature sets are simply postulated in order to do just that. In part on the basis of Haegeman and Zanuttini (1996: 143– 145), section 8.5.2 motivates the feature composition of the n-constituents in (27), using semantic, morphological, and syntactic criteria.

8.5.2 Motivation for the Decomposition

8.5.2.1 Simple n-Words Simple n-words such as niemand, “no one”, and niets, “nothing”, are the negative counterparts of the (Standard Dutch) quantifiers iemand, “some- one”, and iets, “something”.26

(28) quantifier negative quantifier iemand “someone” niemand “no one” iets “something” niets “nothing”

We propose that in the quantifiers iemand, “someone”, and iets, “something”— ie spells out the quantificational component and bears [iq]. We assume that iemand occupies a functional head in the nominal domain and moves to D. In simple n-words such as WF niemand, n-spells out [uneg]27 and is merged with iemand, “someone”, through head movement, and this complex ends 250 The Syntax–Semantics Interface up in D.28 The syntactic structure is given in (29) (see Haegeman 2002a, Troseth 2009 and Aelbrecht 2012, for NegP within DPs):



Crucial for our account is Haegeman and Zanuttini’s (1996) hypothesis that the [iq] feature in niemand is available on the topmost layer of the DP and hence remains accessible at future derivational steps. Because [uneg] remains to be valued, the n-words are still visible for further operations after the derivation in (29b).

8.5.2.2 The Sentential Negator niet Following Zeijlstra (2004, 2008) and Penka (2007a,b,c), we assume that sentential negation is encoded in an abstract operator associated with an [ineg] feature. With Zeijlstra (2004, 2008), we assume that the marker of sentential negation, niet, bears a [uneg] feature, which will be eliminated by Agree with the clausal [ineg] feature. For arguments, see Zeijlstra (2004, 2008) and Penka (2007a,b,c). In addition, however, we propose that niet carries [uq]. The association of a quantificational feature with niet is morphologically motivated. Spe- cifically, we suggest that niet is decomposed as n + iet, parallel to niets.29 Following up on the discussion in the preceding section, niet is part of the paradigm of simple n-words containing nie: niemand, “no one”; niets, “nothing”; and nieverst, “nowhere”. In stage II of Jespersen’s cycle in Middle Dutch, a sentential negative marker niet developed from the negative indefinite niet, “nothing”, which was used adverbially, and it became a reinforcer of sentential negation “not at all” (sentential negation having originally been expressed solely by the preverbal negative marker; see, e.g., van der Auwera and Neuckermans 2004, Breitbarth and Haegeman 2008). We speculate that the development of the adverbial reinforcer into a marker of sentential negation involved a feature change: [iq] associated with—ie changes into [uq]. (For discussion of grammaticalization in relation to the diachronic development of nega- tion, see van der Auwera and Neuckermans 2004 and in particular van Gelderen 2008). Negative Concord and (Multiple) Agree 251

With its two features [uneg, uq], niet enters into an NC relation with negative quantifiers such as niemand, niets, which also display the two fea- tures. Postulating that niet carries the feature set [uneg, uq], however, has as a consequence that the clause must also contain a matching feature set [ineg, iq], This means that the negative operator bears [ineg, iq]. In other words, only if both features are instantiated on the negative operator will the uninterpretable features of niet be able to be checked. We understand this to mean that what is labeled “sentential negation” is not merely a nega- tive feature taking scope over the clause; rather, it involves negative quanti- fication over events.

8.5.2.3  Complex n-Constituents According to (26), the feature specification of complex n-constituents, such as niet dikkerst, “not often”; niet meer, “no more”; and niet vele, “not many”, differs from that of niet. This might appear surprising since these n-constituents contain the formative niet and we would a priori want niet as the marker of sentential negation and niet in complex n-constituents such as niet vele, “not many”, to be the same formative, with [uneg, uq]. We will indeed assume that, like the marker of sentential negation, niet in complex n-constituents bears the features [uneg, uq]. In addition, however, we pro- pose that these complex n-constituents contain a quantificational element. For instance, in niet vele we assume that vele, “many”, has a quantifica- tional feature that has to be interpretable because the ability to quantify is inherent to this item. Since niet negates vele in niet vele, we also assume that niet c-commands vele and is the specifier of a DP-internal NegP (Haegeman 2002a). On the basis of this decomposition, the [uq] feature on niet can be checked inside the n-constituent as shown in the simplified structure in (30). We assume, following Haegeman (2002a), that niet moves from Neg to D.30

b. niet vele ⇒ Agree [uneg, uq] [iq] 252 The Syntax–Semantics Interface c. niet vele [uneg] [iq]

We assume that [iq] is too deeply embedded to take part in further Agree operations at the clausal stage. That is, at the next derivational step, only the feature [uneg] is visible. The precise implementation of this idea requires that we postulate that DPs are phases and that D is the relevant phase head (Svenonius 2004: 267, Chomsky 2007: 26). We assume that QPs in WF are merged below the D head, and, following Haegeman (2002a), that NegP is merged at the top of the DP. Chomsky’s (2001) Phase Impenetrability Con- dition (PIC) allows for Agree across a phase boundary until the next phase is merged. This means that when the verbal phase head is merged in the clause, a probe in the higher phase is unable to Agree with vele, which is not in the accessible domain of the lower phase.31 We assume that a similar derivation can extend to the complex negative adverbials niet dikkerst, “not often”; niet meer, “no more”; and so on.32 This assumption has important consequences for the internal makeup of such constituents, but for reasons of space we do not discuss this issue fur- ther here.

8.5.2.4  Geen-NPs Like the n-constituents discussed in the preceding section, geen-NPs are both quantificational and negative. Once again, though, unlike simple negative quantifiers such as niemand, “no one”, they do not enter into NC with niet. We assume, as was the case for niet vele, that geen-NPs differ from simple negative quantifiers in that their quantificational feature is not instantiated on the head of the phrase. Haegeman and Zanuttini (1996: 144) present some evidence in favor of this. First, in the singular the geen-NP has two variants, as shown in (31). In (31b), which is the emphatic variant of (31a), the negative element gen is distinguished morphologically from the quanti- ficational element eenen (see also Kranendonk 2008):

(31) a. geenen boek no book b. gen-eenen boek no one book

The singular indefinite article eenen corresponds to a zero quantifier in the plural. (32) illustrates the decompositions:33

(32) Singular Plural a. affirmative eenen boek Ø boeken b. negative gen- eenen boek geen Ø boeken Negative Concord and (Multiple) Agree 253 Second, WF has DP-internal NC (33), as seen earlier. In (33), the quanti- ficational force of the phrase is expressed by the quantifier niet vele, and geen simply acts as a negative element entering into NC with the negative component of the negated quantifier niet vele.

(33) niet vele geen boeken not many no books “not many books”

We propose to align geen with niet, so that geen has both a [uneg] and a [uq] feature. The [uq] feature on geen would be valued DP-internally under Agree with [iq] on eenen or on a nonovert article.

(34) a. gen- eenen boek no one book [uneg, uq] [iq] b. geen Ø boeken no books [uneg] [iq]

8.5.3 Maximization and Intervention In terms of the feature sets proposed in (26), the restrictions on NC in (27) suggest that NC is subject to a maximization condition (see Chomsky 2001) in that nie, with its [uneg] and [uq] features, can enter into an NC relation only with simple n-constituents also instantiating both an accessible [neg] feature and an accessible [q] feature. This section shows that this maximi- zation requirement can be made to follow from intervention. Intervention occurs in a case where α and β share a feature F but in which there is a γ such that α c-commands γ and γ c-commands β and γ both have the feature F with β (cf. (22)). In this case, γ will be an intervener and block the Agree relation between α and β. Consider (35), where the n-constituents will enter into NC. Our definition of Agree will allow both uninterpretable features on niet to be checked by the features on niemand; after Agree, [uneg] survives only on niemand. In turn, the surviving [uneg] will Agree with the [ineg] of sentential negation. (35) is a case in which the feature sets of niemand and niet, the agreeing items, are identical.

(35) α γ β OP niemand niet [ineg, iq] [uneg, iq] [uneg, uq] ⇒ Agree [uneg, uq] [ineg, iq] [uneg, iq] ⇒ Agree [uneg] [ineg, iq] [iq] 254 The Syntax–Semantics Interface In (36), the feature sets of γ and β are not identical, and NC is not available.

(36) α γ β OP niet vele niet [ineg, iq] [uneg] [uneg, uq]

The absence of NC in (36) can be derived as a result of intervention (Rizzi 1990).34 Agree can apply to γ (niet vele) and β (nie), resulting in a configura- tion that will only delete [uneg] on β (nie), stranding [uQ] there. [uneg] sur- vives on γ (niet vele), the c-commanding n-constituent. We use overstriking here to show the effect of Agree.

(37)a. γβ

niet vele niet

[uNEG][uNEG][uQ] ⇒ Agree [uNEG]

[uNEG][uNEG][uQ]

Feature checking

The next step of the derivation involves the merger of sentential negation:

(38)b. αγβ

OP niet vele niet ⇒ Agree [uNEG]

[iNEG, iQ][uNEG][uQ] ⇒ *Agree [uQ]

Feature checking

In (37b), [uneg] on γ (niet vele) Agrees with [ineg] on α (OP). However, [uq] on β (niet) cannot be valued by [iq] on α (OP) because [uneg] on the c-commanding γ (niet vele) intervenes. We are assuming that [neg] and [q] belong to the same feature classes (on feature classes, see Starke 2001, Rizzi Negative Concord and (Multiple) Agree 255 2004). (37b) instantiates a classic case of intervention: OP (α) c-commands niet vele (γ) and niet vele c-commands niet (β); niet vele is a closer goal shar- ing a feature of the relevant class with niet. Thus, we have shown that the locality condition on Agree derives the maximization requirement on items entering into NC. We take this to be a welcome result because it means that we do not have to stipulate maximization.35

8.5.4 Illustrations and Extensions

8.5.4.1 Some Examples The application of binary Agree to derive NC in (38) shows that there is no adjacency requirement on NC: In this example, niet vele and niet meer enter into NC while being separated by the PP tegen Valère. We assume that the features of the latter constituent belong to a different feature class in the sense of Starke (2001) and Rizzi (2004) and will not give rise to intervention.

(38) a. da Jan niet vele tegen Valère niet meer klaapt that Jan not much to Valère not more talks “that Jan doesn’t talk to Valère much any more” b.b. da Jan [[iNEG, iQ]niet vele [uNEG]tegen Valère niet meer [uNEG] klaapt ...

Agree Agree

As shown earlier and illustrated in (39a), NC also applies DP-internally in WF. The derivation of the NC reading of (39a) is given in (39b–c).

(39) a. niet vele geen boeken not many no books “not many books” b. niet vele [uneg, uq] [iq] ⇒ Agree [uq] and [iq] c. niet vele [uneg, iq] d. niet vele geen [uneg iq] [uneg, iq] ⇒ Agree [uneg] niet vele geen [uneg, iq] [iq] 256 The Syntax–Semantics Interface

Niet vele geen boeken retains [uneg, iq] and can then enter into further NC relations in the clause; however, like geen boeken, it cannot enter into NC with niet. Recall that when the vP is merged in the clause, the complement of D is spelled out. This makes the [iq] features on vele and geen unavail- able. Thus, the [uq] feature on the clausal niet will remain unchecked and the derivation will crash.

(40) Ier en leest er nooit niemand niet vele geen boeken niet *(meer). here en reads there never no.one not many no books not *(more) “No one ever reads many books around here.”

Notice that this also explains why vele geen boeken (see (18)) is disallowed. In this case, geen will have an unchecked [uneg] feature, and since there is no other n-word within the DP, when the clausal vP is merged, this unchecked feature will be spelled out (because it is located in D’s complement), thus causing a crash.

8.5.4.2 Further Intervention Effects Our approach correctly predicts that nonnegative quantifiers may also inter- fere with the various Agree relations between the n-constituents undergoing NC (see also Haegeman and Zanuttini 1996).36 While a definite DP does not give rise to intervention in (41a), the quantifier alles disrupts the NC rela- tion between niemand and niet in (41b):

(41) a. dat er niemand die boeken niet kent niemand niet: NC that there no.one those books not knows “that no one knows those books” b. dat er niemand alles niet kent niemand niet: *NC that there no.one everything not know cOc. P niemand alles niet

[iNEG, iQ][uNEG, iQ][iQ][uNEG, uQ] AGREE [uQ]

*AGREE [uNEG]

This follows straightforwardly. The quantifier alles, “all”, has [iq]. This feature will be able to check [uq] of niet. The stranded [uneg] feature on niet will then no longer be available for Agree (because of intervention), and thus, the NC reading cannot be derived.37 Negative Concord and (Multiple) Agree 257 8.6 Conclusion In this chapter, we have shown how a detailed analysis of negative concord in West Flemish questions the validity of Multiple Agree as a mechanism to derive negative concord. At a more general level, the data also challenge the validity of MA as an operation of narrow syntax. We have argued that the simpler and less powerful Agree mechanism, which is binary and strictly local, is superior to MA—an across-the-board phenomenon—for deriving the data in question. Agree in its original format as a binary operation offers a way of dealing with the various intervention effects found in WF NC. Our proposal has conceptual and empirical consequences that we hope to return to in future work. In particular, on the conceptual side, we would like to examine whether other cases that have been accounted for in terms of MA can be reanalyzed in terms of our proposal. On the empirical side, it would be interesting to find out whether the crosslinguistic variations among NC patterns described in Giannakidou (2006) and the diachronic development and grammaticalization of n-constituents (“Jespersen’s cycle”) can be captured in relation to the feature content of n-constituents.

Notes * Various aspects of this research were presented at the Departments of Linguistics in Barcelona, Cambridge, and Tromsø, at NELS 2008, and at the LSA Annual Meeting 2009. We thank the audiences for their comments. We also thank Klaus Abels, Cedric Boeckx, Željko Bošković, Norbert Hornstein, Damien Laflaqui- ère, Amélie Rocquet, Michal Starke, Henriëtte de Swart, Raffaella Zanuttini, and two anonymous reviewers for very valuable comments and suggestions. Obviously the final responsibility for the article remains with the authors. Liliane Haegeman’s research has been funded by FWO through the 2009 Odysseus grant G091409. 1 The fact that the negative expressions nooit, “never”, and niets, “nothing”, express a single negation is often referred to as “negative spread”, with negative concord being reserved to the relation between en and niet and the n-constit- uents (see Den Besten 1989). We do not make this distinction; instead, we use the term negative concord to refer to any context in which multiple negative constituents express a single sentential negation. 2 (1) also contains the morpheme en, which, though related to the expression of sentential negation, is not able to express sentential negation all by itself. We discuss it briefly in section 8.2.1. Except when absolutely sentence final, when both [nit] and [ni] are found, niet is usually pronounced [ni]. This is why niet has often been given as nie in the literature. Here we stick to the spelling niet. 3 We remain agnostic here on whether there is a functional projection NegP. As far as we can see, this issue, though relevant in its own right, does not bear on the current discussion. 4 See Brown (1999: 29ff) for an earlier proposal that n-words carry a [uneg] feature. Brown’s discussion of WF (1999: 43–44), however, lacks detail and we cannot assess it here. 258 The Syntax–Semantics Interface 5 Giannakidou (2006: 328) defines n-words informally as in (i). (i) N-word: An expression α is an n-word iff: a. α can be used in structures containing sentential negation or another α expression yielding a reading equivalent to one logical negation; and b. α can provide a negative fragment answer. 6 In reply to a question from an anonymous reviewer: (4b) is ungrammatical because it contains en, which requires the presence of an n-constituent with sen- tential scope. Not having undergone Neg-movement, van niemand, “of no one”, cannot take sentential scope. Without en the example would be possible with van niemand—with niemand stressed—expressing local negation, for instance in the following sequence: (i) Kweten juste da Jan ketent is van Lieve, I.know only that Jan contended is of Lieve da José ketent is van Jan, en da Valère ketent is van niemand. that José contended is of Jan, and that Valère contended is with no.one. “What I know is that Jan is pleased with Lieve, that José is pleased with Jan, and that Valère is pleased with no one.” See also Haegeman (1997) and Svenonius (2002) for local negation. 7 Once again, a negated constituent with clausal scope has to undergo leftward movement. For reasons that will become clear in section 8.3.3.2 (discussion of text examples in (17)), we cannot show this by means of the distribution of the relevant constituent with respect to niet, such negative constituents being incompatible with niet. However, as the contrast in (i) shows, a com- plex negative constituent which is the complement of an adjective (e.g., ketent, “contented”) must move to the left of that adjective. (See Haegeman 1997 for arguments that this is not simply because of the quantificational nature of the constituent). (i) a. *da Valère ketent van geen studenten en-was that Valère contented of no students en-was b. da Valère van geen studenten ketent en-was that Valère of no students contented en-was “that no one isn’t satisfied with anything” 8 In the Lapscheure dialect, DP-internal NC is never possible with a DP-internal negated nonquantificational descriptive adjective: inside the bracketed DP in (ia), the negated attributive adjective goed/goej, “good”, does not allow dou- bling by geen (see Haegeman 2002a), as shown in (ib). The grammatical variant is (ic). Contrary to claims in Zeijlstra (2004: 111), the pattern we are concerned with cannot be described as niet A geen N, “not A no N”; instead, it must be described as niet Q geen N, “not Q no N”. (i) a. Z’oan doa [goej eten]. they had there good food b. *Z’(en)- een doa [niet goed geen eten]. they (en)- have there no good no food c. Z’(en)- een doa [geen goej eten]. They (en)- have there no good food “They didn’t have good food there.” 9 According to Zeijlstra, NC languages (i.e., languages with NegP) have “syn- tactic negation”, non-NC languages (i.e., languages without NegP) have “semantic negation”. In an NC language, overt n-constituents have [uneg], while the operator which carries [ineg] is covert. Zeijlstra ties the presence of NegP to the availability of [uneg] features in NC languages. Conversely, in a non-NC language the overt n-constituents have an [ineg] feature, there are no [uneg] constituents, there is no NegP, and there is no nonovert negation operator. Negative Concord and (Multiple) Agree 259 Zeijstra offers a functional explanation for the absence of an overt negative operator in NC languages (2004: 249). For the present discussion, we adopt Zeijlstra’s proposals, but see Penka (2007a,b) for a different implementation. 10 Zeijlstra (2004) assumes that the head of NegP is also associated with a [uneg] feature. This will not play a role in our discussion, so we leave it out of our rep- resentations for expository reasons. 11 A further problem arises with Zeijlstra’s analysis of WF en. Zeijlstra (2004) assumes that en is associated with an uninterpretable feature [uneg], which is licensed under agreement with an interpretable feature on a nonovert negative operator (see below in the text for details). On Zeijlstra’s account, the question then arises why (i) is not acceptable: (i) a. *Valère en-klaapt. Valère en-talks

b. *[NegP OP ¬ [ineg] [vP Valère [v’ en-klaapt [uneg]]]] (Zeijlstra 2004: 255) See Section 8.2.1 for a different account that is compatible with the data. 12 The data discussed by Déprez (2000) are different in that they implicate a pre- verbal/postverbal asymmetry, which is not at issue here. 13 In general, DN readings are marked, and where an NC reading is available, that will be the default interpretation. For reasons of space, we do not present an analysis of DN readings, but we hope to return to the issue in future work. 14 The final consonant of meer, “more”, often remains unpronounced 15 See section 8.5.2.4 on the alternation between geen and geneenen. 16 An approach in which NC is derived by unselective binding of the n-constit- uents by an operator (see, e.g., Ladusaw 1992, Acquaviva 1993, Piñar 1996, Giannakidou 1997) also does not seem to be able to derive the pairwise relations observed here without additional machinery. In their discussion of NC in Italian dialects, Manzini and Savoia (2008: 91) propose that the binding of several vari- ables by the same quantifier requires that the variables be of the same semantic type, and they invoke a system with the features N(eg) and Q. This requirement is parametrized. Again, this account does not lead us to expect the particular pairwise relations displayed in WF. 17 For full discussion of Zeijlstra’s typology, see also Biberauer and Zeijlstra (2012). 18 As an anonymous reviewer observes, an MA analysis could also claim that the [uneg] of geen is too deeply embedded inside the DP phase for the negative operator to Agree with it. However, under an MA-analysis it is not clear how this embedded [uneg] would be checked so that it does not cause a crash. One could amend the MA analysis such that MA takes place within the DP, and then within the clause, though it is not clear what the MA operation within the DP would be in Zeijlstra’s framework since the DP contains two unvalued features and no interpretable one that can function as a probe. This would in fact be tantamount to reintroducing binary Agree. 19 Although we only deal with negation in this article, our definition of Agree is intended to be a general definition. We hope to return to this in future work. 20 We thank Norbert Hornstein (p.c.) for discussing the concept Agree with us. 21 Pesetsky and Torrego (2007: 268) give the definition in (i). (i) Agree (feature-sharing version) (a) An unvalued feature F (a probe) on a head H at syntactic location α (Fα) scans its c-command domain for another instance of F (a goal) at location β (Fβ) with which to agree. (b)  Replace Fα with Fβ, so that the same feature is present in both locations. 22 We depart from Pesetsky and Torrego (2007) and from Moscati (2006) in that we use interpretable/valued and uninterpretable/unvalued are used interchangeably. 23 The system we are advocating bears some resemblance to a proposal made by Frampton and Gutmann (2006), who pursue the following approach to agree- ment: “Agree induces feature sharing, with matching features coalescing into 260 The Syntax–Semantics Interface a single shared feature, which is valued if either of the coalescing features is valued” (p. 128). However, although their approach and ours seem to derive the same result, it is unclear what kind of operation “coalescing” is. Therefore, we do not use this terminology. 24 We have adjusted this representation in terms of our own article. In particular, we abandon the idea that n-words are universal quantifiers. 25 We are grateful to Michal Starke and Klaus Abels for very useful discussions regarding the feature content of these elements. Neither is responsible for the way we have used their comments. On the relevance of [neg] and [q] to NC, see also Manzini and Savoia (2008). 26 For reasons which are not clear, WF does not use iemand and iets; instead, it uses entwine, “someone”, and eentwa, “something”, both of which are composed of an indefinite article een and a wh-word. See Haegeman (1991) on these indefi- nite pronouns in WF. 27 For arguments that the [neg] feature on the n-constituent is uninterpretable, see the discussion in section 8.3. 28 Thanks to an anonymous reviewer for suggesting this implementation. 29 Some speakers, though not Liliane Haegeman, still use niet as an alternative to niets. 30 In our proposal, the [iq] feature on vele is not instantiated on niet, with which an Agree relation is established. This is not compatible with Pesetsky and Tor- rego’s (2007) proposal, according to which the output of Agree is a single feature shared by two locations. As mentioned, we do not adopt feature sharing here. Instead, we propose that the interpretable feature remains on the element where it is interpreted, as is standardly assumed. Observe that the issue is different for cases where two uninterpretable features Agree (see section 8.4.1). For such cases, we propose that the feature survives on the topmost element. This is required to ensure that the uninterpretable feature is not spelled out in a lower phase if the lowest n-word is in a phase other than the topmost one. As a reviewer points out, we therefore have to adopt two different algorithms for the two Agree relations. This is perhaps unfortunate. We intend to look into this in future work. 31 Bošković (2007) has argued that Agree should not be constrained by the PIC. However, Richards (2011) shows that when reanalyzed, the data Bošković dis- cusses can, in fact, be analyzed in accordance with the PIC. 32 Consider also (i), in which the predicate niet ziek, “not sick”, enters into NC with niet meer, “not more”, but not with niet, “not”: (i) da Valère niet ziek niet *(meer) is that Valère not sick no *(more) is “that Valère isn’t sick any more”This suggests that niet ziek be treated like the complex n-constituents composed with niet, but at first sight it cannot be straightforwardly analyzed in terms of our system. Ziek by itself does not seem to be quantificational. We therefore suggest that there is a silent quanti- ficational element,degree or quant between niet and ziek (see Kayne’s 2005 approach to silent elements, and Corver 1997a,b on the internal syntax of adjectival phrases and the role of degree and quantification) and that this ele- ment bears [iq]. As a result of Agree, the [uQ] feature on niet will duly be checked and only the uneg feature will be visible for further Agree operations. The silent degree could be said to introduce the default standard by which “sickness” is measured. 33 Our analysis differs from Kranendonk (2008), who assumes that geen is a quantifi- cational element. An alternative would be to assume that geen-NPs are associated Negative Concord and (Multiple) Agree 261

with the features [uneg] and [iQ]. Geen spells out [uneg]; [iQ] is located on the (possibly null) article, which we assume to be lower than DP (say, NumP). 34 As an anonymous reviewer points out, the structure in (36) is very reminiscent of a pattern that according to Starke (2001) and Rizzi (2004) creates no intervention effects. We cannot discuss this issue comprehensively here, nor how to reconcile the Starke–Rizzi approach with the way we are analyzing intervention. We intend to look into this in future work. See also Boeckx and Jeong (2004) on intervention. 35 An anonymous reviewer asks whether our proposal predicts a problem for φ-agreement between T and a wh-subject since the wh-subject has a [wh] feature that T does not have. We assume that no problems will arise because φ-features and [wh] features belong to different classes in the sense of Rizzi (2004). 36 Thanks to an anonymous reviewer for raising this question. 37 Zeijlstra (2004: 184–187) discusses the relation between sentential negation and universal quantifiers. We speculate that many of the issues he describes may be subject to an analysis in terms of the intervention effects we observe for WF. For reasons of space, we do not develop this point here.

References Acquaviva, P. 1993. The Logical Form of Negation: A Study of Operator-Variable Structures in Syntax. Doctoral dissertation, Scuola Normale Superiore, Pisa. Adger, D. 2003. Core Syntax. Oxford: Oxford University Press. Aelbrecht, L. 2012. Ellipsis in negative fragment answers. International journal of Basque Linguistics and Philology XLVI: 1–15. Baker, M. C. 2008. The Syntax of Agreement and Concord. Cambridge: Cambridge University Press. Biberauer, T. and Zeijlstra, H. 2012. Negative concord in Afrikaans: Filling a typo- logical gap. Journal of Semantics 29: 345–371. Boeckx, C. 2007. Understanding Minimalist Syntax: Lessons from Locality in Long Distance Dependencies. Malden, MA: Blackwell. Boeckx, C. 2008. Bare Syntax. Oxford: Oxford University Press. Boeckx, C. and Jeong, Y. 2004. The fine structure of syntactic intervention. In Issues in Current Linguistic Theory: A Festschrift for Hong Bae Lee, C. Kwon and W. Lee (eds.), 83–116. Seoul: Kyungchin. Borkin, A. 1971. Polarity items in Questions. In Papers from the Seventh Regional Meeting of the Chicago Linguistic Society, D. Adams, M. A. Campbell, V. Cohen, J. Lovins, E. Maxwell, C. Nygren and J. Reighard (eds.), 53–62. Chicago, IL: University of Chicago, Chicago Linguistic Society. Borsley, R. D. and Jones, B. M. 2005. Welsh Negation and Grammatical Theory. Cardiff: University of Wales Press. Bošković, Ž. 2007. On the locality and motivation of move and agree: An even more minimal theory. Linguistic Inquiry 38: 589–644. Breitbarth, A. and Haegeman, L. 2008. Not Continuity, but Change: Stable Stage II in Jespersen’s Cycle. Ms., University of Cambridge & STL Lille III. Brown, S. 1999. The Syntax of Negation in Russian. Stanford: CSLI Publications. Chomsky, N. 1995. The Minimalist Program. Cambridge, MA: MIT Press. Chomsky, N. 2000. Minimalist inquires. In Step by Step: Essays on Minimalist Syn- tax in Honor of Howard Lasnik, R. Martin, D. Michaels and J. Uriagereka (eds.), 89–155. Cambridge, MA: MIT Press. 262 The Syntax–Semantics Interface Chomsky, N. 2001. Derivation by phase. In Ken Hale: A Life in Language, M. Ken- stowicz (ed.), 1–52. Cambridge, MA: MIT Press. Chomsky, N. 2004. Beyond explanatory adequacy. In Structures and Beyond—The Cartography of Syntactic Structure, Volume 3, A. Belletti (ed.), 104–131. Oxford: Oxford University Press. Chomsky, N. 2007. Approaching UG from below. Interfaces + Recursion = Lan- gauge? Chomsky’s Minimalism and the View from Syntax-Semantics, H. M. Gärtner and U. Sauerland (eds.), 1–30. Berlin: Mouton de Gruyter. Chomsky, N. 2008. On phases. In Foundational Issues in Linguistic Theory, R. Freidin, C. P. Otero and M.-L. Zubizaretta (eds.), 133–166. Cambridge, MA: MIT Press. Christensen, K. K. 1986. Norwegian ingen: A case of post-syntactic lexicalization. In Scandinavian Syntax, Ö. Dahl and A. Holmberg (eds.), 21–35. Stockholm: Institute of Linguistics, University of Stockholm. Christensen, K. K. 1987. Modern Norwegian ingen and the ghost of an Old Norse particle. In Proceeding of the Seventh Biennial Conference of Teachers of Scandi- navian Studies in Great Britain and Northern Ireland, 1–17. London: University College London. Corver, N. 1997. Much—support as a last resort. Linguistic Inquiry 28: 119–164. den Besten, H. 1989. Studies in West Germanic Syntax. Doctoral dissertation, Uni- versity of Tilburg. Déprez, V. 2000. Parallel (A)symmetries and the internal structure of negative expressions. Natural Language and Linguistic Theory 18: 253–342. Frampton, J. and Gutmann, S. 2006. How sentences grow in the mind: Agreement and selection in efficient minimalist syntax. In Agreement Systems, Cedric Boeckx (ed.), 121–157. Amsterdam: John Benjamins. Gelderen, E. van. 2008. Negative cycles. Linguistic Typology 12: 195–243. Giannakidou, A. 1997. The Landscape of Polarity Items. Doctoral dissertation, Groningen University. Giannakidou, A. 2006. N-words and negative concord. In The Blackwell Compan- ion to Syntax, Volume III, M. Everaert and H. van Riemsdijk (eds.), 327–391. Oxford: Blackwell. Haegeman, L. 1991. Enkele opmerkingen over de analyse van eentwa en het West­ vlaams van Guido Gezelle. Taal en Tongval 2: 159–168. Haegeman, L. 1995. The Syntax of Negation. Cambridge: Cambridge University Press. Haegeman, L. 1997. N-words, indefinites and the Neg criterion. In Negation and Polarity: Syntax and Semantics, D. Forget, P. Hirschbühler, F. Martineau and M.-L. Rivero (eds.), 115–137. Amsterdam: Benjamins. Haegeman, L. 1998a. Verb movement in embedded clauses in West Flemish. Lin- guistic Inquiry 29: 631–656. Haegeman, L. 1998b. V-positions and the middle field in West Flemish. Syntax: An Interdisciplinary Journal of Linguistics 1: 259–299. Haegeman, L. 2000a. Antisymmetry and verb-final order in West Flemish. The Jour- nal of Comparative Germanic Linguistics 3: 207–232. Haegeman, L. 2000b. Negative preposing, negative inversion, and the split CP. In Negation and Polarity, L. R. Horn and Y. Kato (eds.), 21–61. Oxford: Oxford University Press. Haegeman, L. 2000c. Remnant movement and OV order. In The Derivation of OV and VO, Peter Svenonius (ed.), 69–96. New York: John Benjamins. Negative Concord and (Multiple) Agree 263 Haegeman, L. 2002a. Some notes on DP-internal negative doubling. In Syntactic Microvariation, Sjef Barbiers (ed.). Available at: www.meertens.nl/books/synmic Haegeman, L. 2002b. West Flemish negation and the derivation of SOV order in West Germanic. In Nordic Journal of Linguistics (Special issue on negation, Anders Holmberg (ed.)) 25: 154–189. Haegeman, L. and Zanuttini, R. 1991. Negative heads and the Neg criterion. The Linguistic Review 8: 233–251. Haegeman, L. and Zanuttini, R. 1996. Negative concord in West Flemish. In Param- eters and Functional Heads: Essays in Comparative Syntax, A. Belletti and L. Rizzi (ed.), 117–179. Oxford-New York: Oxford University Press. Hiraiwa, K. 2001. Multiple agree and the defective intervention constraint in Japanese. In Proceedings of the 1st HUIMIT Student Conference in Lan- guage research (HUMIT 2000), O. Matushansky et al. (eds.), 67–80. MIT Working Papers in Linguistics 40. Cambridge, MA: MIT Working Papers in Linguistics. Hiraiwa, K. 2005. Dimensions of Symmetry in Syntax: Agreement and Clausal Architecture. Doctoral dissertation, MIT, Cambridge, MA. Kayne, R. S. 2005. Movement and Silence. Oxford: Oxford University Press. Kranendonk, H. 2008. Decomposing Negative Quantifiers: Evidence from Dutch dialects. TIN dag presentation. Ms., OTS, Utrecht University. Ladusaw, W. 1992. Expressing negation. In Semantics and Linguistic Theory (SALT) II, C. Baker and D. Dowty (eds.), 237–259. Columbus: Ohio State University. Laka, I. 1990. Negation in Syntax. Doctoral dissertation, MIT, Cambridge, MA. Lawler, J. M. 1971. Any questions? In Papers from the Seventh Regional Meeting of the Chicago Linguistic Society, D. Adams, M.A. Campbell, V. Cohen, J. Lovins, E. Maxwell, C. Nygren and J. Reighard (eds.), 163–173. Chicago, IL: University of Chicago, Chocago Linguistic Society. Lindstad, A. M. 2007. Analyses of Negation: Structure and Interpretation. Doctoral dissertation, University of Oslo. López, L. 2008. The [person] restriction: Why? and, most specially, why not? In Agreement Restrictions, R. D’Alessandra, S. Fischer and G. H. Hrafnbjargarsson (eds.), 129–158. Berlin: Mouton de Gruyter. Manzini, R. and Savoia, L. 2008. Negative adverbs are neither Adv nor Neg. In Work notes on Romance Morphosyntax, R. Manzini & L. Savoia (eds.), 79–97. Alessandria: Editioni dell’Orso. Merchant, J. 2004. Some working definitions (second version). Handout Syntax 1, Fall. Merchant, J. and Sadock, J. 2008. Case, agreement, and null arguments in Aleut. Paper presented at the 83rd Annual Meeting of the Linguistic Society of America, January 9. Moscati, V. 2006. The Scope of Negation. Doctoral dissertation, University of Siena. Penka, D. 2007a. Negative Indefinites. Doctoral dissertation, University of Tübingen. Penka, D. 2007b. Uninterpretable negative features on negative indefinites. In Pro- ceedings of the 16th Amsterdam Colloquium, M. Aloni, P. Dekker and F. Roelof- sen (eds.), 19–22. Amsterdam: University of Amsterdam, ILLC/Department of Philosophy. Penka, D. 2007c. A Cross-linguistic Perspective on n-words. Proceedings of BIDE05: International Journal of Basque Linguistics and Philology XLI: 267–283. 264 The Syntax–Semantics Interface Pesetsky, D. and Torrego, E. 2007. The syntax of valuation and the interpretabil- ity of features. In Phrasal and Clausal Architecture: Syntactic Derivation and Interpretation. In Honor of Joseph E. Emonds, S. Karimi, V. Samiian and W. K. Wilkins (eds.), 262–294. Amsterdam: John Benjamins. Piñar, L. P. 1996. Negative Polarity Licensing and Negative Concord in the Romance Languages. Doctoral dissertation, University of Arizona. Progovac, L. 2000. Coordination, c-command, and ‘logophoric’ n-words. In Nega- tion and Polarity, L. R. Horn and Y. Kato (eds.), 88–114. Oxford: Oxford Uni- versity Press. Richards, M. 2011. Probing the past: On reconciling long-distance agreement with the PIC. In Local Modelling of Non-Local Dependencies in Syntax, A. Alex- iadou, T. Kiss and G. Müller (eds.), 135–154. Tübingen: Niemeyer. Rizzi, L. 1990. Relativized Minimality. Cambridge, MA: MIT Press. Rizzi, L. 2004. Locality and left periphery. In Structures and Beyond—The Car- tography of Syntactic Structure, Volume 3, A. Belletti (ed.), 223–251. Oxford: Oxford University Press. Roberts, I. 2007. Diachronic Syntax. Oxford: Oxford University Press. Roberts, I. and Roussou, A. 2003. Syntactic Change. Cambridge: Cambridge Uni- versity Press. Starke, M. 2001. Move Dissolves into Merge: A Theory of Locality. Doctoral dis- sertation, University of Geneva. Stechow, A. von. 2005. Semantisches und morhologisches Tempus: Zur temporalen Orientierung Von Einstellungen und Modalen. Neue Beiträge zur Germanistik 4: 3–6. Stechow, A. von and Zeijlstra, H. 2008. How to agree. Available at www.kubrussel. ac.be /onderwijs/onderzoekscentra/crissp/bcgl/2008/zeijlstravonstechow.pdf. Svenonius, P. 2002. Strains of negation in Norwegian. Working Papers in Scandina- vian Syntax 69: 121–146. Svenonius, P. 2004. On the edge. In Peripheries: Syntactic Edges and Their Effects, D. Adger, C. de Cat and G. Tsoulas, (eds.), 261–287. Dordrecht: Kluwer. Troseth, E. 2009. Degree inversion and negative intentsifier inversion in the English DP. Studia Linguistica 26: 37–65. Ura, H. 1996. Multiple Feature-Checking: A Theory of Grammatical Function Split- ting. Doctoral dissertation, MIT, Cambridge, MA. van der Auwera, J. and Neuckermans, A. 2004. Jespersen’s cycle and the interaction of predicate and quantifier negation in Flemish. In Dialectology Meets Typol- ogy: Dialect Grammar from a Cross-Linguistic Perspective, B. Kortmann (ed.), 453–478. Berlin: Mouton de Gruyter. Vanacker, V. F. 1975. Substantiefgroepen met dubbele ontkenning in zuidwestelijke dialecten. Taal en Tongval 17: 41–50. Watanabe, A. 2004. The genesis of negative concord: Syntax and morphology of negative doubling. Linguistic Inquiry 35: 559–612. Willis, D. 2006. A Minimalist Approach to Jespersen’s Cycle in Welsh. Ms. Univer- sity of Cambridge. Zeijlstra, H. 2004. Sentential Negation and Negative Concord. Doctoral disserta- tion, University of Amsterdam. Zeijlstra, H. 2008. Negative Concord is Syntactic Agreement. Ms., University of Amsterdam. 9 Medial Adjunct PPs in English Implications for the Syntax of Sentential Negation*

with Karen De Clercq and Liliane Haegeman

9.1 Introduction: Aim and Organization of the Chapter The starting point of this chapter is a fairly widespread claim in the gen- erative literature to the effect that sentence-medial adjunct PPs are unac- ceptable. Our chapter makes two points: First, at the empirical level, we elaborate on Haegeman (2002), who showed that medial adjunct PPs are possible. We demonstrate on the basis of corpus data that sentence-medial adjunct PPs are not unacceptable and are attested. Our corpus data also reveal a sharp asymmetry between negative and nonnegative adjunct PPs, which was noted by De Clercq (2010a,b) but was not thoroughly discussed there. The analysis of the corpus reveals the following pattern: Nonnegative adjunct PPs such as at that time resist medial position and instead tend to be postverbal; negative adjunct PPs such as at no time appear medially rather than postverbally. The second part of the chapter looks at some theoretical implications of our findings for the syntax of negative PPs. We broaden the empiri- cal domain and include negative complement PPs in the discussion. It is shown that when it comes to the licensing of question tags, English negative complement PPs, which are postverbal, pattern differently from postverbal negative adjunct PPs. Put informally, sentences with a postverbal negative adjunct PP pattern with negative sentences in taking a positive question tag, while sentences containing a postverbal negative argument PP pattern with affirmative sentences in taking a negative tag. To account for the observed adjunct-argument asymmetry in the licensing of question tags, we will pro- pose that clauses are typed for polarity and we explore the hypothesis that a polarity head in the left periphery of the clause is crucially involved in the licensing of sentential negation (Laka 1990, Progovac 1993, 1994, Moscati 2006, 2011, De Clercq 2011a,b, McCloskey 2011, and others.). The chapter is organized as follows: Section 9.2 considers the status of non-negative medial adjunct PPs. Section 9.3 examines the distribution of negative adjunct PPs. Section 9.4 elaborates our account of the licensing of sentential negation, which relies on a clause-typing mechanism estab- lished by a polarity head in the left periphery of the clause. Section 9.5 is a brief summary of the chapter. 266 The Syntax–Semantics Interface 9.2 Medial Position for Circumstantial PPs in English When realized by adverbs, English adjuncts are found in three positions: (1) initial (illustrated in (1a), (2a)), (2) medial ((1b), (2b)), and (3) postver- bal ((1c), (2d)). The examples in (1) illustrate the patterns in a sentence with only a lexical verb and in (2) the patterns in a sentence with an auxiliary and a lexical verb. The difference between the patterns in (2b) and (2c) is tangential to the discussion, and we group them under “medial position”.

(1) a. Recently he left for London. b. He recently left for London. c. He left for London recently. (2) a. Recently he has left for London. b. He recently has left for London. c. He has recently left for London. d. He has left for London recently.

With respect to adjuncts realized by PPs, the literature has generally focused on initial ((3a), (4a)) or postverbal ((3c), (4c)) PPs, with little or no discus- sion of medial PPs ((3b), (4b)):

(3) a. At that time the actor lived in London. b. The actor at that time lived in London. c. The actor lived in London at that time. (4) a. At that time the actor was living in London b. The actor was at that time living in London. c. The actor was living in London at that time.

In this section, we discuss these data more carefully based on literature sur- veys and corpus studies.

9.2.1 Medial Position Adjunct PPs: The Literature As pointed out by Haegeman (2002), there is a tendency in the generative tra- dition to consider medial adjunct PPs (such as (3c), (4c)) unacceptable in abso- lute terms, in contrast to medial adverbs. For instance, commenting on (5), Jackendoff (1977: 73) says, “First let us deal with the differences between AdvPs and PPs in V. The most salient difference is that AdvPs may appear preverbally as well as postverbally, whereas PPs may only be postverbal.”

(5) a. Bill dropped the bananas  quickly  .   with a crash b. Bill  quickly  dropped the bananas.   *with a crash (from Jackendoff 1977: 73, ex. (4.40)) Medial Adjunct PPs in English 267 This type of judgment is reiterated in the literature, for example, in Emonds (1976), who treats medial PPs such as those in (3b) and (4b) as parentheti- cals, and in Nakajima (1991), Rizzi (1997:301), Frey and Pittner (1998: 517), Pittner (1999: 175, 2004: 272), Cinque (2004: 699–700), Haumann (2007), Belletti and Rizzi (2010), and elsewhere. Reproducing the judgment in (5), Cinque (1999: 28) writes,

Circumstantial adverbials also differ from AdvPs proper in that they are typically realized (with the partial exception of manner adverbials) in prepositional form (for three hours, in the kitchen, with great zeal, for your love, in a rude manner, with a bicycle, etc.) or in bare NP form (the day after, tomorrow, this way, here etc. [. . .]). Furthermore, pos- sibly as a consequence of this, they cannot appear in any of the pre-VP positions open to AdvPs proper (except for the absolute initial position of “adverbs of setting”, a topic-like position).

While we take no issue with the actual judgments of specific examples, the authors’ extrapolation that all medial PPs are ruled out does not correspond to the empirical data. As a matter of fact, there is no agreement amongst authors that medial adjunct PPs are unacceptable. For instance, on the basis of the judgments presented in (6), McCawley (1998: 207) does confirm the general tendency for adjunct PPs to resist medial position, but he also provides the examples in (7), with acceptable medial adjunct PPs. He comments, “I don’t know of any neat way to distinguish between the P’s in [6] and the ones in [7]” (McCawley 1998: 214, note 25).

(6) a. John was carefully/*with care slicing the bagels. b. ?? We will for several hours be discussing linguistics. c. ?? Ed in Atlanta was struck by a truck. (McCawley 1998: 207) (7) a. John has for many years been a Republican. b. John has on many occasions voted for Republicans. (McCawley 1998: 214, note 25)

Focussing on journalistic prose, Haegeman (2002) shows that medial PPs are regularly attested. The following illustrate a medial adjunct PP in a finite clause without an auxiliary, in (8a), a finite clause with an auxiliary, in (8b), as well as a nonfinite clause, in (8c):

(8) a. Burton moved in with Speke and the collaboration within two months produced a 200,000 word book, which sold 5,700 copies in its first year and was translated all over Europe. (The Guardian, 13 August 2001, p. 8, col. 4) b. The strength and charm of his narratives have in the past relied to a considerable extent on the first person presence of Lewis himself. (The Observer, 22 July 2001, Review, p. 3, col. 2) 268 The Syntax–Semantics Interface c. It is fine, keep going, but then we have to after a day or two just leave this to the committee. (The Guardian, 20 August 2003, p. 4, col. 6)

Several authors (Quirk et al. 1985: 492, 514, 521, Ernst 2002a: 504, 2002b: 194, Mittwoch, Huddleston and Collins 2002: 780) signal that weight con- siderations play a part in restricting the availability of nonparenthetical medial PP adjuncts. For a discussion of a definition of weight in determining word order, see, for example, Ernst (2002b: 194) and the references cited there.

9.2.2 Medial Position Adjunct PPs Are Rare While the claim that medial PPs are categorically unacceptable is definitely incorrect, medial adjunct PPs are not as frequent as medial adverbs. Quirk et al. (1985) provide an overview of the distribution of a range of adverbial expressions in the various positions in a sample of the Survey of English Usage corpus (see their description in Quirk et al. 1985: 489). Tables 9.1 and 9.2 are based on their table 8.23 and summarize the percentages of adjunct PPs and adjunct adverbs in initial, medial and postverbal position. While Quirk et al. distinguish a number of medial and postverbal positions, our tables simplify their table 8.23 in that we have grouped their distinct medial positions into one position and we have also collapsed their postver- bal positions into one. Medial PPs are systematically outnumbered by post- verbal PPs, both in writing and in speech. For adverbs, the opposite relation holds: medial adverbs are slightly more frequent than postverbal ones. That

Table 9.1 Distribution of PPs in the Survey of English Usage (Quirk et al. 1985: 501)

% Initial % Medial % End Total number

Spoken 6 1 93 2,063 Written 12 3 85 2,351 Average 9.5 2.5 88 4,4561

Table 9.2 Distribution of Adverbs in the Survey of English Usage (Quirk et al. 1985: 501)

% Initial % Medial % End Total number

Spoken 17.5 44.5 38 608 Written 15 50 35 462 Average 16 47 37 1,0632 Medial Adjunct PPs in English 269 medial PPs are rare is also occasionally signalled in pedagogically oriented grammars such as, for instance the Collins COBUILD grammar (Sinclair 1990: 283) and Lambotte (1998). In order to assess the status of medial adjunct PPs in present-day English, we undertook a pilot search of the Corpus of Contemporary American Eng- lish (henceforth COCA; http://corpus.byu.edu/coca/, COCA 2010) and the British National Corpus at (henceforth BNC; http://corpus.byu.edu/bnc/, BNC 2010) in which we examined the distribution of the following tem- poral adjunct PPs: on three occasions, on those occasions, at one time, at a time, at some time, at this time, at that time, on many occasions, and of the manner adjunct in this way. For adjunct PPs occurring at a very high frequency (at one time, at a time, at some time, at this time, at that time, on many occasions, in this way), we based our study on a sample of the first 100 entries. We present our results in Tables 9.3 and 9.4. Obviously, the numbers in these tables in no way represent the full and final picture

Table 9.3 Pilot Study: Distribution of PPs in Medial Position, COCA Sample

PP Total Initial Medial Postverbal Not relevant

On three occasions 86 18 2 63 3 On those occasions 95 49 1 42 3 At one time 100 27 13 36 24 At a time 100 9 0 42 493 At some time 100 13 13 74 0 At this time 100 24 6 67 3 At that time 100 35 10 54 1 On many occasions 100 28 5 64 3 In this way 100 52 3 39 6

Table 9.4 Pilot Study: Distribution of PPs in Medial Position, BNC Sample

PP Total Initial Medial Postverbal Not relevant

On three occasions 63 21 2 35 5 On those occasions 29 8 0 20 1 At a time 100 16 2 46 36 At one time 100 37 28 24 11 At some time 100 12 17 70 1 At this time 100 24 6 68 2 At that time 100 27 14 59 0 On many occasions 100 23 3 72 2 In this way 100 26 2 70 2 270 The Syntax–Semantics Interface of the distribution of adjunct PPs, nor does our chapter offer a statistical analysis of such data, but our findings suffice to show (1) that sentence- medial adjunct PPs are certainly attested and (2) that, fully in line with the literature, such medial adjunct PPs are outnumbered by postverbal adjunct PPs. In section 9.3 we will see, however, that for a well-defined class of PP adjuncts, medial position is not just an option but is actually strongly pre- ferred over postverbal position.

9.3 Sentential Negation and Adjunct PPs

9.3.1 Sentential Negation in English In English, negation can be expressed in a number of different ways, the most common of which are illustrated in (9). For recent analyses and a sur- vey of the literature we refer to Zeijlstra (2004), Christensen (2005, 2008), Moscati (2006, 2011) and Tubau (2008).

(9) a. The police did not talk to any witnesses. b. No one talked to the police about any crime. c. The police associated no one with any of these crimes. d. The police talked to no one about any of these crimes. e. The police never talked to any witnesses about the crime. f. Never had the police talked to any witnesses.

The canonical marker of negation is the particle not (or its contracted form n’t) adjacent to the finite auxiliary. Alternatively, an argument of the verb is realized as a negative nominal constituent, such as no one in (9b) or (9c), or as a PP containing a negative nominal as in (9d), which also conveys nega- tion (but see section 9.4 for discussion). Finally, and most relevant for our purposes, in (9e) and (9f) a negative adjunct expresses sentential negation. In (9e) the adverb never is medial, and in (9f) it is initial, triggering subject– auxiliary inversion (henceforth SAI; see Rudanko 1987, Haegeman 2000, Sobin 2003). Negative adjuncts with sentential scope can also be realized as PPs. In (10a) the negative quantifier no contained inside the initial temporal PP at no time has sentential scope, witness the fact that it triggers SAI, and licenses the negative polarity item any in the complement of the verb.45 The negative PP differs from its nonnegative counterpart at that time, which does not, and cannot, trigger SAI, as is shown in (11).

(10) a. At no time had the police talked to any witnesses. b. *At no time the police had talked to any witnesses. (11) a. At that time the police had interviewed the witnesses. b. *At that time had the police interviewed the witnesses. Medial Adjunct PPs in English 271 Like negative adverbs, negative adjunct PPs with sentential scope can appear in sentence-medial position, as in (12). The availability of the polar- ity item any in (12a) confirms that at no time has sentential scope. Though we mainly focus on temporal PPs like (12a), other medial adjunct PPs can also express sentential negation, see (12b):

(12) a. The police had at no time talked to any of the witnesses. b. The FQ at no level forms a constituent with the DP it modifies. (Will Harwood, p.c.)

In relation to the discussion in section 9.2, the data in (12) obviously also challenge claims according to which medial adjunct PPs are categorically unacceptable. We go into these patterns in more detail here.

9.3.2 Negative Adjunct PPs and the Expression of Sentential Negation Sentences with preposed negative constituents such as the pair in (13a,b) have been discussed extensively (see, among others, Rudanko 1987, Haege- man 2002, Sobin 2003, Radford 2004, Haumann 2007 and the references cited there). In (13a), without SAI, the negative quantifier no contained in the PP in no clothes encodes constituent negation (“without clothes”) and does not take sentential scope; in (13b), with SAI, the PP-internal negative quantifier has sentential scope (“there are no clothes such that . . .”).

(13) a. In no clothes Mary looks attractive. b. In no clothes does Mary look attractive.

Less attention has been paid to the distribution and interpretation of postver- bal negative PPs. We briefly consider here some discussions in the literature. Tottie (1983) studies the alternation between S[ynthetic] negation (he said nothing) vs. A[nalytic] negation (he did not say anything) in American Eng- lish, using both informants’ questionnaires and corpus material. However, her data do not include many relevant examples of PPs. Summarizing her conclusions on the basis of the informants’ questionnaires she writes: An examination of the actual sentences from the sample reveals that those sentences that had S negation in PrepPhrases were to a large extent fairly fixed collocations. Compare ([14]), all be-sentences with PrepPhrases func- tioning as adverbials:

(14) a. In any case it is by no means clear that formally structured organs of participation are what is called for at all. A 35 b. Mr Balaguer’s troubles are by no means over. B 05 c. It is by no stretch of the imagination a happy choice. B 22. (Tottie 1983: 52) 272 The Syntax–Semantics Interface Observe that in the three examples in (14), the medial negative adjunct PP is not set off prosodically. Indeed, in spite of its relative weight, even the PP by no stretch of the imagination occupies medial position in (14c). Inserting commas in (14c) would entail that the negative PP cannot scope over the clause and would render the sentence unacceptable, as is shown in (14c’).

(14) c’. *It is, by no stretch of the imagination, a happy choice.

In their discussion of negative markers in English, Quirk et al. (1985: 783) systematically compare a positive sentence with its negative alterna- tive. Their example set in (15) is of interest in the light of our discussion. While in the positive (15a) the adverb somehow is in postverbal position, the negative adjunct PP is placed medially in (15d). Quirk et al. do not com- ment on this shift in position.

(15) a. They’ll finish it somehow. b. They won’t in any way finish it. c. They won’t finish it at all. d. They will in no way finish it. (Quirk et al 1985: 783, ex. (8))

Pullum and Huddleston (2002) distinguish “verbal” negation, expressed by medial not or n’t associated with an auxiliary, as in (9a) or (15b,c), from “non- verbal” negation, expressed by means of a negative constituent such as a nega- tive quantifier (no, nothing, no one, etc.) or a negative adverb (never, no longer, no more). Relevantly, they provide (16a) as an instance of a nonverbal sentential negation. In this example negation is encoded in a postverbal adjunct PP. Fol- lowing Klima (1964), McCawley (1998), Horn (1989), Haegeman (2000), De Clercq (2010a), and others, the standard diagnostics to detect negativity (16b–e) show that the postverbal negative constituent in (16a) can take sentential scope.6

(16) a. We were friends at no time. (Pullum and Huddleston 2002:788, ex. [5iia]) b. We were friends at no time, not even when we were at school. (Pullum and Huddleston 2002:789, ex. [10ia]) c. We were friends at no time, and neither were our brothers. d. We were friends at no time, were we? e. At no time were we friends.

Along the same lines, Haumann (2007: 230) provides (17a), in which post- verbal on no account negates the sentence and Kato (2000) presents (17b) as an instance of sentential negation expressed by a postverbal negative PP (but see the later discussion concerning (22)):

(17) a. She will go there on no account, not even with John. (Haumann 2007: 230, ex. (130b)) b. He will visit there on no account. (Kato 2000: 67, ex. (14a)) Medial Adjunct PPs in English 273 However, native speakers often consider sentences with postverbal negative adjunct PPs as less than perfect. And indeed, while they present (16a) with- out comments, Pullum and Huddleston (2002: 814) themselves signal that in fact postverbal negative PPs lead to a lower acceptability. They illustrate this point by means of the (weak) contrasts in (18) and (19): The examples in (18), with a negative adjunct PP in postverbal position, are more marked than the corresponding sentences in (19), which contain a combination of the negative marker not with a postverbal adjunct PP containing a negative polarity item (NPI).

(18) a. ? I am satisfied with the proposal you have put to mein no way. (Pullum and Huddleston 2002: 814, ex. [24ib]) b. ?As far as I can recall, I have purchased food at the drive-through window of a fast—food restaurant on no street in this city. (Pullum and Huddleston 2002: 814, ex. [24iib]) (19) a. I am not satisfied with the proposal you have put to me in any way. (Pullum and Huddleston 2002: 814, ex. [24ia]) b. As far as I can recall, I have not purchased food at the drive-through window of a fast-food restaurant on any street in this city. (Pullum and Huddleston 2002: 814, ex. [24iia])

As shown in the following extract, the authors account for the preceding contrasts in terms of processing load, rather than in terms of grammaticality:

In principle, non-verbal negators marking clausal negation can appear in any position in the clause. However, as the position gets further from the beginning of the clause and/or more deeply embedded, the accept- ability of the construction decreases, simply because more and more of the clause is available to be misinterpreted as a positive before the negator is finally encountered at a late stage in the processing of the sentence. (Pullum and Huddleston 2002: 814)

Though Pullum and Huddleston do not pursue this point, their account of the contrasts between (18) and (19) leads to the correct prediction that medial position will be preferred for the negative adjunct PP: (18a) and (18b) are definitely improved with the negative PP in medial position. Observe that even for the slightly longer PP on no street in this city in (20b), considerations of weight do not lead to a degradation:

(20) a. I am in no way satisfied with the proposal you have put to me. b. As far as I can recall, I have on no street in this city purchased food at the drive-through window of a fast-food restaurant.7

De Clercq (2010a, b) reports the judgments in (21) through (24). The exam- ples in (21) show that while the nonnegative PP at that time is accepted both 274 The Syntax–Semantics Interface in medial (21a) and postverbal (21b) position, its negative analogue remains acceptable in medial position (21c) but postverbal position (21d) is rejected. In contrast with the judgment reported by Kato in (17b) above, postverbal on no account in (22b) is also considered unacceptable by De Clercq’s informants. The examples in (23) and (24) provide additional judgments along the same lines.

(21) a. The police had at that time interviewed the witnesses. b. The police had interviewed the witnesses at that time. c. The police had at no time talked to the witnesses. d. ?*The police had talked to the witnesses at no time. (22) a. You should on no account move to Paris. b. ?*You should move to Paris on no account. (23) a. She should at no time reveal the secret. b. ?*She should reveal the secret at no time. (24) a. They would under no circumstances reveal the problem. b. *They would reveal the problem under no circumstances.

A fully acceptable alternative to a sentence with a postverbal negative adjunct PP is one in which sentential negation is expressed by the canonical marker of sentential negation not/n’t and in which an NPI any replaces the negative quantifier no in the postverbal PP. The contrast between the perfect (25) and the contrasts in acceptability observed for degraded (22b), (23b) and (24b) suggest that it is the negative component of the postverbal PPs that causes the degradation.

(25) a. She should not reveal the secret at any time. (De Clercq 2010b: 9) b. You should not move to Paris on any account. c. They would not reveal the problem under any circumstances.

9.3.3 The Distribution of Negative PP Adjuncts In section 9.2.2, we saw that as far as nonnegative adjunct PPs are con- cerned, postverbal PPs outnumber medial PPs in the English corpora consid- ered. To assess the distribution of their negative counterparts, we examined the distribution of the negative adjunct PPs at no time, on no account, by no stretch of the imagination, on no occasion, in no event, at no other N, and in no way (see Quirk et al’s (20), shown earlier). Our pilot study reveals an asymmetry between negative PPs and nonnegative PPs. Medial nonnegative PPs are less frequently attested than postverbal nonnegative PPs. Medial negative PPs are far more frequent than postverbal negative PPs, which are, in fact, very rare indeed. These findings offer further support for Haege- man’s (2002) claim that medial adjunct PPs are not categorically excluded. On the other hand, while nonnegative adjunct PPs are easily available in postverbal position, postverbal negative PPs with sentential scope, while available, are the marked option. Medial Adjunct PPs in English 275 Tables 9.5 and 9.6 summarize the results of our searches for the negative PPs at no time, on no account, by no stretch of the imagination, on no occa- sion, in no event, at no other N (see (26e, f, g)), and in no way. The lower frequency of postverbal negative adjunct PPs sets them off sharply from postverbal nonnegative adjunct PPs, which, as shown in Tables 9.3 and 9.4, are well-attested. To complete the picture, Tables 9.7 and 9.8 provide the relevant figures for medial and postverbal position of the corresponding adjunct PPs containing an NPI: at any time, under any circumstances, on any account, and on any occasion. For at any time and in any way, we have again used a reduced sample of 100 examples. As was the case for the nonnegative PPs discussed in section 9.2, postverbal position is more easily available. Some of the (rare) postverbal occurrences of negative PPs are illustrated in (26):

(26) a. I judge you in no way, Eunice. (COCA 2008, Fiction, Harriet Isabella)

Table 9.5 Distribution of Negative Adjunct PPs, COCA Sample

PP Total Initial Medial Postverbal Not (SAI) relevant

At no time 100 96 4 0 0 On no account 21 21 0 0 0 By no stretch of the imagination 10 6 4 0 0 On no occasion 3 2 0 0 1 In no event 9 9 0 0 0 At no other N 34 23 0 3 8 In no way 100 14 84 2 0

SAI = subject–auxiliary inversion.

Table 9.6 Distribution of Negative Adjunct PPs, BNC Sample

PP Total Initial Medial Postverbal Not (SAI) relevant

At no time 100 86 13 0 1 On no account 84 67 17 0 0 By no stretch of the imagination 14 9 5 0 0 On no occasion 3 2 1 0 0 In no event 0 0 0 0 0 At no other N 9 5 0 3 1 In no way 100 8 90 0 2

SAI = subject–auxiliary inversion. 276 The Syntax–Semantics Interface Table 9.7 Distribution of NPIs: Medial and Postverbal Position, COCA Sample

PP Total Initial Medial Postverbal Not relevant

On any occasion 12 0 0 7 5 On any account 8 0 4 3 1 By any stretch of the imagination 100 4 8 60 28 At any time 100 9 1 86 4 In any way 100 0 30 68 2

Table 9.8 Distribution of NPIs: Medial and Postverbal Position, BNC Sample

PP Total Initial Medial Postverbal Not relevant

On any occasion 11 3 4 1 3 On any account 18 0 12 5 1 By any stretch of the imagination 21 0 6 10 5 At any time 100 14 11 71 4 In any way 100 0 45 53 2

b. He really likes and appreciates a wide range of people who resemble him in no way whatsoever.8 (COCA 2001, news, The Washington Post) c. The fall also produced a strong smell of methylated spirits—some- thing repeated at no other meteorite fall. (COCA 2006, MAG, Astronomy) d. For a kind of light and a sweep of possibility that comes at no other time. (COCA 1979, MAG, Skiing) e. It showed a flash of strategic prescience that he displayed at no other moment in his military career. (BNC CLXW, non-ac-humanities-arts) f. Such as has been available at no other period of British history (BNC EEW9, W-non acad, SocScience) g. The success of this unique element, which exists at no other Ger- man University (COCA 1990, Acad, Armed Forces).

In preparation for the next section we need to add one ingredient to the dis- cussion, which we have not touched upon so far: whereas negative adjunct PPs resist postverbal position, the canonical position of negative comple- ment PPs is postverbal (27a). Indeed, there is no medial position available for negative complement PPs, as is shown by (27b). However, the postverbal position of the negative complement PP is felt to be a marked option in com- parison to encoding negation medially by means of the canonical marker of Medial Adjunct PPs in English 277 negation n’t/not, where the corresponding postverbal PP contains an NPI, as in (27c):

(27) a. Mary has talked to no one. b. *Mary has to no one talked. c. Mary hasn’t/not talked to anyone.

9.4 Ways of Expressing Sentential Negation In this section we outline an account for the asymmetry in the distribution of negative adjunct PPs, and in particular for their strong preference for medial position. Our account explores proposals in De Clercq (2010a, 2011a,b). On one of the two derivations of postverbal adjunct PPs presented below, the processing complexity which Pullum and Huddleston (2002) associate with the postverbal negative adjunct PPs can be argued to have a syntactic basis. In this chapter we do not discuss how to account for the distribution of nonnegative adjunct PPs.

9.4.1 Question Tags and Negative Clause-Typing Ever since Klima (1964), reversal tags or question tags as illustrated in (28) have been used as a diagnostic to determine whether a sentence is affirma- tive or negative (Horn 1989, McCawley 1998):9

(28) a. John is working on a PhD, isn’t he? b. John isn’t working on a PhD, is he?

Standardly, it is proposed that a negative question tag identifies an affir- mative sentence (28a) and that a positive question tag identifies a negative sentence. Let us adopt the tag test as a diagnostic to determine the polarity of the clause, focusing on sentences containing a negative PP. Informally, we will say that clauses are typed for polarity as either negative or posi- tive. Needless to say, clause-typing for polarity ([+/− negative]) is orthogo- nal to clause-typing for interrogative/declarative ([+/−wh]) since the value [+/− negative] may combine with the value [+/-wh]. Along these lines, a sentence negated by medial not/n’t is negative, and so is a sentence which contains medial never, for example, (29a). A sentence containing a medial negative adjunct PPs is compatible with a positive question tag, for example, (29b), and hence is also “negative” in the intended sense.

(29) a. Mary has never talked to anyone, has she? b. She had at no point talked to anyone, had she?

As discussed earlier, postverbal negative adjunct PPs are rare, but to the extent that they are acceptable, such sentences are only compatible with 278 The Syntax–Semantics Interface positive tags. The example in (30a) is from Pullum and Huddleston (2002), (30b) is based on Pullum and Huddleston’s [24i]. We conclude that postver- bal negative adjunct PPs also type the clause as negative.

(30) a. We were friends at no time, were we? b. As far as I can recall, we have purchased food at the drive-through win- dow of a fast-food restaurant on no street in this city, have we/*haven’t we? (based on Pullum and Huddleston 2002: 814, ex. [24ii])

When it comes to sentences containing negative complement PPs though, the pattern of question tags is reversed for our informants. As can be seen in (31), while sentence-medial not induces a positive tag, the sentence with the postverbal negative complement PP to no one is compatible with a negative tag (see also Horn 1989: 185, citing Ross 1973 for a similar example with a negative nominal complement).10

(31) a. Mary has talked to no one, *has she/ hasn’t she/? b. Mary hasn’t/not talked to anyone, has she/*hasn’t she?

We conclude, then, that there is an argument-adjunct asymmetry: While postverbal negative adjunct PPs may be rare, to the extent that they are pos- sible they type the clause as negative. On the other hand, we can see that postverbal negative complements do not type the clause as negative, since they are not compatible with a positive question tag.

9.4.2 Clause-Typing and Sentential Negation Our hypothesis is that clauses are typed for polarity: They are either positive or negative. Polarity determines the choice of question tag. In line with the cartographic approach (Rizzi 1997, Moscati 2006), we assume that polarity typing is syntactically encoded on a head in the C-domain such as Laka’s (1990) ƩP, or Progovac’s (1993, 1994) PolP. We propose that in the case of negative sentences, this head must establish a local checking relation with a negative constituent. From the distribution of the tags, we conclude that the medial negative marker not and the medial adverb never are able to license the clause-typing negative head in the C-domain and that postverbal nega- tive PP complements cannot do so.

(32) a. Mary hasn’t talked to anyone, has she? b. Mary has never talked to anyone, has she? c. *Mary has talked to no one, has she?

We interpret the contrast in (32) as deriving from locality conditions on clause-typing. Putting this first at an intuitive level, the negation in (32c) is “too far” from the C-domain to be able to type the clause as negative Medial Adjunct PPs in English 279 and hence to license the positive tag. Various implementations can be envis- aged to capture these locality restrictions. In terms of Phase theory (Chom- sky 2001, 2008), for instance, one might say that being contained within a lower phase (vP), the postverbal negative complement PPs cannot establish the required licensing relation with the relevant head in the C-domain. To make this proposal more precise, let us propose that the polarity- related head in the C-domain contains an unvalued feature, [pol:__], which has to be assigned a value through a local checking relation. In (32a) and in (32b), with the medial negative markers not and never, the feature [pol:__] in the C-domain can be valued through an agree relation with the interpre- table negative feature on never.11 If the C-polarity head is typed as negative, then the clause will be compatible with a positive tag. In (32c), on the other hand, the negative quantifier no one in the VP- internal argument PP is contained in the vP phase and hence it is too low to be able to value the clausal polar head by an agree relation. We assume that in the absence of a negatively valued checker, the polarity feature of the clause is typed as positive by default and will hence not be compatible with the positive reversal tag.

(33) a. [CP [C pol: neg] [TP Mary has not[neg] [vP talked to anyone]]]

b. [CP [C pol: neg] [TP Mary has never[neg] [vP talked to anyone]]]

c. [CP [C pol___] [TP Mary has [vP talked to no one[neg]]]]

A final remark is in order here. Though it does not lead to a positive tag, (31a)/(33c) is still felt to be a “negative” sentence because of the presence of the negative DP. For instance, like (32a) and (32b), (32c) will combine with a neither tag rather than with a so tag.12 Klima (1964) considers neither tags also to be a diagnostic for negativity (see also (16c)):

(34) a. Mary has not talked to anyone, and neither/*so has Jane. b. Mary has never talked to anyone, and neither/*so has Jane. c. Mary has talked to no one, and neither/*so has Jane.

As discussed already by McCawley (1998: 604–612), the reversal tag- diagnostic which we used previously and the neither/so tag gives different results. It is not clear to us at this point how to capture this in terms of our discussion. De Clercq (2011b) proposes that in examples such as (34c) the negation encoded in no one within the complement of V takes scope by vir- tue of its quantificational properties, in the same way that, for instance, the universal quantifier encoded in everyone can scope over the clause in (35). The precise implementation of this proposal would lead us too far and it also depends on the assumptions regarding the syntactic encoding of scope, see De Clercq (2011b) for one proposal. Crucial for us is that, syntacti- cally, the postverbal vP-internal argument cannot establish a local checking relation with the polarity feature, which by hypothesis is in the C-domain: 280 The Syntax–Semantics Interface polarity checking is different from the operation that determines the scope of the quantifier in (35).

(35) Mary has talked to everyone.

We tentatively assume that the neither tag is sensitive to the scopal/quantifi- cational properties of the negative quantifier in a way that the reversal tags are not.

9.4.3 Clause-Typing and Adjunct PPs Let us now return to the distribution of negative adjunct PPs. We have seen that the preferred position for such PPs is medial rather than postverbal. A sentence with a medial negative adjunct PPs is compatible with a positive reversal tag, as shown in (36a), entailing that the negative PP must be able to type the clause. Pursuing our analysis, we will assume that, like the marker of negation not and like the medial negative adverb never, the medial nega- tive adjunct PP is in a sufficiently local relation to the C-domain to value the polarity feature. We conclude from this that such PPs must not be con- tained within the vP phase. If they were, then we would not expect them to pattern with medial not and never. Depending on one’s assumptions about functional structure, the negative PP might be vP adjoined, as in (36b), or it might be taken to be the specifier of a medial functional projection, as in (36c), which we label FP.13

(36) a. She had at no point talked to anyone, had she?

b. [CP [C pol:neg] [TP She had [vP at no [neg] time [vP talked to anyone]]]]

c. [CP [C pol:neg] [TP She had [FP at no [neg] time [vP talked to anyone]]]]

Postverbal negative adjunct PPs are marginal, but to the extent that they are available they were shown to be compatible with positive tags, see (16d), suggesting that they too type the clause. The analysis of such examples depends on one’s general assumptions about the syntax of postverbal PPs (see Cinque 2004 and Belletti and Rizzi 2010 for overview of some options). If right adjunction is admitted in the theory (cf. Ernst 2002a, b), at no time in (37a) might be right-adjoined to vP. Hierarchically speaking, though post- verbal, the PP in (37b) is outside vP and remains within the local checking domain of the polarity head in C. Given that, in terms of hierarchical rela- tions, the relation between C and the postverbal adjunct in (37b) is identical to that between C and the medial adjunct PP in (35b, c), this approach does not offer any insight into the perceived degradation of negative adjunct PPs in postverbal position.

(37) a. She had talked to them at no time, had she?

b. [CP [C pol:neg] [TP she had [vP [vP talked to them] at no[neg] time]]] Medial Adjunct PPs in English 281 On an antisymmetric/cartographic view in which right adjunction is not available (Cinque 2004), one might propose that the negative PP occupies the specifier position of a functional projection, FP (as in (37b’)), and that its postverbal position is derived by leftward movement of the vP to a higher position. The movement could arguably be triggered by the need for the negative PP to receive focal stress (see Jayaseelan 2008, 2010).

(37) b’.  [CP [C pol:neg] [TP she had [[vP talked to them] [FP at no[neg] time [vP talked to them]]]]

Assuming that the projection hosting the PP and the projection hosting the fronted vP do not themselves constitute phases, the polarity head in C can continue to establish a local checking relation with the postverbal negative PP in (37b’). On a more speculative note, we add here that the representa- tion in (37b’) may contribute to explaining the observation that the postver- bal position of the negative PP in (37a) is degraded: The fronting of the vP to a position c-commanding the negative PP might be argued to create a weak intervention effect for the relation between C and the negative PP. A correct prediction of our account is that a negative DP in the canonical subject position always types the clause as negative: (38a) is only compatible with a positive tag. This is so because the negative feature on no one is in a local relation with the polarity feature in C:

(38) a. No one talked to the police about any crime, did they?

b. [CP [C pol:neg [TP No one [neg] [vP talked to the police about any crime.]]]]

The proposal developed here, elaborating on De Clercq’s work, also has fur- ther implications for the representation of clause structure and in particular for the demarcation of phases. Passive sentences with a postverbal negative by phrase take a negative question tag (39):

(39) The book was adapted by no one, wasn’t it?

In terms of our account this entails that, as is the case for postverbal argu- ments, the negative component no one cannot value the polarity feature in the C-domain. This implies that, unlike postverbal adjuncts, the by phrase must be contained within a phase. We do not pursue this issue here as it hinges, among other things, on the analysis of passives (see Collins 2005 for a relevant analysis).

9.5 Conclusion This chapter first challenges the empirical claim often made in the genera- tive literature that medial adjunct PPs are ungrammatical in English. On the 282 The Syntax–Semantics Interface basis of a corpus study we show that (1) medial nonnegative adjunct PPs are attested both in American and in British English, though with low fre- quency, and (2) that medial negative adjunct PPs strongly outnumber post- verbal negative adjunct PPs. We conclude that any empirical generalizations to the effect that medial adjunct PPs are always unacceptable are ill founded. In the second part of the chapter we explored the syntax of sentential negation. The distribution of question tags reveals that among negative PPs, postverbal argument PPs pattern differently from postverbal adjunct PPs. We account for this argument–adjunct asymmetry in terms of a clause- typing account of sentential polarity, which crucially postulates a licensing relation between a polarity head in the C-domain and a constituent which encodes negation, and we pursue some of the consequences of this account.

Notes * Karen De Clercq and Liliane Haegeman’s research is part of the FWO project 2009-Odysseus-Haegeman-G091409. We thank Rachel Nye, Geoff Pullum, and Barbara Ürögdi for help with English judgments. Part of this material was pre- sented at the LAGB 2011 meeting in Manchester. We thank David Adger, Doug Arnold, Joan Maling, and Gary Thoms for their comments. We thank three anonymous reviewers for the Nordic Journal of Linguistics, Rachel Nye and Neil Smith for comments on an earlier version of this chapter. Needless to say all the usual disclaimers hold. 1 The use of the term negative quantifier to refer to no is a simplification. We do not wish to commit ourselves here to its exact nature. See Haegeman and Lohndal (2010) for discussion of the nature of such negative items. 2 An anonymous reviewer claims that (10b) is acceptable as an example of con- stituent negation. We disagree, if atno time is intended to encode constituent negation and hence lacks sentential scope the example will be ungrammatical because the negative polarity item any in the complement of the verb is not licensed. Our informants judge (10b) as unacceptable. 3 There is some speaker variation in the acceptance rate of (16a) and with respect to (18) and (30), but overall our informants’ judgements follow the tendencies reported in Pullum and Huddleston (2002). 4 Thanks to Geoff Pullum for generous help with these data. 5 Neil Smith (p.c) and Barbara Ürögdi (p.c) point out that focal stress makes post- verbal PPs more acceptable. For discussion of focal stress see also the discussion of text example (36) in Section 9.4. 6 On the use of question tags see also the discussion in Horn (1989: 184–189). Observe that there are two kind of tags: (i) question tags or reversal tags (McCaw- ley 1998) and (ii) reduplicative tags or same-way tags (Swan 2005). Question tags reverse the polarity of the matrix clause and usually check for information. Reduplicative tags reduplicate the polarity of the matrix clause and signal the speaker’s conclusion by inference, or his sarcastic suspicion (Quirk et al. 1985: 182). Reduplicative tags are only possible with affirmative sentences. Sentences withreduplicative tags can typically be preceded by oh or so (Quirk et al. 1985: 810–813). It is important to keep the tags apart. In the literature, confusing these tags has led to the wrong conclusions about which polarity certain quantifiers give rise to (De Clercq 2011b: footnote 2). In our chapter, we only consider ques- tion tags. Medial Adjunct PPs in English 283 7 An anonymous reviewer points out that neither the positive or the negative tag is in fact fully grammatical with the “negative” argument PP. This may well be true, but the fact is that our informants consistently prefer the negative tag over the positive one. Nevertheless, speaker variation should indeed be taken into account. Experimental research would be useful to get a clearer picture on speakers’ preferences for certain tags. Crucial for the present analysis is the fact (1) that there is a clear distinction between negative PP-adjuncts that always give rise to positive question tags and negative PP-complements that preferentially lead to negative question tags and (2) that negative question tags are for many speakers definitely an option with negative objects (not only PP-objects) unlike with negative subjects, as also reported in McCawley (1998: 507): (i) Fred talked to no one, didn’t he? (McCawley 1998: 507) 8 We leave open the possibility that TP also contains a polarity-related projection such as NegP or PolP. See Haegeman and Zanuttini (1991, 1996), Haegeman (1995), Smith and Cormack (1998), Christensen (2005, 2008), Moscati (2006, 2011, Tubau (2008) and Haegeman and Lohndal (2010) for discussion of the representation of sentential negation. 9 Thanks to an anonymous reviewer for bringing this point to our attention. 10 We label this projection FP, leaving it intentionally open what its specific nature is. One option is to identify FP with NegP, bearing in mind that NegP contributes to, but is not the sole expression of, sentential negation, which is encoded at the CP level (see note 11 in this chapter). One might also label the projection PolP and assume then that the negative PP will determine a negative value for the Pol head. One important question that remains to be clarified before the identity of FP can be established is whether there is a unique position in the English middlefield that hosts negative PPs and negatively quantified adverbs (never) or whether more than one such projection should be envisaged (see Zanuttini 1997 on Italian and Cinque 1999: chapter 4 for the hypothesis that each adverbial projection may be associated with a negative layer.) Relevant for this issue is the fact that middlefield constituents that encode negation do not all pattern alike. For instance, though both not and never occur in the middlefield, the former requires do-insertion and the latter does not. Similar contrasts are observed for French where pas (“not”) patterns differently from plus (“no more”), as shown in Belletti (1990). For negative constituents in Italian see especially Zanuttini (1997).

References Belletti, A. 1990. Generalized Verb Movement. Turin: Rosenberg and Sellier. Belletti, A. and Rizzi, L. 2010. Moving verbal chunks in the low functional field. In Functional Heads: The Cartography of Syntactic Structures 7, L. Brugé, A. Car- dinaletti, G. Giusti, N. Munaro and C. Poletto (eds.), 129–137. Oxford: Oxford University Press. BNC. 2010. The British national corpus online service. Mark Davies. November– December 2010. http://corpus.byu.edu/bnc/ (accessed 15 January 2011). Chomsky, N. 2001. Derivation by phase. In Ken Hale: A Live in Language, M. Ken- stowicz (ed.), 1–52. Cambridge, MA: MIT Press. Chomsky, N. 2008. On phases. In Foundational Issues in Linguistic Theory, R. Freidin, C. P. Otero and M.-L. Zubizaretta (eds.), 133–166. Cambridge, MA: MIT Press. 284 The Syntax–Semantics Interface Christensen, K. R. 2005. Interfaces: Negation-Syntax-Brain. Doctoral dissertation. University of Aarhus. Aarhus. Christensen, K. R. 2008. NEG-shift, licensing, and repair strategies. Studia Linguis- tica 62(2): 182–223. Cinque, G. 1999. Adverbs and Functional Heads. Oxford: Oxford University Press. Cinque, G. 2004. Issues in adverbial syntax. Lingua 114: 683–710. COCA. 2010. The corpus of contemporary American English online service. Mark Davies. November-December 2010. http://corpus.byu.edu/coca/ (accessed 15 January 2011). Collins, C. 2005. A smuggling approach to the passive in English. Syntax 8: 81–120. De Clercq, K. 2010a. Neg-shift in English: Evidence from PP-adjuncts. In Proceed- ings of the 12th Seoul International Conference on Generative Grammar: 2010 Movement in Minimalism, D.-H. An and S.-Y. Kim (eds.), 231–251. Seoul: Han- kuk Publishing Company. De Clercq, K. 2010b. No in PPs. Evidence for Neg-shift in English. Handout for The Fifth Newcastle-Upon-Tyne Postgraduate Conference in Linguistics. Newcastle. 23rd March 2010. De Clercq, K. 2011a. Negative PP-adjuncts and Scope. Paper presented at ConSOLE XIX. Groningen University, 5–8 January 2011. De Clercq, K. 2011b. Squat, zero and no/nothing: Syntactic negation vs. semantic negation. In Linguistics in the Netherlands 2011, R. Nouwen and M. Elenbaas (eds.), 14–24. Amsterdam: John Benjamins. Emonds, J. E. 1976. A Transformational Approach to English Syntax: Root, Struc- ture-Preserving, and Local Transformations. New York: Academic Press. Ernst, T. 2002a. The Syntax of Adjuncts. Cambridge: Cambridge University Press. Ernst, T, 2002b. Adjuncts and word order asymmetries. In Asymmetry in Grammar: Volume I: Syntax and Semantics, A. M. Di Sciullo (ed.), 178–207. Amsterdam: John Benjamins. Frey, W. and Pittner, K. 1998. Zur posiionierung von adverbialen in deutschen mit- tlefeld. Linguistische Berichte 176: 489–534. Haegeman, L. 1995. The Syntax of Negation. Cambridge: Cambridge University Press. Haegeman, L. 2000. Negative preposing, Negative inversion and the split CP. In Negation and Polarity, L. Horn and Y. Kato (eds.), 29–69. Oxford: Oxford Uni- versity Press. Haegeman, L. 2002. Sentence-medial NP-adjuncts in English. Nordic Journal of Linguistics 25(1): 79–108. Haegeman, L. and Lohndal, T. 2010. Negative concord and multiple agree: A case study of West Flemish. Linguistic Inquiry 41(2): 181–211. Haegeman, L. and Zanuttini, R. 1991. Negative heads and the NEG-criterion. The Linguistic Review 8: 233–251. Haegeman, L. and Zanuttini, R. 1996. Negative concord in West Flemish. In Param- eters and Functional Head: Essays in Comparative Syntax, A. Belletti and L. Rizzi (eds.), 117–180. Oxford: Oxford University Press. Haumann, D. 2007. Adverb Licensing and Clause Structure in English. Amsterdam: John Benjamins. Horn, L. C. 1989. A Natural History of Negation. Chicago, IL: University of Chi- cago Press. Jackendoff, R. 1977. X’ Syntax: A Study of Phrase Structure. Cambridge, MA: MIT Press. Medial Adjunct PPs in English 285 Jayaseelan, K. A. 2008. Topic, focus and adverb positions in clause structure. Nan- zan Linguistics 4: 43–68. Jayaseelan, K. A. 2010. Stacking, stranding, and pied-piping: A proposal about word order. Syntax 13: 298–330. Kato, Y. 2000. Interpretive asymmetries of negation. In Negation and Polarity, L. Horn and Y. Kato (eds.), 62–87. Oxford: Oxford University Press. Klima, E. 1964. Negation in English. In The Structure of Language, J. Fodor and J. Katz (eds.), 246–323. Englewood Cliffs, NJ: Prentice Hall. Laka, I. 1990. Negation in Syntax: On the Nature of Functional Categories and Projections. Doctoral dissertation, MIT, Cambridge, MA. Lambotte, P. 1998. Aspects of Modern English Usage. Paris and Brussels: De Boeck Université. McCawley, J. D. 1998. The Syntactic Phenomena of English. 2nd edition. Chicago, IL: University of Chicago Press. 2 vols. McCloskey, J. 2011. Polarity and case-licensing: The cartography of the inflectional layer in Irish. Paper presented at GIST 3: Cartographic Structures and Beyond. Ghent University, May 14–15 2011. Mittwoch, A., Huddleston, R. and Collins, P. 2002. The clause: Adjuncts. In. The Cambridge Grammar of the English Language, Huddleston and Pullum (eds.), 663–784. Cambridge: Cambridge University Press. Moscati V. 2006. The Scope of Negation. Doctoral dissertation, Università di Siena. Moscati, V. 2011. Negation Raising: Logical form and Linguistic Variation. Cam- bridge: Cambridge Scholars Publishing. Nakajima, H. 1991. Transportability, scope ambiguity of adverbials, and the gener- alized binding theory. Journal of Linguistics 27: 337–374. Pittner, K. 1999. Adverbiale im Deutschen: Untersuchungen zu ihrer Stellung und Interpretation. Tübingen: Stauffenburg. Pittner, K. 2004. Adverbial positions in the German medialdle field. In Adverbials: The Interplay Between Meaning, Context and Syntactic Structure, J. R. Austin, S. Engelberg and G. Rauch (eds.), 253–287. Amsterdam: John Benjamins. Progovac, L. 1993. Negative polarity: Entailment and binding. Linguistics and Phi- losophy 20: 149–180. Progovac, L. 1994. Negative and Positive Polarity. Cambridge: Cambridge Univer- sity Press. Pullum, G. and Huddleston, R. 2002. Negation. In. The Cambridge Grammar of the English Language, R. Huddleston and G. Pullum (eds.)., 785–849. Cambridge: Cambridge University Press. Quirk, R., Greenbaum, S., Leech, G. and Svartvik, J. 1985. A Comprehensive Gram- mar of the English Language. London: Longman. Radford, A. 2004. English Syntax: An Introduction. Cambridge: Cambridge Uni- versity Press. Rizzi, L. 1997. The fine structure of the left periphery. In Elements of Grammar, L. Haegeman (ed.), 289–330. Dordrecht: Kluwer. Ross, J. R. 1973. Slifting. In The Formal Analysis of Natural Languages, M. Gross, M. Halle and M. Schützenberger, (eds.), 133–169. The Hague: Mouton. Rudanko, J. 1987. Towards a description of negatively conditioned subject operator inversion in English. English Studies: A Journal of English Language and Litera- ture 68(4): 348–352. Sinclair, J. (ed.). 1990. COBUILD English Grammar. London: Collins. 286 The Syntax–Semantics Interface Smith, N. and Cormack, A. 1998. Negation, polarity and V positions in English. UCL Working Papers in Linguistics 10: 285–322. Sobin, N. 2003. Negative inversion as nonmovement. Syntax 6: 183–222. Swan, M. 2005. Practical English Usage. Oxford: Oxford University Press. Tottie, G. 1983. Much about not and nothing: A Study of the Variation Between Analytic and Synthetic Negation in Contemporary American English. (Scripta Minora. Regiae Societatis Humanorum Litterarum Lundensis). Lund: CWK Gleerup. Tubau, S. 2008. Negative Concord in English and Romance: Syntax-Morphology Interface Conditions on the Expression of Negation. Utrecht: LOT. Zanuttini, R. 1997. Negation and Clausal Structure: A Comparative Study of Romance Languages. Oxford: Oxford University Press. Zeijlstra, H. 2004. Sentential Negation and Negative Concord. Utrecht: LOT. 10 Neo-Davidsonianism in Semantics and Syntax*

10.1 Introduction Ever since Davidson (1967), an important ingredient of verbal meaning has been the event variable. Davidson’s argument is that in a sentence like (1a), the verb has an event variable in addition to its argument variables, which yields the logical form in (1b) and the paraphrase in (1c):

(1) a. Jones buttered the toast. b. ∃e[buttering(e, Jones, the toast)] c. There is an event of buttering of which Jones is the agent and the toast is the object.

Davidson argues that these event representations are well-suited to capture important entailment relations. Consider the examples in (2a) through (2e):

(2) a. Jones buttered the toast. b. Jones buttered the toast slowly. c. Jones buttered the toast slowly in the bathroom. d. Jones buttered the toast slowly in the bathroom with a knife. e. Jones buttered the toast slowly in the bathroom with a knife at midnight.

In these examples, (2e) entails (2a), (2b), (2c), and (2d); (2d) entails (2a), (2b), and (2c); (2c) entails (2a) and (2b); (2b) entails (2a). This follows straightforwardly if there is an event variable common to all the modifiers. The modifiers can then be linked by conjunction, in which case the entail- ments would follow as a natural consequence of conjunction reduction.

(3) ∃e[buttering(e, Jones, the toast)&Slow(e)&In(e, the bathroom) &With (e, a knife)&At(e, midnight)]

This is the core idea of the Davidsonian approach to semantics, namely, the conjunction of event predicates. 288 The Syntax–Semantics Interface Immediately after Davidson presented his proposal for conjoining modi- fiers and predicates, Castañeda (1967) argued that the thematic arguments could be separated, or severed, from the verb. That is, (1b) could rather be represented as in (4), where thematic relations are independent two-place predicates.

(4) ∃e[buttering(e)&Agent(e, Jones)&Theme(e, the toast)

Logical forms with this structure are called neo-Davidsonian (Parsons 1990). Dowty (1989) calls (1b) the “ordered-argument” method and (4) the “neo-Davidsonian” method.1 Observe that scholars such as Parsons (1990) would be happy if all decomposition is assigned to the lexicon. That is, we could stipulate the meaning postulate in (5) and this would suffice.2

(5) ‘V (e, F, G)’ is true ↔ ∀x(Agent(e, x) ↔ Fx) ∧V ∗ e ∧ ∀x(Theme(e, x) ↔ Gx) (Schein 1993: 9)

Thus, it is crucial to distinguish decomposition from separation, where the latter assumes that thematic arguments are never part of the verb, either in logical forms or in the lexicon. Parsons mostly assumed decomposition rather than separation.3 In this chapter, I focus on arguments that require separation and where decomposition will not be sufficient. This will espe- cially become clear in section 10.2 when I discuss semantic arguments for separation, as especially Schein (1993) makes clear.4 It is worth noticing that what both Davidson and Parsons call “logical form” is not the same as the notion of Logical Form (LF), which is a syn- tactic level of representation (cf. May 1977, 1985). As Hornstein (2002: 345) points out, the “conception of LF is analogous (not identical) to earlier conceptions of logical form (or logical syntax) [. . .] found in the work of philosophers like Frege, Russell, Carnap, and Strawson”. Kratzer (1996: 110) cites Parsons (1993) (see Parsons 1995: 650) saying that the theory in Parsons (1990) is a “proposal for the logical forms of sentences, unsupple- mented by an account of how those forms originate by combining sentence parts”. One can, for example, argue that there is ordered argument associa- tion in the syntax and in conceptual structure, or one can argue that there is ordered argument association in the syntax but separation in conceptual structure. Yet another option is to argue that there is separation both in the syntax and conceptual structure. These three options are illustrated in (6) in the order in which they were just described.

(6) a. stab: λx.λy.λe.stab(e, y, x) b. stab: λx.λy.λe.stab(e)&Agent(e, y)&Theme(e,x) c. stab: λe.stab(e) Neo-Davidsonianism 289 In the literature one finds the label Neodavidsonianism applied to both (6b) and (6c). Parsons (1990) and Ramchand (2008) are representatives of (6b) whereas Schein (1993), Borer (2005a,b), Bowers (2010), and Lohndal (2014) are representatives of (6c). Kratzer (1996) and Pylkkänen (2008) argue for the in-between alternative where the Agent is separated but not the Theme, as discussed in section 10.2.5 The goal of this chapter is to discuss neo-Davidsonianism in semantics and syntax. Section 10.2 looks at neo-Davidsonianism in semantics by focusing on the evidence for conjoining thematic predicates. Particular attention is devoted to the arguments in Schein (1993) and Kratzer (1996), where it is argued that the Agent is not lexically represented on the verb. Section 10.3 considers examples of neo-Davidsonianism approaches to the syntax–semantics interace. Section 10.4 concludes the chapter.

10.2 Neo-Davidsonianism in Semantics Davidson’s original motivation was semantic in nature: He wanted to cap- ture entailment relations. This is clearly conveyed in the following quote:

I would like to give an account of the logical or grammatical role of the parts of words of such sentences [simple sentences about actions] that is consistent with the entailment relations between such sentences and with what is known of the role of those same parts or words in other (non-action) sentences. I take this enterprise to be the same as showing how the meanings of action sentences depend on their structure. (Davidson 1967: 81)

A lot of work since has also focused on the semantic aspects, viz. the influential Higginbotham (1985) and much other work. In this section, I focus on some of the most influential and convincing semantic arguments for adopting the neo-Davidsonian approach. I mainly focus on arguments for severing the agent from the verb’s lexical representation but also, toward the end, present a couple of arguments concerning Themes.6

10.2.1 Severing the Agent From the Verb In this section, I consider arguments in favor of severing the Agent from the verb’s grammatical representation. I first discuss Kratzer’s (1996) argument before I turn to Schein’s (1993) argument.7

10.2.1.1 Kratzer (1996) Kratzer (1996) starts out by rephrasing the argument by Marantz (1984), which says that external arguments are not arguments of verbs. Marantz 290 The Syntax–Semantics Interface observes that there are many cases where the interpretation of the verb depends on the internal argument. Marantz (1984: 25) gives the following examples from English:

(7) a. throw a baseball b. throw support behind a candidate c. throw a boxing match (i.e., take a dive) d. throw a fit (8) a. take a book from the shelf b. take a bus to New York c. take a nap d. take an aspirin for a cold e. take a letter in shorthand (9) a. kill a cockroach b. kill a conversation c. kill an evening watching T.V. d. kill a bottle (i.e., empty it) e. kill an audience (i.e., wow them)

One could of course argue that these verbs are homophonous, but that seems like a cop-out, and it also seems to miss a generalization that one can make, namely, that the verb and its internal argument together determine the rel- evant interpretation (cf. Marantz 1984: 25). Furthermore, Marantz (1984: 26): notes that “. . . the choice of subject for the verbs does not determine the semantic role of their objects”. This is supported by the data in (10) and (11), where the subjects are different but the object could be the same.

(10) a. The policeman threw NP. b. The boxer threw NP. c. The social director threw NP. d. Throw NP! (11) a. Everyone is always killing NP. b. The drunk always refused to kill NP. c. Silence can certainly kill NP. d. Cars kill NP.

These facts would all follow if external arguments are not true argument of their verbs, Marantz argues. That is, by excluding the subject from the unit consisting of the verb and the object, we can capture this asymmetry between subjects and objects.8 Since Kratzer’s paper, there has been a lot of work on the syntax of external arguments, see, for example, Hale and Keyser (1993, 2002), Harley (1995), Kratzer (1996), Marantz (1997), Borer (2005a,b), Alexiadou, Anagnosto- poulou and Schafer (2006, 2015), Folli and Harley (2007), Jeong (2007), Pylkkänen (2008), Ramchand (2008), Schäfer (2008, 2012), and Merchant (2013). There is not necessarily a consensus as to the nature of the projection Neo-Davidsonianism 291 that introduces the external argument (either Spec,vP or Spec,VoiceP), but a lot of the literature is in agreement that a separate projection introduces the external argument. Thus, we typically get the following structure.

(12) VoiceP/vP

external argument Voice′/v′

Voice/v VP

V′

V internal argument

In this structure, the internal argument is illustrated in the complement position of the verb. An additional Applicative projection is typically added for the indirect object; compare McGinnis (2001), Jeong (2007), and Pylk- känen (2008). However, Kratzer’s argument only goes through if the specification of the verb’s meaning only refers to the internal argument and furthermore, if idiomatic dependencies like these can be captured by defining the meaning of the verb. Kratzer discusses the first premise but not the second. She seems to assume that idiomatic dependencies must be specified over objects in the lexicon, that is, over the verb and its Theme. Marantz (1997) has a differ- ent view (see also Harley 2009), namely, that idiomatic dependencies can be defined over outputs of syntax, in which case Kratzer’s argument would not go through. This does not entail that the Agent should not be severed but that we need to investigate the relationship between the verb and the Theme more closely. I do not discuss these issues here; see Marantz (1997) and Lohndal (2014) for discussion.

10.2.1.1  Schein (1993) Schein (1993) puts forward arguments showing that we need the neo- Davidsonian representation in the semantics, a representation that he refers to as “full thematic separation”. Schein makes the strong claim that the Agent relation, the Theme relation and the verb relation are independent of each other. Schein’s project is to argue that lexical decomposition, as seen earlier, is not sufficient and that separation is required. The way Schein implements this idea is to put a Theme in between the Agent and the verb, as illustrated 292 The Syntax–Semantics Interface in (13). If the Agent is not lexically represented on the verb but, rather, introduced by structure separate from the verb, the Agent can be the agent of an event that is not that of the verb. Schein introduces such a case involving a distributive quantifier as the Theme, as in (15). Such a Theme may induce a mereological partition rela- tion between the event of Agent and the event of the verb. Importantly,

(13) Agent Theme V though, in this case no substantive verbal meaning is added. There is not a substantial semantic relation to the event of the verb, as, for example, a causative would contribute but simply the mereological relation. In order to make this clearer, let us see how a mereology of events is motivated. Consider the data in (14), from Schein (1993: 7):9

(14) a. Unharmoniously, every organ student sustained a note on the Wurlitzer for sixteen measures. b. In slow progression, every organ student struck a note on the Wurlitzer.

Schein argues that the reading for (14a) is one where each student is related to a note on the Wurlitzer; that is, for each to have an event of his own, the quantifier must include a quantifier of events within its scope. Note that it is not the individual note that is unharmonious but the ensemble. Each of the students only plays a part in the larger action. There is no other way to get this reading, and the sentence would be false if, for example, one of the students keeps it going for eight measures and then another student does the other eight, as Schein observes. The same argument can be made for (14b). The solitary events performed by the students can only be related to the larger one as parts of the whole. Summarizing, the mereological relation is encoded through a quantifier which includes the condition that e′ is part of e (e′ ≤ e). Let us return to the need for lexical decomposition. Schein’s discussion centers around cases like (15) through (18). In what follows I concentrate on (15).

(15) Three video games taught every quarterback two new plays. Intended reading: “Between the three of them, the video games are responsible for the fact that each quarterback learned two new plays.” (16) Three agents sold (the) two buildings (each) to exactly two investors. (17) Three letters of recommendation from influential figures earned the two new graduates (each) two offers. Neo-Davidsonianism 293 (18) Three automatic tellers gave (the) two new members (each) exactly two passwords.

One may wonder why Schein adds the third NP two new plays in (15). The reason is that this eliminates the possibility that the universal every quarterback denotes a group, like the quarterbacks. If we were dealing with a group denotation, one could possibly analyze (15) as akin to The games taught the quarterbacks. That is, the group of games taught the group of quarterbacks. If that is the case, the particular reading that Schein has iden- tified does not obtain. Therefore, in the example at hand, the universal has to denote a genuine quantifier since it has an indefinite that depends on it. That is, two new plays depends on every quarterback: for every quarterback there are two new plays that he learned. The claim is that the mereological, or part—whole relation among events (e′ ≤ e) connects quantification over quarterbacks and their solitary events to the larger event where three video games are the teachers (Schein 1993: 8). So every quarterback and three video games are cumulatively related, but every quarterback also seems to behave like an ordinary distributive quantifier phrase in its relation to two new plays, as Kratzer (2000) makes clear. Note that in the preceding logical form, the Agent and the Theme are independent of each other and also of the verb. Schein (1993: 8, 57) suggests a corresponding logical form for (15), namely, (19), where INFL means the relation between the event and its agents:10

(19) ∃e(teach(e) ∧ [∃X : 3(X) ∧∀x(Xx → Gx)]∀x(INFL(e, x) ↔ Xx) ∧ [every y : Qy][∃e! : e! ≤ e](∀z(TO(e!, z) ↔ z = y) ∧ [∃W : 2(W) ∧∀w(Ww → Pw)]∀w(OF(e!, w) ↔ Ww)))11

We can spell this out in English as in (20). The lower case e and the use of singularity are just for simplicity. In real life these are second-order quantifiers.12

(20) There is an event e, and e is a teaching, and a three-membered plurality X comprising only video games, such that for every x, x is an agent of e just if it is among those three in X, and for every quarterback y, there is a part e′ of e, such that the tar- get of that part of e is y, and there is a two-membered plurality Z, comprising only plays, such that the content of the teaching e′ was all and only the plays of Z.

We see that the part—whole relation among events (e′ ≤ e) connects quan- tification over quarterbacks and their solitary events to the larger event where three video games are the teachers (Schein 1993: 8). Notice that in the logical form above, the Agent and the Theme are scopally independent 294 The Syntax–Semantics Interface of each other and also of the verb. Here is what Schein says about the inter- pretation of (19).

It is [. . .] essential to the meaning of [(15)] that the θ -role bound into by the subject not occur within the scope of other quantifiers, as in [(19)], and that the action of the three video games be related mereologically to what happened to the individual quarterbacks (Schein 1993: 57).

Schein devotes a lot of time to showing that if teach is a polyadic predi- cate, we do not get the correct logical forms. That is, in (21), either the uni- versal will be inside the scope of the plural, or the reverse, and all thematic relations will be within the scope of the quantifiers.13

(21) [ ∃X : 3(X) ∧∀x(Xx → Gx)][every y : Qy][∃Z : 2(Z) ∧∀z(Zz → Pz)] ∃e teach(X, y, Z, e) (Schein 1993: 57)

As Schein points out, the problem for such polyadic logical forms is to find a meaning that relates individual objects to plural objects. From the point of view of entries such as (21), the difference between (15) and (22-a) is only a matter of scope. The logical form is given in (22b).

(22) a. Every quarterback was taught two new plays by three video games. b. [every y : Qy][∃Z : 2(Z) ∧∀z(Zz → Pz)][∃X : 3(X) ∧∀x(Xx → Gx)] ∃e teach(X, y, Z, e) (Schein 1993: 58)

But the meaning of (15) is crucially different in ways that scope does not reflect. In (22-a), all the NPs related to plural objects occur in the scope of the quantifier over individual objects. This is different in (15) since one of these NPs has escaped, as Schein puts it. I do not go through all the other illustrations Schein provides of why polyadic predicates fail to give the cor- rect meanings. Kratzer (2000) shows that it is technically possible to get around Schein’s (1993) argument for severing the Agent. Here I outline her argument and emphasize, as she does, what one has to buy in order to escape Schein’s arguments. Kratzer uses the sentence in (23a), and the goal is to derive the logical representation in (23b).14 This logical form is simplified compared to the logical form Schein has, but the simplification does not matter for present purposes.

(23) a. Three copy editors caught every mistake (in the manuscript) b. ∃e∃x[3 copy editors(x) ∧ agent(x)(e) ∧ ∀y[mistake(y) → ∃e![e! ≤ e∧ catch(y)(e!)]]]

Kratzer makes the following assumptions:

(24) a. Denotations are assigned to bracketed strings of lexical items in a type-driven fashion (Klein and Sag 1985) Neo-Davidsonianism 295 b. For any string α, T(α) is the denotation of α Types: e (individuals), s (events or states; eventualities as in Bach (1981)), and t (truth-values) d. Composition Principles: Functional Application and Existential Closure (for this example)

With these assumptions in hand, she provides the following derivation:

(25) a. T(every mistake) = λ R(e(st)) λe∀y[mistake(y) → ∃e′[e′ ≤ e′ ∧R(y)(e′)]]

b. T(catch) = λ Q((e(st))(st)) λ xλe[agent(x)(e) ∧Q(catch(e(st))])(e) c. T(catch(every mistake)) = λxλe[agent(x)(e)∧T(every mistake)(catch)(e)] = λxλe[agent(x)(e) ∧∀y[mistake(y) → ∃e′[e′ ≤ e∧ catch(y)(e′)]]] From (a), (b), by Functional Application.

d. T(3 copy editors) = λ R(e(st)) λe∃x[3 copy editors(x) ∧R(x)(e)] e. T(3 copy editors(catch(every mistake))) = T(3 copy editors) (λxλe[agent(x)(e)∧ ∀y[mistake(y) → ∃e′[e′ ≤ e∧ catch(y)(e′)]]]) = λe∃x[3 copy editors(x) ∧ agent(x)(e) ∧ ∀y[mistake(y) → ∃e′[e′ ≤ e∧ catch(y)(e′)]]] From (c), (d), by Functional Application. f. ∃e∃x[3 copy editors(x) ∧ agent(x)(e) ∧ ∀y[mistake(y) → ∃e′[e′ ≤ e∧ catch(y)(e′)]]] From (e), by Existential Closure.

This derivation gets us the intended reading, without severing the Agent. Step (b) shows that all the arguments of catch are part of the lexical entry. Kratzer argues that there is a price to pay if we do this: (1) A complicated semantic type for the direct object position of catch is needed, and (2) it is necessary to posit different argument structure for catch and “catch”; that is, the object language word and the metalanguage word would have different denotations. Many semanticists, including Kratzer, argue that this is not a price we should be willing to pay, and she goes on to show that severing the Agent makes it possible to do without these two assumptions. Furthermore, a derivation of the sort that we have just seen does not preserve the intuition (as expressed by, for example, Levin and Rappaport Hovav 1995) that there is an “underlying” matching of semantic structure to argument structure. In the semantics literature, there is no agreement on whether or not to sever the Agent from the verb. In the next subsection, I discuss whether Themes should be severed or not.

10.2.2 Severing the Theme from the Verb In order for the semantics to be fully neo-Davidsonianism in the domain of thematic arguments, Themes (or Patients) have to be severed from the lexi- cal representation of the verb.15 Here I consider a couple of arguments in favor of severing the Theme (both are discussed in Lohndal 2014). 296 The Syntax–Semantics Interface The first argument concerns the semantic interpretation of reciprocals (Schein 2003). Consider the sentence in (26):

(26) The cockroaches suffocated each other.

The sentence in (26) could be true “even where only the entire group sits at the cusp of catastrophe” (Schein 2003: 349). Put differently, had there been only one less cockroach, all cockroaches would have survived. Schein (2003: 350) observes that none of the following paraphrases accurately cap- tures this reading.

(27) a. The cockroaches each suffocated the others. b. The cockroaches each suffocated some of the others. c. The cockroaches suffocated, each suffocating the others. d. The cockroaches suffocated, each suffocating some of the others.

The problem is that all the paraphrases assign each a scope that includes the verb. The main point here is that each cockroach is in a thematic relation to some event E that contributed to the mass suffocation. But E is not itself a suffocation of one cockroach by another. Schein concludes that the scope of each includes the thematic relation, but not the event predicate suffocate. He gives the logical form in (28a), which has the paraphrase in (28b) (Schein 2003: 350).

(28) a. ∃e[the X : cockroaches[X]] (Agent[e, X] &suffocate[e]&Theme[e, X] & [ιX : Agent[e, X]] [Each x : Xx][ιe′ : Overlaps[e′, e]&Agent[e′, x]] b.  [∃e″ : t(e″) ≤ t(e′)][ιY : Others[x,Y] &Agent[e″,Y]] Theme[e′,Y]) ‘The cockroaches suffocate themselves, (with) them each acting against the others that acted.’

Had there been only one less cockroach, they would all have made it. So each does something to some of the others that contributes to their mass suffocation, but that contribution is not a suffocation, as all the paraphrases in (27a–d) would suggest. Some readers may object that there are many independent issues that need to be dealt with concerning reciprocity before the above argument can be accepted. Here I do not discuss reciprocity in detail but refer the reader to Dotlačil (2010) and LaTerza (2014) for further arguments that reciprocity requires a neo-Davidsonian semantics where no arguments are part of the verb’s denotation. In particular, LaTerza develops a neo-Davidsonian view of distributivity first discussed by Taylor (1985) and Schein (1993) and uses this to account for why reciprocal sentences can be true in a constrained variety of different types of situations, and reciprocals’ ability to appear in a wide range of argument positions. The second argument concerns the argument/adjunct distinction (Lohndal 2014). If the Theme is part of the lexical representation of the verb, that Neo-Davidsonianism 297 means that the obligatoriness of a Theme indicates “V (e, x)” rather than “V (e)&Theme(e, x)”. Put differently, the Theme is obligatory. Consider the following data:

(29) a. *Barry stepped. b. *Barry stepped the path into the garden. c. Barry stepped into the garden.

These examples show that the verb step requires an obligatory PP. How- ever, if that is indicative of the adicity of this verb, into the garden does not have a consistent Davidsonian semantics despite being a poster child for such a semantics, since it would have to be part of the verb’s denotation. That is, according to Davidsonian and neo-Davidsonianism approaches, PPs are always adjuncts. If we want to maintain the (neo-)Davidsonianism semantics for into the garden, the preceding examples do not indicate that the Theme predicate is obligatory. Something else needs to account for this apparent obligatoriness of the PP associated with the verb step. There are also cases of disjunctive obligatoriness. This is illustrated in the following examples:

(30) a. *Mary passed. b. *Mary crossed. (31) a. Mary passed the garden. b. Mary crossed the garden. (32) a. Mary passed into the garden. b. Mary crossed into the garden.

The argument just made applies to these sentences as well. The verbs pass and cross can either take a nominal complement or a PP adjunct. Neo- Davidsonians cannot conclude anything about obligatoriness based on such data since PPs are supposed to be optional and DPs obligatory. Therefore, the badness of (30) has to be due to something else. See Lohndal (2014) for a proposal where the badness of such data is associated with conceptual structure.

10.3 Neo-Davidsonianism at the Syntax–Semantics Interface In the previous section, I presented arguments in favor of neo-Davidsoni- anism that are primarily semantic in nature. Independently of work on the semantics of argument structure, some work in syntax started to argue for the claim that arguments occupy separate functional projections. This move was taken part way in Chomsky (1993), where it was argued that all argu- ments move into a functional projection (see also Koopman and Sportiche 1991 on subjects). Instead of the traditional syntax in (33a), it was argued that the correct structural representation looks like in (33b). EA and IA denote the external and internal argument, respectively. 298 The Syntax–Semantics Interface (33) a. CP

C′

CTP

EA T′

TVP

′ tEA V

VIA b. CP

C′

C AgrSP

EA AgrS′

AgrS TP

tEA T′

T AgrOP

IA AgrO′

AgrO VP

tEA V′

V tIA Neo-Davidsonianism 299 In (33a), the external argument originates internally to the VP and moves to the canonical subject position, SpecTP (cf. McCloskey 1997). This move- ment has been generalized in (33b), where both the subject and the object move into dedicated abstract agreement positions. Later, (33b) was replaced by a little v projection introducing the external argument (Chomsky 1995). There was no dedicated projection for the direct object; it was usually ana- lysed as V’s sister. The extension in Chomsky (1993) is only partial since theta-role rela- tions are determined within the VP. That is, at the point where argument structure is determined, there is no neo-Davidsonian structure (all argu- ments are within the VP). A full-blown neo-Davidsonian syntax was first proposed in Borer (1994) and since argued for in great detail in Borer (2005a,b; see also Lin 2001). Ramchand (2008: 42) uses the term post- Davidsonian to “describe a syntacticized neo-Davidsonian view whereby verbal heads in the decomposition are eventuality descriptions with a single open position for a predicational subject”. Although I see the merit of using a separate term for proposals where the logical forms are accompanied by a specific hierachical syntax, I continue to use the term neo-Davidsonian in this chapter. In this section, I look at a family of neo-Davidsonian approaches to the syntax–semantics interface. I start by looking at Borer, then Ramchand (2008) before I consider Pylkkänen (2008) and Bowers (2010). Last, I con- sider the proposal in Lohndal (2014). Common to all these approaches is that they rely on a certain syntactic hierarchy. They do not say much about what determines the order of this hierarchy. Presumably the order is universal (cf. Cinque 1999), raising several issues that I am not able to discuss here.

10.3.1 The Exoskeletal View Borer (2005a,b) develops a constructional approach to the syntax–seman- tics interface.16 For her, there is no projection of argument properties from lexical items. Rather, lexical items are inserted into what she calls syntactic templates. These templates are independent of specific requirements on lexical items. Thus, there is no specification of argument structure properties in lexical items. Borer makes a redundancy argument, namely, that there is no reason for a property to be both lexically specified and syntactically represented, as is the case in approaches that rely on theta roles and the Theta Criterion (Chomsky 1981). Borer argues that lexical flexibility is so pervasive that argument structure should not be lexically specified.17 She discusses an illuminating case from Clark and Clark (1979), which involves the verb to siren:

(34) a. The factory horns sirened throughout the raid. b. The factory horns sirened midday and everyone broke for lunch. 300 The Syntax–Semantics Interface c. The police car sirened the Porsche to a stop. d. The police car sirened up to the accident site. e. The police car sirened the daylight out of me.

Even if native speakers of English have never heard siren used as a verb, they can easily interpret these sentences. The examples show that the new verb can appear with several subcategorization frames where the core meaning seems to be maintained (to produce a siren sound), though the specific meanings are augmented according to the syntactic environment. This strongly suggests that the meaning of siren cannot just come from the verb itself but that it depends on the syntactic construction. In this sense, Borer follows many other scholars and approaches in arguing that seman- tically synonymous expressions cannot correspond to identical syntactic structures. She argues that there is a “making sense” component which relies on the encyclopedic meaning of lexical items and the structure in which they occur. The general structure of the argument domain of a clause looks as follows (Borer 2005a: 30).

(35) F-1max

Spec F-1

argument-1 F-1min F-2max

Spec F-2

argument-2 F-2min L-D

The bottom part is the lexical domain (L-D), which emerges from the merger of some listeme from the conceptual array (Borer 2005a: 27). A lis- teme “is a unit of the conceptual system, however organized and con- ceived, and its meaning, part of an intricate web of layers, never directly interfaces with the computational system” (Borer 2005a: 11). Listemes are what Distributed Morphology calls roots (Borer 2005a: 20). Put differ- ently, listemes do not have information that is accessible to the syntactic derivation. Listemes have great flexibility whereas functional vocabulary Neo-Davidsonianism 301 does not have the same flexibility. This gives the following dichotomy (Borer 2005a: 21):

(36) a. All aspects of the computation emerge from properties of structure, rather than properties of (substantive) listemes. b. The burden of the computation is shouldered by the properties of functional items, where by functional items here we refer both to functional vocabulary, including, in effect, all grammatical forma- tives and affixation, as well as to functional structure.

Note that the traditional distinction between “external” and “internal” arguments (Williams 1981) makes little sense in a system where arguments are severed from the verb and merged in dedicated functional projections. For that reason, among others, Borer uses different labels for subjects and different types of objects. An example of Borer’s system can be given based on the following examples:

(37) a. Kim stuffed the pillow with the feathers (in two hours). b. Kim stuffed the feathers into the pillow (in two hours).

(37a) means that the pillow was entirely stuffed, but there may still be feathers left. (37b) has the other interpretation, namely, that all the feathers are in the pillow, but the pillow might not be entirely stuffed. Borer (2005b) assigns two different syntactic structures to these sentences. They are pro- vided in (38a) and (38b).18

(38) a. EP

Spec max E T Kim Spec max TAspQ

tKim

Spec

# L-D the pillow L PP

stuffed with feathers

b. EP

Spec max E T Kim Spec max TAspQ

tKim

Spec

# L-D the feathers L PP

stuffed into the pillow (38) a. EP

Spec max E T Kim Spec max TAspQ

tKim

Spec

# L-D the pillow L PP

302 The Syntax–Semantics Interface stuffed with feathers b. EP

Spec max E T Kim Spec max TAspQ

tKim

Spec

# L-D the feathers L PP

stuffed into the pillow

The location in (38a) is the subject-of-quantity and sits in the specifier of Asp. The agent is the subject of an event phrase EP which also hosts the event variable. The PP, which is the understood subject matter, is merged with the L-head. In (38b), the subject matter is the subject-of-quantity, and structured change is measured with respect to the subject matter (Borer 2005b: 93). As the structures show, the specifier of the Asp phrase is what is measured out; compare with Tenny (1987, 1994). Borer (2005b: 94) pro- vides the following neo-Davidsonian logical forms for the sentences in (37):

(39) a. ∃e[quantity(e) & originator(Kim, e) & subject-of-quantity(the pil- low, e) & WITH(the feathers, e) & stuff(e)] b. ∃e[quantity(e) & originator(Kim, e) & subject-of-quantity(the feathers, e) & INTO(the pillow, e) & stuff(e)]

In this way, the meaning of stuff remains the same even though the syntactic structures are different. The mapping between syntax and semantics in Borer’s theory is not very explicit. That it, it is unclear how the system moves from the syntactic struc- ture to the semantic interpretation of that structure. It is clear that various annotations in the syntactic structure have an impact on the meaning, but beyond that, Borer does not say much about the interface itself.

10.3.2 A First-Phase Syntax Ramchand (2008) argues that syntax is crucial in determining many aspects of argument structure. She adopts a constructionist approach, in Neo-Davidsonianism 303 which structure is more important than lexical aspects when it comes to determining meaning, but she argues that verbs (actually roots) contain some information about syntactic selection. For approaches that assume that the lexicon contains roots, Ramchand (p. 11) presents the following two views:

The naked roots view The root contains no syntactically relevant information, not even cat- egory features (cf. Marantz 1997, 2005, Borer 2005a,b).

The well-dressed roots view The root may contain some syntactic information, ranging from cat- egory information to syntactic selectional information and degrees of argument-structure information, depending on the particular theory. This information is mapped in a systematic way onto the syntactic rep- resentation that directly encodes it.19 (Ramchand 2008: 11).

Ramchand opts for a theory that is closer to the well-dressed roots view, since she wants to “encode some notion of selectional information that constrains the way lexical item can be associated with syntactic structure” (Ramchand 2008: 3). The main reason for this is to account for the lack of flexibility in cases like (40):

(40) a. *John slept the baby. b. *John watched Mary bored/to boredom.

However, the main part of Ramchand’s proposal is that the syntactic pro- jection of arguments is based on event structure (cf. Borer 2005a,b, Ritter and Rosen 1998, Travis 2000) and that the syntactic structure has a specific semantic interpretation. She proposes the syntactic structure in (41).

(41) initP (causing projection)

DP3 init procP (process projection)

DP2 proc resP (result projection)

DP 1 res XP 304 The Syntax–Semantics Interface These projections have the following definitions:

(42) a. initP introduces the causation event and licenses the external argu- ment (“subject” of cause = INITIATOR) b. procP specifies the nature of the change or process and licenses the entity undergoing change or process (“subject” of process = UNDERGOER) c. resP gives the “telos” or “result state” of the event and licenses the entity that comes to hold the result state (“subject” of result = RESULTEE)

For Ramchand, many arguments are specifiers of dedicated functional projections. These projections specify the subevental decompositions of events that are dynamic. There is one exception, though, namely, that rhemes are complements instead of specifiers. That is, they have the follow- ing syntactic structure (Ramchand 2008: 46).

(43) initP

init procP

proc DP

RHEME

Rhemes, or “Rhematic Objects”, are objects of stative verbs, and they are not subjects of any subevents, hence not specifiers. Examples of Rhemes are provided in (44) (Ramchand 2008: 33–34):

(44) a. Kathrine fears nightmares. b. Alex weighs thirty pounds. c. Ariel is naughty. d. Ariel looks happy. e. The cat is on the mat.

Thus, arguments can be complements or specifiers, depending on their role in event structure. In terms of interpretation, Ramchand assumes one primitive role of event composition: Neo-Davidsonianism 305 (45) Event Composition Rule e = e1 → e2: e consists of two subevents, e1, e2, such that e1 causally implicated e2 (cf. Hale and Keyser 1993)

Two general primitive predicates over events correspond to the basic sub­ event types in the following way:

(46) a. State(e): e is a state b. Process(e): e is an eventuality that contains internal change

The syntactic structure will determine the specific interpretation. In the init position, the state introduced by the init head is interpreted as causally implicating the process. On the other hand, in the res position, the state introduced by that head is interpreted as being causally implicated by the process (Ramchand 2008: 44). Ramchand defines two derived predicates over events based on the event composition rules:

(47) IF, ∃e1, e2[State(e1)&Process(e2)&e1 → e2], then by definition Initiation(e1)

(48) IF, ∃e1, e2[State(e1)&Process(e2)&e2 → e1], then by definition Result(e1)

The specifiers in each predication relation are interpreted according to the primitive roles:

(49) a. Subject (x, e) and Initiation(e) entails that x is the INITATIOR of e. b. Subject (x, e) and Process(e) entails that x is the UNDERGOER of e c. Subject (x, e) and Result(e) entails that x is the RESULTEE of e

The three important heads in the structure have the following denotations (taken from Ramchand 2011: 458):

(50) [[res]] = λ Pλxλe[P(e)&State(e)&Subject(x, e)]

(51) [[proc]] = λ Pλxλe∃e1, e2[P(e2)&Process(e1)&e = (e1 → e2)&Subject(x, e1)]

(52) [[init]] = λ Pλxλe∃e1, e2[P(e2)&State(e1)&e = (e1 → e2)&Subject(x, e1)]

Importantly, these skeletal interpretations have to be filled by encyclopedic content, but they already contain important aspects of meaning simply by virtue of their structure. Ramchand (2011) asks whether it is possible to make her proposal more austere in the sense of only making use of conjunction (cf. Pietroski 2005, 2011). One consequence of this is that the event composition rule would have to be replaced by specific relations such as RESULT and CAUSE. The 306 The Syntax–Semantics Interface following tree structure and semantics illustrate what this would look like (Ramchand 2011: 460).

(53) a. John split the coconut open. b. b. initP(causing projection)

DP3

init procP (process projection) John

split DP2 proc resP(result projection) the coconut

split DP1 res AP

thecoconut split open

c. [[res P]] = λe∃e1[Result-Part(e, e1)&open(e1)&split(e1)&State(e1)&

Subject(e1, ‘the coconut’)]

[[proc P]] = λe∃e2[Proc-Part(e, e2)&splitting(e2)&Dyn(e2)&Subject

(e2, ‘the coconut’)]

[[init P]] = λe∃e3[Cause(e, e3)&splitting(e3)&Subject(e3, ‘John’)]

In the logical form, specific cognitive concepts are employed instead of the general “leads to” relation. Ramchand argues that it may be that the latter is a more general semantic notion that can be utilized for embedding more broadly. If so, the benefit of reducing the event composition rule to conjunc- tion and an arbitrary set of relational concepts is “somewhat less press- ing”, as Ramchand (2011: 460) argues. She is also skeptical of reducing the predication relation (specifiers) and the event identification (complements) to instances of conjunction; see her paper for arguments. Ramchand’s system constitutes the first phase of the clause, namely, the argument domain. Her logical forms are clearly neo-Davidsonian and she pairs them with a hierarchical syntax where generally each argument is introduced in a separate projection. The theory places great emphasis on the importance of structure instead of the nature of each lexical item that enters the structure. This is similar to the approach in Borer (2005a,b), even though Borer goes further in arguing that verbs (roots) have absolutely no informa- tion about their argument structure. As we have seen, Ramchand maintains that some syntactic constraints on argument structure are necessary. Neo-Davidsonianism 307 10.3.3 Introducing Argument Relations Pylkkänen (2008) and Bowers (2010) both make use of Neodavidsonian logical forms, which they combine with a syntax where each argument is introduced in a separate projection. Pylkkänen mainly relies on the approach in Kratzer (1996), which she extends to applicatives and causatives (see also Jeong 2007). Here I focus on the system in Bowers (2010) because I think that clearly dem- onstrates an alternative to the approaches in this section, and an alternative that many semanticists will find appealing. I rely exclusively on the composi- tional semantics that Bowers provides in Appendix A on pages 197–200. Bowers uses the sentence in (54) as his example:

(54) Bill kisses Mary.

The book itself is among others also devoted to defending a particular syn- tax, where the root is at the bottom of the structure and the Agent is merged after the root. The Theme is merged on top of the Agent. All arguments are specifiers of dedicated projections.

(55) Mary Th Bill √ Ag kiss

I do not discuss Bowers’s arguments in favor of the particular syntactic structure. His semantic composition system is mainly based on Functional Application.

(56) Functional Application: If α is a branching node and {β, γ} is the set of α’s daughters, then, for any assignment a, if [[β]]a is a function whose domain contains [[γ ]]a, then [[α]]a = [[β]]a([[γ ]]a) (Heim and Kratzer 1998).

The relevant denotations are provided in (57), where some of the notation has been slightly altered to fit the notation in the rest of the chapter:

(57) a. [[kiss]] = λe[kiss(e)] b. [[Ag]] = λ Pλyλe[P(e)&Agent(e, y)] c. [[Th]] = λ Pλxλe[P(e)&Theme(e, x)]

Based on this, Bowers outlines the derivation in (58):

(58) a. [[Ag]]([[kiss]]) = λ Pλyλe[P(e)&Agent(e, y)](λe[kiss(e)]) = λyλe[λe[kiss(e)](e)&Agent(e, y)] = λyλe[kiss(e)&Agent(e, y)] 308 The Syntax–Semantics Interface b. [[Ag]](kiss]])) (Bill) = λyλe[kiss(e)&Agent(e, y)] (Bill) = λe[kiss(e)&Agent(e, Bill)] c. [Th](([[Ag]]([[kiss]])) (Bill)) = λ Pλxλe[P(e)&Theme(e, x)] (λe[kiss(e)&Agent(e, Bill)]) = λxλe[λe[kiss(e)&Agent(e, Bill)](e)&Theme(e, x)] = λe[kiss(e)&Agent(e, Bill)&Theme(e, Mary)] d. (Th]](([[Ag]]([[kiss]])) (Bill))) (Mary) = λxλe[kiss(e)&Agent(e, Bill) &Theme(e, x)](Mary) = λe[kiss(e)&Agent(e, Bill)&Theme(e, Mary)]

The only thing that remains to be done is to close the event variable off with an existential quantifier. Bowers argues that the category Pr does this, which is merged on top of the structure in (55) (Bowers 2010: 19). The denotation of Pr is given in (59). This is very similar to a run-of-the-mill existential clo- sure assumed by many scholars (e.g., Heim 1982, Parsons 1990).

(59) [Pr] = λ P[∃eP(e)]

Applying this denotation to the denotation of ThP yields:

(60) [[Pr]]([[ThP]) = λ P[∃eP(e)](λe[kiss(e)&Agent(e, Bill)&Theme(e, Mary)]) = ∃e[λe[kiss(e)&Agent(e, Bill)&Theme(e, Mary)](e)] = ∃e[kiss(e)&Agent(e, Bill)&Theme(e, Mary)]

And this is the final logical form. This way of using Functional Application together with λ-conversion can be applied to any syntactic structure where each argument is introduced by a separate projection. Thus, one is not committed to Bower’s view on the order of the thematic arguments if one wants to use his compositional semantics. Note also that Functional Application can be utilized even though the verb is fully neo-Davidsonian in the sense that there is separation both in the syntax and in the semantics.

10.3.4 Syntactic and Semantic Domains In the previous subsection, we saw a neo-Davidsonian view, whereby Func- tional Application was used to derive the semantic representations by using a syntax where each argument is introduced in a separate projection. The approach in Lohndal (2014) attempts to make use of a different semantic composition operation, namely, conjunction (see Pietroski 2005, 2011 and also Carlson 1984). In essence, the approach attempts to combine a neo-Davidsonian syntax with a conjunctive neo-Davidsonian semantics. Lohndal’s core idea is that each application of Spell-Out corresponds to a conjunct in a logical form. Correspondingly, if we want full thematic Neo-Davidsonianism 309 separation in the logical forms, we need each argument and the predicate to be spelled out separately. Lohndal puts forward a view of syntax that achieves this, together with a specific model of the syntax–semantics interface. The syntax does not make a categorical distinction between specifiers and complements; compare with Hoekstra (1991), Jayaseelan (2008), and Chomsky (2010). The main syntactic relation, modulo adjuncts, is that of a merged head and a nonhead, and whether that is called a head–complement relation or a specifier–head relation does not really matter. The model in Lohndal (2014: chapter 4) requires that the model of Spell- Out in minimalist approaches to syntax be rethought. Lohndal does this by proposing a constraint on the kinds of representations that can be gener- ated. The constraint looks as follows (Lohndal 2014: 92):

(61) *[XP YP].

(61) is a derivational constraint that bans two phrasal elements from being merged. Lohndal takes no position on the specific nature of the constraint in other than that it has to be derivational (pace Moro 2000); see Speas (1990: 48), Uriagereka (1999), Alexiadou and Anagnostopoulou (2001, 2007), Chomsky (2008, 2013), Richards (2010), and Adger (2013) for much discus- sion. Whenever the grammar is confronted with a configuration like (61), the grammar will resolve the conflict by making sure that instead of two phrases merging, a head and a phrase are merged. Spell-Out enables this reduction in a specific way that will be outlined below. A similar logic has been used by Epstein (2009) and Epstein, Kitahara and Seely (2012), where Spell-Out fixes an otherwise illicit representation. However there is a difference: for them, you can generate the representations and then Spell-Out can fix it. For Lohndal, you cannot generate the relevant representation at all. This is similar to Adger (2013) who changes the relationship between labeling and structure building, among other reasons to incorporate the constraint in (61). Lohndal assumes that Agents are introduced by Voice0; compare with Kratzer (1996); Alexiadou, Anagnostopoulou, and Schafer (2006); and Alexiadou, Anagnostopoulou, and Schafer (2015). Lohndal emphasizes that the nature of the label does not matter much; see Chomsky (1995), Harley (1995), Folli and Harley (2007), Pylkkänen (2008), Ramchand (2008), and Sailor and Ahn (2010) for discussion. Themes are also introduced by functional heads. Lohndal simply labels the relevant head F0, for lack of a better name, though it is quite likely that this head is more aspectual in nature; compare with Tenny (1994) and Borer (2005a,b). The verb is generally merged prior to all functional projections as argued by Borer (2005a,b) and Bowers (2010). This has to be the case in order to make sure that the verb is spelled out in a separate conjunct.20 Lohndal argues that the most transparent syntax—semantics mapping is one in which an application of Spell-Out corresponds to a conjunct at logi- cal form. 310 The Syntax–Semantics Interface In order to see how a typical derivation would run, let us consider the following sentence:

(62) Three video games taught every quarterback.

Following are the three steps of the derivation. The arrows signal what the logical translation of the boxed syntactic structure (Spell-Out domain) is, assuming the approach in Schein (1993).

(63) a. FP

F VP

teach

b. ⇒ teach(e)

This is the first step of the derivation. The verb somehow becomes a phrase and merges with the F head.21 The next step is to merge the Theme every quarterback with the FP. When the Theme is to be merged into the structure, the complement of the F head has to be spelled out due to the constraint (61). This complement is the VP and it is in a box in the syntactic tree. This box corresponds to the logical form given in (63b). When the Theme is merged, the derivation continues as follows, with merger of the Voice head.

(64) a. VoiceP

Voice FP

QP F

every quarterback

b. ⇒ [every y : Qy][∃e′ : e′ ≤ e](Theme(e′, y))

The FP will be interpreted as in (64b). Here the quantifier outscopes the mereological relation. There are two ways in which the mereological relation Neo-Davidsonianism 311 can enter the structure. The first option is to put it into the QP. In order to obtain the correct scope relation, the general structure of the QP would have to look roughly as follows.

(65) every quarterback ∃e′ : ee′≤

There are many complicated issues surrounding the internal architecture of QPs, which Lohndal does not discuss; he simply notes that this analysis is an alternative. Another alternative is to stipulate syncategorematicity and say that the QP is interpreted as “[every y : Qy][∃e′ : e′ ≤ e]”. Both these proposals leave every quarterback as a constituent and treat every as taking a covert event quantifier argument. Returning to the main derivation, when the Agent is to be merged, the complement of Voice has to be spelled out. This complement corresponds to the box in the tree structure, and it has the logical denotation in (64b). The derivation can then continue and the Agent can be merged.

(66) a. TP

T VoiceP

QP Voice

three video games

b. ⇒ [∃X : 3(X) ∧∀x(Xx → Gx)](Agent(e, x))

The T head is merged, and the next Spell-Out domain is the domain that is boxed in the tree structure. This domain arises when the subject moves to merge with T. The Agent predicate contains an e variable, since there is no information that indicates that any other event variable is required; com- pare with the earlier discussion of the Theme. Lohndal assumes that the Spell-Out domains are added to a stack so that at the end of the derivation, these domains are all conjoined by the 312 The Syntax–Semantics Interface semantic composition principle Conjunction. This gives us the following representation:

(67) [∃X : 3(X) ∧∀x(Xx → Gx)](Agent(e, x)) ∧[every y : Qy][∃e′: e′ ≤ e](Theme(e′, y)) ∧ teach(e)

At the end, existential closure is added, and we end up with the following final logical form:

(68) ∃e([∃X : 3(X) ∧∀x(Xx → Gx)](Agent(e, x)) ∧[every y : Qy][∃e′ : e′ ≤ e](Theme(e′ , y)) ∧ teach(e))

Lohndal (2014) presents several arguments why both conjunction and exis- tential closure are needed, among others based on cases where existential closure takes place on only a subset of the conjuncts. In addition to conjunction and existential closure, Lohndal needs a map- ping principle integrating the thematic arguments into the thematic predi- cates; compare withCarlson (1984). That is, somehow “Theme(e,)” has to become “Theme(e, John)”, for example. Pietroski (2005) essentially appeals to a type-shifting operation to achieve this, whereas Higginbotham (1985) makes use of a different formalism. Lohndal suggests the mapping opera- tion Thematic Integration. It is defined as in (69).

(69) Thematic Integration

Spell-Out R(e, DP). HDP → →

The operation takes a syntactic structure consisting of a head and a com- plement and provides a mapping into logical form. It relies on a given set of heads H and a given set of thematic predicates R:

(70) H = {Voice, F, App, . . .} (71) R = {Agent, Theme, Experiencer, . . .}

These sets are important in order to constrain the power of Thematic Inte- gration and to account for something like the Uniformity of Theta Assign- ment Hypothesis (UTAH; Baker 1988, 1997). This is a very simplified version of Lohndal’s proposal. See Lohndal (2014) for an extensive discussion of the assumptions and claims made earlier.

10.4 Conclusion Donald Davidson’s original proposal that there is an event variable in logi- cal forms has been immensely influential. This chapter has surveyed a range of approaches that rely on Davidson’s insights concerning adjuncts, but that Neo-Davidsonianism 313 also extend the insights to apply to thematic arguments. We have seen that there is a family of neo-Davidsonian proposals. They all have in common that they adopt neo-Davidsonian logical forms. The syntax is different for each specific approach, and some are not very specific about what the syn- tax would be. For those who provide a hierarchical syntax, they neverthe- less arrive at fairly similar logical forms. However, the way in which they arrive at the logical forms differs substantially: Many of them use standard mechanisms such as Functional Application, whereas others use a conjunc- tive semantics without Functional Application. In a sense, the latter is a natural consequence of the original Davidsonian insight, namely, that predi- cates are chained together by way of conjunction.

Notes * I am grateful to Artemis Alexiadou, Elly van Gelderen, an anonymous reviewer, and Rob Truswell for their valuable comments on a previous version of this chapter. 1 Since Gruber (1965) and Jackendoff (1972), there has been a lot of discussion of what the appropriate thematic roles are. See Dowty (1991) for arguments that we can only define prototypical roles, though Schein (2002) argues against this. See also Zubizarreta (1987) and Ramchand (1998) for discussion. 2 The star in “V*e” marks that this 1-place predicate is different from the 3-place lexical entry, although they may have the same descriptive content. See Parsons (1990) for further discussion. 3 Parsons (1990: 96–9) does present an argument for why decomposition is required. Schein (1993: 94) provides further support for this argument, and Bayer (1996: 206) provides counterarguments. See also Bartsch (1976), Carlson (1984), Higginbotham (1985, 1986), Taylor (1985), and Krifka (1989, 1992). 4 There is a rich and important literature in lexical semantics that does not assume that arguments are severed. I cannot discuss this literature here, but see Jackend- off (1990), Levin and Rappaport Hovav (1995, 2005), Reinhart (2002), Rein- hart and Siloni (2005), Horvath and Siloni (2011), and Everaert et al. (2012). 5 Due to space limitations, I only focus on Agents and Themes in this section. See McGinnis (2001), Jeong (2007), and Pylkkänen (2008) for much discussion of indirect objects and applicatives. In particular, Pylkkänen provides a composi- tional semantics that fits well with the discussion in section 10.3.3. 6 I do not discuss the proposal in Krifka (1989, 1992) for reasons of space as it is quite complex. Essentially, Krifka suggests a theory of how the reference of nominals that bear thematic roles affects the aspectual understanding of the events they participate in. Various patient relations are analyzed in terms of how they map the mereological structure of the object to the mereological structure of the event. See Bayer (1996) and Larson (2014) for more discussion. 7 This part is a slightly revised version of material that appears in Lohndal (2014). 8 This may not hold for all languages. Müller (2008: 47–50) and references therein argue that it does not hold for German. 9 See Ferreira (2005) for more discussion of this issue. 10 A brief note about Schein’s take on plurals, which is important for understand- ing his logical forms: A plural like the As is a second-order description of a predicate: a predicate such that if it holds of x, x is an A. This means that the cats comes out as a definite second-order description: (i) ιY (∃yYy∧∀y(Yy ↔ cat(y)) 11 This representation is identical to one from Schein (1993) up to alphabetic vari- ance. Brasoveanu (2010) and Champollion (2010) argue that event variables are 314 The Syntax–Semantics Interface not required in this particular logical form involving the quantifier every. See their papers for further details. 12 Schein (1993) observes that this formulation is actually not strong enough. See his book for more discussion. 13 Though see McKay (2006) for a different view. 14 I am following Kratzer in using boldface to distinguish the object language from the metalanguage. Boldface denotes the object language here. 15 I use the label Theme as a cover term for the internal argument; cf. Dowty’s (1991) thematic proto-roles. 16 The exposition of Borer’s theory is a revised version of the text in Lohndal (2014). 17 See Potts (2008) for a critical discussion.

18 # means that there is an open value in need of a range assignment from the specifier of Asp, and E means that there is an open value for events in need of a range assignment in order to establish a mapping from predicates to events

(see Borer 2005b for much more discussion of this system). In AspQ, Q stands for quantity, cf. Verkuyl (1972, 1989, 1993). 19 Ramchand points out that this view is virtually indistinguishable from what she calls “the static lexicon view”, which is the view that the lexicon contains argument-structure information that correlates in a systematic way with syntac- tic structure. See Baker (1988) for such a view. 20 Pylkkänen (2008: 84) suggests that all causative constructions involve a Cause head, which combines with noncausative predicates and introduces a causing event to their semantics. That proposal can easily be adopted in Lohndal’s model. 21 The event variable belongs to the verb in the lexicon, or it is acquired through the merger of a root with a categorizer. See Lohndal (2014) for discussion.

References Adger, D. 2013. A Syntax of Substance. Cambridge, MA: MIT Press. Alexiadou, A. and Anagnostopoulou, E. 2001. The subject-in-situ generalization and the role of case in driving computations. Linguistic Inquiry 32:193–231. Alexiadou, A. and Anagnostopoulou, E. 2007. The subject-in-situ generalization revisited. In Interfaces + Recursion = Language? H-M. Gärtner and U. Sauerland (eds.), 31–60. Berlin: Mouton de Gruyter. Alexiadou, A., Anagnostopoulou, E. and Schäfer, F. 2006. The properties of anti- causatives crosslinguistically. In Phases of interpretation, M. Frascarelli (ed.), 187–211. Berlin: Mouton de Gruyter. Alexiadou, A., Anagnostopoulou, E. and Schäfer, F. 2015. External Arguments in Transitivity Alterations: A Layering Approach. Oxford: Oxford University Press. Bach, E. 1981. On time, tense, and aspect: An essay in English metaphysics. In Radi- cal Pragmatics, P. Cole (ed.), 63–81. New York: Academic Press. Baker, M. 1988. Incorporation: A Theory of Grammatical Function Changing. Chi- cago, IL: University of Chicago Press. Baker, M. 1997. Thematic roles and syntactic structures. In Elements of Grammar, L. Haegeman (ed.), 73–137. Dordrecht: Kluwer. Bartsch, R. 1976. The Grammar of Adverbials. Amsterdam: North-Holland. Bayer, S. 1996. Confession of a Lapsed Neo-Davidsonian. Doctoral Dissertation, Brown University. Borer, H. 1994. The projection of arguments. In University of Massachusetts Occa- sional Papers in Linguistics 17: Functional Projections, E. Benedicto and J. Run- ner (eds.), Amherst, MA: GLSA. Neo-Davidsonianism 315 Borer, H. 2005a. Structuring Sense. volume I: In Name Only. Oxford: Oxford Uni- versity Press. Borer, H. 2005b. Structuring Sense. Volume II: The Normal Course of Events. Oxford: Oxford University Press. Bowers, J. 2010. Arguments as Relations. Cambridge, MA: MIT Press. Brasoveanu, A. 2010. Modified numerals as post-suppositions. In Logic, Language and Meaning: 17th Amsterdam Colloquium, Amsterdam, the Netherlands, December 2009, Revised Selected Papers, M. Aloni, H. Bastiaanse, T. Jager and K. Schulz (eds.), 203–212. Berlin: Springer. Carlson, G. 1984. Thematic roles and their role in semantic interpretation. Linguis- tics 22: 259–279. Castañeda, H-N. 1967. Comments. In The Logic of Decision and Action, N. Rescher (ed.), 104–112. Pittsburgh, PA: University of Pittsburgh Press. Champollion, L. 2010. Cumulative readings of every do not provide evidence for events and thematic roles. In Logic, Language and Meaning: 17th Amsterdam Colloquium, Amsterdam, the Netherlands, December 2009, Revised Selected Papers, M. Aloni, H. Bastiaanse, T. Jager and K. Schulz (eds.), 213–222. Berlin: Springer. Chomsky, N. 1981. Lectures on Government and Binding. Dordrecht: Foris. Chomsky, N. 1993. A minimalist program for linguistic theory. In The View from Building 20: Essays in Honor of Sylvain Bromberger, K. Hale and S. J. Keyser, 1–52. Cambridge, MA: MIT Press. Chomsky, N. 1995. The Minimalist Program. Cambridge, MA: MIT Press. Chomsky, N. 2008. On phases. In Foundational Issues in Linguistic Theory: Essays in Honor of Jean-Roger Vergnaud, C. Otero, R. Freidin and M-L. Zubizarreta (eds.), 133–166. Cambridge, MA: MIT Press. Chomsky, N. 2010. Restricting Stipulations: Consequences and Challenges. Talk given at the University of Stuttgart. Chomsky, N. 2013. Problems of projection. Lingua 130: 33–49. Cinque, G. 1999. Adverbs and Functional Heads: A Cross-Linguistic Perspective. Oxford: Oxford University Press. Clark, E. and Clark, H. 1979. When nouns surface as verbs. Language 55: 767–811. Davidson, D. 1967. The logical form of action sentences. In The logic of decision and action, N. Rescher (ed.), 81–95. Pittsburgh, PA: University of Pittsburgh Press. Dotlačil, J. 2010. Anaphora and Distributivity: A Study of Same, Different, Recipro- cals and Others. Doctoral Dissertation, Utrecht University. Dowty, D. 1989. On the semantic content of the notion of ‘thematic role’. In Proper- ties, Types and Meanings. Volume II: Semantic Issues, G. Chierchia, B. Partee and R. Turner (eds.), 69–130. Dordrecht: Kluwer. Dowty, D. 1991. Thematic proto-roles and argument selection. Language 67: 547–619. Epstein, S. D. 2009. The Unification of theta Relations: How TRANSFER Renders SpecvP a theta Marked Complement. Ms., University of Michigan. Epstein, S. D., Kitahara, H. and Seely, T. D. 2012. Structure building that can’t be. In Ways of Structure Building, M. U-Etxebarria and V. Valmala (eds.), 253–270. Oxford: Oxford University Press. Everaert, M., Marelj, M. and Siloni, T. 2012. The theta system: An introduction. In The theta System: Argument Structure at the Interface, M. Everaert, M. Marelj and T. Siloni (eds.), 1–19. Oxford: Oxford University Press. 316 The Syntax–Semantics Interface Ferreira, M. 2005. Event Quantification and Plurality. Doctoral Dissertation, MIT. Folli, R. and Harley, H. 2007. Causation, obligation, and argument structure: On the nature of little v. Linguistic Inquiry 38: 197–238. Gruber, J. 1965. Studies in Lexical Relations. Doctoral Dissertation, MIT. Published in?. Hale, K. and Keyser, S. J. 1993. On argument structure and the lexical expression of syntactic relations. In The View from Building 20: Essays in Honor of Sylvain Bromberger, K. Hale and S. J. Keyser, 53–109. Cambridge, MA: MIT Press. Hale, K. and Keyser, S. J. 2002. Prolegomenon to a Theory of Argument Structure. Cambridge, MA: MIT Press. Harley, H. 1995. Subjects, Events and Licensing. Doctoral Dissertation, Massachu- setts Institute of Technology. Harley, H. 2009. Roots and Locality. Talk given at the Roots workshop, University of Stuttgart. Heim, I. 1982. The Semantics of Definite and Indefinite Noun Phrases. Doctoral Dissertation, University of Massachusetts, Amherst, MA. Heim, I. and Kratzer, A. 1998. Semantics in Generative Grammar. Oxford: Blackwell. Higginbotham, J. 1985. On semantics. Linguistic Inquiry 16: 547–593. Higginbotham, J. 1986. Linguistic theory and Davidson’s program. In Inquiries into Truth and Interpretation, E. Lepore (ed.), 29–48. Oxford: Blackwell. Hoekstra, E. 1991. Licensing Conditions on Phrase Structure. Doctoral Disserta- tion, Rijksuniversiteit Groningen. Hornstein, N. 2002. A grammatical argument for a neo-davidsonian semantics. In Logical form and Language, G. Preyer and G. Peters (eds.), 345–364. Oxford: Oxford University Press. Horvath, J. and Siloni, T. 2011. Causatives across components. Natural Language & Linguistic Theory 29: 657–704. Jackendoff, R. 1972. Semantic Interpretation in Generative Grammar. Cambridge, MA: MIT Press. Jackendoff, R. 1990. Semantic Structures. Cambridge, MA: MIT Press. Jayaseelan, K. A. 2008. Bare phrase structure and specifier-less syntax. Biolinguistics 2: 87–106. Jeong, Y. 2007. Applicatives: Structure and Interpretation from a Minimalist Per- spective. Amsterdam: John Benjamins. Klein, E. and Sag, I. 1985. Type-driven translation. Linguistics and Philosophy 8: 162–202. Koopman, H. and Sportiche, D. 1991. The position of subjects. Lingua 85: 211–258. Kratzer, A. 1996. Severing the external argument from the verb. In Phrase Structure and the Lexicon, J. Rooryck and L. Zaring (eds.), 109–137. Dordrecht: Kluwer. Kratzer, A. 2000. The Event Argument and the Semantics of Verbs. Ms., University of Massachusetts. Krifka, M. 1989. Nominal reference, temporal constitution and quantification in event semantics. In Semantics and Contextual Expression, R. Bartsch, J. Benthem and P. E. Boas (eds.), 75–115. Dordrecht: Foris. Krifka, M. 1992. Thematic relations as links between nominal reference and tempo- ral constitution. In Lexical Matters, I. Sag and A. Szabolcsi (eds.), 29–53. Stan- ford, CA: CSLI. Larson, R. 2014. On Shell Structure. London: Routledge. LaTerza, C. 2014. Distributivity and Plural Anaphora. Doctoral Dissertation, Uni- versity of Maryland. Neo-Davidsonianism 317 Levin, B. and Hovav, M. R. 1995. Unaccusativity: At the Syntax—Lexical Semantics Interface. Cambridge, MA: MIT Press. Levin, B. and Hovav, M. R. 2005. Argument Realization. Cambridge: Cambridge University Press. Lin, T-H. 2001. Light Verb syntax and the Theory of Phrase Structure. Doctoral Dissertation, University of California, Irvine. Lohndal, T. 2014. Phrase Structure and Argument Structure: A Case Study of the Syntax—Semantics Interface. Oxford: Oxford University Press. Marantz, A. 1984. On the Nature of Grammatical Relations. Cambridge, MA: MIT Press. Marantz, A. 1997. No escape from syntax: Don’t try morphological analysis in the privacy of your own lexicon. In U. Penn Working Papers in Linguistics. Volume 4.2: Proceedings of the 21st Annual Penn Linguistics Colloquium, A. Dimitria- dis, L. Siegel, C. Surek-Clark and A. Williams (eds.), 201–225, Philadelphia: Uni- versity of Philadelphia. Marantz, A. 2005. Objects out of the Lexicon: Objects as Events. Ms., MIT. May, R. 1977. The Grammar of Quantification. Doctoral Dissertation, MIT. May, R. 1985. Logical form: Its Structure and Derivation. Cambridge, MA: MIT Press. McCloskey, J. 1997. Subjecthood and subject positions. In Elements of Grammar, L. Haegeman (ed.), 197–235. Dordrecht: Kluwer. McGinnis, M. 2001. Variation in the phase structure of applicatives. In Linguistic Variation Yearbook 1, P. Pica and J. Rooryck (eds.), 105–146. Amsterdam: John Benjamins. McKay, T. 2006. Plural Predication. Oxford: Clarendon Press. Merchant, J. 2013. Voice and ellipsis. Linguistic Inquiry 44: 77–108. Moro, A. 2000. Dynamic Antisymmetry. Cambridge, MA: MIT Press. Müller, S. 2008. Head-Driven Phrase Structure Grammar: Eine Einführung. Tübin- gen: Stauffenburg. Parsons, T. 1990. Events in the Semantics of English. Cambridge, MA: MIT Press. Parsons, T. 1993. Thematic Relations and Arguments. Ms., University of California, Irvine. Parsons, T. 1995. Thematic relations and arguments. Linguistic Inquiry 26: 635–662. Pietroski, P. 2005. Events and Semantic Architecture. Oxford: Oxford University Press. Pietroski, P. 2011. Minimal semantic instructions. In The Oxford Handbook of Lin- guistic Minimalism, C. Boeckx (ed.), 472–498. Oxford: Oxford University Press. Potts, C. 2008. Review article: Hagit Borer’s Structuring Sense. Language 82: 348–369. Pylkkänen, L. 2008. Introducing Arguments. Cambridge, MA: MIT Press. Ramchand, G. 1998. Deconstructing the lexicon. In The Projection of Arguments, M. Butt and W. Geuder (eds.), 65–96. Stanford, CA: CSLI. Ramchand, G. 2008. Verb meaning and the Lexicon: A First Phase Syntax. Cam- bridge: Cambridge University Press. Ramchand, G. 2011. Minimalist semantics. In The Oxford Handbook of Linguistic Minimalism, C. Boeckx (ed.), 449–471. Oxford: Oxford University Press. Reinhart, T. 2002. The theta system: An overview. Theoretical Linguistics 28: 229–290. Reinhart, T. and Siloni, T. 2005. The lexicon—syntax parameter: Reflexivization and other arity operations. Linguistic Inquiry 36: 389–436. 318 The Syntax–Semantics Interface Richards, N. 2010. Uttering Trees. Cambridge, MA: MIT Press. Ritter, E. and Rosen, S. T. 1998. Delimiting events in syntax. In The Projection of Arguments, W. Geuder and M. Butt (eds.), 135–164. Stanford, CA: CSLI. Sailor, C. and Ahn, B. 2010. The Voices in our Heads. Talk given at Morphological Voice and its Grammatical Interfaces. Schäfer, F. 2008. The Syntax of (Anti-)causatives: External Arguments in Change-of- State Contexts. Amsterdam: John Benjamins. Schäfer, F. 2012. Two types of external argument licensing—the case of causers. Studia Linguistica 66: 128–180. Schein, B. 1993. Plurals and Events. Cambridge, MA: MIT Press. Schein, B. 2002. Events and the semantic content of thematic relations. In Logical form and Language, G. Preyer and G. Peters (eds.), 263–344. Oxford: Oxford University Press. Schein, B. 2003. Adverbial, descriptive reciprocals. Philosophical Perspectives 17: 333–367. Speas, M. 1990. Phrase Structure in Natural Language. Dordrecht: Kluwer. Taylor, B. 1985. Modes of Occurrence: Verbs, Adverbs and Events. Oxford: Blackwell. Tenny, C. 1987. Grammaticalizing Aspect and Affectedness. Doctoral Dissertation, MIT. Tenny, C. 1994. Aspectual Roles and the Syntax—Semantics Interface. Dordrecht: Kluwer. Travis, L. 2000. Event structure in syntax. In Events as Grammatical Objects: The Converging Perspectives of Syntax and Lexical Semantics, C. Tenny and J. Puste- jovsky (eds.), 145–185. Stanford, CA: CSLI. Uriagereka, J. 1999. Multiple spell-out. In Working Minimalism, S. David Epstein and N. Hornstein (eds.), 251–282. Cambridge, MA: MIT Press. Verkuyl, H. J. 1972. On the Compositional Nature of the Aspects. Dordrecht: Reidel. Verkuyl, H. J. 1989. Aspectual classes and aspectual composition. Linguistics and Philosophy 12: 39–94. Verkuyl, H. J. 1993. A Theory of Aspectuality: The Interaction Between Temporal and Atemporal Structure. Cambridge: Cambridge University Press. Williams, E. 1981. Argument structure and morphology. The Linguistic Review 1: 81–114. Zubizarreta, M-L. 1987. Levels of Representation in the Lexicon and in the Syntax. Dordrecht: Foris. 11 Interrogatives, Instructions, and I-Languages An I-Semantics for Questions1

with Paul Pietroski

11.1 Introduction Prima facie a serious obstacle to the program of providing a truth-theoretic semantics for natural language is the fact that natural languages apparently contain an infinite number of sentences that do not appear to be truth evalu- able even in use, specifically, imperatives [. . .] and interrogatives. —Lepore and Ludwig (2007: 263)

In this chapter, we develop a simple idea in a minimalist setting: interroga- tive expressions are instructions for how to assemble mental representa- tions that are apt for making queries. While this idea might seem trivial, at first glance, it can be developed in a theoretically spare yet still empirically attractive way. We discuss wh-movement as a paradigm example of how “movement to the edge” of a sentence has a semantic effect that differs from merely adding information (say, by means of a new argument/adjunct) or raising a quantifier. In particular, we offer a minimalist version of an old thought: the leftmost edge of a sentence permits a kind of abstraction that makes it possible to use a sub-sentential (mood-neutral) expression to ask a question; and wh-interrogatives turn out to be especially interesting, with implications for relative clauses, which also provide examples of how movement to the edge of a cyclically generated expression has a distinctive semantic effect. From this perspective, the edge of a phrase is a locus for a “secondary” semantic instruction, concerning the use of a mental rep- resentation that can be assembled by executing the “primary” instruction encoded by the rest of the phrase; cp. Chomsky (2005: 14). What follows is an attempt to articulate this general idea, in some detail, for interrogative expressions. Within formal semantics, it is widely held that understanding the declara- tive sentences of a natural language—knowing what these sentences mean— is a matter of knowing their truth conditions. Since children naturally acquire spoken/signed languages that have endlessly many declaratives, it seems that each such sentence must have a truth condition that can be 320 The Syntax–Semantics Interface somehow computed, given finitely many assumptions about (a) the semantic properties of lexical items, (b) the relevant syntax, and (c) how the semantic properties of complex expressions are determined by (a) and (b). But the languages that children acquire also include endlessly many interrogative sentences that are understood just as well as their declarative counterparts. So if (1) has a computable truth condition,

(1) Jay saw Kay. that raises a cluster of foundational questions about the corresponding yes/ no-interrogative (2a) and the wh-interrogatives (2b) through (2d),

(2) a. Did Jay see Kay? b. Who did Jay see? c. Who saw Kay? d. When did Jay see Kay? along with further questions about relative clauses like (3a) through (3c) and complex declaratives like (3d) and (3e):2

(3) a. . . . who Jay saw b. . . . who saw Kay c. . . . when Jay saw Kay d. Someone wondered/forgot/knew whether Jay saw Kay. e. Someone asked who Jay saw, and someone remembered when Jay saw Kay.

At the most basic level, one wants to know how the cognitive resources deployed in understanding interrogatives are related to the cognitive resources deployed in understanding declaratives and relative clauses. While the parallels between (2c) and (3b) seem especially vivid, there is presumably massive overlap in terms of the lexical knowledge and recursive capacities invoked to understand (1) through (3e). But does this “core” semantic com- petence consist in tacit knowledge of a Tarski-style theory of truth, which is supplemented in some way that accommodates interrogatives (cp. Dummett (1976)’s discussion of Davidson (1967b)), or is natural language semantics less truth-centric? In terms of the expressions themselves, do (2a) through (2d) have truth-evaluable constituents that are combined with special ques- tion-forming devices: if so, how are interrogatives understood composition- ally; if not, does this tell against the familiar idea (reviewed in the following) that relative clauses have truth-evaluable constituents, and would this in turn tell against truth-theoretic accounts of declaratives? Over the past thirty years or so, intensive study of interrogatives has led to many insights (Hamblin 1973, Karttunen 1977, Higginbotham and May 1981, Groenendijk and Stokhof 1982, 1984, Higginbotham 1993), Interrogatives, Instructions, I-Language 321 especially with regard to details concerning the kinds of expressions that can be used (cross-linguistically) to ask questions, and how these expres- sions are related to others in terms of form and meaning. Semanticists have also developed useful frameworks for thinking about how interrogatives are related to information (see, e.g., Ginzburg and Sag (2001)). But central questions remain. In particular, as discussed in section 2.2, it is often said that an interrogative has a set of propositions—intuitively, a set of possible answers—as its semantic value. But it is not obvious that a few word mean- ings, combined as in (2a) through (2d), can determine a set of propositions in accord with the natural principles governing lexical items and compo- sition. We stress this point, elaborated on in the following: Any posited meaning of an interrogative expression must be determined, in accord with independently plausible composition principles, by independently plausible lexical meanings. From this perspective, one wants to know how expressions like (1) through (3e) systematically interface with relevant aspects of human cog- nition. If a declarative like (1) is something like an instruction for how to build a truth-evaluable thought, perhaps each of (2a) and (2d) is an instruc- tion for how to build a certain class of thoughts. But another possibility is that while hearing an interrogative often leads one to represent possible answers, understanding an interrogative requires both less and more: less, because representing answers is a potential effect (not a constitutive part) of understanding. and more, because an interrogative differs from any non- interrogative device for representing a set of answers (McGinn 1977, Stain- ton 1999). Maybe (1) through (3e) are all instructions for how to construct concepts, with “sentential” concepts as special cases, and each grammatical mood is an instruction related to a certain kind of concept use. One can invent a language in which certain sentences indicate proposi- tions, and other sentences indicate sets of such abstracta, thereby offering an idealized model of certain speech acts. (Imagine a community in which a declarative is always used to assert the indicated proposition, and an interrogative is always used to request an assertion of some proposition in the indicated set.) Such invention can also suggest a grammar that at least accommodates (2a) through (2d) in a way that describes their interrogative character. And for these purposes, one can abstract from many details con- cerning how the lexical constituents of (2a) through (2d) combine to form expressions whose meanings make them apt for use in requesting informa- tion.3 But we assume that children acquire I-languages in Chomsky (1986, 1995)’s sense: expression-generating procedures—intensions in Church (1941)’s sense (see also Frege 1892)—that are biologically implemented in ways that respect substantive constraints of Universal Grammar on lexical meanings and modes of composition, where these procedures, acquired in conditions of limited experience via the human language faculty, generate expressions that pair phonological instructions to “perceptual-articulatory” systems with semantic instructions to “conceptual-intentional” systems. 322 The Syntax–Semantics Interface In this respect, we adopt fundamental assumptions of the Minimalist Pro- gram, taking each I-language “to be a device that generates expressions Exp = , where Phon provides the “instructions” for sensimo- tor systems and Sem the “instructions” for systems of thought-information about sound and meaning, respectively, where “sound” and “meaning” are understood in internalist terms, “externalizable” for language use by the performance systems” (Chomsky 2000a: 91) There are many ways of spelling out this old idea. But one can hypoth- esize that each sentence is a two-part instruction: a core (or “radical”) com- ponent that directs construction of a sentential concept that can be used with different forces and a further instruction for how to make sentential concept apt for a certain kind of use (Frege 1879).4 More specifically, draw- ing on Segal (1991) and McGinn (1977), we adopt an I-language version of a Tarski-style semantics that eschews truth values. On this view, there are no I-language expressions of type . Rather, each sentence is an instruction to build a concept that applies to everything or nothing in the relevant domain, relative to an assignment of values to variables. And such a concept can be used, in concert with other cognitive systems, to assert that—query whether, or wish/pretend/joke that—it applies to something/everything. Given an independently plausible and spare syntax, this yields a simple procedure for generating interrogative interpretations. It also preserves descriptive adequacy with regard to a significant range of interrogatives and wh-expressions. But our goal here is to offer a promising account that speaks directly to some foundational challenges presented by non-declara- tive sentences. We cannot—and will not try to—deal with the many and var- ied empirical phenomena that have been analyzed and discussed in the rich descriptive literature on interrogatives, much less imperatives/exclamatives and so on. We focus instead on a few illustrative phenomena: argument and adjunct interrogatives, yes/no-interrogatives, and multiple wh-inter- rogatives. In our view, these basic cases already reveal difficult theoretical questions that are not answered by describing the facts in terms of sets of propositions.

11.2 I-Semantics and Concepts In this section, we briefly review the I-language/E-language distinction (Chomsky 1986), and we endorse a version of the following idea (cp. Chomsky (1995)): human I-languages generate expressions that pair pho- nological instructions (PHONs) with semantic instructions (SEMs), where the latter can be described as (syntactic) instructions to build concepts as in (Chomsky 2000a, Pietroski 2008, 2010). Given this overtly psycho- logical perspective on semantics, positing denotations for expressions is a first step that raises further questions: What are the corresponding mental representations, and how is their compositional structure related to that of the corresponding I-language expressions? Focusing on these questions Interrogatives, Instructions, I-Language 323 may lead one to conclude that the original claims, about denotations, mixed distinct aspects of linguistic competence (concerning knowledge of meaning and knowledge of how meaningful expressions can be used in communication).5

11.2.1 Implemented Intensions and Typology Chomsky (1986) distingushed I-languages from E-languages, stressing the difference between expression-generating procedures (intensions) and sets (extensions) of generable expressions. The “I” also connoted “idiolect”, “individual”, and “internal”. As noted earlier, the expression-generating procedures that children naturally acquire must also be implemented by available human biology, and this is presumably a major source of con- straint on which I-languages children can acquire, even if theorists do not know the details. One can invent E-languages that are “external” to any particular speaker, in the sense of being governed by public conventions that may violate principles respected by all natural languages. And such inven- tions may be useful for certain purposes. In particular, if an E-language has unboundedly many expressions, one might describe it in terms of a genera- tive procedure that models certain aspects of the “human” I-languages that children can naturally acquire. But our inquiry is focused on these I-lan- guages, the faculty that lets human children acquire them, and the mental representations with which generable expressions interface. At least to a first approximation, one can describe human I-languages as implemented procedures that pair PHONs with SEMs, where PHONs (or PFs) are the aspects of generable expressions that interface with human per- ceptual-articulatory systems and SEMs (or LFs) are the aspects of generable expressions that interface with human conceptual-intentional systems. This leaves room for many hypotheses about how SEMs are related to PHONs, syntax, and morphology. But the simplest idea, and hence an obvious starting point, is that expressions are PHONSEM pairs (Chomsky 1995). Familiar facts suggest that an expression’s PHON need not be isomorphic to the “logical form” of the thought expressed, taking logical forms to be structural aspects of mental representations with which SEMs naturally interface. So it seems that either (1) an expression’s PHON need not be isomorphic to its SEM or (2) an expression’s SEM need not be isomorphic to the corresponding logical form. We assume that (1) is correct, and that part of the goal is to specify a relatively transparent mapping from SEMs to logical forms, while keeping the posited mismatches between SEMs and PHONs explicable; compare May (1977), Chomsky (1981), Higginbotham and May (1981). The broader task is to specify a biologically implementable algorithm for generating complex SEMs that can be employed as executable instructions for how to build mental representations of some kind—and to specify these instructions in empirically plausible ways—while also specify- ing the elements and structural properties of SEMs, along with the elements 324 The Syntax–Semantics Interface and structural properties of the corresponding mental representations; com- pare Hornstein and Pietroski (2009). One can accept this task and still hypothesize that understanding an expression of a human I-language is a matter of recognizing (or perhaps assigning) its truth-theoretic properties in the right way; see e.g., Higgin- botham (1986), Larson and Segal (1995), Heim and Kratzer (1998). For as Larson and Segal (1995) make explicit, and other authors suggest, one can offer a proposal about how the pronounceable expressions of a human I-language are related to expressions of a hypothesized language of thought (Fodor 1975, 2008) that makes it possible to represent Tarskian satisfaction conditions. Indeed, we do not see how else sentential SEMs could actually have truth conditions, as opposed to merely being “interpretable” in this way from an externalistic perspective. To recognize that cow is true of all and only the cows—or more pre- cisely, that when cow is linked to a variable v, the resulting expression is satisfied by an assignment A of values to variables iff A assigns a cow to v—a speaker presumably needs some way of representing the cows, along with some way of representing variables and truth/satisfaction. Likewise, for Larson and Segal, understanding brown cow is a matter of generating (in the right way) a representation according to which this phrase is true of things that are both brown and cows. If only for simplicity, we assume that humans have concepts like COW and BROWN, along with some logical concepts like AND, where concepts are mental representations, composable in their own terms.6 And we assume that speakers deploy such concepts in understanding. Ordinary speakers may also have semantic concepts like SATISFIES, which can be deployed to represent complex I-language expres- sions as having semantic properties that can be specified via concepts like AND, BROWN, and COW.7 Though once one grants that theorists must say something about how SEMs are related to concepts, even for simple cases like brown cow, various possibilities come into view. Instead of saying that SEMs have (and/or are represented as having) truth-theoretic properties, one can hypothesize that SEMs are instructions for how to fetch and combine mental representations and that executing SEMs leads to the assembly of representations that may or may not have truth-theoretic properties. To a first approximation, one might view the SEM of each morpheme as an executable instruction for how to fetch a concept from a certain lexical address. And one might view each complex SEM as an executable instruction for how to formally combine concepts obtained by executing constituent SEMs. This leaves it open which if any aspects of SEMs/instructions should be characterized in terms of traditional semantic notions. For example, the SEM of brown dog might be described as CONJOIN[FETCH@“brown”, FETCH@“cow”], that is, conjoin con- cepts fetched from the lexical addresses (in the lexicon) associated with the PHONs of brown and cow. If we idealize away from polysemy, and assume that each lexical address indicates a single concept—and that there Interrogatives, Instructions, I-Language 325 is only one available way to conjoin monadic concepts—there will only be one way of executing the semantic instruction. The resulting concept, AND[BROWN, COW], might have a satisfaction condition. But executing an instruction may lead to a product that has properties not specified in the instruction. So even if (some) concepts have Tarskian satisfiers, it is not obvious that SEMs are related to concepts via semantic concepts like SATIS- FIES, as on more traditional approaches. In any case, there may be many human concepts that cannot be fetched or assembled via SEMs: acquirable I-languages may not interface with all the concepts that humans enjoy. So let us say that human I-concepts are concepts that can be fetched or assembled via SEMs. Likewise, there may be ways of assembling concepts that SEMs cannot invoke. We assume that any plausible account will need to posit, one way or another: con- junction of monadic concepts; a restricted form of saturation, or perhaps “theta-binding”, corresponding to combination of a verb with an argu- ment (Carlson 1984), and something like quantification over assignment variants, to accommodate the kind(s) of abstraction associated with rela- tive clauses and the external arguments of quantificational determiners (cp. Higgin botham (1985)). But whatever the details, let’s say human I-operations are those concept-combining operations that can be invoked via the syntax of SEMs.8 This invites a question characteristic of Chomsky (1995, 2000b)’s pro- posed Minimalist Program, applied to semantics: What is the sparest the- oretical inventory, of I-concepts and I-operations, that allows for at least rough descriptive adequacy with regard to characterizing the concept-con- struction-instructions (or Begriffsplans) generated by human I-languages? In answering this question, one needs to distinguish Tarski (1935)’s techni- cal notion of satisfaction from the intuitive sense in which instructions are satisfied when successfully executed. IfΣ is a sentence of a language that has a Tarskian semantics—perhaps an idealized language of thought—one can say that Σ is “E-satisfied” by each sequence of entities (in the relevant domain of discourse) that meets a certain condition. But if Σ has E-satisfiers, these sequences need not and typically will not reflect the structure ofΣ , and Σ need not be an instruction to build anything. By contrast, if Σ is a concept- construction-instruction, one can say that Σ is “I-satisfied” by fetching cer- tain concepts and performing certain combinatorial operations. And Σ can be satisfied in this sense, requiring construction of a concept that at least partly reflects the structure ofΣ , even if Σ has no E-satisfiers; compare with Davies (1987).9 Given this distinction, appeals to semantic typology must be motivated carefully. Let’s grant that humans enjoy concepts that exhibit at least some of the traditional Fregean hierarchy: singular concepts of type , used to think about entities; truth-evaluable thoughts of type ; predicative concepts of type that can combine with concept of type to form a thought of type ; and so on, for some nontrivial range of types. It does 326 The Syntax–Semantics Interface not follow that human have I-concepts of these types. Indeed, the allegedly basic types and are especially suspect. Many accounts of proper names eschew the idea that names are lexi- cal items (that fetch concepts) of type , in favor of the idea that Jay is more like “that person called Jay”—a complex predicative expression (used to assemble a complex monadic concept). More generally, expressions often said to be of type may be better analyzed as devices for fetching monadic concepts that can be conjoined with others; see Pietroski (2011) and the references there. And it is worth stressing that Tarski (1935) did not appeal to truth values, or expressions of type , when he character- ized truth in terms of satisfaction. Tarski treated sentences (of his invented language) as devices that classify sequences. And since it will be important that I-language sentences need not be instructions to build concepts of truth values, we conclude this subsection by introducing some relevant notation. Consider a pair of operators, ↑ and ↓, that convert monadic concepts into monadic concepts as follows: for each thing in the domain, ↑C applies to it iff C applies to something, and ↓C applies to it iff C applies to nothing, where C ranges over monadic concepts. For example, ↑COW applies to you iff there is at least one cow. Correlatively, ↓COW applies to you iff noth- ing is a cow. So for each thing, either ↑COW or ↓COW applies to it. And nothing is such that both ↑COW and ↓COW apply to it. Given a suitable metalanguage, we can say ↑C ≡ ∃x[C(x)]; ↓C ≡ ¬∃x[C(x)]. But the idea is not that “↑C” abbreviates “∃x[C(x)]”. The possibility to consider is that sentential SEMs invoke an operation that creates a concept of “all or none” from an assembled monadic concept. We take no stand here on which aspect of sentential syntax invokes this operation. But one can hypothesize that for some grammatical label S, an expression with this label is an instruction to execute the labeled instruction and then prefix the resulting concept with↑ . Given concepts of events, like SEEING-OF-KAY and DONE-BY-JAY, “closing up” can yield concepts like ↑AND[DONE-BY-JAY, SEEING-OF-KAY]. This concept applies to all or to none, depending on whether or not there was an event of Jay seeing Kay.10 Let us say that any concept of the form “↑C” or “↓C” is a T-concept, with “T” connoting Tarski, totality, and truthy.11 There is no guarantee that human I-concepts include T-concepts. Even if humans enjoy such concepts, there is no guarantee that they can be assembled by executing semantic instructions. But that is true for concepts of any type. Still, one can imagine a procedure that gen- erates instructions of the form CLOSE-UP­ :CONJOIN[. . . , . . .], where execut- ing such instructions leads to assembling concepts of the form ↑AND[C, C’]. And appeal to T-concepts can do at least much of the work done by supposing that sentences (are used to assemble concepts that) denote truth values. So instead of saying that sentences exhibit a special type that differs from the type exhibited by “brown cow”, with expressions of type as truth value denoters, one might offer an I-language semantics according to which sentences are special cases of predicates. Put another way, T-concepts Interrogatives, Instructions, I-Language 327 are predicative concepts formed via distinctive operators, not concepts of distinctive things. So especially if it is unclear that human I-concepts include concepts of type , the possibility of appealing to T-concepts should make theorists pause before assuming the traditional semantic typology in theo- ries of I-languages.12 We do not deny that “postlinguistic” cognition often traffics in complete thoughts, with each monadic concept saturated or quantificationally bound. But human I-languages may interface with such cognition via formally monadic T-concepts, perhaps because I-languages do not themselves gener- ate expressions of type . And this is not mere speculation, given that the notion of a sentence has always had an unstable place in grammatical theory. It is notoriously hard to say which SEMs exhibit the special type , especially if each SEM is (qua generable expression) an instance of some grammatical type exhibited by lexical items. One can stipulate that sentences are projections of some functional category, perhaps associated with tense. But no such stipulation seems especially good. So perhaps theorists should drop the idea that human I-languages generate expressions of type , in favor of a less type-driven conception of semantics. In any case, we do not want to rely on inessential typological assumptions when addressing foundational questions about how interrogatives and rela- tive clauses are related to declaratives. It is hard see how concepts of truth values can be used to ask questions. And we suspect this feeds the idea that concepts of propositions are required. So we do not assume that I-languages generate expressions , much less that interrogatives and relative clauses have constituents that denote truth values.

11.2.2 I-Language Interrogatives and Question-Denoters Following Hamblin (1958, 1973), many theorists have been attracted to some version of the idea that an interrogative denotes the corresponding set of possible answers—or the corresponding set of true answers (Kart- tunen 1977), or a partition of a suitable set of answers (Higginbotham and May 1981). Hamblin expressed the leading idea in terms of a point about how interrogatives are used in communication: “Pragmatically speaking a question sets up a choice-situation between a set of propositions, namely, those propositions that count as answers to it” Hamblin (1973: 48). And as noted earlier, we grant the utility of this idealization concerning use. But it is often said that yes/no-interrogatives like (4a) denote the set indicated with (4b) or perhaps the corresponding singleton set that includes only the true proposition in question.13

(4) a. Did Jay see Kay? b. {the proposition that Jay saw Kay, the proposition that Jay did not see Kay} 328 The Syntax–Semantics Interface Likewise, it is often said that a wh-question like (5a) denotes some set of propositions gestured at with (5b),

(5) a. Who did Jay see? b. {the proposition that Jay saw Kay, the proposition that Jay saw Larry, the proposition that Jay saw Mary, . . ., the proposition that Jay saw Jay} or perhaps a partition of (5b)—that is, a set of conjunctions, each of which has each element of (5a) or its negation as a conjunct. Any such proposal raises questions about the unmentioned elements of (5b), indicated with the ellipsis. Do they include, for example, the following propositions (at least if true): that Jay saw the governor of Illinois; that Jay saw every governor convicted of a crime; that Jay saw every governor who saw him; that Jay saw every governor he saw? But set such issues aside, and assume that there is a determinate set of propositions corresponding to (5a). A more pressing question, from an I-language perspective, is how to translate the talk of expressions denoting abstracta into a plausible hypothesis about the SEMs of (4a) and (5a). At least sometimes, talk of denotation is abbreviated talk of what speak- ers can do with expressions in communicative contexts—with no pretense of any proposal about how SEMs are used to assemble mental representa- tions. But it can be tempting to say that (6), by virtue of its meaning,

(6) Jay saw Kay has a certain proposition as its denotation, where “denotation” is a techni- cal term of semantics, on par with Frege’s term of art “Bedeutung,” except that Frege stipulated that sentences of his invented language denote/Bedeut truth values. It is tempting to think that “denotation” can also be used, with- out serious equivocation, to talk about the representations assembled by executing SEMs. But as noted earlier, appeal to (I-concepts of) truth values should already raise eyebrows if the task is to describe human I-languages in terms of the sparest descriptively adequate typology. Appeal to (I-concepts of) propositions, in order to accommodate (4a), should raise eyebrows high. Does the fact that (6) has an interrogative counterpart already show that (6) is not an instruction to assemble a mere T-concept, or that the constituents of (6) have denotations that can be combined to form a proposition that can be an/the answer to (4a)? If (6) denotes the proposition that Jay saw Kay, then it becomes very tempting to say that one way or another: The SEM of (4a) combines some formal element Q with a complex constituent S that shares its denotation with (6); and the complex expression Q^S denotes (4b), because Q denotes the requisite mapping function µ. From an I-language perspective, this would be to say that Q fetches a concept of µ, and hence that I-concepts include Interrogatives, Instructions, I-Language 329 concepts of this sort. And if (7) is related to (6) by some process of abstrac- tion, so that (7) denotes the proposition-part corresponding to “Jay saw _”,

(7) . . . who Jay saw then it becomes very tempting to say that one way or another: The SEM of (5a) combines some formal element Q with a complex constituent R that shares its denotation with (7); and the complex expression Q^R denotes (5b), because Q denotes the requisite mapping function, with parallel con- sequences for the space of I-concepts. Many variants on these initial ideas have been proposed, in response to various facts: see among others Groenendijk and Stokhof (1984: chapter 1), Berman (1991: chapter 2), Higginbotham (1996) and Lahiri (2002) for summaries. We focus here on the Hamblin-Karttunen approach because it is simple, it is widely adopted (at least as an idealization), and it illustrates two foundational concerns that apply to at least many of the variants. The first concern has already been noted: Insofar as the approach suggests a specific hypothesis about human I-languages, it suggests a rich typology of I-concepts, even for sentences that seem quite simple. One wonders if the same descriptive work could be done with fewer theoretical distinctions. The issue here is not skepticism about abstracta. We suspect that at least often in the study of linguistic meaning, appeal to propositions is historical residue of an E-language perspective on I-languages that are used to assem- ble concepts. Populating the domain of denotations with propositions, while retaining the traditional (nonpsychological) notion of denotation, is no substitute for the idea that SEMs are instructions to assemble concepts. Failure to be clear about this is a recipe for positing more typology than needed. For many purposes, economy of typology is not a high priority. But if the goal is to say how human I-languages interface with other aspects of human cognition, then part of the task is to describe the space of possible human I-concepts and not merely to describe a space of possible concepts that might be employed by minds that can ask/answer questions. The second concern is related. Even if one assumes that interrogatives denote questions, in some technical sense, this description of the facts con- cerning (4a) and (5a) is still not rich enough. For it does not yet distinguish an interrogative SEM from a noninterrogative SEM that has the same deno- tation. If questions are sets of propositions, then a speaker can label/describe a question without asking one.14 Frege (1879) stressed that one can label/ describe a truth value without making (or asking for) an assertion. So for purposes of his invented language, Frege introduced a distinction between markers of force and representations of content—so that the same force marker (e.g., a judgment stroke) could be combined with different content representations, while different force markers could be combined with the same content representation. With regard to human I-languages, there are various analog hypotheses concerning the left periphery of matrix sentences. 330 The Syntax–Semantics Interface One might speculate that declaratives/interrogatives are covert performa- tives along lines indicated in (8a) or (8b); cp. Ross (1970), Lewis (1970), Lakoff (1972):

(8) a. (I hereby assert that) Jay saw Kay. b. (I hereby ask whether) Jay did see Kay.

But in our view, while performatives raise further interesting questions, (8a) and (8b) still exhibit the same declarative mood as (6) or (9a) and (9b):

(9) a. Mary (thereby) asserted that Jay saw Kay. b. Mary (thereby) asked whether Jay saw Kay.

Indeed, we think the SEM of (8b) differs in kind from the SEM of any sen- tence that exhibits declarative mood. One can use (8b) to ask a question. But this just shows, if it needs showing, that the mood of an uttered sentence is (at best) an imperfect indicator of the corresponding speech act’s force; see Note 3 of this chapter. We assume that each grammatical mood—an aspect of certain naturally generable linguistic expressions—is a feature that makes sentences apt for certain uses, in a way that makes this feature neither neces- sary nor sufficient for the use in question; compare McGinn (1977), Segal (1991). Competent speakers know that (4a) and (5a) are well suited to the task of answering questions, while (6) and (9a) and (9b) are well suited to the task of making claims. A speaker can claim that someone asked a ques- tion, with special implications if the speaker performatively claims that he or she him- or herself is asking a question. But our task here is not to offer any specific account of the complexities concerning the relation of mood to force in human communication. Rather, given the distinction between mood and force, we want to specify the semantic role of mood in a suitably neutral way. From an I-language perspective that aims to keep semantic typology spare, an obvious hypothesis is that a sentence is a bipartite instruction: the main part, whose execution leads to construction of a T-concept, which may be modified by a process corresponding to wh-extraction (see the fol- lowing discussion), and a second part, associated with the sentential left periphery edge, whose execution makes an assembled concept fit for a cer- tain kind of use (say, declaration or querying).15 We assume that a T-con- cept can be used to declare that it applies (to one or more things) or to query whether it applies.16 Likewise, a “wh-concept” can be used to clas- sify, or to query which thing(s) it applies to. But it may be that before any I-concept can be used in any such way, it must be “fitted” to the relevant cognitive/linguistic performance system. And the relevant systems may dif- fer in ways that require diverse kinds of fitting. Humans can represent and perform speech acts of many sorts (Austin 1962), some of which cor- respond to “basic” interior actions like endorsing and wondering.17 But in Interrogatives, Instructions, I-Language 331 using a T-concept to form a thought and endorse it, a thinker may need to adapt the T-concept in some formal way (leaving the content unchanged) that makes the concept accessible to the biological process of endorsement, whatever that is. We see no reason to assume that T-concepts are essentially tailored to endorsement (or “judgment”). Moreover, even if a T-concept can be directly endorsed, using a T-concept to wonder if it applies—or put another way, to wonder whether it is to be endorsed—a thinker may need to make the T-concept formally accessible to the biological process of wondering. Given a system that can systematically combine fetchable I-concepts by means of I-operations, it would be amazing if the resulting products came directly off the assembly line in a ready-for-endorsing/wondering format. The relevant interfaces may not be uniform and transparent, as if endorsing/wondering were simply a matter of copying an assembled concept into a suitable decla- ration/query workspace. So even if some interior actions can be performed directly on concepts assembled by executing semantic instructions, it may be that for at least some of the action types imperfectly correlated with grammatical mood, performing actions of those types requires additional preparation of grammatically assembled concepts, in which case, gram- matical mood may itself be an aspect of a complex sentential instruction for how to build a concept and prepare it for a certain kind of use; com- pare Segal (1991). Though to repeat, a concept can be prepared (i.e., ready) for a certain use—say, by applying the I-operation associated with a given mood—yet not be so used, and a concept might be so used without being prepared in this moody way. Correlatively, our suggestion is that the edge of a sentence is important in preparing a concept for a given use, be it to utter a command, ask a question or to make a statement. An edge provides a locus for directing “adjustment” of a sub-sentential concept in an appro- priate way. For simplicity, suppose that declarative mood is the instruction DECLARE, while interrogative mood is the instruction QUERY. We take no stand on the details of how DECLARE is related to acts of endorsing propositions— as opposed to (say) entertaining hypotheses, or stating the antecedent of a conditional—or how QUERY is related to acts of seeking information, as opposed to (say) asking rhetorical questions, or merely giving voice to uncertainty. Perhaps subdistinctions will be required. But as a potential analogy, a common view is that external arguments of verbs are associ- ated with a relational concept, AGENT-OF, that co-classifies segregatable thematic participants: CAUSERS, EXPERIENCERS, and so on.18 It may be that DECLARE likewise munges segregatable speech acts. And for the moment, it is enough to envision a procedure that can generate instructions of the form shown in (10a) and (10b):

(10) a. DECLARE: CLOSE-UP: CONJOIN[. . ., . . .] b. QUERY: CLOSE-UP: CONJOIN[. . ., . . .] 332 The Syntax–Semantics Interface where executing such instructions leads to assembling T-concepts, like ↑AND[DONE-BY-JAY, SEEING-OF-KAY], and then preparing such con- cepts for use in declaration or posing a yes/no-query.19 Given this way of setting up the issues, one can—and in the following sec- tions, we do—go on to ask what further typology of instructions is required to accommodate a range of basic facts concerning interrogative SEMs. But we conclude this section with a few remarks about relative clauses, since part of our goal is to offer a syntax/semantics that captures the apparent commonalities across wh-questions and relative clauses. Unsurprisingly, the notion of a T-concept must be extended to include concepts that contain variables. And this extension to relative clauses will further illustrate our claims about the semantic role of movements to edges.

11.2.3 Abstraction: T-concepts, Variables, and Relative Clauses As noted earlier, ↑AND[DONE-BY-JAY, SEEING-OF-KAY] applies to everything or nothing, depending on whether or not something was both done by Jay and a seeing of Kay. Or more briefly,↑ AND[DONE-BY-JAY, SEEING-OF-KAY] applies to x iff (x is such that) Jay saw Kay. For simplic- ity (cp. Note 8 in this chapter), suppose the embedded conjuncts have sin- gular constituents that can be replaced with variables like v' or v"—mental symbols that might be fetched via grammatical indices like “1” and “2”—as shown in (11a) through (11c):

(11) a. ↑AND[DONE-BY(JAY), SEEING-OF(v')] b. ↑AND[DONE-BY(v'), SEEING-OF(KAY)] c. ↑AND[DONE-BY(v'), SEEING-OF(v")]

In one sense, the concept SEEING-OF(v') is dyadic; it applies to a pair iff e was an event of seeing e’. But formally, SEEING-OF(v') is a concept of events, just like SEEING-OF-KAY. Likewise, there is a sense in which (11a) is dyadic; it applies to iff x is such that Jay saw e’. Though formally, (11a) is a T-concept, and hence a concept of all or none. Similar remarks apply to (11b). And of course, there is a sense in which (11c) is triadic; it applies to a triple iff x is such that e’ saw e”. Nonetheless, (11c) is a T-concept. So let’s say that relative to any assign- ment of A of values to variables: SEEING-OF(v') applies to e iff e was an event of seeing whatever A assigns to v'; (11a) applies to x iff (x was such that) Jay saw whatever A assigns to v'; (11b) applies to x iff whatever A assigns to v' saw Kay; and (11c) applies to x iff whatever A assigns to v' saw whatever A assigns to v". Assignments can also be described as mappings—from variables to entities—that satisfy T-concepts: SAT[A, (11a)] iff Jay saw A(v'); SAT[A, (11b)] iff A(v') saw Kay; SAT[A, (11c)] iff A(v') saw A(v"). And while T-concepts are not concepts of assignments, Interrogatives, Instructions, I-Language 333 a capacity to represent assignment-relativization would be valuable for a cluster of reasons. Imagine a mind that can refer to some of its own concepts and form complex concepts like SAT[α, (11C)], where α is either a default (or ran- domly chosen) assignment, or an assignment that “fits” a given conversa- tion in the sense of assigning the nth thing demonstrated to the nth variable (and assigning values to any speaker/place/time indices in the appropriate way). Such a mind might be able to form concepts like AND[SAT[α, (11C)], RELEVANT(α)]; cp. Kaplan (1978a,b). And it might not be a big leap to representing assignments as differing minimally, in the sense of differing at most with respect to the value of a single variable; cp. Tarski (1935). Existential quantification over assignments could then be used to convert “variable T-concepts” into complex monadic concepts that apply to some but not all individuals. Consider (12a) and (12b):20

(12) a. ∃A[ASSIGNS(A, x, v') & MINDIF(A, α, v') & SAT(A, ↑AND[DONE- BY(JAY), SEEING-OF(v')])] b. ∃A[ASSIGNS(A, x, v') & MINDIF(A, α, v') & SAT(A, ↑AND[DONE- BY(v'), SEEING-OF(KAY)])]

Relative to any choice for α, concept (12a) applies to x iff some assignment A meets three conditions: A assigns x to the first variable; A is otherwise just like α; and A satisfies (11a). More briefly, (12a) applies to x iff Jay saw x. Likewise, (12b) applies to x iff x saw Kay. More interestingly, consider (13a) and (13b):

(13) a. ∃A[ASSIGNS(A, x, v') & MINDIF(A, α, v') & SAT(A, ↑AND[DONE- BY(v'), SEEING-OF(v")])] b. ∃A[ASSIGNS(A, x, v") & MINDIF(A, α, v") & SAT(A, ↑AND[DONE-BY(v'), SEEING-OF(v")])]

Relative to α: (13a) applies to x iff x saw α(v"); and (13b) applies to x iff α(v') saw x.21 This implements a limited kind of lambda abstraction, corresponding to extraction of a wh-expression. So let’s abbreviate (13a) and (13b), respec- tively, as in (14a) and (14b):

(14) a. λv'.↑AND[DONE-BY(v'), SEEING-OF(v")] b. λv".↑AND[DONE-BY(v'), SEEING-OF(v")]

But note that such abstraction can be specified in terms of T-concepts, without appeal to truth values or concepts of type , as Church (1941)’s own discussion makes clear. Concepts of individuals can be used to build 334 The Syntax–Semantics Interface concepts of all or none, which can be used to build concepts of assignments, which can be used to build concepts of individuals. So a relative clause like (15a) or (15b) can be a complex instruction, with the embedded sentence as an instruction for how to build a (variable) T-concept:

(15) a. [CP who1 [C [TP who1 saw her2]]]

b. [CP who2 [C [TP he1 saw who2]]]

The higher copy of who—traditionally described as occupying a specifier position of a covert complementizer—is then part of an indexed instruc- tion for how to convert an assembled T-concept into a concept like (14a) or (14b). There are various ways of encoding wh-instructions. But for concreteness, given any index v, let ABSTRACT-v be an instruction for how to prefix a T-concept with the operator “λv". Then a wh-question can be treated as an instruction, of the form shown in (16), for how to build a concept like (14a) or (14b) and prepare it for querying.

(16) QUERY : ABSTRACT-v: CLOSE-UP: CONJOIN [. . ., . . .]

We want to stress that from any plausible I-language perspective, lambda abstraction has to be viewed as a formal operation on mental representa- tions, as opposed to represented semantic values. Indeed, a standard type theory makes it especially clear that wh-movement to the edge is a special kind of concept construction instruction. So we illustrate the point with a more familiar proposal. On the treatment of relative clauses in Heim and Kratzer (1998), the SEM of (17a) has the form shown in (17b), with expres- sions of type indicated as such with superscripts:

(17) a. Every dog chased some cat. b. [[every dog]^[1^[[some cat]^[2^[t1 chased t2] ]] ]]

The idea, which can be encoded in various ways, is that the indices on the raised quantifiers correspond to (ordered) lambda-abstraction on open sen- tences: “Every dog” and “some dog” are quantificational expressions of type < , t>; “every dog” combines with an expression (of type ) that denotes the function determined by abstraction on the first variable applied to a sentence (of type ) that has one free variable; “some cat” combines with an expression that denotes a function (of type ) deter- mined by abstraction on the second variable applied to a sentence (of type ) that has two free variables. Semantic values for the constituent expres- sions can be recursively specified as shown in (18), with the relevant seman- tic types indicated for explicitness. Interrogatives, Instructions, I-Language 335 (18) A standard semantic derivation of a relative clause

Expression Type Semantic value

[t1 chased t2] T iff CHASED(A1, A2)

2^[t1 chased t2] λx.T iff CHASED(A1, x) [some cat] < , t> λX.T iff ∃x:CAT(x)[Xx = T]

[some cat]^[2^[t1 chased t2]] T iff ∃z:CAT(z)[CHASED(A1, z)]

1^[[some cat]^[2^[t1 chased t2]]] λx.T iff ∃z:CAT(z)[CHASED(x, z)] [every dog] < , t> λX.T iff ∀y:DOG(y)[Xy = T] [every dog]^[1^[some cat]^ T iff ∀y:DOG(y)[∃z:CAT(z)

[2^[t1 chased t2]]] [CHASED(y, z)]]

Like all expressions, the expressions of type have their semantic val- ues relative to assignments of values to variables. But relative to any assign- ment, there are only two possible values: TRUTH and FALSITY. So, for example, “T iff CHASED(A1, A2)” is shorthand for TRUTH if the thing assigned to the first index chased the second, and otherwise FALSITY. With this in mind, focus on the two crucial steps, which involve a shift from an expression of type to an expression of type .

(19) Abstraction 1

Expression Type Semantic value

[t1 chased t2] T iff CHASED(A1, A2) 2^[t1 chased t2] λx.T iff CHASED(A1, x)

(20) Abstraction 2

Expression Type Semantic value

[some cat] < , t> λX.T iff ∃x:CAT(x)[Xx = T] 1^[[some cat]^[2^[t1 chased t2]]] λx.T iff ∃z:CAT(z)[CHASED(x, z)]

The truth values do not, relative to any assignment, determine the rel- evant functions. And the indices do not denote functions of type >; any such function would always map the same truth value to the same function. So replacing “T iff CHASED(A1, A2)” with “λx.T iff CHASED (A1, x)” would border on incoherence if “T iff CHASED(A1, A2)” was really serving as an assignment-relative specification of a truth value. A rep- resentation of a truth value is no basis for a representation of a function.22 By contrast, the following psychological hypothesis is perfectly sensible: the

I-language expression “[t1 chased t2]” is a (complex) instruction for how to build a concept of type ; likewise “2^[t1 chased t2]” is an instruction for 336 The Syntax–Semantics Interface how to build a concept of type —viz., by executing “[t chased t2]” and converting the resulting concept of type into a concept of type . One can imagine a Tarskian language of thought with sentences like “T iff CHASED(A1, A2)” or “CHASED(A1, A2)”, and a psychological opera- tion that converts such sentences into expressions like “λx.T iff CHASED (A1, x)” or “λx.T iff CHASED(A1, x)”, where such expressions of a Church- style mentalese have (functional) denotations that can be recursively speci- fied by appealing to sequences and variants. We have been at pains to avoid assuming this standard typology, in part because we do not see how concepts of truth values could be used to ask questions. (And we suspect that we are not alone, given the tendency to sup- plement the usual typology with appeals to propositions and sets thereof.) But even on the standard view, it seems that wh-extraction ends up being treated (at least from an I-language perspective) as an instruction to convert a sentential concept into a predicative concept. And given this conception of wh-extraction, one need not treat sentential concepts as concepts of truth values—much less concepts of propositions—in order to accommodate rela- tive clauses and their wh-question counterparts. On the contrary, by treat- ing sentential concepts as concepts of all or none, one is more or less forced into treating wh-extraction as an instruction for how to use a sentential concept to form a concept of individuals that can be used in predication or in querying. Again we see how movement to the edge can have a crucial semantic effect, as expected if “External Merge correlates with argument structure, internal Merge with edge properties, scopal or discourse-related [. . .]” (Chomsky 2005: 14). Our approach encodes this “duality of seman- tics” in part via the idea that for both relative clauses and interrogatives, movement to the edge exploits internal Merge to create instructions for how to modify concepts assembled by executing instructions that are formed by external Merge.

11.3 The Syntax of SEMs As noted earlier, for any given I-language, syntacticians and semanticists share the task of specifying both the implemented procedure that generates boundlessly many SEMs and the principles governing how those SEMs inter- face with mental representations. In attempting this joint task, one tries to construct the simplest overall account that does justice to the facts. But it is all too easy to simplify a syntactic theory of the generable SEMs by positing a sophisticated (and perhaps unimplementable) mapping from the posited grammatical forms to mental representations. Likewise, one can purchase simplicity in semantics by complicating the syntax in ways that require gen- erative procedures that children cannot acquire. This invites a minimalist strategy urged by Hornstein and Pietroski (2009): start with the sparest remotely plausible conceptions of both syntax and semantics—where the relevant notion of sparsity concerns the posited procedures, not just the Interrogatives, Instructions, I-Language 337 generated expressions/representations—and ask if relatively simple interface principles would still accommodate a significant range of “core” phenom- ena and, if so, try to describe more recalcitrant phenomena as interaction effects between the representations assembled via SEMs and other aspects of cognition, where these need not be limited to pragmatic effects as classically conceived. Having urged a reduction in semantic typology, we now turn to syntax. In our view, a rather simple procedure generates wh-expressions that are a little more complicated than is often acknowledged, with impor- tant ramifications for how interrogative SEMs can be interpreted.

11.3.1 Syntactic Assumptions We adopt a minimalist approach to syntax, according to which the com- putational primitives should be as few as possible (Chomsky 2005, 2007, 2008, Boeckx 2008, Hornstein 2009, Hornstein and Pietroski 2009). Spe- cifically, while it seems obvious that human I-languages employ a Merge operation to combine various expressions, we do not take this operation as basic. Rather, we assume that when two expressions can be Merged to form a third, this is because the constituents had certain characteristics— say, Edge Features (Chomsky 2008) or Labels (Hornstein 2009). We also assume that Merge manifests in two ways (Chomsky 2004): as External Merge, or “first-merge”, of a lexical item with another generable expres- sion; and as Internal Merge, or so-called movement, of an expression with (a copy of) one of its constituents. Empirical facts suggest that another oper- ation establishes certain “agreement” dependencies between lexical items. Generating SEMs may require other basic operations. But we assume this much without further comment. In this section, we suggest a syntax of inter- rogatives that maps transparently onto logical forms of the sort envisioned in section 11.2, given some independently plausible assumptions about how SEMs are “spelled out”.

11.3.1.1 The Syntax of Interrogatives (Cable (2010)) Going back to Baker (1970), most syntactic approaches to questions have assumed that one way or another, a Q(uestion)-morpheme is merged into the left periphery of the sentence—typically, in what would now be called the C(P) domain—with consequent effects, like auxiliary-fronting and/or characteristically interrogative intonation. This makes it tempting to blame characteristically interrogative meaning on the same left-peripheral mor- pheme. Our view differs, at least in the details. Following Cheng (1991) and Cable (2010), we argue that the crucial interrogative element is not the Q-morpheme itself. Rather, there is a distinct but semantically related “Q(uestion)-particle” (Hagstrom 1998, Kishimoto 2005). While this Q-par- ticle is phonologically empty in many languages, including English, Cable presents intriguing evidence for an overt Q-particle in Tlingit—a Na-Dene 338 The Syntax–Semantics Interface language of Alaska, British Columbia and the Yukon. Consider (21), a typi- cal example of a Tlingit wh-question:

(21) Waa sá tudinookw I éesh? how Q he.feels your father ‘How is your father feeling?’ (Cable 2010: 3)

The following structure shows how questions generally are formed in Tlingit:

(22) [. . . [QP [. . . wh-word. . .] sá] (focus particle) . . . main predicate. . .] (Cable 2010: 4)

(22) illustrates that the wh-word has to precede the main predicate of the wh-question and that the wh-word is also typically initial in the clause. Next, the wh-word is followed by the Q-particle sá. Notice that this particle either directly follows the wh-word or a phrase containing the wh-word. Cable’s representation of the syntax, shown in (23), captures the gist of his analysis (Cable 2010: 38).

(23) CP

QP CP

XP Q CQ IP

QP ...wh-word ...

Cable offers evidence that the Q-particle is the real target of the rules/ operations governing question formation. When the wh-word is fronted, so is the entire QP, and Cable argues that nothing about the wh-word itself matters in this respect (Cable 2010). The examples that follow illustrate this claim. In particular, the locality of the wh-word itself is irrelevant: What matters is the locality of the QP to the left periphery, as suggested by (24a) through (24c) (Cable 2010: 33):

(24) a. [NP [CP Wáa kligéiyi ] xáat] sá i tuwáa sigóo? how it.is.big.REL fish . Q your spirit it.is.glad “How big a fish do you want?”(A fish that is how big do you want?) Interrogatives, Instructions, I-Language 339

b. *[NP [CP Wáa sá kligéiyi ] xáat] i tuwáa sigóo? how Q it.is.big.REL fish your spirit it.is.glad

c. *[NP [CP Wáa kligéiyi ] sá xáat] i tuwáa sigóo? how it.is.big.REL Q fish your spirit it.is.glad

Examples (24a) through (24c) show that wh-operators may be inside islands if and only if the Q-particle is outside the island. Thus, these examples show that it is only the features of the Q-particle that determine whether fronting is possible or not. A related point is that the Q-particle must always front in a wh-question, as in (25a) and (25b) (Cable 2010: 32):

(25) a. [Goodéi woogootx sá]i [has uwajée ti i shagóonich]? where.to he.went Q think your parents.ERG “Where do your parents think he went?”

b. *[Goodéii [has uwajée [woogootx sá] i shagóonich]? where.to think he.went Q your parents.ERG

This suggests that the Q-particle is central to wh-fronting. For if the wh-fronting rule made reference only to the wh-word, we would expect (25b) to be acceptable. Moreover, a Q-particle must always appear at the right edge of whatever phrase one fronts in a wh-question (Cable 2010: 44–45).

(26) a. [Aadóo yaagú] sá ysiteen? who boat . Q you.saw “Whose boat did you see?” b. *[Aadóo sá yaagú] ysiteen? who Q boat you.saw

The unacceptability of (26b) lends further support to the hypothesis depicted in (23). For these reasons, we follow Cable’s suggestion that languages like Tlin- git should inform analyses of languages with no overt Q-particle. So for English, we assume a silent Q-particle that is merged with the wh-word.23 In addition to allowing for a more language-invariant mapping from SEMs to logical forms, this will allow for attractively simple conception of the underlying generative procedures, for the small cost of positing slightly more elaborate SEMs for languages like English.

11.3.1.2 Spell-Out and the Mapping to Logical Forms For purposes of offering an explicit proposal about how Cable-style syntac- tic structures could be “read” as semantic instructions, we adopt a particular 340 The Syntax–Semantics Interface minimalist syntax that has independent virtues; see Lohndal (2012b). Vari- ous details will not be very important for the main points we are making. Given the facts discussed here, one could equally well adopt slightly differ- ent syntactic assumptions, including those of Cable (2010). The point is not that our general treatment of interrogatives and relative clause requires the particular syntax adopted here but, rather, that a relatively spare syntax would suffice. Initially, we show how the proposed syntax (from Lohndal 2012b) works in some detail. Then we return to more traditional representations for ease of exposition. But as will become clear, our proposed logical forms will reflect the proposed Spell-Out system, which dovetails with a conception of SEMs as instructions to build concepts. Every theory makes assumptions about how syntactic structures are mapped onto logical forms. It could be that syntactic structures are logical forms and that there effectively is no mapping. But we assume, standardly, that there is a mapping from syntactic structures to logical forms, and that it is an open question what this mapping is. On our view, SEMs are instructions to build concepts. SEMs need to be mapped onto logical form and we will call this point of transfer Spell-Out. A standard assumption within Minimalism is that transfer happens in chunks (Uriagereka 1999, Chomsky 2000a, 2001) of a certain size. There is disagreement on what the size of the chunks are, but the core idea in Lohndal (2012b) is that each application of Spell-Out corre- sponds to a conjunct in logical form. One motivation is to enable a relatively transparent mapping to Logical Forms that manifest full “thematic separa- tion” of arguments from predicates (Carlson (1984), Schein (1993), Pietroski (2005a)), by spelling out each argument and the predicate separately.24 Lohndal develops a syntax where there is no categorical distinction between specificers and complements. The core syntactic relation is that of a head and a nonhead that are in a sisterhood relation. A derivational constraint that bans two elements that can only be phrases from being set- merged is proposed; compare with Moro (2000, 2008), Chomsky (2008), and Narita (2011).

(27) *[XP XP].

There are many views one can take on the nature of this constraint; see Speas (1990: 48), Uriagereka (1999), Moro (2000, 2008), Alexiadou and Anag- nostopoulou (2001, 2007), Richards (2010) and Chomsky (2008, 2010) for much discussion. The present system is different in one crucial way from Uriagereka (1999) and Narita (2009, 2011, 2012). The former argues that only left branches can be spelled out separately, whereas the latter argues that there is optionality as to where Spell-Out applies. In the present system, optionality does not exist: Spell-Out always has to target the complement of the head that constitutes the spine of the relevant tree that is being built. Interrogatives, Instructions, I-Language 341 An assumption is that all arguments are introduced by functional projec- tions above the verb, as in Lin (2001), Borer (2005), Bowers (2010). Agents are introduced by Voice0; compare with Kratzer (1996), and Alexiadou, Anagnostopoulou, and Schafer (2006). It should be clarified that the nature of the label does not really matter; see Chomsky (1995), Harley (1995), Folli and Harley (2007), Pylkkänen (2008), Ramchand (2008), and Sailor and Ahn (2010) for much discussion. The importance of this assumption is that the Agent is introduced by a separate functional projection. Compare the earlier appeal to DONE-BY-JAY as a conjunct in T-concepts. Themes are also introduced by functional heads; compare with Baker (1996), Lin (2001), Borer (2005), and Bowers (2010). One can label the relevant head F0, for lack of a better name. Kratzer (1996) argues against thematic separa- tion for internal arguments, but see Williams (2008), Lohndal (2012a) for replies. So while we earlier appealed to conjuncts like SEEING-OF-KAY, since appeal to T-concepts is neutral on this aspect of thematic separation, we think such conjuncts should be elaborated as follows: and[SEEING(e), ∃x[THETA(e, x) & KAY(x)], where THETA(e, x) is the thematic concept associated with being the internal argument of “see”. Likewise, DONE- BY-JAY should be elaborated as ∃x[+THETA(e, x) & JAY(x)], where +THETA(e, x) is the thematic concept associated with being the external argument of “see”. Based on these assumptions, consider the structure in (28).25

(28) VoiceP

XPAgent Voice FP

XPTheme FVP

V

The following discussion shows how this structure gets built and what structures get spelled-out during the structure building process.

Because of (27), when F has merged with VP and then XPTheme wants to merge with the FP phrase, such a merger cannot take place. Instead, for

XPTheme to merge with FP, the complement of F needs to be spelled out. Because of the relational character of BPS, F is now a head and can merge with the phrase XPTheme. (29) shows the structure before Spell-Out. 342 The Syntax–Semantics Interface (29) FP

FVP

V

(30) is the structure after Spell-Out and merger of Theme.

(30) FP

XPTheme F

The next element to be merged is the Voice head (31).

(31) VoiceP

Voice FP

XPTheme F

Then the XPAgent wants to be merged into the structure. But VoiceP and

XPAgent cannot be merged, so again, the complement of Voice needs to be spelled out. The resulting structure is given in (32).

(32) VoiceP

XPAgent Voice

The VoiceP in (32) can now be merged with further heads, but as soon as a new phrase wants to be merged, Spell-Out will be triggered again. Let us for concreteness move the subject to the canonical subject position, which in English we take to be SpecTP. First T merges with VoiceP, creating a TP, as shown in (33).

(33) TP

TVoiceP

XPAgent Voice Interrogatives, Instructions, I-Language 343

When the subject, XPAgent , moves, Spell-Out is triggered again so that we end up with (34).26

(34) TP

XPAgent T

The present system will guarantee that each application corresponds to a conjunct at logical form. That is, the syntax will give us the simplified logical form in (35), where “A1” and “A2” indicate the contributions of arguments/variables.

(35) ∃e[Agent(e, A1) & Theme(e, A2) & verb(e)]

Of course, this will require what we referred to as theta-binding in section 11.2.1; that is, the argument has to be integrated into a thematic predi- cate. There are various ways this can be done; see Carlson (1984), Lohndal (2012b) for two views. As we have shown, relative clauses also require an abstraction instruction in order to implement the idea behind lambda-abstraction. That is to say, the Spell-Out system proposed here does not itself suffice for all the seman- tic computation. But as Pietroski (2011) shows, an even more restricted ver- sion of the posited abstraction instruction can accommodate quantification. Furthermore, and this will be crucial below, the IP complement of C has to be spelled out before the wh-element can be merged with the C head. This means that the wh-element will introduce a conjunct at logical form. This raises further questions, to which we now turn, about the role of edges in our system. Within traditional phase-based systems of the sort developed in Chom- sky (2000a, 2001), the notion of an edge plays an important role. A phase head spells out its complement, but both the phase head and its specifier(s) are accessible for further computation (agreement, movement, etc.). On this approach, an edge is important especially for purposes of movement: unless a constituent moves to the edge of a phase head, this constituent will not be able to undergo further movement because of Spell-Out. From this per- spective, one can think of edges as escape hatches. But on our approach, the issue is not about escaping. Rather, there are several Spell-Out domains, and these together create instructions. The left peripheral edge in interroga- tives makes it possible to modify a sub-sentential concept such that this concept can be used for querying. So one can think of edges as “second- ary” instructions. But there is no single instruction that all edges issue. The details depend on the location and relevant content/features of the edge in question. 344 The Syntax–Semantics Interface 11.3.2  Argument Interrogatives Returning now to (36),

(36) Who did Jay see? consider the syntactic representation in (37), where the wh-expression is indexed, and striking through means that the constituent is phonologically empty.

(37) CP

QP C’

who QUERY CQ IP

did I’ Jay

I vP

did vP who

v’ Jay

v VP

see

This structure is nearly ideal for purposes of describing (36) as an instruc- tion for how to build a concept of things Jay saw, and then prepare this concept for use in querying. Recalling section 10.2: Think of the IP as a tensed instruction to build a T-concept that applies to x, relative to some assignment α, iff there was an event of Jay seeing α(1); and think of the CP, with who on the left edge, as an instruction for how to build a concept of things Jay saw by abstracting on the first variable of a concept assembled by executing the IP. The problem is that in (37), QUERY is combined with who, instead of being combined with the entire CP. Interrogatives, Instructions, I-Language 345 On our view, who is not an instruction for how to fetch a concept that gets modified by the operation triggered by QUERY. Rather, who is part of an “edge” instruction that directs abstraction on a T-concept. And we want QUERY to direct modification of the abstracted concept. Moreover, given the syntax offered in section 11.3, Q’ cannot be a bipartite instruction for how to form a concept that combines with the concept formed by execut- ing the C’ or IP. But suppose that QUERY either raises again, as in (38a) or “reprojects” its grammatical label as in (38b),

(38) a. QP

QUERY CP

Q’ CP

who QUERY CQ IP

did Jay see who b. QP/CP

Q’ CP

who QUERY CQ IP

did Jay see who where the slash at the top of (38b) reflects the derivational history, in which Q’ projects its own label after moving into a specifier position of C. Either way, the idea is that the internal merge of [who QUERY] with the CP is an instruction, also a CP, whose execution (Spell-Out) leads to assem- bly of the abstracted concept. But (38b) reflects the hypothesis that once [wh QUERY] has internally merged with the CP, QUERY can reproject its own label. And one can say that this relabeling is itself the instruction to prepare the abstracted concept, of things Jay saw, for use in querying. In (38a), the order of operations is directly reflected in the branching structure. But 346 The Syntax–Semantics Interface such movement violates plausible constraints on extraction; and on the cur- rent proposal, moving QUERY—a head—would not trigger an additional Spell-Out domain. So we favor (38b). But however encoded, the idea is that—perhaps because of a constraint on Spell-Out, of the sort suggested earlier—QUERY is not executed as part of the instruction Q’, which has who as a constituent. Rather, QP is an instruction to execute CP and pre- pare the resulting concept (of things Jay saw) for querying. If it helps, one can think of C as a Spell-Out instruction that permits a certain kind of parataxis: An assembled T-concept becomes available as a target for a cer- tain kind of manipulation, as opposed to merely being available as a con- junct of a larger concept; compare Pietroski (1996)’s neo-Fregean version of Davidson (1968). From this perspective, Q’ is an instruction to target the first variable—which would otherwise be treated as a name for what- ever α(1) turns out to be—and treat it as the only free variable in a concept whose event variable has been closed; compare Heim and Kratzer (1998), discussed previously. This does not yet explain how a structure like (38b) can be generated and used an instruction for how to build a concept of things Jay saw, even given that the embedded IP can be generated and used as instruction for how to build a relevant T-concept. While the covert first-merged occurrence of who corresponds to an indexed variable—like him1, but with even less predica- tive content—one wants to know why the overt internal-merged occurrence (at the left periphery) has a different interpretation.27 Put another way, the first-merged occurrence comports with an intuitive view that Beck (2006) defends in detail: by itself, the meaning of who is somehow deficient. This raises the question of why a raised wh-word is part of a more interesting instruction, instead of merely being spelled out (again) as an indexed variable. What role does movement to the edge play, in terms of determining the instruction generated? And posed this way, the question suggests the answer hinted at earlier: executing the instruction Q’, thereby manipulating a T-concept in the relevant way, involves produc- tion of a second occurrence of the indexed variable (as in familiar formal languages where the quantification over assignment variants is explicit). In short, movement to the edge creates the relevant instruction, which would otherwise not have been generated. So let us say a bit more about the gener- ated instruction. The idea is that spelling out see who—that is, executing this semantic instruction, with who as an internal argument of see—will yield a concept like the following: SEE(E) & ∃X[THEME(E, X) & 1(X)], where 1(x) is a concept that applies, relative to α, to whatever α assigns to the first index. The idea, defended in Pietroski (2011), is that the conceptual template ∃x[THEME(e, x) & φ(x)] is invoked by an aspect of phrasal syntax (viz., being the internal argument of a verb like see). Correlatively, spelling out the embedded IP in (38a) will yield a T-concept like the following: ↑[∃x[EXPERIENCER(e, x) & JOHN(x)] & SEE(E) & ∃x[THEME(e, x) & 1(x)]]. Interrogatives, Instructions, I-Language 347 The C head is then merged, and the QP is moved to SpecCP, yielding (37). So it remains to motivate the reprojection step to (38a) or (38b). In discussing this kind of relabeling operation, Hornstein and Uriagereka (2002) focus on the fact that quantificational determiners can be semantically asymmetric. For example, every cow is an animal neither implies nor is implied by every animal is a cow. As many authors have noted (see, e.g., Larson and Segal 1995), determiners seem like transitive verbs, in taking a pair of arguments in a certain order; compare Montague (1974). While see combines with a Theme-argument before combining with an Experiencer-argument, every apparently combines with a restrictor-argument before combining with a scope-argument. Indeed, Hornstein and Uriagereka speculate that quanti- fiers raise out of the VP shell because the determiner-plus-restrictor phrase must combine with an expression with an expression of the right sort—that is, an expression whose label marks it as a potential external/scope-argu- ment of a determiner phrase (see also Higginbotham and May 1981). Note that in (39),

(39) VP

DP V’

John V DP

saw cow every cow can be read as the internal argument of every, but every has no external argument. So if a SEM that includes every is executable only if this (asym- metric) determiner has an external argument, then (39) is not an executable SEM. In (40),

(40) VP

DP V’

Every cow V DP

saw John 348 The Syntax–Semantics Interface one might think that saw John can be interpreted as the external argument of every. Initially, this makes it tempting to think that displacement is not required in such cases. But one must not forget the event variable. In (40), the VP-internal every cow is marked as the external argument of saw, and that is presumably part of the explanation for why the cows are represented as experiencers of events of seeing John. So saw John cannot be interpreted as the external argument of every, unless a single generable SEM can be an ambiguous instruction that gets executed in more than one way in the con- struction of a single thought.28 Put another way, (40) is not an expression/ instruction that marks saw John as the external argument of every. This is unsurprising if saw me is an instruction to build a concept of events. So if every cow needs to combine with a more sentential instruction—to build a T-concept, or a concept of truth values—then every cow must displace. And if the raised DP combines (i.e., internally merges) with a phrase headed by any functional element F, with the higher DP in a “specifier” position of F, then the resulting expression will still not be labeled as one in which every has an external argument. By contrast, suppose that every reprojects, yielding the structure shown in (41),

(41) DP/IP

D’ IP

Every cow IVP

V’ every cow

V DP

saw John where for simplicity, the index is shown on the determiner’s internal argu- ment, suggesting that the indexed variable is restricted to the cows. The idea is that in (41), the IP is marked as the external argument of every.29 Our suggestion is not that in (42), QUERY is itself a quantifier taking the CP as its external argument. Interrogatives, Instructions, I-Language 349 (42) QP/CP

Q’ CP

who QUERY CQ IP

did Jay who

But one can, if one likes, think of QUERY as an element that can com- bine with an indexed wh-word to form a constituent that combines with CP to form a reprojected instruction for how to build a concept as follows: execute CP, thereby obtaining a T-concept that is ready for manipulation; abstract on the indexed variable; and prepare a resulting monadic concept for use in querying. Our “mood-as-instruction” conception of questions retains impor- tant aspects of an account that can at first seem very different. Follow- ing Karttunen (1977), many authors have argued that question words are—or at least are tightly associated with—existential quantifiers. For a recent interesting argument, involving data from acquisition, see Crain et al. (2009), though Caponigro (2003) argues against this view. In at least one sense, and perhaps two, we agree. For on our account, a raised wh-expression combines with an instruction to form a T-concept. With regard to (36), a concept of events of Jay seeing α(1) is used to build a concept that applies to x iff there was at least one such event; as noted in section 11.2, T-closure has the effect of existentially closing a variable. More important, we take a raised wh-expression to be an instruction existentially quantify over assignment variants, so that executing the CP is a way of building a concept that applies to x iff there was an event of Jay seeing x. Recent proposals have suggested that wh-words are semantically defi- cient in the sense that wh-words in all languages have only a focus-semantic value and that their normal semantic value is undefined, as Beck (2006) and Cable (2010) argue. This yields an interesting account of so-called “LF”- or “Focus-intervention effects” across various languages, and it provides a rationale for why wh-words are focused in so many I-languages. Rizzi (1997) has also clearly demonstrated that there is a close syntactic relation- ship between wh-phrases and focus (see also Büring 1997 and Schwarz­ schild 1999). The examples in (43a) and (43b) (taken from Rizzi 1997: 298) 350 The Syntax–Semantics Interface show that a focalized constituent and an interrogative constituent are incompatible:

(43) a. *A chi IL PREMIO NOBEL dovrebbero dare? to whom the prize nobel should.they give “To whom THE NOBEL PRIZE should they give?” b. *IL PREMIO NOBEL a chi dovrebbo dare? the nobel prize to whom should.they give “THE NOBEL PRIZE to whom should they give?”

Rizzi takes the complementary distribution to suggest that wh-phrases and focused phrases move to the same projection in the left periphery. Relatedly, Hagstrom (1998), Yatsushiro (2001), Kratzer and Shimoyama (2002) and Beck (2006) have argued that Q-particles are operators over sets. Drawing on Hagstrom, Cable (2010) suggests that Q-particles are actually variables over choice functions while on Hagstrom’s theory, Q-particles are existential quantifiers over choice function variables. But from an E-per- spective, choice functions are intimately related to existential quantification over assignment variants; and we assume that (on anyone’s view) it takes work to turn talk of operators and choice functions into specific proposals about the procedures that generate SEMs and the procedures that use SEMs to build mental representations. So far from being at odds with existen- tial treatments of wh-expressions, our proposal can be viewed as a way of encoding (via Cable’s syntax) an insight that motivates such treatments. In this context, it is worth noting that cross-linguistically, the interroga- tion particle is often the disjunction marker; see Kuroda (1965) and Jayasee- lan (2001, 2008). Consider the following example, from Japanese:

(44) a. John-ka Bill-ga hon-o kat-ta. John-or Bill-NOM books-ACC bought-PAST “John or Bill bought books.” (Kuroda 1965: 85) b. John-ga hon-o kat-ta-ka? John-NOM books-ACC buy-PAST-Q “Did John buy books?” (Kuroda 1965: 87)

As Jayaseelan (2008: 4) stresses, this invites an interesting question:

If the question particle is a device of clausal typing, as is standardly assumed since Cheng (1991), any marker should be able to fill this func- tion. Then why is it that in so many languages—with a regularity that is far greater than by chance—the question particle is also the disjunction marker?

Jayaseelan’s own proposal is that “a disjunction that takes a variable as its complement is interpreted as infinite disjunction. This is the meaning of an existential quantifier” (Jayaseelan 2001: 75). Our account captures this Interrogatives, Instructions, I-Language 351 idea by treating wh-extraction as instruction to existentially quantify over assignment variants, thereby accommodating the existential property.

11.3.3 Yes/No-Interrogatives If one takes interrogatives to be devices for denoting sets of propositions, then it is natural to start with yes/no-interrogatives. For as noted in sec- tion one, wh-interrogatives immediately raise questions about which propo- sitions are in the relevant sets. By contrast, we have stressed the parallel between wh-interrogatives and relative clauses. So if our proposal applies straightforwardly to (45),

(45) Did Jay see Kay? that is a point in our favor. Again, we assume that English has a covert yes/no-operator that other languages express overtly, as in the Malayalam example (46):

(46) John wannu-(w)oo? John came-or “Did John come?” (Jayaseelan 2001: 67)

More specifically, consider the syntactic representation in (47).30

(47) QP

Q’ CP

QUERY CQ IP

did I’ Jay

I vP

did v’ Jay

v VP

see Kay 352 The Syntax–Semantics Interface If the IP is an instruction for how to build a T-concept, then the QP can simply be an instruction to execute the IP and prepare the resulting T-con- cept for use in querying. The relevant query can still concern which things fall under the assembled: all or none? In this sense, yes/no-queries are like wh-queries. However, for a yes/no-query, one can equally well ask if at least one thing falls under the assembled T-concept. In this sense, yes/no-queries are special cases of wh-queries, with the relation to existential quantification even more obvious. Given the absence of any wh-word in (47), there is no need to appeal to quantification over sequence variants. Correlatively, there is no need to appeal to reprojection. From this perspective, the need for reprojection— however encoded—arises when QUERY is combined with an abstraction instruction. And this suggests another sense in which yes/no-queries are spe- cial cases, as opposed to paradigmatic cases that should shape our general conception of interrogatives. Any T-concept, by virtue of its form, applies to all or none; and such a concept can be used, given suitable preparation, to ask a binary question. But to ask a more refined (nonbinary) question, one needs a concept whose form makes it possible for the concept to apply to some but not all things. If such a concept is formed by abstraction on a T-concept, then any preparation for querying is presumably delayed until after the abstraction operation has been performed. Since that is the only role to appeal to reprojection in our account of wh-interrogatives, we can avoid such appeal in our account of yes/no-interrogatives. But it does not follow that the latter are somehow semantically basic. So while it can initially seem attractive to say that each interrogative denotes a trivial set of answers, thus making it tempting to say that a yes/ no-interrogative denotes a less trivial set, we think it is better (all things considered) to treat all interrogatives as instructions for how to assemble a concept and prepare it for use in asking which things fall under the con- cept. This reflects our suspicion that the parallels between interrogatives and relative clauses run deep, with yes/no-interrogatives being especially simple cases that do not make the parallel obvious.

11.3.4  Adjunct Interrogatives Often, the semantics for adjunct interrogatives has seemed especially dif- ficult.31 But given an event semantics that associates grammatical arguments with thematic concepts, both arguments and adjuncts correspond to con- juncts of assembled concepts. And this suggests a relatively simple theory of adjunct interrogatives like (48):

(48) Why/how/when did Jay see Kay?

In a language like Tlingit, we see that Q-particles are overt also in adjunct interrogatives (49): Interrogatives, Instructions, I-Language 353 (49) Waa sá sh tudinookw I éesh? how Q he.feels your father “How is your father feeling?” (Cable 2010: 3)

We assume, therefore, that (48) has the reprojected structure shown in (50).32

(50) QP/CP

Q’ CP

why QUERY CQ IP

did I’ Jay

I vP

did why vP

v’ Jay

v VP

see Kay

To a first approximation, the VP in (50) is an instruction for how to build a concept of Jay-saw-Kay events that has a further feature: α(1) is a cause/ manner/time of their occurrence. Creating the corresponding T-concept, and then abstracting on the variable, yields a concept of causes/manners/times of Jay-saw-Kay events. Such a concept can then be prepared for the use of ask- ing what falls under the concept. To be sure, pragmatics will play a role with regard to what a satisfactory answer requires. A given event of Jay seeing Kay might have many causes, and there might have been many such events, at different times, each done in a different manner. But likewise, if a speaker asks who Jay saw, pragmatics will play a role with regard to what a satis- factory answer requires. Jay may have seen many people, the vast majority of whom are irrelevant to the speaker’s question, which is not to be con- fused with the interrogative SEM used to assemble the concept with which 354 The Syntax–Semantics Interface the question is asked. Our aim is not to provide a theory of how speakers use concepts in contexts to ask questions that might well be modeled with partitions; compare Higginbotham and May (1981) and Higginbotham (1993). Our aim has been to explain how SEMs can be generated and used as instructions to build concepts that can then be used in many ways. And for these purposes, adjunct-interrogatives pose no special difficulties, given an “eventish” semantics of the sort adopted here.

11.3.5  Multiple wh-Interrogatives One might, however, think that multiple wh-interrogatives do pose a special difficulty. Examples like (51) have been important in linguistic theory since Baker (1970) and Kuno and Robinson (1972):

(51) Who saw what?

And it may be that such examples present special complications to any account of wh-expressions. But we do not think they weigh against our proposal about the syntax/semantics interface. Initially, one might imagine the syntactic structure in (52).

(52) QP

Q’ CP

who QUERY CQ IP

did I’ who

I vP

did v’ who

v QP

chase what QUERY Interrogatives, Instructions, I-Language 355 But given well-known syntactic arguments, what needs to move; see Las- nik and Uriagereka (1988: 102–104), building on Huang (1982), for a clear summary. While this movement is not triggered by a type/label mismatch— at least not of the sort illustrated by displacement of quantifiers—the famil- iar idea is that wh-movement corresponds to the creation of a variable. As noted earlier, wh-expressions seem to be semantically bleached in a way that invites a pure existential or free choice analysis. Certainly, what differs from her, in that the former cannot support a deictic interpretation. So let’s sup- pose that what does indeed raise, as in (53), (53) QP/CP

Q’ CP

who QUERY QP C’

Q’ CQ IP

what QUERY did I’ who

I vP

did v’ who

v QP

see what with the following result: The embedded CP, formed by internally merging [what QUERY] with a C’, is an instruction for how to build a concept of things such that α(who) saw something; but this CP remains labeled as such, allowing for subsequent internal merger with [who QUERY]. That is, only the topmost QUERY reprojects, and only one wh-expression per cycle trig- gers genuine abstraction. On this view, the matrix QP/CP is an instruction for how to build a concept of things who saw something and then prepare this concept for use in querying. 356 The Syntax–Semantics Interface This does not yet predict a pair-list reading for (51). Indeed, it raises the question of why (51) differs in meaning from (54):

(54) Who saw something?

But we take it that answers to (54)—for example, “Jay saw something”—are at least partial answers to (51). And we note that a relative clause like (55)

(55)  . . . student who saw what does not have an interpretation according to which it describes pairs such x is a student and x saw y. Moreover, a declarative like (56a) corresponds to an interrogative like (56b),

(56) a. Jay gave Kay a dollar yesterday. b. Who gave who what (and) when? suggesting the possibility of “n-tuple list” interpretations. We see no reason for thinking that such interpretations can be generated directly by the core recursive operations that characterize I-languages. So we suspect an interac- tion between a concept with one free variable and pragmatic processing, prompted by occurrences of Q-constituents that did not direct preparation of a concept for querying at the initial interface. If (51) directs the construction of a concept of things who saw some- thing, but the existential is still somehow associated with an unexecuted occurrence of QUERY, then answers like “Jay saw something” might well feel incomplete, compared with answers like “Jay saw a/the turnip”. Put another way, once a concept of those who saw something is formed and prepared for use in querying, it might be used (in combination with other cognitive capacities) to pose two questions: Who falls under that concept, and for whoever falls under it, what did they did see? So far as we know, any account of multiple wh-interrogatives will require some such appeal to a cognitive apparatus that goes beyond basic I-operations in order to accom- modate the phenomenon of “n-tuple list” interpretations. If this is correct, then theorists may as well retain a simple semantics according to which (51) and (54) need not direct the construction of different concepts, for these expressions differ manifestly in ways that can affect subsequent use of the concept constructed.33 Since May (1977) and Higginbotham and May (1981), much attention has been devoted to interrogatives that also have “regular” quantifiers, as in (57):

(57) Who said everything?

While it is often said that (57) is ambiguous, our own view is that (57) is— like its relative clause counterpart—a univocal instruction for how to build Interrogatives, Instructions, I-Language 357 a concept that applies to individuals who said everything. The complication is that who is number neutral, as illustrated in (58a) and (58b),

(58) a. . . . student who said everything that needed to be said b. . . . students who said everything that needed to be said where (58b) has not only a distributive reading but also a collective reading according to which it (directs construction of a concept that) applies to some students if they together said every relevant thing. So one might answer (57) by listing some people who together said everything or some people each of whom said everything. And if each thing got said, it was said by one or more people. But it does not follow that (57) is structurally homophonous, with one reading where everything takes scope over who. On the contrary, absent compelling reasons to the contrary, we assume that the QP/CP position to which [who QUERY] raises must be higher than he position occupied by a regular quantify that was initially the internal argument of a verb. Finally, we note in passing that all the issues raised here can be raised again with regard to embedded interrogatives, as in (59a) through (59d):

(59) a. . . . asked/wondered/knows whether/if Jay saw Kay b. . . . asked/wondered/knows who Jay saw c. . . . knows/asked/wondered who saw what d. . . . knows/asked/wondered who said everything

A thorough treatment that accommodates the relevant varieties of verbs and clausal complements is well beyond the scope of this chapter; see Lahiri (2002) for a discussion. But on our view, (matrix) interrogative mood is an instruction for how to prepare a concept for use in querying, regardless of what speech act is actually performed with that concept. And if interroga- tive sentences are (perhaps reprojected) QPs, then verbs like ask/wonder/ know—words that are themselves instructions to fetch concepts of actions/ states that have intentional “contents” (Pietroski 2005a)—can presumably take QP complements. From the perspective urged here, a SEM can be an instruction for how to build a concept of askings whose content is (given by) an interrogatively prepared T-concept, or an interrogatively prepared concept of things Jay saw. This requires a conception of speech acts/mental states whose contents are (given by) concepts as opposed to propositions. But if I-language sentences are not bound to truth values, much less propo- sitions, we need no reason to insist that verbs like ask/wonder/know fetch concepts of actions/states whose contents must be propositional.

11.4 Conclusion Interrogatives present a wide range of challenges to syntacticians and semanticists. We have argued that adopting an I-language perspective focuses attention on certain theoretical questions and raises others. If one 358 The Syntax–Semantics Interface sets aside talk of truth values and communication, one cannot assume that the meaning of an interrogative is the set of (possibly true) answers to the question at hand and that a theory of meaning should reveal how such sets are compositionally determined by interrogative expressions. Rather, we argue, I-languages generate semantic instructions (SEMs) for how to assemble concepts and prepare them for various uses. In particular, an interrogative SEM can be used to build a T-concept and prepare it for use in querying—perhaps with an intervening step of abstracting on some variable in the T-concept, as with relative clauses. In offering a syntax and semantics that conspire to yield these results, we have posited a QUERY element for English, as well as languages that have overt question particles. And we have argued that this element, together with an operation of repro- jection in the left periphery, serves as the instruction for how to prepare a concept for use in asking what falls under that concept. On this view, the relation between mood and force is still pragmatically inflected. But interrogatives are indeed apt for the use of asking questions, as opposed to being devices that denote questions. We have illustrated how this proposal applies to yes/no-interrogatives, argument/adjunct-interrogatives, and mul- tiple wh-interrogatives. We have also emphasized some implications of our proposal for the role of edges. We have argued that the left edge provides an instruction for how to assemble and prepare a concept for use, for example, in querying. Edges can be viewed as semantic instructions, and not primarily as escape hatches, as on other approaches. In this sense, edges are not distinct from non-edges, though the “mode” of their semantic instructions turns out to be somewhat different, if the present paper is on the right track with regard to the “dual- ity” of semantic instructions. Many other questions remain unanswered. In particular, we have offered only hints of how quantifiers, wh-elements, QUERY, and interrogative verbs like wonder interact. Moreover, by setting aside issues of communication, we have bracketed many empirical puzzles concerning the pragmatics of query- ing. In this sense, we have focused on a small subset of the issues that have animated the study of interrogatives. In compensation, we have emphasized the importance of considering both syntax and semantics in tandem. This is because we think the simplest overall account will posit an expression- generating procedure that employs its elementary operations to generate SEMs that may exhibit a little more structure than the SEMs that would be required given more powerful semantic operations. In comparing theories of I-languages, simplicity of operations—and not just generated structures— counts for a lot.

Notes 1 Thanks to audiences at Harvard University and the Center for the Study of Mind in Nature at the University of Oslo, and to Cedric Boeckx, Seth Cable, Hiroki Narita, Massimo Piattelli-Palmarini, Barry Schein, an anonymous reviewer, the Interrogatives, Instructions, I-Language 359 editors, and especially Norbert Hornstein for providing useful feedback on the present ideas. 2 Question marks indicate characteristic interrogative intonation. The absence of final punctuation in (3a)-(3c) indicates the non-sentential status of these clauses, which are superficially like the embedded wh-clauses in (10). For simplicity, we set aside “echo-questions” like “Jay saw Kay?” and “Jay saw who?” that also involve focus of some kind. See Bolinger (1987) and especially Artstein (2002) for discussion, in addition to the canonical references on questions given in the following main text. 3 Correlatively, one can introduce a notion of semantic value that is geared to what speakers can do with expressions, and then speak of propositions/ques- tions as the semantic values of declarative/interrogative sentences. 4 While grammatical moods are correlated with certain kinds of speech act force (Austin 1962), we assume that “[. . .] mood is a matter of meaning, whereas force is a strictly pragmatic affair” (McGinn 1977: 304). In suitable contexts, one can use declaratives to issue commands, interrogatives to make assertions, and so on. So on the view urged below, moods are not instructions for how to use (or represent the use of) expressions. Rather, moods direct processes that make sentential concepts available for certain uses. 5 Cp. Chomsky (2000b), Pietroski (2005b). In terms of Marr (1982)’s levels of explanation, positing denotations can be useful in specifying a computable func- tion, thereby raising the question of how that function is computed. But part of the answer may be that the function initially described is computed in stages, perhaps starting with a “primal sketch” that serves as input to subsequent com- putations with a different character. 6 We follow the standard convention of using ALL CAPS for concepts. 7 Here and throughout, we take AND to be a concept that can combine with two monadic concepts to form a third. Making adicities explicit, the familiar idea is that COW () can combine with a singular concept like BESSIE to form a complete thought; a concept like ABOVE(,) can combine with two singular concepts; and SATISFIES(,) can combine with (1) a concept of an expression that may contain a variable and (2) a concept of an assignment of values to variables. Given a concept of biconditionality—and a capacity to form assignment-relative singular concepts like A[v], which applies to whatever A assigns to v—speakers could form complex concepts like those indicated in the following: (i) IFF[SATISFIES(A, “cow”:v), COW(A[v]); (ii) IFF[SATISFIES(A, “brown”:v), BROWN(A[v])]; (iii)  IFF[SATISFIES(A, “brown cow”:v), AND[SATISFIES(A, “cow”:v), SATISFIES(A, “brown”:v)]]; and (iv)  IFF[SATISFIES(A, “brown cow”:v), AND[COW(A[v], BROWN(A[v])]]; where the first two biconditionals encode (hypothesized) aspects of lexical knowledge, the third encodes an aspect of compositional knowledge, and the fourth encodes a derivable conclusion. 8 Complex expressions of Frege’s (1892) invented language can be viewed as instructions for how to create ideal concepts that are always formed by means of a saturating operation that accommodates concepts of (endlessly) higher types; see Horty (2007) for useful discussion of Frege on definition. But obviously, there is no guarantee that human I-languages can invoke such an operation; see Pietroski (2010, 2011). 9 Recall that Davidson (1967a,b) and Montague (1974) did not present their claims as psychological hypotheses about human I-languages; compare with Higginbotham (1986), Larson and Segal (1995), and Heim and Kratzer (1998). The conjecture that there are Tarski-style theories of truth for such languages, and that such theories can serve as theories of meaning for the I-languages that 360 The Syntax–Semantics Interface children acquire, is very bold indeed. It may be more plausible to say that expres- sions of an I-language can be I-satisfied. 10 Perhaps DONE-BY-JAY has a simple decomposition: DONE-BY(JAY). But event concepts constructed via I-language expressions may not have singular constitu- ents, especially if names like Jay are indexed. Consider, for example, the complex concept ∃X:AND[FIRST(X), PERSON-CALLED- Jay(X)]DONE-BY(X), where FIRST(X) is a monadic concept that applies to whatever is indexed with the first index (cp. Pietroski (2011)). We return to some details concerning variables and assignments of values. 11 While no T-concept is a concept of a truth value, each monadic concept C has two T-closures—↑C and ↓C—that in turn have T-closures that exhibit the Bool- ean structure required for classical truth tables. For any such concept C and entity x, ↑↑C applies to x iff ↓↓C does; each of these doubly-closed concepts applies to x iff ↑C applies to x—that is, iff C applies to something. Likewise, ↑↓C applies to x iff ↓↑C does, since each of these concepts applies to x iff ↓C does—that is, iff C applies to nothing. Note also that ↑AND[C, C’] applies to x iff something falls under the concept AND[C, C’], which applies to x iff x falls under both C and C’. But AND[↑C, ↑C’] applies to x iff (x is such that) some- thing falls under C and something falls under C. So ↑AND[BROWN, COW] is a more restrictive concept than AND[↑BROWN, ↑COW]. 12 See Partee (2006) for related discussion. Perhaps the real empirical motivation for this typology lies with facts that invite appeal to (instructions to fetch con- cepts of) higher types like <, , t>>. But if so, that is worth knowing. For at least in many cases, these facts can be accommodated without such appeal (and the consequent threat of overgeneration); see Pietroski (2011). 13 For Hamblin and Karttunen, propositions are individuated roughly as sentences are and hence more finely than sets of possible states of the represented world. This is not the case for Groenendijk and Stokhof, where questions refer to alter- native states of the world. Thanks to Barry Schein (personal communication) for reminding us of this difference. 14 If a question is a set like (4b) or (5b), then by representing such a set, one thereby represents a question. One can stipulate that interrogatives present questions in a special way. But then one wants to know what this “way” is and whether the facts can be accommodated just as well by appealing to an equally interrogative way of presenting ordinary entities. 15 This is a modern implementation of ideas going back to Ross (1970), Lewis (1970) and Lakoff (1972). Chomsky (2004) refers to it as the “duality of seman- tics” thesis, and a lot of recent work in the cartographic tradition has sought to map out the fine details of the left periphery of the clause, cf. Rizzi (1997), Cinque (1999). 16 The notion of a “query” has also been used by Ginzburg (1992, 1996), but in a very different sense. For Ginzburg, a “query” is “a move per (conversational) turn”. 17 See, among many others, Searle (1965, 1969), Bach and Harnish (1982), and Portner (2007) for more on philosophical and linguistic aspects of speech acts. 18 Cf. Baker (1988), Dowty (1991), Pesetsky (1995), and Pylkkänen (2008). 19 From this perspective, one can view a matrix sentence as a tripartite instruc- tion: a “lower” portion that directs construction of a concept of events/states of some kind, a “middle” portion that directs construction of a T-concept, and an “upper” portion that directs a more specific tailoring of the assembled concept to the demands of specific interfacing systems that are often described abstractly in terms of propositions. In expressions that involve quantifier raising, the mid- dle portion may be an instruction for how to construct a series of increasingly complex T-concepts; see Pietroski (2011). Interrogatives, Instructions, I-Language 361 20 MINDIF(A, α, v') is our way of writing A =v' α; A differs from α at most with respect to what A assigns to v'. 21 For if MinDif(A, α, v'), then A(v") = α(v"). And if MinDif(A, α, v"), then A(v') = α(v'). 22 Though cp. Kobele (2006) on representations of sequences and functions. 23 Such a particle may also be present in sentences like “the person Jay saw”, gen- eralizing from “the person who Jay saw”. 24 Lohndal (2012b: chapter 3) reviews a range of facts that tell in favor of thematic separated logical forms and offers a corresponding syntax outlined here. 25 This structure has been slightly simplified for present purposes. 26 There are obviously a range of syntactic consequences of the present proposal. The reader may think of obvious challenges, involving basic cases of wh-move- ment, cases of VP-fronting and serial verb constructions. These can all be ana- lyzed within the present syntax, as Lohndal (2012b) demonstrates. 27 We adopt the standard idealization that intermediate traces of displacement are interpretively inert, and so we focus exclusively on the “head” and “tail” of the “chain”, cf. Chomsky (1995), Fox (2002). 28 One can posit this more complicated mapping from SEMs to concepts, cf. Steed- man (1996), Jacobson (1999). But from an I-language perspective, this is a real cost, even if one can respond to overgeneration concerns by positing further constraints that exclude unattested interpretations. 29 Donati (2006) offers a similar argument for reprojection, in the context of Ital- ian free relatives. Barring additional assumptions (like a Chain Uniformity Con- dition; cf. Chomsky 1995), there is nothing that prohibits this reprojection. Our aim here is not to provide an “instructionist” semantics of quantification. But an obvious thought (see Pietroski (2011)) is that the entire DP is an instruction for how to build a complex concept that applies to some ordered pairs >x, y< iff: every value of “y” is value of “x” (i.e., every internal is an external); the values of “y” are the cows (i.e., the potential values of the restricted variable); and each value of “x” is such that John saw it (i.e., each such value meets the condition imposed by concept obtained by executing the IP). Some ordered pairs meet these conditions iff John saw every cow. 30 Cable (2010: 214, fn. 21) argues that yes/no-interrogatives have a separate par- ticle that may or may not be homophonous with the wh-question Q-particle. Though for Tlingit, he is skeptical that this is a true Q-particle (Seth Cable, personal communication). In any case, we will not assume that English has a separate Q-particle for yes/no-questions. But one can supplement our syntax/ semantics accordingly. 31 For example, Hintikka and Halonen (1995: 637) say, “The theory of [. . .] ‘nor- mal’ wh-questions is by this time firmly under control, unlike that of why-ques- tions, and the explanation of this discrepancy is thought to lie in the complexity of the semantics of why-questions”. 32 Chametzky (1996) argues that adjuncts are label-less, and we have some sym- pathy with his leading ideas in this respect; see Hornstein and Pietroski (2009). Though as Hornstein and Nunes (2008) argue, one can preserve Chametzky’s insights, while allowing that adjuncts can be optionally labeled, cp. Hunter (2011). For present purposes, however, it does not matter if the adjunct that combines with QUERY is labeled. For simplicity, we also abstract away from differences concerning location of why as opposed to other adjuncts, see Rizzi (2001), Thornton (2008) for discussion. 33 There are various syntactic issues that we do not address here. One concerns the difference between languages that have multiple-wh fronting and languages that do not; see Boškovic´ (2002) and Stoyanova (2008) for recent analyses. We are 362 The Syntax–Semantics Interface assuming that the logical form does not change depending on whether there is phonological multiple-wh fronting or not.

References Alexiadou, A. and Anagnostopoulou, E. 2001. The subject-in-situ generalization and the role of case in driving computations. Linguistic Inquiry 32: 193–231. Alexiadou, A. and Anagnostopoulou, E. 2007. The subject-in-situ generalization revisited. In Interfaces + Recursion = Language? H. M. Gärtner and U. Sauerland (eds.), 31–60. Berlin: Mouton de Gruyter. Alexiadou, A., Anagnostopoulou, E. and Schäfer, F. 2006. The properties of anti- causatives crosslinguistically. In Phases of Interpretation M. Frascarelli (ed.), 187–211. Berlin: Mouton de Gruyter. Artstein, R. 2002. Parts of Words: Compositional Semantics for Prosodic Constitu- ents. Doctoral Dissertation, Rutgers University. Austin, J. L. 1962. How to Do Things With Words. Oxford: Clarendon Press. Bach, K. and Harnish, R. M. 1982. Linguistic Communication and Speech Acts. Cambridge, MA: MIT Press. Baker, C. L. 1970. Notes on the description of English questions: The role of an abstract question morpheme. Foundations of Language 6: 197–219. Baker, M. C. 1988. Incorporation. Chicago, IL: University of Chicago Press. Baker, M. C. 1996. On the structural positions of themes and goals. In Phrase Struc- ture and the Lexicon J. Rooryck and L. Zaring (eds.), 7–34. Dordrecht: Kluwer. Beck, S. 2006. Intervention effects follow from focus interpretation. Natural Lan- guage Semantics 14: 1–56. Berman, Stephen. 1991. On the Semantics and Logical Form of Wh-Clauses. Doc- toral Dissertation, University of Massachusetts. Boeckx, C. 2008. Bare Syntax. Oxford: Oxford University Press. Bolinger, D. 1987. Echoes reechoed. American Speech 62: 262–279. Borer, H. 2005. Structuring Sense. Oxford: Oxford University Press. Boškovic´, Ž. 2002. On multiple Wh-fronting. Linguistic Inquiry 33: 351–383. Bowers, J. 2010. Arguments as Relations. Cambridge, MA: MIT Press. Büring, D. 1997. The Meaning of Topic and Focus—The 59th Bridge Accent. Lon- don: Routledge. Cable, S. 2010. The Grammar of Q: Q-Particles, Wh-Movement, and Pied-Piping. Oxford: Oxford University Press. Caponigro, Ivano. 2003. Free Not to Ask: On the Semantics of Free Relatives and Wh-Words Cross-linguistically. Doctoral Dissertation, Univeristy of California Los Angeles. Carlson, G. 1984. Thematic roles and their role in semantic interpretation. Linguis- tics 22: 259–279. Chametzky, R. A. 1996. A Theory of Phrase Markers and the Extended Base. Albany: University of New York Press. Cheng, Lisa. 1991. On the Typology of wh-Questions. Doctoral Dissertation, MIT. Chomsky, N. 1981. Lectures on Government & Binding: The Pisa Lectures. Dor- drecht: Foris. Chomsky, N. 1986. Knowledge of Language. New York: Praeger. Chomsky, N. 1995. The Minimalist Program. Cambridge, MA: MIT Press. Interrogatives, Instructions, I-Language 363 Chomsky, N. 2000a. Minimalist inquiries: The framework. In Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik R. Martin, D. Michaels, and J. Uriagereka (eds.), 89–155. Cambridge, MA: MIT Press. Chomsky, N. 2000b. New Horizons in the Study of Language and Mind. Cam- bridge: Cambridge University Press. Chomsky, N. 2001. Derivation by phase. In Ken Hale: A Life in Language M. Ken- stowicz (ed.), 1–52. Cambridge, MA: MIT Press. Chomsky, N. 2004. Beyond explanatory adequacy. In Structures & Beyond: The Cartography of Syntactic Structures A. Belletti (ed.), 104–131. Oxford: Oxford University Press. Chomsky, N. 2005. Three factors in language design. Linguistic Inquiry 35: 1–22. Chomsky, N. 2007. Approaching UG from below. In Interfaces + Recursion = Lan- guage? H. M. Gärtner and U. Sauerland (eds.), 1–29. Berlin: Mouton de Gruyter. Chomsky, N. 2008. On phases. In Foundational Issues in Linguistic Theory: Essays in Honor of Jean-Roger Vergnaud C. Otero, R. Freidin and M. L. Zubizarreta, 133–166. Cambridge, MA: MIT Press. Chomsky, N. 2010. Restricting Stipulations: Consequences and Challenges. Talk given at the University of Stuttgart, March 24. Church, A. 1941. The Calculi of Lambda-Conversion. Princeton, NJ: Princeton Uni- versity Press. Cinque, G. 1999. Adverbs & Functional Heads: A Cross-Linguistic Perspective. Oxford: Oxford University Press. Crain, S., Khlentzos, D., Thornton, R. and Zhou, P. 2009. The Logic of Human Languages. Ms., Macquire University. Davidson, D. 1967a. The logical form of action sentences. In The Logic of Decision and Action N. Resher (ed.), 81–95. Pittsburgh: University of Pittsburgh Press. Davidson, D. 1967b. Truth and meaning. Synthese 17: 304–323. Davidson, D. 1968. On saying that. Synthese 19: 130–146. Davies, M. 1987. Tacit knowledge and semantic theory: Can a five percent differ- ence matter? Mind 96: 441–462. Donati, C. 2006. On Wh-head movement. In Wh-Movement: Moving On L. L. Cheng and N. Corver, 21–46. Cambridge, MA: MIT Press. Dowty, D. 1991. Thematic proto-roles and argument selection. Language 67: 547–619. Dummett, M. 1976. What is a theory of meaning? In Truth and Meaning G. Evans and J. McDowell (eds.), 67–137. Oxford: Oxford University Press. Fodor, J. 1975. The Language of Thought. New York: Crowell. Fodor, J. 2008. LOT2: The Language of Thought Revisited. Oxford: Oxford Uni- versity Press. Folli, R. and Harley, H. 2007. Causation, obligation, and argument structure: On the nature of little v. Linguistic Inquiry 38: 197–238. Fox, D. 2002. Antecedent-contained deletion and the copy theory of movement. Linguistic Inquiry 33: 63–96. Frege, G. 1879. Begriffsschrift: eine der arithmetischen nachgebildete Formelsprache des reinen Denkens. Halle: Nebert. Frege, G. 1892/1980. Function and concept. In Translations from the Philosophical Writings of Gottlob Frege P. Geach and M. Black (eds.), Oxford: Blackwell. Ginzburg, J. 1992. Questions, Queries and Facts: A Semantics and Pragmatics for Interrogatives. Doctoral Dissertation, Stanford University. 364 The Syntax–Semantics Interface Ginzburg, J. 1996. The semantics of interrogatives. In The Handbook of Contempo- rary Semantic Theory S. Lappin (ed.), 175–233. Oxford: Black well. Ginzburg, J. and Sag, I. 2001. Interrogative Investigations. Stanford: CSLI. Groenendijk, J. and Stokhof, M. 1982. Semantic analysis of wh-complements. Lin- guistics and Philosophy 5: 175–233. Groenendijk, J. and Stokhof, M. 1984. Studies in the Semantics of Questions and the Pragmatics of Answers. Doctoral Dissertation, University of Amsterdam. Hagstrom, P. 1998. Decomposing Questions. Doctoral Dissertation, MIT. Hamblin, C. L. 1958. Questions. The Australian Journal of Philosophy 36: 159–168. Hamblin, C. L. 1973. Questions in Montague English. Foundations of Language 10: 41–53. Harley, Heidi. 1995. Subjects, Events and Licensing. Doctoral Dissertation, MIT. Heim, I. and Kratzer, A. 1998. Semantics in Generative Grammar. Malden, MA: Blackwell. Higginbotham, J. 1985. On semantics. Linguistic Inquiry 16: 547–593. Higginbotham, J. 1986. Linguistic theory and Davidson’s program. In Inquiries into Truth and Interpretation, Ernest Lepore (ed.), 29–48. Oxford: Basil Blackwell. Higginbotham, J. 1993. Interrogatives. In The View from Building 20: Essays in Honor of Sylvain Bromberger, K. Hale and S. Jay Keyser (eds.), 195–227. Cam- bridge, MA: MIT Press. Higginbotham, J. 1996. The semantics of questions. In The Handbook of Contem- porary Semantic Theory, S. Lappin (ed.), 361–383. Oxford: Black-well. Higginbotham, J. and May, R. 1981. Questions, quantifiers, and crossing. The Lin- guistic Review 41–80. Hintikka, J. and Halonen, I. 1995. Semantics and pragmatics for why-questions. The Journal of Philosophy 92: 636–657. Hornstein, N. 2009. A Theory of Syntax: Minimal Operations & Universal Gram- mar. Cambridge: Cambridge University Press. Hornstein, N. and Nunes, J. 2008. Adjunction, labeling and bare phrase structure. Biolinguistics 2: 57–86. Hornstein, N. and Pietroski, P. 2009. Basic operations: Minimal syntax-semantics. Catalan Journal of Linguistics 8: 113–139. Hornstein, N. and Uriagereka, J. 2002. Reprojections. In Derivation and Represen- tation in the Minimalist Program, S. David Epstein and T. Daniel Seely (eds.), 107–132. Malden, MA: Blackwell. Horty, J. 2007. Frege on Definitions. Oxford: Oxford University Press. Huang, C-T. J. 1982. Logical Relations in Chinese & the Theory of Grammar. Doc- toral Dissertation, MIT. Hunter, T. 2011. Syntactic Effects of Conjunctivist Semantics: Unifying Movement and Adjunction. Amsterdam: John Benjamins. Jacobson, P. 1999. Towards a variable-free semantics. Linguistics and Philosophy 22: 117–184. Jayaseelan, K. A. 2001. Questions and question-word incorporating quantifiers in malayalam. Syntax 4: 63–93. Jayaseelan, K. A. 2008. Question Particles and Disjunction. Ms., The English & Foreign Languages University. Kaplan, D. 1978a. Dthat. In Syntax and Semantics, P. Cole (ed.), 221–243. New York: Academic Press. Interrogatives, Instructions, I-Language 365 Kaplan, D. 1978b. On the logic of demonstratives. Journal of Philosophical Logic VIII: 81–98. Karttunen, L. 1977. Syntax and semantics of questions. Linguistics and Philosophy 1: 3–44. Kishimoto, H. 2005. Wh-in-situ and movement in sinhala questions. Natural Lan- guage & Linguistic Theory 23: 1–51. Kobele, G. 2006. Generating Copies: An Investigation into Structural Identity in Language and Grammar. Doctoral Dissertation, Univeristy of California Los Angeles. Kratzer, A. 1996. Severing the external argument from the verb. In Phrase Structure and the Lexicon, J. Rooryck and L. Zaring (eds.), 109–137. Dordrecht: Kluwer. Kratzer, A. and J. Shimoyama. 2002. Indeterminate Pronouns: The View from Japa- nese. Ms., University of Massachusetts. Kuno, S. and Robinson, J. T. 1972. Multiple Wh questions. Linguistic Inquiry 3: 463–487. Kuroda, S.-Y. 1965. Generative Grammatical Studies in the Japanese Language. Doctoral Dissertation, MIT. Lahiri, U. 2002. Questions and Answers in Embedded Contexts. Oxford: Oxford University Press. Lakoff, G. 1972. Linguistics and natural logic. In Semantics of Natural Language, G. Harman and D. Davidson (eds.), 545–665. Dordrecht: D. Reidel. Larson, R. K. and Segal, G. 1995. Knowledge of Meaning. Cambridge, MA: MIT Press. Lasnik, H. and Uriagereka, J. 1988. A Course in GB Syntax. Cambridge, MA: MIT Press. Lepore, E. and Ludwig, K. 2007. Donald Davidsons’s Truth-Theoretic Semantics. Oxford: Oxford University Press. Lewis, D. 1970. General semantics. Synthese 22: 18–67. Lin, T. 2001. Light Verb Syntax and the Theory of Phrase Structure. Doctoral Dis- sertation, University of California, Irvine. Lohndal, T. 2012a. Towards the end of argument structure. In The End of Argument Structure? M. C. Cuervo and Y. Roberge (eds.), 155–184. Bingley: Emerald. Lohndal, T. 2012b. Without Specifiers: Phrase Structure and Events. Doctoral Dis- sertation, University of Maryland. Marr, D. 1982. Vision. New York: Freeman. May, R. 1977. The Grammar of Quantification. Doctoral Dissertation, MIT. McGinn, C. 1977. Semantics for nonindicative sentences. Philosophical Studies 32: 301–311. Montague, R. 1974. Formal Philosophy. New Haven: Yale University Press. Moro, A. 2000. Dynamic Antisymmetry. Cambridge, MA: MIT Press. Moro, A. 2008. Rethinking Symmetry: A Note on Labelling and the EPP. Ms., Vita- Salute San Raffaele University. Narita, Hiroki. 2009. Multiple transfer in service of recursive Merge. Paper pre- sented at GLOW XXXII, Nantes. Narita, H. 2011. Phasing in Full Interpretation. Doctoral Dissertation, Harvard University. Narita, H. 2012. Phase cycles in service of projection-free syntax. In Phases: Devel- oping the Framework, Á. J. Gallego (ed.), 125-172. Berlin: Mouton de Gruyter. 366 The Syntax–Semantics Interface Partee, B. H. 2006. Do we need two basic types? In 40–60 Puzzles or manfred krifka, H-M. Gärtner, S. Beck, R. Eckardt, R. Musan and B. Stiebels (eds.). Online: www.zas.gwz-berlin.de/publications/40-60-puzzles-for-krifka/. Pesetsky, D. 1995. Zero Syntax. Cambridge, MA: MIT Press. Pietroski, P. 1996. Fregean innocence. Mind & Language 11: 338–370. Pietroski, P. 2005a. Events & Semantic Architecture. Oxford: Oxford University Press. Pietroski, P. 2005b. Meaning before truth. In Contextualism in Philosophy, G. Preyer and G. Peters (eds.), 253–300. Oxford: Oxford University Press. Pietroski, P. 2008. Minimalist meaning, internalist interpretation. Biolinguistics 2: 317–341. Pietroski, P. 2010. Concepts, meanings, and truth: First nature, second nature, and hard work. Mind & Language 25: 247–278. Pietroski, P. 2011. Minimal semantic instructions. In The Oxford Handbook of Lin- guistic Minimalism, C. Boeckx (ed.), 472–498. Oxford: Oxford University Press. Portner, P. 2007. Instructions for interpretation as separate performatives. In On Information Structure, Meaning and Form, K. Schwabe and S. Winkler (eds.), 407–426. Amsterdam: John Benjamins. Pylkkänen, L. 2008. Introducing Arguments. Cambridge, MA: MIT Press. Ramchand, G. 2008. Verb Meaning and the Lexicon: A First Phase Syntax. Cam- bridge: Cambridge University Press. Richards, N. 2010. Uttering Trees. Cambridge, MA: MIT Press. Rizzi, L. 1997. The fine structure of the left periphery. In Elements of Grammar, L. Haegeman (ed.), 281–337. Dordrecht: Kluwer. Rizzi, L. 2001. On the position “int(errogative)” in the left periphery of the clause. In Current Studies in Italian Syntax, G. Cinque and G. Salvi (eds.), 287–296. Oxford: Elsevier. Ross, J. R. 1970. On declarative sentences. In Readings in English Transforma- tional Grammar, R. Jacobs and P. Rosenbaum (eds.), 222–272. Washington, DC: Georgetown University Press. Sailor, Craig and Byron Ahn. 2010. The Voices in Our Heads. Talk given at Mor- phological Voice and its Grammatical Interfaces, June 25. Schein, B. 1993. Plurals and Events. Cambridge, MA: MIT Press. Schwarzschild, R. 1999. GIVENness, Avoid F and other constraints on the place- ment of focus. Natural Language Semantics 7: 141–177. Searle, J. R. 1965. What is a speech act? In Philosophy in America, M. Black (ed.), 221–239. Ithaca: Cornell University Press. Searle, J. R. 1969. Speech Acts: An Essay in the Philosophy of Language. Cam- bridge: Cambridge University Press. Segal, G. 1991. In the mood for a semantic theory. Proceedings of the Aristotelian Society 91: 103–118. Speas, M. J. 1990. Phrase Structure in Natural Language. Dordrecht: Kluwer. Stainton, R. J. 1999. Interrogatives and sets of answers. Crítica, Revista Hispano- americana de Filosofia 31: 75–90. Steedman, M. 1996. Surface Structure and Intepretation. Cambridge, MA: MIT Press. Stoyanova, M. 2008. Unique Focus: Languages Without Multiple wh-Questions. Amsterdam: John Benjamins. Interrogatives, Instructions, I-Language 367 Tarski, A. 1935. Der Wahrheitsbegriff in den formalisierten Sprachen. Studia Philo- sophica 1: 261–405. Thornton, R. 2008. Why continuity. Natural Language & Linguistic Theory 26: 107–146. Uriagereka, J. 1999. Multiple spell-out. In Working Minimalism, S. D. Epstein and N. Hornstein (eds.), 251–282. Cambridge, MA: MIT Press. Williams, A. 2008. Patients in Igbo and Mandarin. In Event Structures in Linguistic Form and Interpretation, J. Dölling, T. H. Zybatow and M. Schäfer (eds.), 3–30. Berlin: Mouton de Gruyter. Yatsushiro, Katzuko. 2001. The Distribution of mo and ka and its Implications. MIT Working Papers in Linguistics.

Part C Multilingualism and Formal Grammar

12 Generative Grammar and Language Mixing*

The paper by Benmamoun, Montrul, and Polinsky (BMP; 2013) clearly out- lines the importance and relevance of heritage languages for linguistic theory. They make the point that “[. . .] additional perspectives and sources of data can also provide new critical evidence for our understanding of language struc- ture”. I completely agree with this. My goal in this brief chapter is to attempt to situate the BMP paper in a somewhat broader theoretical context. In that sense, what follows is much more of an extension than a critique of their paper.

12.1 Some History BMP correctly point out that the monolingual speaker has been given pri- macy in the history of theoretical linguistics. An important locus for this priority can be found in the early pages of Aspects of the Theory of Syntax:

Linguistic theory is concerned primarily with an ideal speaker-listener, in a completely homogeneous speech-community, who knows its language perfectly and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of the lan- guage in actual performance. (Chomsky 1965: 3)

Put differently, there is no variation, no errors. When a speaker of the lan- guage is asked whether sentence X or Y is well formed, the answer will not be influenced by the fact that the speaker might know two other languages as well. Clearly this is an idealization that simply cannot be correct, but it has been retained because it made it easier to construct theories of complex empirical phenomena. It is worth pausing to reflect on this assumption. Here is another ratio- nale for why it makes sense to focus on the monolingual, this time from an interview with Chomsky conducted by François Grosjean in the mid 1980s:

Why do chemists study H2O and not the stuff that you get out of the Charles River? [. . .] You assume that anything as complicated as what 372 Multilingualism and Formal Grammar is in the Charles River will only be understandable, if at all, on the basis of discovery of the fundamental principles that determine the nature of all matter, and those you have to learn about by studying pure cases. (Cook and Newson 2007: 222)

In essence, then, anything but a monolingual speaker is argued to be too complicated as an object of study. When attempting to discover the underly- ing principles of the faculty of language, we need to study “pure cases” to ensure that what we discover has not been affected by other factors. Much has happened since 1965. Today a multilingual perspective is increasingly the norm in linguistics. In addition to their native tongue, most speakers know at least one other language. Nonformal approaches to the study of language are keenly aware of this, and they have unraveled a lot of important data and generalizations, which have also come to play an important role in language instruction and our understanding of the effects of knowing one or more languages for language learning in general. Within theoretical linguistics, however, multilingualism has not been the norm. There has been some work uncovering formal constraints on lan- guage mixing1 (Sankoff and Poplack 1981, Woolford 1983; Di Sciullo, Muysken and Singh 1986; Belazi, Rubin, and Toribio 1994, MacSwan 1999, 2000, 2005, Muysken 2000, Myers Scotton 2002, van Gelderen and MacSwan 2008, González-Vilbazo and López 2011, Åfarli and Subbārāo 2013), but in general, the field at large has not devoted much attention to this area. Much more work has been done on second language acquisition (see Hawkins 2001, White 2003, and Herschensohn and Young-Scholten 2013 for a comprehensive overview), where generative theories have been used to analyze the data since the late 1980s. Other work includes Roeper (1999), who argues for a certain kind of universal bilingualism within every language (see also Cook and Newson 2007), and Kroch (2001) argues for competing grammars in the context of language change. BMP contributes to this literature by their paper, extending the domain to heritage languages as well and drawing on independent work by the three scholars (see references in their original paper).2 Despite the focus on the monolingual speaker in theoretical linguistics, I agree with BMP that there is a lot to learn from looking at other more complex situations. In the next section, I elaborate on this claim.

12.2 The Importance of Multilingual Speakers I believe the approach taken in Chomsky (1965) was a correct approach at the time. It made sense to start investigating the tacit competence of a native speaker on the assumption that the native speaker only masters one language. Given the vast number of theoretical and empirical questions that had to be addressed, the task would have been made much more difficult if more complex situations had been taken as the point of departure. Generative Grammar and Language Mixing 373 However, a lot of progress has been made since 1965 (see Freidin and Vergnaud 2001, Freidin and Lasnik 2011, Freidin 2012, Lasnik and Lohndal 2013 for discussion of the historical development of generative grammar). Today we know a lot about the basic operations (see Hornstein and Pietroski 2008, Hornstein 2009 for discussion of this term) of the lan- guage faculty. Government and Binding (Chomsky 1981 and a lot of work in the 1980s) was instrumental in establishing a framework for comparative syntax, and in doing so, it basically uncovered a “body of doctrine” in the sense of a set of generalizations that appear to be more or less true. In the past twenty years, the focus has been on rationalizing these generalizations by providing more principled explanations for them (cf. the preceding refer- ences). Therefore, one can argue that our understanding of the faculty of language has reached a stage where it is possible to move further to more complex situations, as BMP also briefly mentions. Returning to the water metaphor cited in the previous section, Cook and Newson (2007: 224) argue that

[. . .] water is a molecule, H2O, not an atom; if we break it into its constituent hydrogen and oxygen, we are no longer studying water. Purifying the mind into a single language means destroying the actual substance we are studying—the knowledge of language in the human mind.

Thus, they are arguing that we need to look at the actual, more complex situations that most speakers encounter. The core questions for most linguists are the following:

(1) a. What constitutes knowledge of language? b. How is knowledge of language acquired? c. How is knowledge of language put to use? (Chomsky 1986: 3)

A lot of generative work has focused on the first two questions, arguing for an innate biological capacity to acquire human languages. However, there are certain issues that a focus on monolingual speakers will not address. BMP mention questions such as “[. . .] what exactly is the role of input in the development and maintenance of a language during childhood and into adulthood? When language acquisition takes place under reduced input con- ditions or under pressure from another language in a bilingual environment, which areas of grammar are resilient and which ones are vulnerable? What underlies the common simplification patterns observed among different heri- tage languages?” BMP do a great job of illuminating how exactly research on heritage languages can provide at least partial answers to these questions. Chomsky (1986) argues that the object of inquiry should be I-language, that is, our individual, internal and intensional tacit knowledge of language. 374 Multilingualism and Formal Grammar From that perspective, it is important to study speakers who master multiple languages to varying degrees. They will provide crucial information about a central question in theoretical linguistics, namely, what a possible human I-language is. This includes identifying the boundaries of I-language, often through studying instances of poverty of stimulus. BMP also point at an important distinction between areas of the grammar that require significant input and use in order to be immune from attrition, and “areas of the gram- mar which are naturally resilient even without extensive input and use”. Such questions cannot be raised unless scholars look at multilingual envi- ronments, and it is clear that a complete theory of I-language will need to capture this distinction that BMP emphasize. The distinction, if true, shows how work on heritage languages and other instances of multilingualism can provide evidence which in turn can illuminate our theories of the faculty of language. For example, why is it the case that certain areas are so resilient whereas other areas are malleable and subject to change throughout the life of a speaker? Is it because resilient areas are part of Universal Grammar or because of some other property? Future work will hopefully tell us more about this.

12.3 Theoretical Issues in Language Mixing BMP show a number of cases where data from heritage languages bear on theoretical questions. In several cases, I think the theoretical implications are somewhat more significant than BMP make them sound. In the interest of future work, I want to discuss a couple of examples where I think BMP’s work has far-reaching consequences. One example in section 3 of BMP’s paper involves the distinction between lexical and functional categories. They argue that in general, “functional categories are relatively more vulnerable than lexical categories, although there is significant variation among the latter as well”. In a sense, this is not so surprising, since many scholars identify the functional domain with language variation (Borer 1984, Chomsky 1995). Various scholars have also put forward theories of the acquisition of functional phrases where they are acquired based on input, and Hawkins (2001) also shows that bilingual speakers first produce the lexical argument structure domain of a sentence and then proceed to build additional structure based on evidence in the input. Heritage speakers illustrate the loss of functional structure, and this raises the question of what goes missing and how that happens. Are functional structures such that they require “maintenance” in order to be preserved, so that with too little input, they start to disappear? Or are the features on functional heads somehow different in nature from features on lexical heads? These are interesting questions that the work discussed in BMP raises. Other work by Tsimpli et al. (2004) argues that in first language attrition, semantic features are vulnerable whereas syntactic features stay intact. The results in BMP point in the direction of syntactic features being Generative Grammar and Language Mixing 375 vulnerable as well, which may be a difference between attrition in heritage speakers and the kind of attrition discussed by Tsimpli et al. More generally, functional structure has come to play a pivotal role in syntactic research. Recent work on the syntax of argument structure shows how functional structure is crucially part of the argument structure domain as well. Since Harley (1995) and Kratzer (1996), many scholars have argued that the Agent is introduced by a dedicated functional projection (VoiceP or vP) (Alexiadou, Anagnostopoulou and Schäfer 2006). Since then, other work has argued that all arguments are introduced by dedicated projections (Borer 2005, Bowers 2010, Lohndal 2014). Such dedicated projections serve to introduce argument structure templates (or frames, as in Åfarli 2007). They typically have specific meanings attached to them. However, they are different from functional structure introduced above the Agent (following the standard assumption that the Agent is introduced after all other arguments have been introduced), where functional structure does not introduce any arguments. Rather, they introduce scope operators (such as negation) and there may be dedicated projections as landing sites for move- ment related to focus, topic, wh-movement, and so on and so forth (see, e.g., Rizzi 1997 for discussion from a syntactic point of view). If true, then it complicates the traditional distinction between lexical and functional cat- egories, especially in the sense that lexical projections are introduced prior to functional projections. The dissociation between the argument structure domain of a sentence and the rest of the sentence provides a starting point for understanding other instances of language mixing. In particular, in one common variety of lan- guage mixing, one language seems to provide the lexical content morphemes whereas another language tends to provide the inflectional morphemes. An example from the famous Haugen (1953) is provided in (2), where Norwe- gian and American English are mixed even within words (assuming with Myers-Scotton 2002 that this is not a case of borrowing):

(2) Så play-de dom game-r then play-PAST they game-PL “Then, they played games”

In (2), there are content morphemes from English, but the inflectional morphemes are drawn from Norwegian. Thus, there is a clear separation between the two domains. In this sense, heritage languages appear to be similar to other cases of language mixing. A more systematic comparison may uncover further similarities, and probably a few differences. For exam- ple, in (2), the first language of the speaker is Norwegian, and the functional structure comes from that language. This is different from heritage speakers since attrition most commonly affects functional structure, as discussed ear- lier. Data such as (2) (and there are plenty of similar instances attested in the literature; compare, for exampe, Muysken 2000) support the dissociation 376 Multilingualism and Formal Grammar between lexical and functional structure, but they also show that we need to distinguish between different kinds of language mixing and attrition. Future work will hopefully be able to explore this further. Another example I want to focus on relates to the following statement in BMP:

Syntactic knowledge, particularly the knowledge of phrase structure and word order, appears to be more resilient to incomplete acquisition under reduced input conditions than inflectional morphology is. There is a tendency for heritage language speakers to retain the basic, perhaps universal, core structural properties of their language.

I would argue that this shows that we need a clear dissociation between the structural properties (frames or templates in Åfarli/Borer) and the morpho- logical properties. Distributed Morphology is a theory that has incorpo- rated this with their notion of “late insertion” (see, e.g., Embick and Noyer 2006). In this theory, abstract syntactic structures are generated and mor- phological structure is inserted after the syntactic computation proper has been finished. BMP does not comment on this similarity, but the tendency in heritage languages supports a theory that distinguishes structures from their morphological realization. The question, then, is how exactly this should be implemented in a formal theory applied to heritage languages and language mixing more generally. Space prevents me from discussing this further here, but see Åfarli (2013) for a suggestion. An issue that BMP does not explicitly discuss involves the nature of theo- ries of multilingualism. Scholars such as Myers-Scotton (1993, 2002) argue that theories of syntactic code-switching need to assume special machin- ery in order to account for the relevant data. That is, the theory we have for monolinguals is not adequate and additional assumptions are required to account for bilingual and multilingual phenomena. On the other hand, MacSwan (1999, 2000, 2005) argue that we should have the same theory of the language faculty regardless of whether the speaker is monolingual or multilingual. Åfarli (2013) supports MacSwan’s argument and goes on to develop an I-language theory that accommodates the data discussed by Myers-Scotton without making special assumptions about the grammar of multilingual speakers.3 I believe a tacit assumption in BMP is that a theory of the grammar of heritage languages should not look different from a theory of a monolingual speaker of a language. That is, the syntactic model that one uses for the monolingual speaker should be identical to one used for the multilingual speaker. Of course, this is not to say that there are no differences between heritage speakers and other speakers. For example, BMP point at four fac- tors that are important in shaping heritage grammars: differences in attain- ment, attrition over the life span, transfer from the dominant language, and incipient changes in parental/community input that get amplified in the heri- tage variety (their section 5). These differences do not entail that, say, the Generative Grammar and Language Mixing 377 theory of syntax for heritage speakers requires fundamentally different or additional assumptions than does the theory of syntax for a monolingual speaker. However, a more pressing question relates to how attrition over the lifespan should be captured. BMP do not assume that these are perfor- mance effects, rather they seem to view them as real competence effects, witnessed, for example, through the reduced ability of heritage speakers to make acceptability judgments that are similar to monolingual controls (see BMP’s paper for more discussion). What exactly is it that goes missing or how are certain aspects of their grammatical knowledge reanalyzed into a new system? Ideally, we would like a syntactic theory that could capture these effects in a straightforward way. Such a theory does not exist at pres- ent, which underscores BMP’s claim that heritage languages indeed do have a lot to offer theoretical linguistics.

12.4 Concluding Remarks BMP make a convincing case that heritage languages can illuminate our the- ories of the faculty of language. In this brief commentary, I have attempted to justify why theoretical linguistics should take multilingual phenomena into account when constructing theories of grammar. I have also discussed some questions that BMP raise in their article, which I believe should play an important role in theoretical linguistics in the years to come.

Notes * Thanks to Artemis Alexiadou and Elly van Gelderen for helpful comments on this essay. 1 I use language mixing, which includes code-switching, to describe a situation where a speaker produces linguistic outcomes constituted by a mixture of ele- ments from two or more languages (see Gumperz 1982 among others). 2 See also Putnam and Salmons (2013) for a recent study of heritage German from a theoretical perspective. 3 See also González-Vilbazo and Lopéz (2011) for some discussion of Myers-Scot- ton’s theoretical perspective.

References Åfarli, T. A. 2007. Do verbs have argument structure? In Argument Structure, E. Reuland, T. Bhattacharya and G. Spathas (eds.), 1–16. Amsterdam: John Benjamins. Åfarli, T. A. 2013. A Syntactic Frame Model for the Analysis of Code-Switching Phenomena. Ms., Norwegian University of Science and Technology. Åfarli, T. A. and Subbārāo, K. V. 2013. Models for language contact: The Dakkhini challenge. Paper presented at Formal Approaches to South Asian Languages, University of Southern California, March 9–10. Alexiadou, A., Anagnostopoulou, E. and Schäfer, F. 2006. The properties of anti- causatives crosslingustically. In Phases of Interpretation, M. Frascarelli (ed.), 187–212. Berlin: Mouton de Gruyter. Belazi, H. M., Rubin, E. J. and Toribio, A. J. 1994. Code switching and X-bar theory. Linguistic Inquiry 25: 221–237. 378 Multilingualism and Formal Grammar Benmamoun, E., Montrul, S. and Polinsky, M. 2013. Heritage Languages and Their Speakers: Opportunities and Challenges for Linguistics. Theoretical Linguistics 39: 129–181. Borer, H. 1984. Parametric Syntax. Dordrecht: Foris. Borer, H. 2005. Structuring Sense (vols. I & II). Oxford: Oxford University Press. Bowers, J. 2010. Arguments as Relations. Cambridge, MA: MIT Press. Chomsky, N. 1965. Aspects of the Theory of Syntax. Cambridge, MA: The MIT Press. Chomsky, N. 1981. Lectures on Government and Binding. Foris: Dordrecht. Chomsky, N. 1986. Knowledge of Language: Its Nature, Origin, and Use. New York: Praeger. Chomsky, N. 1995. The Minimalist Program. Cambridge, MA: The MIT Press. Cook, V.J. and Newson, M. 2007. Chomsky’s Universal Grammar: An Introduc- tion. 3rd edition. Malden: Blackwell. Di Sciullo, A-M., Muysken, P. and Singh, R. 1986. Government and code-mixing. Journal of Linguistics 22: 1–24. Embick, D. and Noyer, R. 2006. Distributed morphology and the syntax morphol- ogy interface. In The Oxford Handbook of Linguistic Interfaces, G. Ramchand and C. Reiss (eds.), 289–324. Oxford: Oxford University Press. Freidin, R. 2012. A brief history of generative grammar. In The Routledge Compan- ion to Philosophy of Language, G. Russell and D. Graff Fara (eds.), 895–916. London: Routledge. Freidin, R. and Lasnik, H. 2011. Some roots of minimalism. In The Oxford Hand- book of Linguistic Minimalism, C. Boeckx (ed.), 1–26. Oxford: Oxford Univer- sity Press. Freidin, R. and Vergnaud, J-R. 2001. Exquisite connections: Some remarks on the evolution of linguistic theory. Lingua 111: 639–666. González-Vilbazo, K. and López, L. 2011. Some properties of light verbs in code- switching. Lingua 121: 832–850. Gumperz, J. 1982. Discourse Strategies. Cambridge: Cambridge University Press. Harley, H. 1995. Subjects, Events, and Licensing. Doctoral dissertation, MIT. Haugen, E. 1953. The Norwegian Language in America. Philadelphia: University of Philadelphia Press. Hawkins, R. 2001. Second Language Syntax. Malden: Blackwell. Herschensohn, J. and Young-Scholten, M. (ed.) 2013. The Cambridge Handbook of Second Language Acquisition. Cambridge: Cambridge University Press. Hornstein, N. 2009. A Theory of Syntax. Cambridge: Cambridge University Press. Hornstein, N. and Pietroski, P. 2008. Basic operations. Catalan Journal of Linguis- tics 8: 113–139. Kratzer, A. 1996. Severing the external argument from the verb. In Phrase Structure and the Lexicon, J. Rooryck and L. Zaring (eds.), 109–137. Dordrecht: Kluwer. Kroch, A. S. 2001. Syntactic change. In The Handbook of Contemporary Syntactic Theory, M. Baltin and C. Collins (eds.), 699–729. Malden: Blackwell. Lasnik, H. and Lohndal, T. 2013. Brief overview of the history of generative gram- mar. In The Cambridge Handbook of Generative Syntax, M. den Dikken (ed.), 26–60. Cambridge: Cambridge University Press. Lohndal, T. 2014. Phrase Structure and Argument Structure: A Case Study of the Syntax Semantics Interface. Oxford: Oxford University Press. Generative Grammar and Language Mixing 379 MacSwan, J. 1999. A Minimalist Approach to Intra-Sentential Code-Switching. New York: Garland. MacSwan, J. 2000. The architecture of the bilingual faculty: Evidence from intrasen- tential code switching. Bilingualism 3: 37–54. MacSwan, J. 2005. Codeswitching and generative grammar: A critique of the MLF model and some remarks on “modified minimalism”. Bilingualism: Language and Cognition 8: 1 22. Muysken, P. 2000. Bilingual Speech: A Typology of Code-Mixing. Cambridge: Cam- bridge University Press. Myers-Scotton, C. 1993. Duelling Languages: Grammatical Structure in CodeSwitch- ing. Oxford: Oxford University Press. Myers-Scotton, C. 2002. Contact Linguistics: Bilingual Encounters and Grammati- cal Outcomes. Oxford: Oxford University Press. Putnam, M. T. and Salmons, J. 2013. Losing their (passive) voice: Syntactic neutral- ization in heritage German. Linguistic Approaches to Bilingualism 3: 233–252. Rizzi, L. 1997. The fine structure of the left periphery. In Elements of Grammar, L. Haegeman (ed.), 281–337. Dordrecht: Kluwer. Roeper, T. 1999. Universal bilingualism. Bilingualism: Language and Cognition 2: 169–186. Sankoff, D. and Poplack, S. 1981. A formal grammar for code-switching. Research on Language and Social Interaction 14: 3–45. Tsimpli, I., Sorace, A., Heycock, C. and Filiaci, F. 2004. First language attrition and syntactic subjects: A study of Greek and Italian near-native speakers of English. International Journal of Bilingualism 8: 257–277. van Gelderen, E. and MacSwan, J. 2008. Interface conditions and code-switching: Pronouns, lexical DPs, and checking theory. Lingua 118: 765–776. White, L. 2003. Second Language Acquisition and Universal Grammar. Cambridge: Cambridge University Press. Woolford, E. 1983. Bilingual code-switching and syntactic theory. Linguistic Inquiry 14: 520–536.

13 Language Mixing and Exoskeletal Theory A Case Study of Word-Internal Mixing in American Norwegian*

with Maren Berg Grimstad and Tor A. Åfarli

13.1 Introduction Most work on formal syntax takes the following assumption as its point of departure:

Linguistic theory is concerned primarily with an ideal speaker-listener, in a completely homogeneous speech-community, who knows its language perfectly and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of the lan- guage in actual performance. (Chomsky 1965: 3)

This assumption has been an eminent research strategy as it has made it eas- ier to construct theories of complex empirical phenomena (Lohndal 2013). Given the vast number of theoretical and empirical questions that had to be addressed, the task would have been made much more difficult if more complex situations had been taken as the starting point. Today, the situation is different. It has been well established that formal grammars are a very good characterization of the nature of grammatical rep- resentations that humans possess. These formal grammars have mostly been constructed on the basis of monolingual data. Nonformal theories since the 1970s have studied what appears to be closer to “real-life” situations, where one speaker knows multiple languages and mixes aspects of these languages to a greater or lesser extent. It is only in the past twenty-five years that a few formally oriented linguists have started to focus on multilingual data, except for the more specialized area of second-language acquisition. The goal of this chapter is to combine current developments in formal grammar with data from situations where two languages are mixed. We argue that data from language mixing support a specific version of formal grammar, namely, a late-insertion exoskeletal model. This theory has previ- ously only been motivated on the basis of monolingual data, and being able to cover both monolingual and multilingual phenomena would significantly strengthen the model in question. 382 Multilingualism and Formal Grammar Specifically, this chapter focuses on language mixing in the heritage lan- guage American Norwegian. This is the variety of Norwegian spoken by native speakers of Norwegian who immigrated to the US after the criti- cal period (Lenneberg 1967) had passed, as well as their descendants.1 The immigration period in question lasted for about a century, starting in the 1850s. They were gradually influenced by English (see Haugen 1953), and the resulting language mixing is characterized by Norwegian structure and functional items paired with certain English content words. The question is how to model this mixing in a way that predicts the possible and impossible patterns. This chapter is organized as follows: Section 13.2 introduces the empiri- cal domain, American Norwegian, and some general issues when it comes to analyzing language mixing data. In section 13.3, we introduce and discuss our late-insertion exoskeletal model, and in section 13.4, we use this model to analyze mixing data from American Norwegian. Section 13.5 concludes the chapter.

13.2 The Empirical Domain: American Norwegian This section presents relevant background on language mixing (sec- tion 13.2.1) before introducing American Norwegian and some relevant constructions we seek to analyze (section 13.2.2).

13.2.1 The Grammar of Language Mixing In the literature, there is a controversy regarding how to account for lan- guage mixing. In general, there are two positions: one that claims that lan- guage mixing requires additional theoretical primitives and another one that claims that the same theory that accounts for monolingual data should account for language mixing as well. One caveat is in order before we start. A lot of the literature we will rely on discusses code-switching specifically. Code-switching is but one instance of language mixing, and there is substantial debate in the literature regarding whether or not certain types of mixing are to be considered code-switching or not. The debate is especially heated when it comes to distinguishing between code-switches and loanwords (Poplack 2004, Poplack and Dion 2012). It is straightforward to state that established loanwords are avail- able for monolinguals as well as bilinguals, whereas you have to have some competence in a second language (L2) in order to code-switch. What is not equally straightforward is how—if at all—you can look at a single mixed item spoken by a bilingual and decide for certain whether you are deal- ing with a loanword or an instance of code-switching. Some scholars argue that because of inappropriate methodology, attempts at distinguishing between the two must fail (Eliasson 1989, Eastman 1992, Johanson 1993, Language Mixing and Exoskeletal Theory 383 Thomason 2001, Winford 2003, Gardner-Chloros 2009). Others argue that the distinction is fuzzy or part of a continuum (Eliasson 1989, Heath 1989, Bentahila and Davies 1991, Boyd 1993, Myers-Scotton 1993, 2002, 2006, Field 2002, Boztepe 2003, Clyne 2003, Thomason 2003, Treffers-Daller 2005, Haspelmath 2009, Winford 2009). In this chapter, we are concerned with the formal grammar of cases where one language provides the inflectional morphemes and the word order, whereas the other language at most contributes some of the lexical content morphemes. This specific type of language mixing is by several research- ers, for example, Poplack and Dion (2012), considered to not be a form of code-switching. Poplack and Dion (2012) specifically claim that you can be certain you are not dealing with an instance of code-switching when coming upon what they call a lone other-language item, simply because such items never are code-switched. According to them, the process of code-switching only applies to multiword fragments, whereas lone other-language items always are borrowed, either for the nonce, something they dub nonce bor- rowings, or repeatedly, as established loanwords. They base this mostly on the observation that the integration of the single other-language items into the recipient-language occurs abruptly, whereas that is not the case for mul- tiword fragments. By integration they mean the reproduction of variable recipient-language patterns. Within the model we propose here, however, the different level of integration observed between lone other-language items and multiword fragments is explained without assuming that they are subject to two different language mixing processes, such as borrowing and code-switching (more on that in section 13.2.2). Also, as pointed out by, for example, Myers-Scotton (1993) and Haspelmath (2009), the term nonce borrowing is straight out contradictory. Regardless of whether or not instances of code-switching can develop into loanwords over time, bor- rowings, as, for example, Haspelmath (2009) defines them, are completed processes of language change—in other words, by the definition established. Mixing that happens for the nonce, however, would in the same theory be referred to as code-switches. Following such a distinction, what Poplack and Dion (2012) call nonce borrowings should really be seen as instances of code-switching. Haspelmath (2009) does acknowledge that one might broaden the definition of borrowing in such a way as to include what Poplack and Dion (2012) call nonce borrowing, but he stresses that he can- not see how they can do so without ending up with a definition of borrow- ing that encompasses all instances of code-switching—effectively making nonce borrowing the term for all types of code-switching. In other words, word-internal code-switching does exist, which means we need a model that can account for it. If all established loanwords start out as code-switches, as, for example, Myers-Scotton (2002) has suggested, then even established loanwords have to be explained as if they were instances of code-switching, because 384 Multilingualism and Formal Grammar diachronically, they once were. Even if it is correct, as, for example, Poplack and Dion (2012) claim, that established loanwords do not originate as code-switches, we still have to explain the word-internal language mix- ing that is the focus of this chapter. Since we cannot easily assert whether a lone other-language item we encounter in the data is a loanword or an instance of code-switching, we run the risk of analyzing a specific lone other-language item as an instance of code-switching when, in fact, it is an established loanword. However, since we can be certain that both estab- lished loanwords and code-switched lone other-language items exist, we know we need a model that can account for both. If we are dealing with an established loanword, it is essentially identical to be dealing with a com- pletely monolingual utterance with no language mixing, meaning any syn- tactic model can account for it. If, on the other hand, a lone other-language item is an instance of code-switching, the list of syntactic models capable of accounting for it grows shorter. Since we cannot know whether our specific data are code-switches or loanwords, we set aside the discussion regard- ing labeling and continue to use the more general term language mixing throughout the chapter. We now turn to another issue that is important for present purposes, namely, whether language mixing phenomena require special grammatical principles or not. Myers-Scotton (1993, 2002) argues that it is impossible to explain language-mixing phenomena without assuming an asymmetry between a matrix language and an embedded language (see also Joshi 1985, Jake, Myers-Scotton and Gross 2002).2 From this perspective, the notion of “matrix language” and “embedded language” are theoretical primitives. In any given utterance, the matrix language is the main language whereas the embedded language is a secondary language. This distinction is used to account for the fact that the matrix language enjoys a more privileged status: It is responsible for major word order phenomena and for provid- ing the inflectional/functional morphemes, whereas the embedded language occasionally contributes lexical content items. Another approach, which we can label the Null Theory account (Woolford 1983, Mahootian 1993, Belazi, Rubin and Toribio 1994, MacSwan 1999, 2000, 2005, Chan 2003, 2008, González-Vilbazo 2005, and González-Vil- bazo and López 2011, 2012), argues that language mixing and unmixed languages are governed by the same principles. That is, there are no con- straints or rules that are unique to code-switching that cannot be found in the two individual grammars. Furthermore, there is just one computational system, and this system does not recognize entities such as matrix language or embedded language. An advantage of this perspective is that language mixing is not something peripheral to the study of the language faculty, but rather, data from language mixing can inform the study of this faculty (cf. Chan 2008). However, González-Vilbazo and López (2011: 833) empha- size that the Null Theory assumption does not “necessarily entail that the Language Mixing and Exoskeletal Theory 385 I-language of code-switchers will be identical to the union of the two gram- matical systems: code-switchers may include features drawn directly from Universal Grammar which are absent in the component grammars”. We leave this issue open here, as our data do not provide evidence in either direction. Several generative studies of language mixing have appeared, viz. Woolford (1983); Di Sciullo, Muysken, and Singh (1986); Belazi, Rubin, and Toribio (1994); Mahootian and Santorini (1996); MacSwan (1999, 2000, 2005); Muysken (2000); King (2000); Toribio (2001); Chan (2003, 2008); González-Vilbazo (2005); Adger (2006); van Gelderen and Mac- Swan (2008); González-Vilbazo and López (2011, 2012); and Åfarli and Subbarao (2013). In this chapter, we side with the scholars who have been arguing in favor of a Null Theory. However, we also attempt at combining the Null Theory with an intuition found in Myers-Scotton’s work, namely, that of a matrix language (cf. Åfarli 2015). Although this may sound paradoxical, we demonstrate that recent work in syntactic theory provides us with the tools to merge the Null Theory with insights in Myers-Scotton’s work.

13.2.2 Types of Language Mixing and American Norwegian In the literature, there is reference to roughly three main types of sociolin- guistic settings of language contact and mixing, given in (1):

(1) Types of language mixing a. Balanced Bilingual Mixing (BBM) b. Colonial Influx Mixing (CIM) c. Immigrant Community Mixing (ICM)

(1a) is exemplified by children or adults who master (at least) two lan- guages more or less fluently and who mix those languages in their utter- ances (although possibly only in some situations). Speakers who exhibit (1a) are typically children who grow up as genuine bilinguals with parents who have different languages or where the parents speak one language at home and the child acquires another language outside its home. An exam- ple of (1a) is the speech of a bilingual Chinese child growing up in Norway, who masters both Mandarin Chinese and Norwegian, reported in Åfarli and Jin (2014). (2a) is an example produced by this child, in which Chinese is the main language and Norwegian is the secondary one. The mixed verb phrase in (2a) has Chinese word order and Chinese grammatical properties. (2b) provides the Norwegian translation, and as can be seen, ball would have had a suffix denoting definiteness, gender, and number in Norwegian. In the Chinese-Norwegian sentence, however, there is no suffix, in accor- dance with Chinese grammar. Note that throughout this chapter, we only 386 Multilingualism and Formal Grammar gloss the examples with features relevant to the point being made, for ease of exposition.

(2) a. Da na ge ball hit that GE ball “Hit that ball.” Chinese-Norwegian b. Slå den ballen

hit that ball.DEF.M.SG “Hit that ball.” “Hit that ball.” Norwegian

Type (1b) is exemplified by situations where the language of a minor- ity colonial master, because of its status and power, influences the major- ity native language(s) of the colonized people. This is the typical situation during the long period of Western colonization of large parts of the world during the last five hundred years. The influence of English and French on many native languages of Africa can serve as an example; compare Myers- Scotton (1993) and Kamwangamalu (1997). Examples of (1b) are provided, for example, by Zulu–English mixing (data from Kanwangamalu 1997: 47). Zulu is the main language, and English is the secondary language. Clauses with object pronouns show OV order in Zulu, but regular VO order in English. The mixed example in (3) has the Zulu OV order, and inflectional affixes are also from Zulu:

(3) No mngame zama uku-ba-respect-a no my friend try to-them-respect “No my friend, try to respect them.” Zulu-English

Type (1c) is exemplified by situations where a group of people from one language community settles on the native soil of another and larger, more powerful language community, and where the language of the members of the immigrant minority community is influenced by the dominating major- ity language. The empirical basis of this chapter consists of exactly this situation, namely, American Norwegian. As previously stated, this is the variety spoken by Norwegian immigrants who settled in the US during a hundred-year period starting from the first half of the nineteenth cen- tury, as well as their descendants. A lot of material was collected by Einar Haugen in the 1930s (see Haugen 1953) and Arnstein Hjelde in the 1980s (Hjelde 1992). Currently, however, an electronic database called Corpus of American Norwegian Speech (CANS) is being created at the Text Labora- tory at the University of Oslo, including material that has been collected in recent times. It is this newer material that our data come from. This corpus is a rich source of American Norwegian mixing data that is excel- lent for our purposes. First, it comprises data collected in recent years and Language Mixing and Exoskeletal Theory 387 therefore contains considerably more instances of language mixing as com- pared to the earlier data, as the speakers are being ever more influenced by English. Moreover, it contains sound and video files together with tran- scriptions, which enables us to actually listen to the pronunciation of the inserted English item to determine whether it has a full-fledged American accent or not. In American Norwegian, Norwegian is the main language, and English is the secondary language. Norwegian is a Verb Second (V2) language, whereas English is not. As expected, American Norwegian clauses show V2, as shown in (6). In addition, tense affixes are Norwegian and noun phrases exhibit Norwegian syntax and affixes, even when the lexical con- tent morphemes are borrowed from English. This is shown in (4) through (6). We have altered the transcriptions and used English spelling for English words. These are marked as bold, and importantly, they were uttered with a distinct American accent as opposed to a Norwegian one. The information in brackets behind each American Norwegian example is a reference to the speaker in the CANS corpus who uttered that specific phrase.

(4) a. Jeg teach-a # første # grad[e]-en American Norwegian

I teach-PAST # first # grade-DEF.M.SG (coon_valley_WI_07gk) “I taught the first grade.” b. Jeg underviste den første klassen Norwegian

I teach-PAST the-M first grade-DEF.M.SG “I taught the first grade.” (5) a. Å celebrat[e]-e birthday-en hennes American Norwegian

to celebrate-INF birthday-DEF.M.SG hers (coon_valley_WI_06gm) “To celebrate her birthday.” b. Å feire bursdagen hennes Norwegian

to celebrate-INF birthday-DEF.M.SG hers “To celebrate her birthday.” (6) a. Så kan du mow-e litt lawn American Norwegian

then can you mow-INF some lawn-INDEF.SG (coon_valley_WI_07gk) “Then you can mow some lawn.” b. Så kan du klippe litt plen Norwegian

then can you cut-INF some lawn-INDEF.SG “Then you can mow some lawn.”

As shown in the preceding examples, the English verbs teach, celebrate, and mow have all received Norwegian affixes. Note that the Norwegian transla- tion of teach, “undervise”, receives the inflectional affix -te in past tense, not the -a used on the mixed verb in the American Norwegian clause. These are both past tense suffixes used in Norwegian, however, and there might well be phonological reasons for why the speaker chose teach-a over teach-te. Similarly, the English nouns grade, birthday, and lawn are all marked for 388 Multilingualism and Formal Grammar definiteness/indefiniteness, and the noun phrasesshow Norwegian syntax. We will return to this in section 13.4. The overall pattern that emerges from these three types of language con- tact and mixing is the following: In situations of language mixing, one of the languages involved is the main language while the other is the secondary language. The main language provides the overall grammatical structure of the utterances (e.g., as expressed through word order), as well as most of the lexical content morphemes and all the inflectional/functional morphemes. The secondary (or influencing) language occasionally provides lexical con- tent morphemes but not inflectional or functional morphemes. We can dis- play the pattern as in (7):3

(7) a. LSEC + INFLMAIN

b. LMAIN + INFLMAIN

c. *LSEC + INFLSEC (except in bigger mixed chunks)

d. *LMAIN + INFLSEC

It is worth pausing at the exception in (7c), namely, that you do find lexi- cal content morphemes from the secondary language with inflectional mor- phemes also from the secondary language in bigger mixed chunks. (8) and (9) are examples of this:

(8) a. Åssen det var der in the second world war American Norwegian how it was there in the second world war (westby_WI_03gk) “How it was there in the second world war.” b. Åssen det var der i den andre verdenskrigen Norwegian

how it was there in the-DEF.M.SG second worldwar-DEF.M.SG “How it was there in the second world war.” (9) a. Første fisken vi caught down in the creek American Norwegian

first fish-DEF.M.SG we caught down in the creek (westby_WI_03gk) “The first fish we caught down in the creek.” b. Den første fisken vi tok nede i bekken Norwegian

the-DEF.M.SG first fish-DEF.M.SG we caught down in creek.DEF.M.SG “The first fish we # e # caught e # down in the creek.”

In (8), the entire PP in the second world war is in English, displaying English functional elements and lacking the Norwegian definiteness suffix –en. In (9), the entire VP caught down the creek is in English, and in addition to the English functional elements we find, neither the verb nor the noun displays Norwegian suffixes for tense and definiteness, respectively. This is perfectly in accord with the model we propose, as it only requires the overall mixed phrase, not the internal structure of the phrase, to fit with the Norwegian structure. In other words, the model requires the PP in (8) and the VP in (9) Language Mixing and Exoskeletal Theory 389 to appear in positions where a Norwegian PP and VP could have appeared, but the internal structure of these phrases may very well be English. In other words, the observed integration discrepancy between single, other-language items and multiword fragments reported in Poplack and Dion (2012) fol- lows naturally from the model, leaving us with no reason to assume that they are subject to two different mixing processes. Leaving the bigger mixed chunks aside and coming back to the pattern for word-internal mixing, we look at American Norwegian and provide an account of why (7c) and (7d) do not exist.

13.3 A Late-Insertion Exoskeletal Model In this section, we outline a new approach to grammar, namely, a late- insertion exoskeletal model. This model will combine work on argument structure with work on the relationship between syntax, morphology, and phonology. In section 13.3.1, we briefly review the main transition from theta roles to structural constraints on argument structure. Section 13.3.2 proposes a specific model of language mixing, which we use to analyze data from American Norwegian in section 13.4.

13.3.1 Advances in Our Understanding of Argument Structure It is commonly argued that, for example, verbs carry information about its surrounding syntactic structure. This is illustrated in (10) and (11), where each verb contains information about its number of arguments (subject, object, etc.). Underlining of the number denotes the subject.

(10) a. John kicked the ball. b. *John kicked  kick: 1, 2 (11) a. Kim gave Michael candy. b. *Kim gave Michael.  give: 1, 2, 3

This information is typically known as theta roles (Chomsky 1981), or the- matic roles (Gruber 1965, Jackendoff 1990, Carlson 1984). The assump- tion is that theta roles account for syntactic constraints on argument structure. Since Chomsky (1995), Harley (1995) and Kratzer (1996), many scholars have argued that the Agent is introduced by a dedicated functional projec- tion, VoiceP or vP (Alexiadou, Anagnostopoulou, and Schäfer 2006, Folli and Harley 2007, Merchant 2013), distinguishing between the external argument and all the internal arguments (Williams 1981, Marantz 1984). Since then, other work has argued that all arguments are introduced by ded- icated projections (Borer 2005a, b, 2013, Ramchand 2008, Bowers 2010, Lohndal 2012, 2014). 390 Multilingualism and Formal Grammar In this chapter, we assume that instead of encoding properties of syntactic structures into the words themselves, the syntactic structures are generated independently of the words. This is inspired by nongenerative construction grammar work, as witnessed, for example, in Goldberg (1995, 2006) (and see Booij 2010 for morphology). A series of scholars have worked on devel- oping a generative neo-constructivist model, for example, van Hout (1996), Borer (2005a, b, 2013), Åfarli (2007), Ramchand (2008), Lohndal (2012, 2014) and Marantz (1997, 2013), to mention some. The most “radical” approach can be called exoskeletal, as the syntactic structure is assumed to provide a skeleton (Borer: template; Åfarli: frame) in which lexical items can be inserted, much like in Chomsky (1957) and in Distributed Morphology (Embick and Noyer 2007). Of the researchers mentioned above in favor of a generative neo-constructivist model, both Borer, Marantz, Åfarli, and Lohndal support an exoskeletal view. Let us briefly review two arguments in favor of an exoskeletal model. The first involves the variability that verbs display. Many verbs can occur in a range of sentential environments, viz. the examples in (12) from Clark and Clark (1979):

(12) a. The factory horns sirened throughout the raid. b. The factory horns sirened midday and everyone broke for lunch. c. The police car sirened the Porsche to a stop. d. The police car sirened up to the accident site. e. The police car sirened the daylight out of me.

It is natural to assume that even native speakers of English who have never heard siren used as a verb easily can interpret these sentences. The examples show that siren can appear with a varying number of arguments and that the core meaning (to produce a siren sound) seems to be maintained in all cases, even though the specific meanings are augmented according to the syntactic environment. This strongly suggests that the meaning of siren can- not just come from the verb itself but that it depends on the meaning of the syntactic construction. This is more in accord with an exoskeletal model than the mainstream, endoskeletal one, where syntax is generated on the basis of features inherent to lexical heads (e.g., verbs). Another supporting argument for the exoskeletal view is the flexibility many lexical items display as to what word class they belong to. One exam- ple is provided in (13):

(13) a. He ran out the door. b. My son outed me to his preschool. c. He was desperately looking for an out.

In (13), out surfaces as a preposition (a), a verb (b) and a noun (c). In an endoskeletal model of the grammar where features inherent to a lexical head Language Mixing and Exoskeletal Theory 391 determine, amongst other things, what word class category it belongs to, there are two options for accommodating (13). Out either has to be stored in the lexicon as both a preposition, a verb and a noun, or alternatively, one has to assume that one word class is derived from the other. Both solutions are circular and therefore not explanatory: They do not capture the system- atic relation between the different versions of out, meaning the verb out and the noun out are no more related in the grammar than, say, the verb out and the noun sofa. An exoskeletal theory fares better, in that one assumes a category-less primitive, usually referred to as root, receives a specific word class category (noun, verb, . . .) by virtue of being inserted into a particular syntactic position in a template/frame. For reasons of space, we cannot review other arguments in favor of an exoskeletal view, but we refer to extensive discussions in the literature; compare with Parsons (1990); Schein (1993); Kratzer (1996); van Hout (1996); Borer (2005a, b, 2013); Levin and Rappaport Hovav (2005); Alex- iadou, Anagnostopoulou, and Schafer (2006); Pietroski (2007); Ramchand (2008); Lohndal (2012, 2014); and Adger (2013). It is important to empha- size that exoskeletal theories cover a family of approaches. The works we cite here differ in details, but they share the claim that syntactic structure is crucial in determining argument structure, a view that has gained trac- tion in recent years. This is clearly conveyed in the following quote from Marantz (2013: 153). where he says that current developments in linguistic theory

. . . have shifted discussion away from verb classes and verb-centered argument structure to the detailed analysis of the way that structure is used to convey meaning in language, with verbs being integrated into the structure/meaning relations by contributing semantic content, mainly associated with their roots, to subparts of a structured meaning representation.

This is in contrast to what has become a hallmark of much work within the Minimalist Program (Chomsky 1995), namely, its lexical nature: Syntactic structure is generated based on features on lexical and functional elements (see Adger 2003 for a textbook illustration where this is pursued in great detail; see also Adger 2010 and Adger and Svenonius 2011). This feature- based approach has also been applied to intra-individual variation; see espe- cially MacSwan (1999, 2000), King (2000), and Adger (2006). However, features are not unproblematic. Let us briefly consider some issues that emerge. The first is that it is unclear what the nature of features is (see Chomsky 1995, Brody 1997, Pesetsky and Torrego 2004, Zeijlstra 2008, 2012, Adger 2010, Boeckx 2014, Adger and Svenonius 2011 for dis- cussion). What kind of features are there? Are they binary, privative? Are features the only building blocks in syntax? Despite a lot of work on fea- tures, there is no consensus on these issues. Another issue is that several 392 Multilingualism and Formal Grammar of the syntactic features that are invoked appear to be rather pragmatic, semantic or phonological in nature. This seems to be true of features such as [TOPIC] and [FOCUS] and EPP-features triggering movement if there is a semantic effect of the movement (Reinhart 1997, Fox 1995, Chomsky 2001). If the features have a pragmatic, semantic or phonological basis, one could argue that rather than syntacticizing such features, the relevant effects should be analyzed in these components in order to avoid duplicating the analysis across grammatical components (Borer 2005a, b, 2013). An important tenet of features in the Minimalist Program was to con- strain derivations. Taken to its most radical conclusion, it means that grammar is “crash-proof” (Frampton and Gutmann 2002) in the sense that only grammatical structures are licit at the interface. If features do not constrain derivations, there have to be other ways of “filtering” out illicit representations. These can be derivational constraints like Relativized Minimality (Rizzi 1990, 2001, Starke 2001, etc.), or they can be interface constraints that are either phonological or semantic. In order to account for argument structure violations, the exoskeletal view typically relies on an interface account: A combination of language use and conceptual knowledge accounts for the absence of certain meanings (Borer 2005a, b, Åfarli 2007, Nygård 2013, Lohndal 2014). In this sense, the theory is more like Government and Binding (Chomsky and Lasnik 1977, Chom- sky 1981, Lasnik and Saito 1984, 1992) than most approaches within the Minimalist Program. In the exoskeletal model that we develop in the next section, the role of features in syntactic derivations is restricted to formal morphological features of functional nodes, and we thus assume that it is desirable to adopt a restrictive view on the role played by features in a derivation. Instead, syntactic templates or frames take on an important role. Impor- tantly, syntactic structures will contain features, but the role played by feature matrices is different regarding functional elements as compared to lexical content items. Put differently, we assume that the abstract building blocks of syntactic structures are functional features and functional feature matrices, but we assume that the functional elements instantiate feature matrices, whereas lexical content items are freely inserted into designated lexical slots.

13.3.2 A Specific Model of Language Mixing We assume an exoskeletal model, which is a version of Distributed Mor- phology (DM; Halle and Marantz 1993, Marantz 1997, Harley and Noyer 1999, Embick and Noyer 2007). Rather than assuming one lexicon one can access at the very beginning of the syntactic derivation, DM has distributed the content of the lexicon throughout the derivation, comprising three sepa- rate lists. This is illustrated in (14): Language Mixing and Exoskeletal Theory 393 (14) The Grammar (Embick & Noyer 2007: 301)

The syntactic terminals consists of two types of primitives, namely, roots and features or feature bundles. Roots are items like √TABLE, √CAT or √WALK. There is a discussion amongst the proponents of exoskeletal mod- els as to what the nature of roots are (see, for instance, Harley 2014 and subsequent articles in the same issue of Theoretical Linguistics). We assume that roots are devoid of grammatical features and that they are underspeci- fied for phonology and semantics, following Arad (2005), but the exact nature of roots is not of vital importance to this article. What is important is that we assume that all roots one individual has ever learned, whether that speaker is monolingual or multilingual, are stored together. In other words, roots do not belong to any particular language in the sense of being listed separately or having any sort of language features; rather, knowledge of what language a specific root usually appears in is stored in the encyclo- pedia, along with other idiosyncratic and idiomatic pieces of information. Unlike the roots, we assume that the features and feature bundles, known collectively in the DM literature as abstract morphemes, are stored in lan- guage-specific lists. This means that someone competent in two languages or varieties will have one list for the abstract morphemes of the one lan- guage or variety, another list for those belonging to the other, and a third list encompassing all the roots. Importantly, the features that make up the abstract morphemes are drawn from a universal repository, and part of learning a language or variety is learning which features that are “active” in that specific language, as well as how they bundle together, and then stor- ing that information as specific abstract morphemes. Thus, if Norwegian 394 Multilingualism and Formal Grammar makes use of the feature bundle [+X, +Y, +Z], and a particular speaker of Norwegian also speaks another language or variety that makes use of the exact same feature bundle, the same bundle will be stored in both lists of abstract morphemes. Roots, however, are not universal, and we can there- fore always add new ones—there is no final list. This distinction between abstract morphemes and roots reflects the classic division between open and closed class items. There are two options for the generation of syntactic structures (tem- plates/frames). Either they are generated by the functional features, or alter- natively Merge operates freely (Chomsky 2004, Boeckx 2014). We do not take a stand regarding this particular question. Rather, we want to look into the consequences for language mixing of a model such as the one proposed here, where abstract syntactic frames or templates are generated prior to any lexical insertion. Let us look at an abstract and simplified representation of the argument structure domain of a clause. [] denotes a feature matrix.

(15) 

This structure builds on Lohndal (2012, 2014), where the abstract verb slot is generated prior to functional structure introducing arguments. Both the internal and the external arguments are introduced into the structure by way of dedicated functional projections. Other structures such as Borer (2005a, b), Ramchand (2008), Bowers (2010), or Alexiadou, Anagnosto- poulou, and Schäfer (2015) are also compatible with what follows, we are simply using (15) for ease of exposition. We follow DM in assuming that both roots and abstract morphemes are abstract units, which do not get their phonological content until after Spell- Out to the interfaces. Another way of putting it is to say that the exponents of all lexical material (in the wide sense, comprising both functional and content items) are inserted late. This process is known in DM as Vocabulary insertion. The Vocabulary is the second type of list assumed in DM and consists of the phonological exponents of the different roots and abstract Language Mixing and Exoskeletal Theory 395 morphemes, also known as Vocabulary Items (VI). This process will prove important when we account for the word-internal cases of language mixing in section 13.4. Another thing that naturally follows from DM is that syn- tax operates word-internally as well. By making this assumption, we make it easy for the theory to address word-internal language mixing, pace, for example, MacSwan (1999, 2000) and King (2000). In fact, it is impossible to prevent the theory from saying something about word-internal mixing. The resulting picture is one in which we get an abstract syntactic structure where the exponents of roots and abstract morphemes can be inserted. Root insertion is without any syntactic constraints, as the syntactic slots in which roots are inserted make no featural demands regarding their content. This explains why we can get the pattern in (7a), repeated underneath as (16a):

(16) a. LSEC + INFLMAIN

b. LMAIN + INFLMAIN

c. *LSEC + INFLSEC (except in bigger mixed chunks)

d. *LMAIN + INFLSEC

As seen in (16c) and (16d), however, the exponents of abstract functional morphemes apparently always come from the main language, never from the secondary one. In the present context of American Norwegian, this amounts to saying that the functional vocabulary comes from Norwegian and not from English. This asymmetry needs an explanation, and this is where features really play a role in this model. We assume that the main or matrix language builds the structure and, thus, that the feature matrices that are part of the structure come from the Norwegian list of abstract mor- phemes. As mentioned, roots do not instantiate feature matrices but are, rather, inserted as modifiers in the appropriate lexical slots. Abstractly, this can be illustrated as in (17).

(17) 396 Multilingualism and Formal Grammar In (17), we have specified the feature matrices relevant to our illustration of language mixing, leaving the others as [FM] (for feature matrix) for ease of exposition. The functional exponent that will be subject to vocabulary insertion in any particular one of those slots has to match the features of the underlying abstract morpheme. This follows from the rules of expo- nence summarized in the Subset Principle (Halle 1997: 428), which reads as follows:

Subset Principle: The phonological exponent of a Vocabulary Item is inserted into a position if the item matches all or a subset of the features specified in that position. Insertion does not take place if the Vocabu- lary item contains features not present in the morpheme. Where several Vocabulary Items meet the conditions of insertion, the item matching the greatest number of features specified in the terminal morpheme must be chosen.

What also follows from this particular formulation of the Subset Principle is that the most specified form will block the insertion of a less specified form, even though they both are compatible. This specification will prove impor- tant when addressing verbal word-internal language mixing in section 13.4. For the lexical categories, items can be inserted from any language. We have specified in the structure where you would get a root and where you would get a full nominal phrase, D (the internal structure of which we will get back to). At least for English-Norwegian mixing, we assume that roots never are mixed on their own. The smallest element that is mixed, at least in English–Norwegian mixing, is a categorical stem. In other words, the categorizing head has to come from the same language as the root. We can assume this because of data such as (18), based on intuitions from several native speakers of Norwegian:

(18) a. Han sprang ut døra

he ran out door-DEF.F.SG “He ran out the door.” b. Sønnen min out-a/*ut-a meg til førskolen sin

son-DEF.M.SG my out-PAST/ out-PAST me to preschool-DEF.M.SG his “My son outed me to his preschool.”

We see that the Norwegian preposition ut roughly corresponds to the Eng- lish preposition out, but whereas out can surface as a verb in English, ut is not attested in similar use in Norwegian. What is attested, however, is the verb out being used as a verb in Norwegian, as seen in (18b). If what was mixed into Norwegian were the uncategorized root, we would not expect outa in Norwegian to have the exact same idiosyncratic meaning as outed has in English, simply because that specific, verbal meaning is not present Language Mixing and Exoskeletal Theory 397 in the meaning content of the preposition, adverb, noun, or adjective built using the same root. In other words, Norwegian must have mixed in a struc- ture involving at least the categorized root for the specialized verbal mean- ing of out to be attained as well. This outlines the model we will be using, and we now turn to word- internal language mixing and how this specific model can account for the pattern we observe in American Norwegian.

13.4 Accounting for Word-Internal Mixing Let us consider how we can employ the model developed in section 13.3 to analyze more data from American Norwegian. We first look at verbs and then at nouns. The examples in (19) illustrate word-internal mixing within verbs. As in the previous examples, the index in brackets behind each example is a refer- ence to the speaker in the CANS corpus who uttered that specific phrase. Again, CANS’ transcriptions have been altered for ease of exposition: Eng- lish words are here written using English spelling. This will be the norm for all subsequent examples.

(19) a. spend-e (blair_WI_02gm) spend-INF “to spend” b. bother-e (westby_WI_01gm) bother-INF “to bother” c. figur[e]-e ut (blair_WI_01gm) figure-INF out “to figure out” d. harvest-e (coon_valley_WI_02gm) harvest-INF “to harvest” e. cultivat[e]-e (coon_valley_WI_04gm) cultivate-INF “to cultivate” f. shut-e (coon_valley_WI_04gm) shut-INF “to shut” g. count-e (sunburg_MN_03gm) count-INF “to count” h. rais[e]-er (blair_WI_01gm) raise-PRES “raise(s)” 398 Multilingualism and Formal Grammar i. rent-er (coon_valley_WI_02gm) rent-PRES “rent(s)” j. pre-empt-er (harmony_MN_01gk) pre-empt-PRES “pre-empt(s)” k. hunt-er (coon_valley_WI_04gm) hunt-PRES “hunt(s)” l. feed-er (spring_grove_MN_05gm) feed-PRES “feed(s)” m. retir[e]-a (coon_valley_WI_06gm) retire-PAST “retired” n. visit-a (blair_WI_01gm) visit-PAST “visited” o. telephon[e]-a (harmony_MN_01gk) telephone-PAST “telephoned” p. car[e]-a (webster_SD_02gm) care-PAST “cared” q. tight-a (westby_WI_01gm) tight-PAST “tighted” r. catch-a (sunburg_MN_03gm) catch-PAST “caught” s. watch-a (sunburg_MN_03gm) watch-PAST “watched” t. walk-te (rushford_MN_01gm) walk-PAST “walked” u. rais[e]-te (blair_WI_01gm) raise-PAST “raised”

As illustrated, even though an English stem is used in the American Nor- wegian examples, the affixes are not English but, rather, the ones used in Norwegian. How can we account for this? A structure for the example in (19i), renter, will be as in (20), where the vocabulary item or exponent has been inserted to make it easier to read the Language Mixing and Exoskeletal Theory 399 structure. In the syntax, importantly, there are only feature matrices and roots. Note that only the relevant features are shown.

(20)

The verb moves from stem position/v through F and Voice until it picks up the inflectional morpheme in T. Not included here is that together, the verb and the inflectional ending would then move to C, since American Norwegian conforms to the V2 rule. Importantly, the exponent is renter, with a Norwegian tense inflection, not rent or rents, with an English one. In order to explain why this is a pattern we observe for all mixed verbs in American Norwegian as opposed to a random coincidence, we have to look at the corresponding English structure. (21a) shows the relevant abstract structure with feature matrices, whereas in (21b), we have inserted exponents.

(21) a.

b.

400 Multilingualism and Formal Grammar As we see, the English structure is identical to the Norwegian one, apart from the fact that the English T has unvalued features for number and per- son that have to be valued by features of the external argument. When the external argument has the features [num: sg, pers: 3], as in (21b), the expo- nent of T is rents with an –s. Had the external argument had any other feature combination, however, the exponent would have been rent. In other words, English has subject–verb agreement. As we recall from the abstract, Norwegian structure in (17), we do not assume that the feature bundle of T used in Norwegian includes unvalued features for number and person, simply because Norwegian does not display subject–verb agreement. This means that following the Subset Principle, English rent and English rents are ruled out as possible phonological exponents of the feature bundle of the Norwegian T projection, seeing as they include features for number and person that are not called for in the structure. It is worth noting that one also could assume that all languages have subject–verb agreement, but that some languages, such as Norwegian, have identical exponents for all feature combinations. If that were the case, the Norwegian and English exponents would be equally well matched, mean- ing the syntax would pose no restrictions for the insertion of any of them. We claim that even in such situations, we can expect the exponent from the matrix language to be chosen over that from the embedded language. The reason is that the speaker is aware of what language constitutes the main language of any given utterance. When building an American Norwegian syn- tactic structure, for instance, the speaker gets his or her abstract morphemes from the Norwegian list of abstract morphemes, and that will likely influence what exponent they will choose to use, even if there was an identical abstract morpheme in the American list. We do not even have to get very technical, as this really is a matter of communication strategies. If I am speaking a specific language or variety, say, Norwegian, I am likely to make use of mostly Nor- wegian exponents for both abstract morphemes and roots. If I choose to use an exponent associated with a different language instead of a Norwegian one, it will not just be because they both were compatible and I randomly chose one; I will be choosing that exponent for some form of purpose. In the case of categorized roots, there are many reasons why one might want to choose one from another language. It could be that the matrix language does not have an exponent with the specific semantic content the speaker wants to express, as the case is with the verb out, or there could be other psychosocial reasons (e.g., Poplack and Dion 2012 mention conspicuousness and attention seek- ing as to oft-cited motivations). It is more difficult to see what the motivation for choosing a functional exponent from another language could be, though. Now, let us turn to word-internal mixing within nouns.

(22) a. road-en (westby_WI_02gm) road-DEF.M.SG “the road” Language Mixing and Exoskeletal Theory 401 b. graveyard-en (blair_WI_07gm) graveyard-DEF.M.SG “the graveyard” c. river-en (chicago_IL_01gk) river-DEF.M.SG “the river” d. teacher-en (rushford_MN_01gm) teacher-DEF.M.SG “the teacher” e. end loader-en (westby_WI_01gm) end loader-DEF.M.SG “the end loader” f. track-en (coon_valley_WI_04gm) track-DEF.M.SG “the track” g. squirrel-en (coon_valley_WI_02gm) squirrel-DEF.M.SG “the squirrel” h. railroad-en (harmony_MN_05gm) railroad-DEF.M.SG “the railroad” i. university-en (harmony_MN_04gm) university-DEF.M.SG “the university” j. color-en (coon_valley_WI_04gm) color-DEF.M.SG “the color” k. choir-en (coon_valley_WI_07gk) choir-DEF.M.SG “the choir” l. cousin-a (harmony_MN_01gk) cousin-DEF.F.PL “the cousin” m. fair-a (coon_valley_WI_06gm) fair-DEF.F.SG “the fair” n. field-a (westby_WI_02gm) field-DEF.F.SG “the field” o. field-et (rushford_MN_01gm) field-DEF.N.SG “the field” p. pastur[e]-et (coon_valley_WI_03gm) pasture-DEF.N.SG “the pasture” 402 Multilingualism and Formal Grammar q. government-et (harmony_MN_01gk) government-DEF.N.SG “the government” r. shed-et (blair_WI_07gm) shed-DEF.N.SG “the shed” s. school board-et (westby_WI_01gm) school board-DEF.N.SG “the school board” t. stor[e]-et (wanamingo_MN_04gk) store-DEF.N.SG “the store” u. fenc[e]-a (coon_valley_WI_06gm) fence-DEF.N.PL “the fences”

As can be seen, even though an English stem is used in the American Nor- wegian examples, the definiteness morpheme is not the English prenominal free morpheme the but, rather, the Norwegian postnominal suffix. Just like in the case of the preceding verbal examples, our model readily explains this pattern. Since American Norwegian is a variety of Norwegian, the speaker employs a Norwegian syntactic structure, and the relevant Norwegian syn- tactic structure is sketched in (23) (cf. Riksem et al. 2014).

(23)

The structure builds on Julien (2005), with the exception of the gender projection. Whereas Julien argues that gender lacks a projection of its own and rather is a feature of the root or stem, we assume that it is an independent functional head; compare with Picallo (1991, 2008; though see Alexiadou 2004 and Kramer 2014 for a different analysis). Note that our data also are consistent with an analysis where gender is a feature Language Mixing and Exoskeletal Theory 403 of another syntactic head instead of being a projecting head itself. Defi- niteness, number and gender could for instance be features of the same functional head, as argued for in Riksem (in press). This is also compat- ible with the gender feature being located on different syntactic heads in different languages, as proposed in Ritter (1993). For the purposes of this chapter, however, we use GenP to implement our analysis. What we do argue not to be compatible with our data is gender being a feature of the root or n, contrary to what is argued in Julien (2005) as well as in Alex- iadou (2004) and Kramer (2014). The functional D head provides a feature matrix that must be made vis- ible by the best matching available exponent, in accordance with the Subset Principle. If English made use of the same structure with the same feature matrices, one could insert exponents from both languages. However, gen- der is a nonexistent feature in English, and number is not expressed on the definite/indefinite article. This means that the English exponents are less specified than the Norwegian ones, seeing as the latter match all the features of the relevant feature matrices, such as [+DEF, +F, +SG] for –a in fielda, or [+DEF, +M, +SG] for –en in graveyarden. Consequently, only Norwegian exponents will do. The structure for (22a) is illustrated in (24).

(24)

The fact that features that are nonexistent in English but existing in Nor- wegian are assigned to English nouns in American Norwegian is of par- ticular interest. This is illustrated for gender in (22), where the nouns have suffixes denoting either feminine, masculine, or neuter gender. It also shows up on the articles chosen to accompany singular, indefinite nouns borrowed from English, as illustrated in (25).

(25) a. en chainsaw (blair_WI_07gm)

a-INDEF.M.SG chainsaw-INDEF.M.SG “a chainsaw” 404 Multilingualism and Formal Grammar b. en strap (coon_valley_WI_04gm) a-INDEF.M.SG strap-INDEF.M.SG “a strap” c. en permit (westby_WI_06gm) a-INDEF.M.SG permit-INDEF.M.SG “a permit” d. en license (westby_WI_06gm) a-INDEF.M.SG license-INDEF.M.SG “a license” e. ei nurse (coon_valley_WI_02gm) a-INDEF.F.SG nurse-INDEF.F.SG “a nurse” f. ei field (westby_WI_01gm) a-INDEF.F.SG field-INDEF.F.SG “a field” g. ei stor family (harmony_MN_02gk) a-INDEF.F.SG large family-INDEF.F.SG “a large family” h. ei slik turkey cooker (westby_WI_01gm) a-INDEF.F.SG such turkey cooker-INDEF.F.SG “one of those turkey cookers” i. et shed (coon_valley_WI_02gm) a-INDEF.N.SG shed-INDEF.N.SG “a shed” j. et walnut (coon_valley_WI_04gm) a-INDEF.N.SG walnut-INDEF.N.SG “a walnut” k. et company (westby_WI_01gm) a-INDEF.N.SG company-INDEF.N.SG “a company” l. et crew (westby_WI_03gk) a-INDEF.N.SG crew-INDEF.N.SG “a crew” m. et grocery store (westby_WI_03gk) a-INDEF.N.SG grocery store-INDEF.N.SG “a grocery store” n. et annet dialect (harmony_MN_01gk) a-INDEF.N.SG other dialect-INDEF.N.SG “another dialect”

An analysis of this assignment is suggested in Nygård and Åfarli (2013), again making use of an exoskeletal model. Nygård and Åfarli take as their point of departure what they call the gender problem, that is, the prob- lem of why gender seems to be an inherent property of the noun, whereas other functional properties, like number and definiteness, may vary from Language Mixing and Exoskeletal Theory 405 one occasion of use to another. American Norwegian is particularly inter- esting concerning the gender problem, because this variety of Norwegian shows frequent mixing of nouns from a language without gender on nouns (English) into a language with a gender system (Norwegian). There are two theoretical possibilities for a noun taken from a nongender system into a gender system:

(26) a. the noun receives a default (“inactive”) gender in virtue of being borrowed; that is, all borrowed nouns receive the same default gen- der; or b. the noun receives a particular (“active”) gender in a systematic way by some assignment rule.

The American Norwegian data material indicates that English nouns mixed into American Norwegian are assigned to different gender classes in a systematic way. For instance, Hjelde (1996) finds that of the English nouns borrowed into “Trønder” American-Norwegian, 70.7% are mascu- line (m), 10.5% are feminine (f), and 15.7% are neuter (n) (whereas the final 3.1% alternate). It has also been argued that gender assignment in Norwegian is “rule governed” (Trosterud 2001) and, similarly, that there are particular gender assignment rules in American Norwegian (Hjelde 1996: 297). Hjelde (1996: 299–300) states that English nouns mixed into American Norwegian seem to acquire a gender based on its conceptual and/or phonological properties. Nygård and Åfarli (2013) side with Hjelde and conclude that gender is, in fact, syntactically assigned to the English nouns borrowed into American Norwegian, and they explain this assign- ment as we have done, by assuming a gender projection for Norwegian DPs which is absent for English ones. The relevant structure for (25a) would be (27):

(27) 406 Multilingualism and Formal Grammar Keep in mind that within a lexicalist or endoskeletal model where features are inherent properties of individual lexical items, one could not readily explain how an English lexical item, such as chainsaw, receives gender—as it can only project the features inherent to it, and English nouns are not assumed to have gender features. As shown, the exoskeletal model proposed here, on the other hand, explains both why and how this assignment of oth- erwise alien features takes place.

13.5 Conclusion To conclude, we have argued that language mixing data provide important evi- dence for grammatical theory. More specifically, the data from language mixing in American Norwegian that we have been discussing support a late-insertion exoskeletal model of grammar whereby the functional part of the sentence con- stitutes a template or frame in which lexical content items are inserted. The primary explanatory device in an exoskeletal analysis is the syn- tactic template or frame, and although we assume that the existence of features and feature matrices is important as explanatory devices, fea- tures still have a somewhat reduced role and scope in our analysis as compared to mainstream minimalist theory. More concretely, we claim that the syntactic functional structure is generated by way of bundles of abstract formal features, that is, feature matrices consisting of abstract morphemes. These features generate a syntactic template or frame. For the realization of the feature matrices, we assume a set of vocabulary insertion rules. Based on a specific version of the subset principle, we have argued that as the functional feature matrices belonging to the matrix lan- guage rarely will match the feature specifications of the functional expo- nents of an embedded language equally well or better than the feature matrices belonging to the matrix language itself, there is a strong tendency for functional morphology to be provided by the matrix language. We have shown this to be the case in American Norwegian. As discussed, we assume that there will be a preference for functional exponents from the matrix language even when the exponent from the embedded language is equally well matched. On the other hand, lexical content items are freely inserted into desig- nated slots in the structure generated by the abstract feature matrices, and importantly, there is no feature-matching requirement pertaining to content items. As a result, these items are freely inserted and can be picked from any language. Thus, as we have shown, American Norwegian often contains content items (stems) from English.

Notes * We are grateful to two anonymous reviewers, audiences at a number of venues, and members of the EXOGRAM research group at NTNU for valuable and help- ful comments. Language Mixing and Exoskeletal Theory 407 1 Whether the first generation immigrants should be referred to as heritage speakers along with their descendants or not, is debatable (Åfarli 2015: 14–15). In either case, our data set does not comprise speech from first generation immigrants. 2 The approach in Poplack (1980, 1981) and Sankoff and Poplack (1981) also pro- poses constraints that are unique to language mixing. See also Gumperz (1982). 3 There are exceptions to this pattern, such as the occasional use of the English plural marker –s in an otherwise Norwegian noun phrase, as well as English nouns in an otherwise Norwegian, definite noun phrase lacking the Norwegian definiteness suffix. As this barely ever occurs in the earlier material documented in Haugen (1953), we are, for the most part, attributing this to attrition. An analysis of this phenomenon can be found in Riksem (in press) and Riksem et al (2014).

References Adger, D. 2003. Core Syntax. Oxford: Oxford University Press. Adger, D. 2006. Combinatorial variability. Journal of Linguistics 42: 503–530. Adger, D. 2010. A minimalist theory of feature structure. In Features: Perspec- tives on a Key Notion in Linguistics, A. Kibort and G. Corbett (eds.), 185–218. Oxford: Oxford University Press. Adger, D. 2013. A Syntax of Substance. Cambridge, MA: MIT Press. Adger, D. and Svenonius, P. 2011. Features in minimalist syntax. In The Oxford Handbook of Linguistic Minimalism, C. Boeckx (ed.), 27–51. Oxford: Oxford University Press. Åfarli, T. A. 2007. Do verbs have argument structure? In Argument Structure, E. Reuland, T. Bhattacharya and G. Spathas (eds.), 1–16. Amsterdam: John Benjamins. Åfarli, T. A. 2015. A syntactic model for the analysis of language mixing phenomena: American Norwegian and beyond. In Moribund Germanic Heritage Languages in North America, B. R. Page and M. T. Putnam (eds.), 12–33. Leiden: Brill. Åfarli, T. A. and Jin, F. 2014. Syntactic frames and single-word code-switching: A case study of Norwegian-Mandarin Chinese. In The Sociolinguistics of Gram- mar, T. A. Åfarli and B. Mæhlum (eds.), 153–170. Amsterdam: John Benjamins. Åfarli, T. A. and Subbārāo, K. V. 2013. Models for language contact: The Dakkhini challenge. Paper presented at Formal Approaches to South Asian Languages, University of Southern California, March 9–10. Alexiadou, A. 2004. Inflection class, gender and DP-internal structure. In Explora- tions in Nominal Inflection, G. Müller, L. Gunkel and G. Zifonun (eds.), 21–50. Berlin: Mouton de Gruyter. Alexiadou, A., Anagnostopoulou, E. and Schäfer, F. 2006. The properties of anti- causatives crosslingustically. In Phases of Interpretation, M. Frascarelli (ed.), 187–212. Berlin: Mouton de Gruyter. Alexiadou, A., Anagnostopoulou, E. and Schäfer, F. 2015. External Arguments in Transitivity Alternations: A Layering Approach. Oxford: Oxford University Press. Arad, M. 2005. Roots and Patterns: Hebrew Morpho-Syntax. Dordrecht: Springer. Belazi, H. M., Rubin, E. J. and Toribio, A. J. 1994. Code switching and X-bar theory. Linguistic Inquiry 25: 221–237. Bentahila, A. and Davies, E. E. 1991. Constraints on code-switching: A look beyond grammar. In Papers from the Symposium on Code-Switching in Bilingual Studies: Theory, Significance and Perspectives, 369–404. Strasbourg: European Science Foundation. 408 Multilingualism and Formal Grammar Boeckx, C. 2014. Elementary Syntactic Structures: Prospects of a Feature-Free Syn- tax. Cambridge: Cambridge University Press. Booij, G. 2010. Construction Morphology. Oxford: Oxford University Press. Borer, H. 2005a. Structuring Sense I: In Name Only. Oxford: Oxford University Press. Borer, H. 2005b. Structuring Sense II: The Normal Course of Events. Oxford: Oxford University Press. Borer, H. 2013. Structuring Sense III: Taking Form. Oxford: Oxford University Press. Bowers, J. 2010. Arguments as Relations. Cambridge, MA: MIT Press. Boyd, S. 1993. Attrition or expansion? Changes in the lexicon of finnish and Ameri- can adult bilinguals in Sweden. In Progression and Regression in Language: Sociocultural, Neurophysical, and Linguistic Perspectives, K. Hyltenstam and Å. Viberg (eds.), 386–411. Cambridge: Cambridge University Press. Boztepe, E. 2003. Issues in code-switching: Competing theories and models. Work- ing Papers in TESOL and Applied Linguistics 3: 1–27. Brody, M. 1997. Perfect chains. In Elements of Grammar, L. Haegeman (ed.), 139– 167. Kluwer: Dordrecht. Carlson, G. 1984. Thematic roles and their role in semantic interpretation. Linguis- tics 22: 259–279. Chan, B. H-S. 2003. Aspects of the Syntax, Pragmatics and Production of Code- Switching—Cantonese and English. New York: Peter Lang. Chan, B. H-S. 2008. Code-switching, word order and the lexical/functional category distinction. Lingua 118: 777–809. Chomsky, N. 1957. Syntactic Structures. The Hague: Mouton. Chomsky, N. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press. Chomsky, N. 1981. Lectures on Government and Binding. Dordrecht: Foris. Chomsky, N. 1995. The Minimalist Program. Cambridge, MA: MIT Press. Chomsky, N. 2001. Derivation by phase. In Kenneth Hale: A Life in Language, M. Kenstowicz (ed.), 1–52. Cambridge, MA: MIT Press. Chomsky, N. 2004. Beyond explanatory adequacy. In Structures and Beyond: The Cartography of Syntactic Structures, A. Belletti (ed.), 104–131. Oxford: Oxford University Press. Chomsky, N. and Lasnik, H. 1977. Filters and control. Linguistic Inquiry 8: 425–504. Clark, E. V. and Clark, H. H. 1979. When nouns surface as verbs. Language 55: 767–811. Clyne, M. 2003. Dynamics of Language Contact. Cambridge: Cambridge University Press. Corpus of American Norwegian Speech (CANS); Text Laboratory, University of Oslo. URL: www.tekstlab.uio.no/nota/NorAmDiaSyn/index.html Di Sciullo, A-M., Muysken, P. and Singh, R. 1986. Government and code-mixing. Journal of Linguistics 22: 1–24. Eastman, C. M. 1992. Codeswitching as an urban language-contact phenomenon. Journal of Multilingual and Multicultural Development 12: 1–17. Eliasson, S. 1989. English-Maori language contact: Code-switching and the free morpheme constraint. Reports from Uppsala University Department of Linguis- tics 18: 1–28. Embick, D. and Noyer, R. 2007. Distributed morphology and the syntax morphol- ogy interface. In The Oxford Handbook of Linguistic Interfaces, G. Ramchand and C. Reiss (eds.), 289–324. Oxford: Oxford University Press. Language Mixing and Exoskeletal Theory 409 Field, F. 2002. Linguistic Borrowing in Bilingual Contexts. Amsterdam: John Benjamins. Folli, R. and Harley, H. 2007. Causation, obligation, and argument structure: On the nature of little v. Linguistic Inquiry 38: 197–238. Fox, D. 1995. Economy and scope. Natural Language Semantics 3: 283–341. Frampton, J. and Gutmann, S. 2002. Crash-proof syntax. In Derivation and Expla- nation in the Minimalist Program, S. D. Epstein and T. D. Seely (eds.), 90–105. Oxford: Blackwell. Gardner-Chloros, P. 2009. Code-Switching. Cambridge: Cambridge University Press. Goldberg, A. 1995. Constructions. Chicago, IL: The University of Chicago Press. Goldberg, A. 2006. Constructions at Work. Oxford: Oxford University Press. González-Vilbazo, K. 2005. Die Syntax des Code-Switching. Doctoral dissertation, University of Cologne. González-Vilbazo, K. and López, L. 2011. Some properties of light verbs in code switching. Lingua 121: 832–850. González-Vilbazo, K. and López, L. 2012. Little v and parametric variation. Natural Language and Linguistic Theory 30: 33–77. Gruber, J. S. 1965. Studies in Lexical Relations. Doctoral dissertation, MIT. Gumperz, J. 1982. Discourse Strategies. Cambridge: Cambridge University Press. Halle, M. 1997. Distributed morphology: Impoverishment and fission. MITWPL 30: Papers at the Interface, B. Bruening, Y. Kang and M. McGinnis (eds.), 425– 449. Cambridge: MITWPL. Halle, M. and Marantz, A. 1993. Distributed morphology and the pieces of inflec- tion. In The View from Building 20: Essays in Linguistics in Honor of Sylvain Bromberger, K. Hale and S. J. Keyser (eds.), 111–176. Cambridge, MA: MIT Press. Harley, H. 1995. Subjects, Events, and Licensing. Doctoral dissertation, MIT. Harley, H. 2014. On the identity of roots. Theoretical Linguistics 40: 225–276. Harley, H. and Noyer, R. 1999. Distributed morphology. Glot International 4: 3–9. Haspelmath, M. 2009. Lexical borrowing: Concepts and issues. In Loanwords in the World’s Languages: A Comparative Handbook, M. Haspelmath and U. Tad- mor (eds.), 35–54. Berlin: Mouton de Gruyter. Haugen, E. 1953. The Norwegian Language in America. Philadelphia: University of Philadelphia Press. Heath, J. 1989. From Code-Switching to Borrowing: A Case Study of Moroccan Arabic. London: Kegan Paul International. Hjelde, A. 1992. Trøndsk talemål i Amerika. Trondheim: Tapir. Hjelde, A. 1996. The gender of English nouns in American Norwegian. In Lan- guage Contact Across the Atlantic, P. S. Ureland and I. Clarkson (eds.), 297–312. Tübingen: Max Niemeyer Verlag. Jackendoff, R. 1990. Semantic Structures. Cambridge, MA: MIT Press. Jake, J. L., Myers-Scotton, C. and Gross, S. 2002. Making a minimalist approach to codeswitching work: Adding the Matrix Language. Bilingualism: Language and Cognition 5: 69–91. Johanson, L. 1993. Code-copying in Immigrant Turkish. In Immigrant languages in Europe, G. Extra and L. Verhoeven (eds.), 197–221. Clevedon: Multilingual Matters. Joshi, A. K. 1985. Processing of sentences with intrasentential code switching. In Natural Language Parsing: Psychological, Computational, and Theoretical 410 Multilingualism and Formal Grammar Perspectives, D. Dowty, L. Karttunen and A. Zwicky (eds.), 190–205. Cam- bridge: Cambridge University Press. Julien, M. 2005. Nominal Phrases from a Scandinavian Perspective. Amsterdam: John Benjamins. Kamwangamalu, N. M. 1997. Language contact, code-switching, and I-languages: Evidence from Africa. South African Journal of Linguistics 15: 45–51. King, R. 2000. The Lexical Basis of Grammatical Borrowing: A Prince Edward Island French Case Study. Amsterdam: John Benjamins. Kramer, R. 2014. Gender in Amharic: A morphosyntactic approach to natural and grammatical gender. Language Sciences 43: 102–115. Kratzer, A. 1996. Severing the external argument from the verb. In Phrase Structure and the Lexicon, J. Rooryck and L. Zaring (eds.), 109–137. Dordrecht: Kluwer. Lasnik, H. and Saito, M. 1984. On the nature of proper government. Linguistic Inquiry 15: 235–289. Lasnik, H. and Saito, M. 1992. Move α. Cambridge, MA: MIT Press. Lenneberg, E. 1967. Biological Foundations of Language. New York: John Wiley and Sons. Levin, B. and Hovav, M. R. 2005. Argument Realization. Cambridge: Cambridge University Press. Lohndal, T. 2012. Without Specifiers: Phrase Structure and Argument Structure. Doctoral dissertation, University of Maryland. Lohndal, T. 2013. Generative grammar and language mixing. Theoretical Linguis- tics 39: 215–224. Lohndal, T. 2014. Phrase Structure and Argument Structure: A Case-Study of the Syntax Semantics Interface. Oxford: Oxford University Press. MacSwan, J. 1999. A Minimalist Approach to Intra-Sentential Code-Switching. New York: Garland. MacSwan, J. 2000. The architecture of the bilingual faculty: Evidence from intrasen- tential code switching. Bilingualism 3: 37–54. MacSwan, J. 2005. Codeswitching and generative grammar: A critique of the MLF model and some remarks on “modified minimalism”. Bilingualism: Language and Cognition 8: 1–22. Mahootian, S. 1993. A Null Theory of Code-Switching. Doctoral dissertation, Northwestern University. Mahootian, S. and Santorini, B. 1996. Code switching and the complement/adjunct distinction. Linguistic Inquiry 27: 464–479. Marantz, A. 1984. On the Nature of Grammatical Relations. Cambridge, MA: MIT Press. Marantz, A. 1997. No escape from syntax: Don’t try morphological analysis in the privacy of your own Lexicon. Proceedings of the 21st Penn Linguistics Collo- quium, A. Dimitriadis, L. Siegel, C Surek-Clark and A. Williams (eds.), 201–225. University of Pennsylvania: UPenn Working Papers in Linguistics. Marantz, A. 2013. Verbal argument structure: Events and participants. Lingua 130: 152–168. Merchant, J. 2013. Voice and ellipsis. Linguistic Inquiry 44: 77–108. Muysken, P. 2000. Bilingual Speech: A Typology of Code-Mixing. Cambridge: Cam- bridge University Press. Myers-Scotton, C. 1993. Duelling Languages: Grammatical Structure in Code Switching. Oxford: Oxford University Press. Language Mixing and Exoskeletal Theory 411 Myers-Scotton, C. 2002. Contact Linguistics: Bilingual Encounters and Grammati- cal Outcomes. Oxford: Oxford University Press. Myers-Scotton, C. 2006. Multiple Voices: An Introduction to Bilingualism. Malden: Blackwell. Nygård, M. 2013. Discourse Ellipses in Spontaneous Spoken Norwegian: Clausal Architecture and Licensing Conditions. Doctoral dissertation, NTNU Trondheim. Nygård, M. and Åfarli, T. A. 2013. The Structure of Gender Assignment and Ameri- can Norwegian. Paper presented at the 4th Annual Workshop on Immigrant Lan- guages in the Americas, University of Iceland, September 19. Parsons, T. 1990. Events in the Semantics of English. Cambridge, MA: MIT Press. Pesetsky, D. and Torrego, E. 2004. Tense, case and the nature of syntactic categories. In The Syntax of Time, J. Guéron and J. Lecarme (eds.), 495–538. Cambridge, MA: MIT Press. Picallo, M. C. 1991. Nominals and nominalizations in Catalan. Probus 3: 279–316. Picallo, M. C. 2008. Gender and number in Romance. Lingue e Linguaggio VII: 47–66. Pietroski, P. 2007. Systematicity via monadicity. Croatian Journal of Philosophy 7: 343–374. Poplack, S. 1980. Sometimes I’ll start a conversation in Spanish Y TERMINO EN ESPAÑOL: Toward a typology of code switching. Linguistics 18: 581–616. Poplack, S. 1981. Syntactic structure and social function of code-switching. In Latino Language and Communicative Behavior, R. P. Duran (ed.), 169–184. Norwood: NJ: Ablex Publishing Corporation. Poplack, S. 2004. Code-switching. In Sociolinguistics: An International Handbook of the Science of Language, U. Ammon, N. Dittmar, K. J. Mattheier and P. Trud- gill (eds.), 589–596. Berlin: Mouton de Gruyter. Poplack, S. and Dion, N. 2012. Myths and facts about loanword development. Lan- guage Variation and Change 24: 279–315. Ramchand, G. 2008. Verb Meaning and the Lexicon: A First Phase Syntax. Cam- bridge: Cambridge University Press. Reinhart, T. 1997. Wh-in-situ in the framework of the minimalist program. Natural Language Semantics 6: 29–56. Riksem, B. R., Grimstad, M. B., Åfarli, T. A. and Lohndal, T. 2014. The inadequacy of feature-based lexicalist theories: A case-study of American Norwegian. Paper presented at The Fifth Annual Workshop on Immigrant Languages in the Ameri- cas, UCLA, Los Angeles, October 17–19. Riksem, B. R. In press. Language mixing in American Norwegian noun phrases. Journal of Language Contact. Ritter, E. 1993. Where’s gender?. Linguistic Inquiry 24: 795–803. Rizzi, L. 1990. Relativized Minimality. Cambridge, MA: MIT Press. Rizzi, L. 2001. Relativized minimality effects. In The Handbook of Contemporary Syntactic Theory, M. Baltin and C. Collins (eds.), 89–110. Malden: Blackwell. Sankoff, D. and Poplack, S. 1981. A formal grammar for code-switching. Research on Language and Social Interaction 14: 3–45. Schein, B. 1993. Plurals and Events. Cambridge, MA: MIT Press. Starke, M. 2001. Move Reduces to Merge: A Theory of Locality. Doctoral disserta- tion, University of Geneva. Thomason, S. G. 2001. Language Contact: An Introduction. Edinburgh: Edinburg University Press. 412 Multilingualism and Formal Grammar Thomason, S. G. 2003. Contact as a source of language change. In A Handbook of Historical Linguistics, R. D. Janda and B. D. Joseph (eds.), 687–712. Oxford: Blackwell. Toribio, A. J. 2001. On the emergence of bilingual code-switching competence. Bilingualism: Language and Cognition 43: 203–231. Treffers-Daller, J. 2005. Evidence for insertional codemixing: Mixed compounds and French nominal groups in Brussels Dutch. International Journal of Bilingual- ism 9: 477–508. Trosterud, T. 2001. Genustilordning i norsk er regelstyrt. Norsk Lingvistisk Tidsskrift 19: 29–58. van Gelderen, E. and MacSwan, J. 2008. Interface conditions and code-switching: Pronouns, lexical DPs, and checking theory. Lingua 118: 765–776. van Hout, A. 1996. Event Semantics of Verb Frame Alternations. Doctoral disserta- tion, Tilburg University. Williams, E. 1981. Argument structure and morphology. The Linguistic Review 1: 81–114. Winford, D. 2003. An Introduction to Contact Linguistics. Malden: Blackwell. Winford, D. 2009. On the unity of contact phenomena and their underlying mecha- nisms. In Multidisciplinary Approaches to Code Switching, L. Isurin, D. Winford and K. de Bot (eds.), 279–306. Amsterdam: John Benjamins. Woolford, E. 1983. Bilingual code-switching and syntactic theory. Linguistic Inquiry 14: 520–536. Zeijlstra, H. 2008. On the syntactic flexibility of formal features. In The Limits of Syntactic Variation, T. Biberauer (ed.), 143–174. Amsterdam: John Benjamins. Zeijlstra, H. 2012. There is only one way to Agree. The Linguistic Review 29: 491–539. 14 Grammatical Gender in American Norwegian Heritage Language Stability or Attrition?*

with Marit Westergaard

14.1 Introduction In his seminal study, Corbett (1991: 2) states that “[g]ender is the most puz- zling of the grammatical categories”. It involves the interaction of several components: morphology, syntax, semantics, phonology, as well as knowl- edge about the real world. Languages also differ in terms of how many (if any) genders they have. This means that gender is a property of language which must be inferred from the input to which both child and adult learn- ers of a language have to be finely attuned. We follow Hockett (1958: 231) in defining gender as follows: “Genders are classes of nouns reflected in the behavior of associated words”. This means that gender is expressed as agreement between the noun and other elements in the noun phrase or in the clause and that affixes on the noun expressing, for example, case, number or definiteness are not exponents of gender (Corbett 1991: 146). We refer to the marking on the noun itself as an expression of declension class (cf. Enger 2004, Enger and Corbett 2012; see also Kürschner and Nübling 2011 for a general discussion of the difference between gender and declension class in the Germanic languages). This has an interesting consequence for the definite article in Norwegian, which is a suffix (more on this later). A distinction is also commonly made between gender assignment and gender agreement. Gender assignment is what is typically referred to as an inherent property of the noun, for example, bil(M),

“car”, and hus(N), “house”, while gender agreement refers to agreement on other targets that is dependent on the gender of the noun, for example, the indefinite articles and adjectives in enM finM bil(M), “a nice car”, and etN 1 fintN hus(N), “a nice house”. The literature also differentiates between lexi- cal versus referential gender (Dahl 2000), or in the terminology of Corbett (1991), syntactic versus semantic gender. The former refers to the inherent and invariable gender of a noun, for example, papa, “daddy” in Russian, which is always masculine, whereas the other refers to cases where gender depends on the referent, for example, vrac, “doctor”, which may take either feminine or masculine agreement. 414 Multilingualism and Formal Grammar In this article, we provide a case study of gender assignment in a popula- tion of heritage speakers of Norwegian who have lived their entire lives in America, often without ever visiting Norway. We follow Haugen (1953) in referring to this variety as American Norwegian, and here we study whether the use of gender differs in any way from the traditional use of gender in Norwegian dialects. We are also interested in the nature of possible dis- crepancies. This will provide important information on how gender systems may change over time, especially in contexts with reduced input and use, and we compare the situation in American Norwegian to heritage Russian spoken in the US. As Polinsky (2008: 40) emphasizes, “[s]ince very little is actually known about heritage language speakers, studying different aspects of language structure in this population is important”. The current chap- ter contributes to this end in that it provides an additional investigation into the linguistic structure of heritage languages. The structure of the chapter is as follows: In the next section, we intro- duce gender and its manifestations within the Norwegian noun phrase. Sec- tion 14.3 outlines some relevant background from acquisition and heritage contexts, and section 14.4 introduces our research questions, participants, and methodology. The results are presented in section 14.5, which is fol- lowed by a discussion in section 14.6 and some concluding remarks in section 14.7.

14.2 Gender and the Norwegian Noun Phrase Norwegian dialects traditionally distinguish between three genders: mascu- line, feminine and neuter. While many languages with gender have reliable morphophonological gender cues, for example, Spanish or Italian (where a noun ending in –o marks masculine and –a marks feminine), gender assign- ment in Norwegian is nontransparent. That is, from just hearing a noun, for example, bil, “car”; bok, “book”; or hus, “house”, a learner cannot make out its gender. It is only when nouns appear with associated words that the gender can be identified, for example, the indefinite article, as in en.n bil(M), ei.f bok(F), and et.n hus(N). Nevertheless, Trosterud (2001) proposes forty-three different assignment rules and argues that they may account for 94% of all nouns in the language. These assignment rules include three gen- eral rules, nine morphological rules, three phonological rules, and twenty- eight semantic rules. However, each rule has numerous exceptions, making it less clear if or how this rule-based account could actually predict gender in acquisition situations. Thus, we follow Rodina and Westergaard (2013, 2015a, b) in assuming that the acquisition of gender in Norwegian is opaque and must be learned noun by noun. This makes Norwegian gender a chal- lenging property to acquire in a heritage language situation, where there is typically reduced input (see O’Grady et al. 2011). Norwegian has two written standards, Nynorsk and Bokmål, the latter being by far the dominant one (see Venås 1993 for more information about Grammatical Gender in American Norwegian 415 the Norwegian language situation). In Bokmål, all feminine nouns may take masculine agreement, which means that this written variety may use only two genders, common and neuter. The historical reason for this is that Bok- mål is a development of the Danish written standard, and in Danish (as well as in Swedish and Dutch) the gender system has been reduced from one that distinguished three genders to one that generally only has two. The three- gender system has generally been retained in spoken Norwegian, in virtually all dialects (except Bergen and parts of Oslo). However, some recent studies indicate that a change from a three-gender system to a two-gender system is underway in the Tromsø dialect (Rodina and Westergaard 2015b). More about this later. Norwegian noun phrase syntax is relatively complex, and it has been extensively discussed in the literature; see Delsing (1993), Vangsnes (1999), and Julien (2005). Here we only discuss aspects of the noun phrase that are relevant for gender. Norwegian dialects also differ considerably with respect to the specific morphological marking on nouns. Table 14.1 pro- vides an overview of the three-way gender system (based on the written Bokmål norm). Gender in Norwegian is mainly expressed inside the noun phrase (and on predicative adjectives, not discussed in this article). Thus, gender is marked on the indefinite article, for example, en “a.m”, ei “a.f”, and et “a.n”, and on adjectives, where we find syncretism between M and F forms.2 As shown in Table 14.1, the definite article in Norwegian is a suffix, for example, hasten, “the horse”; senga, “the bed”; huset, “the house”. Some traditional grammars of Norwegian analyze the postnominal definite suffix as an expression of gender (e.g., Faarlund, Lie and Vannebo 1997), mainly because it is derived diachronically from postnominal demonstratives (sepa- rate words), which used to be marked for gender. Given our definition in section 14.1, however, these suffixes do not express gender but should be considered to be declension class markers.

Table 14.1 The Traditional Three-Gender System of Norwegian

Gender Masculine Feminine Neuter

SG Indefinite en hest ei seng a bed et hus a horse a house Definite hesten horse.def senga huset house.def bed.def Double den hesten den senga det huset definite that horse.def that bed.def that house.def Adjective en fin hest ei fin seng et fint hus a nice horse a nice bed a nice house Possessive min hest/hesten min mi seng/senga mi mitt hus/huset mitt my horse/ my bed/ my house/ horse.def my bed.def my house.def my 416 Multilingualism and Formal Grammar Since the definite suffix is sometimes considered to express gender, also in current work (e.g., Johannessen and Larson 2015), it is worth pausing to consider the evidence in favor of suffixes being declension class markers. This view is most prominently articulated by Lødrup (2011), based on a careful investigation of (a variety of) the Oslo dialect, where the feminine gender is argued to have been lost. The main piece of evidence is that despite the –a suffix (definite article) appearing on previously feminine nouns, all associated words are inflected as masculine in this dialect. Thus, the pattern is en bok, “a.m book”, but boka, “the book” (with the definite suffix for feminines). All adjectives and possessives are masculine, with the excep- tion of certain instances of postnominal possessives. Together, these facts indicate that the gender of these nouns is M and that the suffix is indicat- ing something that is not gender. Lødrup (2011), following Enger (2004), argues that the suffix expresses declension class, the inflection that is used for definite forms. As Alexiadou (2004: 25) points out, “[. . .] inflection class [. . .] is never relevant for the purposes of agreement. It merely groups nouns into classes, which do not determine any further properties”. In essence, then, the distinction between gender markers and declension class markers is based on different properties: The latter is always a bound morpheme and appears on the noun itself, whereas the former do not appear on the noun. Following Corbett and Fedden (2016), it could be argued that in systems where gender markers and declension class markers align, we have a canoni- cal gender system, whereas the Oslo dialect exhibits a noncanonical gender system, where the definiteness suffix does not encode gender. Gender is also marked on possessives, which may be either pre- or post- nominal. Note that the noun is marked for definiteness when the possessor appears after the noun. In contrast, the definite suffix is impossible if the possessor is prenominal. According to Anderssen and Westergaard (2012), who have investigated both the NoTa corpus of adult speech (Oslo)3 as well as a corpus of child-directed speech recorded in Tromsø (Anderssen 2006), the frequency of the postnominal possessor construction is much higher than the prenominal one (attested approximately 75%). The proportion of the postnominal possessor construction has been found to be even higher in American Norwegian heritage language, as the majority of the speakers investigated (N = 34) produce virtually only this word order (Westergaard and Anderssen 2015). This is relevant for our investigation of gender, as it has been argued that the possessor is not an exponent of gender when it is placed postnominally (cf. Lødrup 2011). This means that it could be treated like a declension class marker just like the definite suffix, and as just mentioned, the postnominal possessive also retains the feminine form much more than the prenominal one. We return to this in section 14.4. Finally, we should note that Norwegian exhibits a phenomenon called double definiteness, requiring that definiteness be marked twice in certain contexts, notably in demonstratives and in modified noun phrases. This means that definiteness is marked both on a prenominal determiner and Grammatical Gender in American Norwegian 417 on the suffix. While double definiteness adds complexityto the Norwegian noun phrase, it is also worth noting that in case of the prenominal deter- miner, there is again syncretism between M and F forms (cf. Table 14.1).

14.3 Grammatical Gender in Acquisition and Attrition

14.3.1 The Acquisition of Gender Grammatical gender is a complex linguistic phenomenon. A child or a sec- ond-language learner acquiring a language with gender thus often has to internalize a range of different cues that contribute to determining the gen- der of a given noun. For the acquisition of grammatical gender in Norwe- gian, the lack of transparency of gender assignment has been shown to be a major challenge. While gender is typically acquired around the age of three in languages with a transparent gender system, such as Russian (e.g., Gvoz- dev 1961) or many Romance languages (e.g., Eichler, Jansen and Müller 2012, on various bilingual Romance-German combinations), gender has been shown to be in place relatively late in Norwegian. Based on corpora of two monolingual and two bilingual (Norwegian-English) children (age approximately 2–3), Rodina and Westergaard (2013) found considerable overgeneralization of masculine forms (by far the most frequent forms in the input) to both feminine and neuter nouns (63% and 71%, respectively). In a more recent experimental study of somewhat older children and adults, Rodina and Westergaard (2015b) find that neuter gender is not in place (at 90% accuracy; cf. Brown 1973) until the age of approximately 7. It is also shown that the feminine is even more vulnerable among the older children. Rodina and Westergaard argue that this latter finding is due to an ongoing change in the dialect (Tromsø) from a three-gender system to a two-gender system, common and neuter. In both studies, they also show that, while proper gender forms such as the indefinite article are late acquired, the cor- responding declension class markers (e.g., the definite suffix) are target-con- sistently in place from early on. In fact, the acquisition pattern for indefinite and definite forms are the mirror image of one another at an early stage, with non-target-consistent production around 90% for the former category and only about 10% for the latter. This means that young children typically produce the masculine form of the indefinite article with nouns of all three genders (e.g., en.m hest(M), “a horse”; en.m seng(F), “a bed”; en.m hus(N), “a house”; cf. Table 14.1), while the definite suffix is target-consistent (hesten, “the horse”; senga, “the bed”; huset, “the house”). Results confirming this pattern are also attested in an experimental study of bilingual Norwegian- Russian children (Rodina and Westergaard 2015a). These findings show that learners do not create an immediate link between the definite suffix and the agreement forms, indicating that the two belong to different systems and thus support the distinction between gender and declension class in Lødrup (2011). 418 Multilingualism and Formal Grammar 14.3.2 Gender in Heritage Language Situations Over the past twenty years, there has been an increasing focus on the lan- guage of heritage speakers. We adopt the following definition of a heritage language: “A language qualifies as a heritage language if it is a language spo- ken at home or otherwise readily available to young children, and crucially this language is not a dominant language of the larger (national) society” (Rothman 2009: 156; see also, e.g., Rothman 2007, Polinsky 2008, Benma- moun, Montrul and Polinsky 2013). One characteristic of heritage gram- mars is that they may be different from that of speakers acquiring the same language as a majority language due to incomplete acquisition (e.g., Polin- sky 1997, 2006, Montrul 2002, 2008, Sorace 2004, Tsimpli et al. 2004) or attrition (e.g., Pascual y Cabo and Rothman 2012, Putnam and Sán- chez 2013). That means that a heritage language grammar may represent a change compared to the grammar of the previous generation as well as the relevant nonheritage variety. The baseline language for a heritage speaker is the language of exposure during childhood. This means that a heritage speaker of Russian in the US should not strictly speaking be compared to a speaker of Russian in Russia. This makes studying heritage languages quite challenging, given that it is often difficult to establish the relevant properties of the primary linguistic data that the learners have been exposed to. Because of this lack of data across generations, a comparison is often made between the heritage lan- guage grammar and the nonheritage variety—with the caveat that the latter does not necessarily represent the input to the generation of heritage speak- ers studied. This is what we have had to do in the current study. Heritage speakers also differ from non-heritage speakers of the same language with respect to the amount of variation attested in their production; while some speakers have a fairly stable grammar, others display a more variable gram- mar, not applying rules consistently (see Montrul 2008 for discussion). It is well known that for heritage speakers, the amount of input and use of the language during childhood varies (see Montrul, Foote, and Perpi- nan 2008, among many others). Given the complexity of gender, it is to be expected that heritage speakers face difficulties with this part of the gram- mar. This has been investigated for Russian heritage language in the US by Polinsky (2008). Like Norwegian, Russian has three genders: masculine, feminine, and neuter; see Corbett (1991: 34–43) and Comrie, Stone, and Polinsky (1996: 104–117) for further details and references. According to Corbett (1991: 78) the distribution of the three genders is M 46%, F 41%, and N 13%. Gender agreement is marked on adjectives, participles, demon- stratives, possessive pronouns, past tense verbs and some numerals, and gender assignment is relatively transparent in that M nouns typically end in a consonant, F nouns in –a, and N nouns in –o. There are also some classes of nouns with nontransparent gender assignment. Grammatical Gender in American Norwegian 419 Given somewhat reduced input, heritage speakers are typically exposed to fewer cues for gender assignment than children learning nonheritage Russian. Polinsky (1997, 2006) shows that less proficient American Rus- sian speakers do not fully master the complex system of declension classes. In Polinsky (2008: 55), she demonstrates that two new gender systems have developed among the heritage speakers, both somewhat different from that of the nonheritage variety: (1) a three-gender system used by the more proficient speakers, differing from the nonheritage variety in that opaque N nouns ending in an unstressed –o are produced with F gender (i.e., they are pronounced with a schwa and therefore were confused with the femi- nine ending –a), and (2) a two-gender system produced by the less profi- cient speakers where all N nouns have migrated to F. It is speculated that the latter speakers do not master the complex system of declensional case endings, and in the absence of this knowledge, they are relying on a purely phonological cue, that is, whether the noun in its base form (nominative singular) ends in a consonant or a vowel. The two systems are described in (1) and (2):

(1) More proficient speakers: Three-gender system a. nouns ending in a consonant are M b. nouns ending in a stressed –o are N c. all other nouns are F (i.e., including nouns ending in an unstressed –o, which are N in non-heritage Russian) (2) Less proficient speakers: Two-gender system a. nouns ending in a consonant are M b. nouns ending in a vowel are F

In a recent study of Norwegian-Russian bilingual children growing up in Norway (age 4–8), Rodina and Westergaard (2015a) find an even more reduced gender system in some of the children. The amount of input is argued to be crucial: While children with two Russian-speaking parents are virtually identical to monolingual children growing up in Russia, the bilin- guals with the least amount of input (only one Russian-speaking parent who does not use Russian consistently with the children) have considerable prob- lems with gender, not just the opaque nouns but also the transparent ones. In fact, some of these children produce almost exclusively masculine forms, overgeneralizing them to feminine nouns 77% and to neuters as much as 94%, which means that they do not seem to have any gender distinctions at all. Since these children are only up to eight years of age, follow-up studies are necessary in order to find out whether they will eventually converge on the target, or whether they are developing a Russian heritage variety with- out gender. 420 Multilingualism and Formal Grammar 14.3.3 Gender and Diachronic Change It is well known that M and F genders have collapsed into common gender (C) in many Germanic languages and dialects. This change has taken place, for example, in Dutch, Danish, and the Bergen dialect of Norwegian (Jahr 1998, Nesse 2002, Trudgill 2013). Furthermore, Conzett, Johansen, and Sollid (2011) have attested a similar change in certain dialects in North Norway (Kåfjord and Nordreisa). This region has had extensive language contact with Saami and Kven/Finnish, languages which do not have gram- matical gender. This language contact is argued to have caused a reduction of the gender system of the Norwegian spoken in this area from three to two (C and N). At the same time the declension system is intact. This means that while the feminine indefinite article ei “a.F” is virtually nonexistent in the data, the corresponding definite suffix still has the –a ending typical of F nouns. This is illustrated in (3):

(3) a. en bok - boka a.c book.c - book.f.def

This pattern is identical to what Lødrup (2011) found for Oslo speech (cf. section 14.2). The cause of the change in Oslo is generally argued to be sociolinguistic: The Bokmål written standard allows the use of only two genders, and a spoken version of this variety enjoys a high social prestige in certain speaker groups. Thus, the three-gender system of the traditional dialects has gradually become associated with something rural and old- fashioned. The pattern attested means that a reduced gender system has developed in both areas (common and neuter), but at the same time a more complex declension system, in that the new common gender has two declen- sion classes in the definite form, that is, en bil—bilen, “a car—the car”, and en bok—boka, “a book—the book”. Even more recent research is providing us with data on a real-time case of language change. Based on an experimental study, Rodina and Westergaard (2015b) demonstrate that F gender is rapidly disappearing from the speech of children and young adults in Tromsø: The F indefinite article is replaced by M, yielding common gender, but as in Oslo and Kåfjord/Nordreisa, the definite suffix is still preserved in its F form. Note that this pattern is also identical to what has been attested in early Norwegian child language (cf. section 14.3.1). While Rodina and Westergaard (2015b) also assume that that the cause of this change is sociolinguistic, they argue that the nature of the change is due to acquisition: While the N forms are saliently differ- ent from the other two genders, there is considerable syncretism between M and F (e.g., adjectives and prenominal determiners), making it more difficult to distinguish the two in the acquisition process (cf. ableT 14.1). Furthermore, while the real gender forms are very late acquired (around age 5–7), the declensional suffixes are target-consistently in place very early Grammatical Gender in American Norwegian 421 (around age 2), compare with Anderssen (2006) and Rodina and Wester- gaard (2013). Thus, the late acquired forms are the ones that are vulnerable to change. The three studies briefly presented here demonstrate that F gender is dis- appearing or already lost from several Norwegian dialects. We would thus expect that F gender should be vulnerable in an acquisition context where there is somewhat reduced input, for example, in a heritage language situa- tion. In the following sections, we present our study of gender in American Norwegian.

14.4 Our Study: Participants, Hypotheses, and Methodology

14.4.1 Norwegian Heritage Language in America According to Johannessen and Salmons (2015: 10), Norwegian immigration started in 1825, when the first Norwegians arrived in New York. By 1930, as many as 810,00 people had arrived in the US and an additional 40,000 in Canada. In the US, they settled mostly in the Midwest, predominantly in the Dakotas, Illinois, Iowa, Minnesota, and Wisconsin. The Norwegians built churches and schools and also had their own newspapers, Decorah-Posten and Nordisk Tidende. According to Johannessen and Salmons (2015: 6) 55,465 people reported Norwegian as their home language in the 2000 US Census. However, most of the current heritage speakers are older than sev- enty years of age. American Norwegian as a heritage language can thus be said to be in its final stages (cf. Johannessen and Salmons 2012). American Norwegian was first documented and studied by Haugen (1953), based on fieldwork in the late 1930s and 1940s and subsequently, this heritage language was studied by Hjelde (1992, 1996). More recently, extensive fieldwork has been conducted in connection with the NorAm- DiaSyn project, and data have been collected from a number of second- through fourth-generation immigrants who learned Norwegian as their first language (L1) from parents and grandparents. According to Haugen (1953: 340), the first immigrants were from the west coast of Norway, but around 1850, large numbers came from rural eastern parts of Norway (Johannes- sen and Salmons 2015: 10). It is mainly these eastern varieties that are spo- ken today: Johannessen and Salmons (2015) remark that in 2010 it was difficult to find speakers of western dialects. For most of the immigrants, there was little or no support for Norwegian language in the community. Consequently, these speakers have generally been bilingual since the age of five to six, and they have been dominant in English since this time. The background information offered about the corpus participants is relatively sparse: Year of birth, language of schooling and confirmation, literacy in Norwegian, and number of visits to Norway, as well as other contact with the country. In addition, we know which generation immigrant they report 422 Multilingualism and Formal Grammar to be, and for some of them, the year their family arrived in the US. There is no information about the amount of use of Norwegian in adulthood. The language of schooling is English for all of them (except two informants for which this information is missing), and the large majority (43/50) had their confirmation in English. Contact with Norway varies between “some” and “often”, and many have never visited the country. Typically, these heritage speakers have never had any instruction in Norwegian, and most of them have no literacy skills in the language. The majority of the participants are between seventy and one hundred years old today, and as they have not passed on the language to the next generation, they do not have many people to communicate with in Norwe- gian. Thus, most of these heritage speakers hardly ever use Norwegian any more, and at the time of the CANS recordings, many of the participants had not uttered a word of Norwegian for years, one participant for as long as fifty years. The initial impression of their Norwegian proficiency is that it is quite rusty, but once these speakers warm up, many properties of the language turn out to be intact (Johannessen and Laake 2015). Given the language profile of these learners (monolingual Norwegian speakers until school age, predominantly English dominant in adult life, and hardly using Norwegian at all in old age) it is possible that any discrepancies between their language and the nonheritage variety should be because of attrition rather than incomplete acquisition. So far, data from fifty informants have been transcribed and now make up the Corpus of American Norwegian Speech (CANS; Johannessen 2015). This corpus consists of speech data collected through interviews (by an investigator from Norway) and conversations among pairs of heritage speakers. Each recording lasts approximately a half hour to an hour, mean- ing that there is relatively sparse data per informant.

14.4.2 Hypotheses and Predictions Based on the properties of the gender system of Norwegian and previous research on gender in acquisition and change, we formulate the following hypotheses and predictions for American Norwegian:

(4) Hypotheses A. Gender is vulnerable in American Norwegian B. Gender forms and declensional suffixes behave differently C. F is more vulnerable than N due to syncretism with M (5) Predictions A. Speakers will overgeneralize M gender forms B. Declensional suffixes will be retained C. F will be affected first; that is, (some speakers of) American Norwe- gian will have a two-gender system (common and neuter) Grammatical Gender in American Norwegian 423 We expect gender to be vulnerable in a situation with reduced input such as Norwegian heritage language, especially given the nontransparency of the gender system and the relatively late acquisition attested by Rodina and Westergaard (2015b). We also expect to see a difference between forms that express gender proper (i.e., agreement) and the declensional endings, which has been attested in previous research on both acquisition and change (e.g., Lødrup 2011, Rodina and Westergaard 2013). Finally, as in Russian heritage language and in many Germanic varieties, we may also see reduc- tions in the gender system, either from a three- to a two-gender system (common and neuter) or to a system where gender breaks down completely.

14.4.3  Methodology We have used CANS to probe the usage of gender in American Norwegian. We have generally excluded English loan words appearing with gender mark- ing (see Flom 1926, Hjelde 1996, Nygård and Åfarli 2013, Alexiadou et al. 2014 on this issue).4 Our main focus here is on gender assignment, and we have therefore also disregarded agreement between different gender forms within the nominal phrase. We have searched CANS for the following forms.

(6) a. the indefinite article followed by a noun (occasionally with an inter- vening adjective) b. possessives c. definite forms

We have also compared the data from the CANS corpus to a sample of the Nordic Dialect Corpus (Johannessen et al. 2009). This allows us to compare the gender system of American Norwegian to that of contemporary Norwe- gian. We would like to emphasize that we obviously do not assume that the heritage speakers recorded in the CANS corpus were exposed to a variety of Norwegian that is identical to the nonheritage variety spoken today. But we are interested in investigating possible changes in the heritage variety, pos- sibly across several generations, and these are the data we have available to make the comparison. We have used the part of the Nordic Dialect Corpus which covers the dialects spoken in the eastern part of Norway (excluding the capital, Oslo), the area from which most of the ancestors of the heritage speakers originate. The Nordic Dialect Corpus consists of structured conver- sations between speakers of the same dialect and as such, the two corpora are comparable with respect to the recording situations. In the Nordic Dialect Corpus, speakers are classified as either “old” (over 50) or “young” (under 30), where most of the informants in the two groups are in their sixties and twenties, respectively. The corpus was recorded between 2008 and 2011. Both corpora have been transcribed into a dialect version and a standard- ized Bokmål transcription. The corpora are tagged, and the transcriptions are directly linked to the recordings. In CANS, we found that in several 424 Multilingualism and Formal Grammar cases, the Bokmål transcription had standardized the gender according to the Bokmål official dictionary, even when the informants actually used a dif- ferent gender. Thus, we have had to check the recordings carefully in order to be sure that we had reliable transcriptions. We generally did not find errors in the dialect version (corresponding to the pronunciation), which made us trust that this transcription is sufficiently correct for our present purposes. Furthermore, there are some instances where the F indefinite arti- cle has been transcribed simply as /e/. We have listened to all of these and in all cases the informants seem to be saying the feminine form /ei/. They have therefore been counted as occurrences of the F indefinite article. Compound nouns (e.g., skolehus, “schoolhouse”) have been counted sepa- rately. In Norwegian, the right-hand part of the compound is always the head noun and thus determines the gender. For several of the compound words in the corpus, the right-hand noun also occurs independently (e.g., hus, “house”). Instances in which the noun was not uttered completely were disregarded. In cases where speakers correct themselves as in (7a), we only counted the latter form. Examples have also been counted if they occur in what would be con- sidered an ungrammatical or unidiomatic structure in Norwegian, for exam- ple, (7b), which is presumably a direct translation of an English expression.

(7) a. ei # en familie (flom_MN_02gm) a.f # a.m family.m b. vi hadde en god tid (portland_ND_01gm) we had a.m good time.f Target form (intended meaning): vi hadde det morsomt [lit.: we had it fun]

With these methodological considerations in mind, let us move on to the results of our study.

14.5 Results

14.5.1 Gender Marking on the Indefinite Article—Overall Results Our search in CANS first of all revealed that all three gender forms are attested in the data. Examples illustrating the use of the three indefinite arti- cles en, ei, and et (M, F, and N) are provided in (8) and (10). In these exam- ples, the gender marking is entirely in line with what we would expect in present-day nonheritage Norwegian. It is also worth noticing that although there is some language mixing between English and Norwegian here, the sentences are predominantly Norwegian in structure and lexicon.

(8) vi kjøpte en butikk (blair_WI_04gk) we bought a.m store(m) “we bought a store” Grammatical Gender in American Norwegian 425 (9) og ei uke sia så h- visita vi parken i Blair her (blair_WI_01gm) and a.f week(f) ago so visited we park.def in Blair here “a week ago we visited the park in Blair here” (10) we got har bare et tre (coon_valley_WI_04gm) we got have only a.n tree(n) “we only got one tree”

In a study of the Nynorsk dictionary (Hovdenak et al. 1998), which is the written norm that is closest to the contemporary dialects, Trosterud (2001) has found that out of the 31,500 nouns listed there, 52% are M, 32% are F, and 16% are N. These numbers are somewhat different from the distribution in the spoken language. Rodina and Westergaard (2015b) have investigated proportions of the indefinite article in a corpus of child and child-directed speech recorded in the mid-1990s (Anderssen 2006) and found that M forms are even more frequent in the input than in the dictionary, 62.6%, while the F and N forms are more or less equally represented, 18.9% and 18.5%, respectively (N = 2,980). We have investigated the occurrences of the three indefinite articles in the Nordic Dialect Corpus, and we find that the distri- bution in the data of the “old” speakers is virtually identical to Rodina and Westergaard’s (2015b) findings; see Table 14.2. In the data of the “young” speakers, on the other hand, the F indefinite article is only attested 5.4%, while the proportion of M forms has increased to 74.9%. We believe that it is likely that these numbers reflect an ongoing change involving the loss of F forms also in these dialects, just like in Oslo and Tromsø (cf. section 14.3.3). A careful study of the Nordic Dialect Corpus in order to confirm (or discon- firm) this hypothesis has to be left for future research. In Table 14.2, we have also provided the relevant counts from the CANS corpus. Overall, the figures for the heritage speakers indicate that gender is relatively stable in American Norwegian, as they are quite similar to the older speakers in the Nordic Dialect Corpus, except for a lack of neuter forms. However, a closer look reveals that the heritage speakers are over- generalizing the M gender forms quite substantially to both F and N nouns. We now turn to a discussion of these discrepancies between the CANS cor- pus and forms found in present-day spoken Norwegian.

Table 14.2 Token Distributions of the Three Indefinite Articlesen (M), ei (F), and et (N), in CANS and in Eastern Norwegian Dialects (Nordic Dialect Corpus)

Gender CANS (N = 50) NorDiaCorp NorDiaCorp (old, N = 127) (young, N = 66)

M 76.3% (753) 64.8% (1833) 74.9% (909) F 16.9% (165) 18.2% (514) 5.4% (66) N 6.9% (67) 17.0% (481) 19.7% (239) 426 Multilingualism and Formal Grammar 14.5.2 Overgeneralization—Indefinite Articles Although all gender forms are represented in the corpus, and gender thus appears to be relatively stable, there are several cases of what we will refer to as non-target-consistent forms, that is, forms that are different from what would be expected in nonheritage Norwegian. When determining the gen- der of nouns in nonheritage Norwegian, we have used the Nynorsk Diction- ary with some adjustments for differences between the dictionary and the gender typically found in Eastern Norwegian dialects.5 In this section, we consider nouns with the indefinite article, either by itself or together with an adjective. We first consider all noun occurrences (tokens) and then the number of different nouns (types) appearing in the corpus. In the corpus, we find 236 occurrences that are F nouns. As many as 39.0% (92/236) of these appear with M gender; see (11) through (13):

(11) og om in in en uke da så # kom han til byen igjen (rushford_MN_01gm) and about in in a.M week.F then so came he to city again (12) og # tre brødre og r- en s- # en søster (blair_WI_04gk) and three brothers and a.M a.M sister.F (13) ja # em # har du # har du en ku enda? (coon_valley_WI_01gk) yes em have you have you a.M cow.F still?

We should note that there is considerable variation between M and F forms used with some F nouns in the corpus. For example, datter, “daughter”, occurs both with F and M indefinite articles. Speakers appear to be consis- tent and typically do not alternate. However, given the sparse data in CANS, we very often find that a speaker only produces one or two instances of the same noun. For this reason, we cannot address the question of speaker consistency. Turning to the neuter, we find 164 nouns that are N according to the Nynorsk dictionary and our Eastern Norwegian adjustments. Of these, as many as 48.8% (80/164) appear with the M indefinite article. Examples are provided in (14) through (16):

(14) # bestemor # var født # hun var på en fjell ## (chicago_IL_01gk) grandma was born she was on a.m mountain.n (15) fire år # og en en år at to the university (wanamingo_MN_04gk) four years and a.m a.m year.f at to the university (16) fikk jeg en pass (coon_valley_WI_02gm) got I a.m passport.n

There are also occasional N nouns appearing with F gender forms, 10.4% (17/164); see the examples in (17)-(19). Considering the current trend in Norway with F gender in the process of disappearing, it is rather surprising that there is overuse of feminine forms. Grammatical Gender in American Norwegian 427 (17) det var ei menneske (westby_WI_05gm) it was a.f human.being.n (18) han var her det var ei bryllup (harmony_MN_02gk) he was here it was a.f wedding.n (19) jeg tror ikke jeg sa ei eneste norsk ord (harmony_MN_04gm) I think not I said a.f single Norwegian word.n

Finally, we found four examples of non-target-consistent gender on M nouns, in all cases produced with the F indefinite article. This amounts to only 0.7% (4/576). We now take a closer look at the number of actual nouns involved (types). Because of the very low number of non-target-consistent M nouns, we only consider F and N. The list in (20) provides all F nouns that occur with the target-consistent indefinite article (altogether 51 nouns), where the ones in bold are sometimes produced with M (10 nouns). In (21) we find twenty- one F nouns that always appear with M gender in the corpus. In total, there are seventy-two different F nouns, of which thirty-one are either always or sometimes produced with M gender forms. This means that overgeneraliza- tion of types is 43.1% (31/72), which is similar to the frequency of noun tokens reported above, 39.0%.

(20) F=F: stund, “time”; søster, “sister”; kanne, “mug”; trå, “yearning”; side, “side”; kjerring, hag”; seng, “bed”; uke, “week”; jente, “girl”; lefse, “lefse”; kiste, “coffin”; mølle, “mill”; øks, “ax”; tid, “time”; mjølking, “milking”; ku, “cow”; kvige, “heifer”; grøft, “trench”; brødpanne, “bread pan”; bok, “book”; trinse, “caster”; mil, “mile”; høstnatt, “fall night”; datter, “daughter”; dame, “lady”; bjelle, “bell”; tobakksseng, “tobacco bed”; ei [female name removed], “a female name”; bestemor, “grandmother”; hytte, “hut”; frilledatter, “daugh- ter of a mistress”; gryte, “pot”; aure, “trout”; liste, “list”; skrøne, “tall tale”; rumpe, “butt”; stikke, “peg”; pakke, “package”; pike, “girl”; mor, “mother”; trønderskrøne, “tall tale from Trøndelag”; dør, “door”; platform, “platform”; himmelseng, “four-poster bed”; kirke, “church”; tante, “aunt”; hand, “hand”; matte, “mat”; lue, “cap”; bøtte. “bucket”; datter, “daughter” (41 + 10 = 51) (21) F->M: blanding, “mixture”; mil, “mile”; flaske, “bottle”; tale, “speech”; stund, “while”; gruppe, “group”; ordbok, “dictionary”; hast, “haste”; rotte, “rat”; vogn, “wagon”; avis, “newspaper”; pipe, “pipe”; elv, “river”; stripe, “stripe”; kagge, “keg”; purke, “sow”; slekt, “family”; øy, “island”; dialect, “dialect”; klasse, “class”; lære- rinne, “female teacher” (21)

Considering N nouns, (22) lists all the ones that occur with the target- consistent indefinite article (altogether 23 nouns). Nouns in bold also appear with M indefinite article (11 nouns), while nouns that are underlined also 428 Multilingualism and Formal Grammar appear with F (8 nouns). In (23) we find N nouns which only appear with F indefinite article and in (24) N nouns that consistently appear with M indefinite article.

(22) N=N: hotell, “hotel”; par, “pair/couple”; år, “year”; fat, “plate”; brev, “letter”; lass, “load”; hus, “house”; lag, “layer”; hull, “hole”; skolehus, “school”; bilde, “picture”; sted, “place”; fjell, “mountain”; blad, “magazine”; ord, “word”; rom, “room”; leven, “noise”; stykke, “piece”; slag, “blow”; navn, “name”; minutt, “minute”; liv, “life”; problem, “problem” (12 + 11 = 23) (23) N->F: menneske, “human being”; hjem, “home”; bryllup, “wedding”; barnebarn, “grandchild”; papir, “paper” (5) (24) N->M: barnetog, “children’s parade”; farmeår, “farm year”; program, “program”; pass, “passport”; tømmerhus, “log cabin”; tog, “train”; arbeid, “work”; patent, “patent”; dusin, “dozen”; bord, “table”; band, “band”; lys, “light”; oppstuss, “fuss”; eiketre, “oak”; utvan- drermuseum, “emigration museum”; kort, “card”; mål, “measure”; måltid, “meal”; kupp, “bargain”; selvfirma, “independent company”; orkester, “orchestra” (21)

The total number of different N nouns is forty-nine. As many as thirty-four of them (always or sometimes) appear with an M indefinite article (69.4%), while thirteen (always or sometimes) appear with F gender (26.5%). This means that N nouns are quite unstable in the production of these heritage speakers. Table 14.3 summarizes our findings, considering both the total number of noun occurrences (tokens) in the data as well as the number of different nouns (types).

14.5.3 Gender versus Inflection Class As we have seen, many of the F and N nouns in the corpus (always or some- times) occur with an M indefinite article (31/72 and 34/49, respectively), shown in (25) and (27). However, when we consider the definite suffixes on these same nouns, they are usually the feminine –a and neuter –et forms,

Table 14.3 Summary of Noun Tokens and Noun Types Appearing With a Non- Target-Consistent Indefinite Article

Direction Total number of examples Number of different nouns (tokens) (types)

F  M 39.0% (92/236) 43.1% (31/72) N  M 48.8% (80/164) 69.4% (34/49) N  F 10.4% (17/164) 26.5% (13/49) Grammatical Gender in American Norwegian 429 not the masculine –en. This is shown in (26) and (28), where the numbers in parentheses indicate occurrences. In fact, for the neuter nouns, the mas- culine declensional suffix is unattested (cf. Johannessen and Larsson 2015).

(25) en datter, “a daughter”; en tid, “a time”; en kirke, “a church”; en uke, “a week” (26) dattera (24)—datteren (0), tida (206)—tiden (13), kirka (80)—kirken (3), uka (14)—uken (0) (27) en år, “a year”; en tog, “a train”; en hus, “a house”; en lys, “a light” (28) året (31)—åren (0), toget (9)—togen (0), huset (60)—husen (0), lyset (3)—lysen (0)

This mirrors findings from other studies, showing that when the feminine gender is lost, the definite suffix is retained (e.g., Lødrup 2011, Rodina and Westergaard 2015b). This demonstrates that the affixal definite article clearly behaves differently from the free gender morphemes that agree with the noun, for example, the indefinite article, not only in contexts of acquisition and change, as attested in previous research, but also in heritage language. Related to this is the result of our search for possessives in the corpus. Recall from section 14.2 that possessives in Norwegian may appear both in prenominal and postnominal position and that Westergaard and Anders- sen (2015) reported that in Norwegian heritage language, the postnomi- nal construction is the preferred one. First of all, our findings show that the possessives used in the corpus are mainly high frequency kinship terms (more than 90%) of the type illustrated in (29) and (30); thus, they may be rote-learned or memorized and not necessarily be the result of a productive system. We also find that numbers are very low for all possessives except the first-person singular, and this is therefore the only result that is reported here (Table 14.4):

(29) a. mor mi (44) mother my “my mother” b. søstera mi (10) sister.def my “my sister”

Table 14.4 Distribution of Gender Marking for First-Person Possessives in CANS (N = 50)

Gender form Prenominal Postnominal TOTAL

M (min ‘my’) 87/96 (90.6%) 251/414 (60.6%) 338/510 (66.3%) F (mi ‘my’) 0 (–) 126/414 (30.4%) 126/510 (24.7%) N (mitt ‘my’) 9/96 (9.4%) 37/414 (8.9%) 46/510 (9.0%) 430 Multilingualism and Formal Grammar c. bestemora mi (4) grandmother.def my “my grandmother” (30) a. far min (102) father.def my “my father” b. bror min (36) brother.def my “my brother” c. mannen min (35) husband.def my “my husband” Compared to the results in Table 14.2, where the proportion of F indefi- nite articles was only 16.9%, it is a bit surprising that the proportion of F forms is as high as 24.7%. However, as we mentioned earlier, the post- nominal possessor has been argued to be a declension class marker and not an exponent of gender (Lødrup 2011). In this table, we also see that the prenominal possessives behave differently from the postnominal ones, in that the feminine form is attested relatively frequently as a declension class marker (30.4%), and not at all in the gender form (in prenominal position). This difference becomes even clearer when we consider whether the gender forms have been used target-consistently: In Table 14.5, the feminine forms are always produced with M gender in prenominal posi- tion (the gender form), but they are generally retained when occurring postnominally, where we only find occasional non-target forms (both M and N). The fact that the F form is retained postnominally fits well with Lødrup’s (2011) analysis that postnominal possessors behave like declen- sion markers on a par with the affixal F definite endings. Turning to N nouns, we see that they also tend to migrate to M, somewhat more in pre- nominal than postnominal position (30.8% vs. 19.2%). In comparison, the masculine is virtually always produced with target-consistent gender agreement.

Table 14.5 Distribution of Genders for First-Person Possessives in CANS (N = 50), Target and Non-Target-Consistent Forms

Gender Prenominal Postnominal

Target Nontarget Target Nontarget

M 40/40 0 226/228 2 (to N, 0.9%) F 0 43 (to M, 100%) 126/137 7 (to M, 5.1%) + 4 (to N, 2.9%) N 9/13 4 (to M, 30.8%) 21/26 5 (to M, 19.2%) TOTAL 49 47 373 18 Grammatical Gender in American Norwegian 431 14.5.4 Individual Results The individual production results of each of the fifty participants in the cor- pus are provided in the Appendix, for the indefinite article only, as this is the most frequent form produced. As expected, there is a very limited amount of data per informant, so that it is impossible to provide complete profiles of the gender system of each of them. Nevertheless, the participants have been divided into four groups. In Group 1, there are four participants for which no conclusions can be drawn, as the production is too limited (one partici- pant produces no indefinite forms at all and three participants only produce masculine forms—for masculine nouns). In Group 2, we find five partici- pants who may possibly have an intact three-gender system, as they make no mistakes. However, each of them produces so few examples (11, 13, 9, 6, and 6, respectively), and it is therefore possible that this is simply the result of sheer luck in the recording situation. Furthermore, only two of these five produce nouns in all three genders, while the remaining three only produce masculine and feminine nouns, not a single neuter. At the other end of the scale, there are nine informants who may not have gender at all (Group 3). These speakers produce masculine forms only, either for nouns belonging to two of the genders (4 participants) or all three (5 participants). The final group (Group 4) thus contains the majority of informants (32), who pro- duce a mixture of forms. For these, target-consistency varies considerably, from participants making only one mistake (e.g., decorah_IA_01gm), who are thus similar to Group 2, to those who produce only one form that is not masculine (e.g., portland_ND_02gk) and are thus similar to Group 3. There is also variation with respect to which gender is more vulnerable, as some seem to have more problems with feminine nouns (e.g., webster_SD_02gm) and others with the neuter (e.g., coon_valley_WI_06gm), while others again have problems with both (e.g., stillwater_MN_01gm). Eight informants produce no feminine forms, which at first sight could indicate that they have a two-gender system consisting of common and neuter. However, two of them do not produce any feminine nouns at all, and all of them also make a considerable number of mistakes with the neuter. Thus, not a single infor- mant displays a clear two-gender system where the neuter is intact, and the feminine has merged with the masculine into common gender.

14.6 Discussion We now return to our hypotheses and predictions, repeated in (31) and (32) for expository convenience:

(31) Hypotheses A. Gender is vulnerable in American Norwegian B. Gender forms and declensional suffixes behave differently C. F is more vulnerable than N due to syncretism with M 432 Multilingualism and Formal Grammar (32) Predictions A. Speakers will overgeneralize M gender forms B. Declensional suffixes will be retained C. F will be affected first; that is, (some speakers of) American Norwe- gian will have a two-gender system (common and neuter)

In the results in section 14.5.1, we saw that all the three genders are repre- sented in the corpus, and the total numbers give the impression of a fairly stable system. However, when we considered the data in more detail (sec- tion 14.5.2), we saw that there is considerable overgeneralization of M forms of the indefinite article to both F and N nouns (cf. ableT 14.3). The substantial overgeneralization of M to F is unsurprising, given the find- ings from previous studies. However, in the present study there is clearly more overgeneralization affecting neuter than feminine nouns, both when we consider the overall number of occurrences (tokens, 48.8% vs. 39.0%) and the number of different nouns affected (types, 69.4% vs. 43.1%), see Table 14.3. In the prenominal possessives, we find that the feminines are produced with masculine forms 100% and the neuters approximately 31%. Based on these results, we conclude that gender is in fact vulnerable in American Norwegian, and thus that our Hypothesis A has been confirmed. Likewise, we can confirm Prediction A: Although there are a number of cases where neuter nouns migrate to the feminine (10.4% of the total num- ber of neuters (tokens) and 26.5% of the number of different nouns (types), see Table 14.3), it is clear that the general pattern found for non-target- consistent forms is overgeneralization of the masculine. Turning to Hypothesis and Prediction B, we saw in section 14.5.3 that the definiteness suffix behaves very differently from the indefinite article. While feminine and neuter indefinite articles are frequently produced with masculine forms, the definite suffix is always target-consistent in the neuter and mostly also in the feminine. This means that our findings confirm pre- vious research both from acquisition and change (cf. sections 14.3.1 and 14.3.3), where the same distinction has been attested. As mentioned above, we consider the indefinite article to be an exponent of gender, whereas the affix is analyzed as a declension marker. The different behavior of these two elements also in this population of heritage speakers clearly shows that gender forms are much more prone to change than declension markers. The different behavior of the prenominal and postnominal possessives (at least for feminine nouns) also indicates that there is a distinction between the two that may be related to gender (cf. Lødrup 2011). It should be noted here that our claim that gender is vulnerable in Nor- wegian heritage language runs counter to the conclusion reached by Johan- nessen and Larsson (2015). Based on an investigation of a selection of the 50 speakers in CANS, they argue that grammatical gender is not affected by attrition. The main reason for the two different conclusions is that, unlike Grammatical Gender in American Norwegian 433 us, Johannessen and Larsson (2015) do consider the definite suffix as a gen- der marker. And since the form of the suffix is generally retained, they con- sider this evidence that gender is intact. Furthermore, they find that complex noun phrases (determiner-adjective-noun) are much more prone to errors than simple ones (adjective-noun), with 18% (20/113) vs. 2% (1/58) target- deviant agreement. They argue that this shows that gender is unaffected by attrition, since it is target-consistent in simple noun phrases, and they account for the target-deviance in the complex ones as a result of processing difficulties. In our view, another explanation is also possible: Given that the number of noun types in the corpus is quite low and mainly consists of high- frequency nouns, we could argue that the simple noun phrases are more likely to be rote-learned and memorized as chunks than the more complex ones, which require a productive system of gender agreement. Since this is in the process of breaking down, the complex noun phrases display more errors. We then turn to our final hypothesis and prediction (C) and the issue whether F gender is more vulnerable than N and whether we see changes or reductions in the gender system. As discussed earlier (section 14.3.3), this has been attested in Russian heritage language; both a reduction from a three- to a two-gender system (Polinsky 2008) and possibly a breakdown of gender altogether (Rodina and Westergaard 2015a). We also know that a reduction in the gender system has happened in many Germanic vari- eties and is currently taking place in certain Norwegian dialects (cf. sec- tion 14.3.3), that is, a reduction from a three-gender system to a system with just two genders, common and neuter. As noted previously, disappearance of also the neuter gender is not an unlikely scenario, given the nontranspar- ency of the system and the late acquisition of this property of the Norwegian language. The gender system may be further weakened by the consider- able lack of input and use in this heritage language situation. However, as shown in the previous section, we do not find any evidence of a two-gender system in the production of any of these fifty speakers. Instead, we see a general erosion across the whole gender system, with both feminine and neuter nouns migrating to the most frequent gender form, the masculine. In fact, the majority of the speakers (N = 32) behave in this way (Group 4). The end result of this will presumably be a complete breakdown of gender altogether; i.e., a system without gender distinctions. It is possible that this is already attested in the production of the nine speakers in Group 3, who produce only masculine forms. We would like to speculate about the reasons for this development; that is, (1) why is grammatical gender vulnerable in heritage language, (2) why are declension class suffixes stable, and (3) why do we not see evidence of a two-gender system the way we predicted? Our findings partly correspond to what has been found in acquisition and change, that is, proper gender forms such as the indefinite article are late acquired and prone to change, while the declensional suffixes are early acquired and remarkably stable. But we 434 Multilingualism and Formal Grammar do not find a two-gender system (common and neuter), which is attested in some children and which is also the result of changes that have taken place in certain varieties of Norwegian. An obvious answer to the first question corresponds to the general account for the late acquisition of gender in Norwegian, viz. the nontransparency of gender assignment. A system where gender has to be learned noun by noun is crucially dependent on a considerable amount of input. Unfortunately, we do not know much about the input to these speakers in childhood, but it is not inconceivable that it was somewhat limited. Given that gender has been found not to be fully in place until around age six or seven (Rodina and Westergaard 2015b), which is the time when these speakers experienced a language shift, it is possible that this property is the result of incomplete acquisition (e.g., Montrul 2008). However, given the general profile of these heritage speakers mentioned above (monolingual Norwegian speakers until school age, English dominant in their adult lives and hardly using Norwe- gian at all in old age), it is more likely that whatever discrepancies we find between their language and the nonheritage variety is due to attrition. This is further supported by the fact that there is considerable variation among these speakers. If this is the case, then we may speculate on a possible dif- ference between incomplete acquisition and attrition with respect to gender: While the former process typically results in a systematic reduction in the gender system (e.g., from three to two genders), the latter affects an existing system in terms of erosion across the board. That is, incomplete acquisition is the cause of a system that is different from the nonheritage variety (and typically reduced), while the result of attrition is an unsystematic break- down of the system, eventually leading to total loss of grammatical gender. Some support for our speculation may be found in Schmid’s (2002) impor- tant work on German Jews in the US, who had generally also experienced a severe reduction in the use of their L1 over an extended period: The occa- sional mistakes found in gender assignment in the data did not constitute any rule-based reduction in the gender system of their German.6 We then turn to the second question, why declensional suffixes are sta- ble in heritage language. The early acquisition of declensional suffixes is generally accounted for by their high frequency and the fact that they are prosodically favored by young children (Anderssen 2006).7 They may also be initially learned as a unit together with the noun, even though they are not considered to be fully acquired until the relevant nouns also appear in appropriate contexts without the suffix. While prosody is unlikely to be a factor in heritage languages, the other two, frequency and chunking, may be responsible for the robustness of the definite forms. That is, highly frequent nouns (such as the ones typically used by our heritage speakers in the cor- pus) may be stored in memory as units together with the suffix, for example, hesten, “the horse”; senga, “the bed”; huset, “the house”. For this reason, they are easily retrieved, while the indefinite forms must be computed as part of a productive process, for example, en hest, “a horse”; ei seng, “a Grammatical Gender in American Norwegian 435 bed”; et hus, “a house”. In any case, our heritage data provide further evi- dence that the definite suffix does not have a gender feature. If this were the case, we would expect these speakers to make a direct link between this form and (other) gender forms: That is, knowing the definite form of a feminine or neuter noun (e.g., boka, “the book”, or huset, “the house”, should make it easy to produce the target-consistent indefinite forms ei bok, “a book”, and et hus, “a house”. But the data from these heritage speakers show that this is not the case. We therefore conclude that the evidence that we had from acquisition and change from previous studies is now supported by data from a new population. Finally, we address the third question, why there is no systematic reduc- tion from a three- to a two-gender system in the data of the heritage speakers. In several varieties of Norwegian that have undergone (or are undergoing) a change, the result has been the same: disappearance of the feminine and a development of a two-gender system with common and neuter gender. This has been argued to be partly because of sociolinguistic factors such as language contact or the prestige of the written form Bokmål and partly because of the syncretism between masculine and feminine, making it more difficult to distinguish the two in acquisition (e.g., Lødrup 2011, Trudgill 2013, Rodina and Westergaard 2015b). Following up on our speculation above, we would like to suggest that all of these historical developments are due to incomplete acquisition. What we see in our data from the Norwegian heritage speakers, on the other hand, is the result of attrition. If this idea is on the right track, we might have a way to distinguish between the two processes: While incomplete acquisition typically results in a systematic dif- ference between the heritage language and the nonheritage variety, attrition will result in general erosion and considerable variability.8

14.7 Conclusion In this chapter, we have presented an investigation of grammatical gender in a corpus of heritage Norwegian spoken in America, the Corpus of American Norwegian Speech (CANS). The corpus consists of data from fifty speakers, whose linguistic profile is as follows: Monolingual Norwegian until age five or six, English dominant throughout life, and virtually no use of Norwe- gian in old age. Because of the nontransparency of gender assignment, we expected gender to be vulnerable in this situation of reduced input and use. Based on previous research from acquisition and change, we also expected declensional suffixes to be robust and feminine forms to be more vulnerable than neuter. That is, we expected to find evidence of a reduction in the sys- tem, from three genders (masculine, feminine, neuter) to two (common and neuter). Focusing on indefinite articles and possessives, we demonstrated that all three gender forms, masculine, feminine and neuter, are represented in the data. Nevertheless, there is considerable overgeneralization of mascu- line forms (the most frequent gender forms) in the production of the heritage 436 Multilingualism and Formal Grammar speakers to both feminine and neuter nouns (as compared with gender in the relevant present-day Norwegian dialects). We also found a substantial difference between the indefinite article (an exponent of gender) and the definite suffixal article (which we consider a declension class marker): While the former is to a large extent affected by overgeneralization, the latter form is virtually always target-consistent. This confirms similar findings from pre- vious research on both acquisition and change. However, we did not find any evidence of a two-gender system in the production of any of the speak- ers; instead there seems to be overgeneralization of masculine forms across the board. Assuming that the Norwegian of our participants is somewhat attrited, we speculate that this finding is because of a distinction between (incomplete) acquisition and attrition: While the former process typically results in a systematic difference between the heritage language and the nonheritage variety, attrition will lead to general erosion of the system and eventually complete loss of gender. Appendix

Table 14.6 Production of the Indefinite Article for Each of the Three Genders by All Speakers in CANS (N = 50).

Informant M F N

MFNMFNMFN

Group 1 harmony_MN_03gm 5 harmony_MN_05gm 1 north_battleford_SK_02gk spring_grove_MN_09gm 3

Group 2 billings_MT_01gm 8 2 1 blair_WI_01gm 9 4 blair_WI_07gm 7 2 spring_grove_MN_05gm 4 2 zumbrota_MN_01gk 4 1 1

Group 3 blair_WI_02gm 9 2 2 blair_WI_04gk 16 4 coon_valley_WI_01gk 5 2 2 coon_valley_WI_12gm 3 2 decorah_IA_02gm 5 2 1 gary_MN_01gm 14 2 sunburg_MN_04gk 5 4 1 vancouver_WA_03uk 4 1 1 westby_WI_02gm 7 2

Group 4 albert_lea_MN_01gk 8 4 8 1 1 chicago_IL_01gk 17 9 6 5 coon_valley_WI_02gm 14 1 3 3 1 4 coon_valley_WI_03gm 29 1 8 1 coon_valley_WI_04gm 6 2 2 2

(Continued) 438 Multilingualism and Formal Grammar Table 14.6 (Continued)

Informant M F N

MFNMFNMFN coon_valley_WI_06gm 36 1 5 1 3 7 coon_valley_WI_07gk 5 1 3 1 decorah_IA_01gm 15 4 1 2 fargo_ND_01gm 15 5 4 3 flom_MN_01gm 14 1 12 3 4 flom_MN_02gm 19 10 2 3 gary_MN_02gk 16 1 3 8 1 1 10 glasgow_MT_01gm 4 3 2 3 harmony_MN_01gk 16 2 2 1 1 harmony_MN_02gk 12 4 4 1 harmony_MN_04gm 5 1 1 1 1 north_battleford_SK_01gm 1 4 2 portland_ND_01gm 8 4 1 3 1 portland_ND_02gk 6 8 13 1 rushford_MN_01gm 1 1 1 1 stillwater_MN_01gm 75 13 25 9 20 sunburg_MN_03gm 13 3 3 4 2 sunburg_MN_12gk 1 1 1 vancouver_WA_01gm 14 8 2 2 wanamingo_MN_04gk 2 1 1 1 1 webster_SD_01gm 24 3 1 1 webster_SD_02gm 6 4 4 2 westby_WI_01gm 41 1 30 2 1 15 westby_WI_03gk 13 4 3 6 westby_WI_05gm 3 3 3 westby_WI_06gm 9 1 3 1 2 zumbrota_MN_02gm 4 1 4

The baseline is the Nynorsk dictionary adjusted for some typical patterns in Eastern Nor- wegian dialects. Group 1: Gender system unclear, Group 2: Possibly a three-gender system, Group 3: Masculine forms only, Group 4: Mixture of gender forms.

Notes * We are grateful to the two reviewers for detailed comments and very useful sug- gestions. We would also like to thank Alexander Pfaff for his help with the corpus data. 1 We indicate gender on the noun itself in parenthesis and gender agreement on other targets after a period. 2 There is only one exception to this, the adjective liten/lita/lite ‘small/little’, which distinguishes between all three genders. This is illustrated in (i). (i) a. en liten gutt a.m small.m boy “a small boy” b. ei lita jente a.f small.f girl “a small girl” Grammatical Gender in American Norwegian 439 c. et lite hus a.n small.n house “a small house” 3 NoTa (Norsk talespråkskorpus—Oslodelen [Norwegian spoken corpus, the Oslo part]), The Text Lab, Department of Linguistics and Scandinavian Studies. Uni- versity of Oslo. Available online at www.tekstlab.uio.no/nota/oslo/index.html 4 It is not always easy to distinguish loan words from English words that have become an integrated part of American Norwegian speech, for example, farmer or field. We have used the following criterion in our selection: All words that cur- rently exist in English and that are pronounced with a clear American pronuncia- tion have been discarded in this chapter. 5 We are grateful to Jan Terje Faarlund for valuable help and discussions concerning this issue. 6 An important difference between Schmid’s (2002) study and ours (pointed out by a reviewer) is that she finds very few non-target-like examples in her data, while there is evidence for considerable erosion in the data of the Norwegian heri- tage speakers. We would like to suggest that a possible reason for this could be that Schmid’s (2002) subjects are first generation immigrants and thus had more robust input in their L1, while the attrition we see in our speakers could have accumulated over three or four generations. Furthermore, the German gender system could be said to be somewhat more transparent than the Norwegian one. 7 Adding a definite suffix to monosyllabic nouns in Norwegian results in a tro- chaic structure (strong–weak), which is known to be favored by young children (e.g., Gerken 1994). 8 A reviewer suggests that our findings could be the result of problems with lexical access in very old speakers rather than attrition. We agree that this could very well be the case—or at least an additional factor. This would predict that also Norwegians living in Norway would experience problems with gender assignment in their old age. Unfortunately, we know of no studies that have investigated this issue, and we therefore have to leave this suggestion to further research.

References Alexiadou, A. 2004. Inflection class, gender and DP-internal structure. In Explora- tions in Nominal Inflection, G. Müller, L. Gunkel and G. Zifonun (eds.), 21–50. Berlin: Mouton de Gruyter. Alexiadou, A., Lohndal, T., Åfarli T. A. and Grimstad, M. B. 2015. Language mix- ing: A distributed morphology approach. In Proceedings of NELS 45, T. Bui and D. Özyildiz (eds.), 25–38. Create Space. Anderssen, M. 2006. The Acquisition of Compositional Definiteness in Norwegian. Doctoral dissertation, University of Tromsø. Anderssen, M. and Westergaard, M. 2012. Tospråklighet og ordstilling i norske possessivkonstruksjoner. [Bilingualism and word order in Norwegian possessive constructions]. Norsk Lingvistisk Tidsskrift [Norwegian Journal of Linguistics] 30: 170–197. Benmamoun, E., Montrul, S. and Polinsky, M. 2013. Heritage languages and their speakers: Opportunities and challenges for linguistics. Theoretical Linguistics 39: 129–181. Brown, R. 1973. A First Language: The Early Stages. Harvard, MA: Harvard Uni- versity Press. Comrie, B., Stone, G. and Polinsky, M. 1996. The Russian Language in the Twenti- eth Century. Oxford: Clarendon Press. 440 Multilingualism and Formal Grammar Conzett, P., Johansen, Å. M. and Sollid, H. 2011. Genus og substantivbøying i nor- dnorske språkkontaktområder. [Grammatical gender and declension in language contact areas in North Norway]. Nordand Tidskrift for andrespråksforskning [Nordand Journal for Second Language Research] 6: 35–71. Corbett, G. G. 1991. Gender. Cambridge: Cambridge University Press. Corbett, G. G. and Fedden, S. 2016. Canonical gender. Journal of Linguistics 52: 495–531. Dahl, Ö. 2000. Elementary gender distinctions. In Gender in Grammar and Cogni- tion II: Manifestations of Gender, B. Unterbeck, M. Rissanen, T. Nevalainen and M. Saari (eds.), 577–593. Berlin: Mouton de Gruyter. Delsing, L-O. 1993. The Internal Structure of Noun Phrases in the Scandinavian Languages. Doctoral dissertation, University of Lund. Eichler, N., Jansen, V. and Müller, N. 2012. Gender acquisition in bilingual children: French-German, Italian-German, Spanish-German and Italian-French. Interna- tional Journal of Bilingualism 17: 550–572. Enger, H-O. 2004. On the relation between gender and declension: A diachronic perspective from Norwegian. Studies in Language 28: 51–82. Enger, H-O. and Corbett, G. 2012. Definiteness, gender, and hybrids: Evidence from Norwegian dialects. Journal of Germanic Linguistics 24: 287–324. doi:10.1017/ S1470542712000098 Faarlund, J. T., Lie, S. and Vannebo, K. I. 1997. Norsk referansegrammatikk. [A Reference Grammar of Norwegian]. Oslo: Universitetsforlaget. Flom, G. T. 1926. English loanwords in American Norwegian, as spoken in the Koshomong settlement, Wisconsin. American Speech 1: 541–558. Gerken, L. 1994. Young children′ s representation of prosodic phonology: Evidence from English-speakers′ weak syllable productions. Journal of memory and lan- guage 33: 19–38. Gvozdev, A. N. 1961. Formirovanie u rebenka grammatičeskogo stroja russkogo jazyka. [Language development of a Russian child] Moscow: APN RSFSR. Haugen, E. 1953. The Norwegian Language in America. Cambridge, MA: Harvard University Press. Hjelde, A. 1992. Trøndsk talemål i Amerika [The Troender Variety of Norwegian in America]. Trondheim: Tapir. Hjelde, A. 1996. The gender of English nouns used in American English. In Lan- guage Contact Across the North Atlantic, P. S. Ureland and I. Clarkson (eds.), 297–312. Tübingen: Max Niemeyer Verlag. Hockett, C. F. 1958. A Course in Modern Linguistics. New York: Palgrave MacMillan. Hovdenak, M., Killingbergtrø, L., Lauvhjell, A., Nordlie, S., Rommetveit, M. and Wor- ren, D. 1998. Nynorskordboka [Nynorsk dictionary]. Oslo: Det norske samlaget. Jahr, E. H. 1998. Sociolinguistics in historical language contact: The Scandinavian languages and Low German during the Hanseatic period. In Language Change: Advances in Historical Sociolinguistics, E. H. Jahr (ed.), 119–130. Berlin: Mou- ton de Gruyter. Johannessen, J. B. 2015. The Corpus of American Norwegian Speech (CANS). In Proceedings of the 20th Nordic Conference of Computational Linguistics, NODALIDA 2015, B. Megyesi (ed.), 297–300. Frankfurt: Peter Lang. Johannessen, J. B. and Laake, S. 2015. On two myths of the Norwegian language in America: Is it old-fashioned? Is it approaching the written Bokmål standard? In Germanic Heritage Languages in North America, J. B. Johannessen and J. Salmons (eds.), 299–322. Amsterdam: John Benjamins. Grammatical Gender in American Norwegian 441 Johannessen, J. B. and Larsson, I. 2015. Complexity matters: On gender agree- ment in Heritage scandinavian. Frontiers in Psychology 6: 1842. doi:10.3389/ fpsyg.2015.01842 Johannessen, J. B., Priestley, J., Hagen, K., Åfarli, T. A. and Vangsnes, Ø. A. 2009. The Nordic dialect corpus—an advanced research tool. In Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009, K. Jokinen and E. Bick (eds.), NEALT Proceedings Series Volume 4: 73-80. Johannessen, J. B. and Salmons, J. 2012. Innledning. [Introduction]. Norsk Lingvis- tisk Tidsskrift [Norwegian Journal of Linguistics] 30: 139–148. Johannessen, J. B. and Salmons, J. 2015. The study of Germanic heritage languages in the Americas. In Germanic Heritage Languages in North America, J. B. Johan- nessen and J. Salmons (eds.), 1–17. Amsterdam: John Benjamins. Julien, M. 2005. Nominal Phrases from a Scandinavian Perspective. Amsterdam: John Benjamins. Kürschner, S. and Nübling, D. 2011. The interaction of gender and declension in Germanic languages. Folia Linguistica 45: 355–388. Lødrup, H. 2011. Hvor mange genus er det i Oslo-dialekten? [How many genders are there in the Oslo dialect?] Maal og Minne 2: 120–136. Montrul, S. 2002. Incomplete acquisition and attrition of Spanish tense/aspect dis- tinctions in adult bilinguals. Bilingualism: Language and Cognition 5: 39–68. Montrul, S. 2008. Incomplete Acquisition in Bilingualism: Re-Examining the Age Factor. Amsterdam: John Benjamins. Montrul, S., Foote, R. and Perpiñán, S. 2008. Gender agreement in adult second language learners and Spanish heritage speakers: The effects of age and context of acquisition. Language Learning 58: 503–553. Nesse, A. 2002. Språkkontakt mellom norsk og tysk i hansatidens Bergen [Language Contact between Norwegian and German in Bergen during the Hanseatic Time]. Oslo: Novus Nygård, M. and Åfarli, T. A. 2013. The Structure of Gender Assignment and Ameri- can Norwegian. Paper presented at the 4th Annual Workshop on Immigrant Lan- guages in the Americas, University of Iceland, September 19. O’Grady, W., Kwak, H-Y., Lee, O-S. and Lee, M. 2011. An emergentist perspective on heritage language acquisition. Studies in Second Language Acquisition 33: 223–245. Pascual y Cabo, D. and Rothman, J. 2012. The (Il)logical problem of heritage speaker bilingualism and incomplete acquisition. Applied Linguistics 33: 450–455. Polinsky, M. 1997. American-Russian: Language loss meets language acquisi- tion. In Formal Approaches to Slavic Linguistics, W. Browne, E. Dornisch, N. Kondrashova and D. Zec (eds.), 370–407. Ann Arbor: Michigan Slavic Publications. Polinsky, M. 2006. Incomplete acquisition: American Russian. Journal of Slavic Lin- guistics 14: 191–262. Polinsky, M. 2008. Gender under incomplete acquisition: Heritage speakers’ knowl- edge of noun categorization. Heritage Language Journal 6: 40–71. Putnam, M. and Sánchez, L. 2013. What’s so incomplete about incomplete acqui- sition? A prolegomenon to modeling heritage language grammars. Linguistic Approaches to Bilingualism 3: 478–508. Rodina, Y. and Westergaard, M. 2013. The acquisition of gender and declension class in a non-transparent system: Monolinguals and bilinguals. Studia Linguis- tica 67: 47–67. 442 Multilingualism and Formal Grammar Rodina, Y. and Westergaard, M. 2015a. Grammatical gender in bilingual Norwe- gian-Russian acquisition: The role of input and transparency. Bilingualism: Lan- guage and Cognition. doi:10.1017/S1366728915000668 Rodina, Y. and Westergaard, M. 2015b. Grammatical gender in Norwegian: Lan- guage acquisition and language change. Journal of Germanic Linguistics 27: 145–187. Rothman, J. 2007. Heritage speaker competence differences, language change, and input type: Inflected infinitives in Heritage Brazilian Portuguese. International Journal of Bilingualism 11: 159–389. Rothman, J. 2009. Understanding the nature and outcomes of early bilingualism: Romance languages as heritage languages. International Journal of Bilingualism 13: 155–163. Schmid, M. 2002. First Language Attrition, Use and Maintenance: The case of Ger- man Jews in Anglophone Countries. Philadelphia: John Benjamins. Sorace, A. 2004. Native language attrition and developmental instability at the syn- tax-discourse interface: Data, interpretations and methods. Bilingualism: Lan- guage and Cognition 7: 143–145. Trosterud, T. 2001. Genustilordning i norsk er regelstyrt. [Assignment of gender in Norwegian is rule-based]. Norsk Lingvistisk Tidsskrift [Norwegian Journal of Linguistics] 19: 29–57. Trudgill, P. 2013. Gender maintenance and loss in Totenmålet, English, and other major Germanic varieties. In In Search of Universal Grammar: From Old Norse to Zoque, T. Lohndal (ed.), 77–107. Amsterdam: John Benjamins. Tsimpli, I. M., Sorace, A., Heycock, C. and Filiaci, F. 2004. First language attrition and syntactic subjects: A study of Greek and Italian near-native speakers of Eng- lish. International Journal of Bilingualism 8: 257–277. Vangsnes, Ø. A. 1999. The Identification of Functional Architecture. Doctoral dis- sertation, University of Bergen. Venås, K. (1993). On the choice between two written standards in Norway. In Lan- guage Conflict and Language Planning, E. H. Jahr (ed.), 263–278. Berlin and New York: Mouton de Gruyter. Westergaard, M. and Anderssen, M. 2015. Word order variation in Norwegian pos- sessive constructions: Bilingual acquisition and attrition. In Germanic Heritage Languages in North America: Acquisition, Attrition and Change, J. B. Johannes- sen and J. Salmons (eds.), 21–45. Amsterdam: John Benjamins. Index

abstraction 2, 319, 325, 329, 332 – 5, argument 11, 140n5, 142n27, 216 – 18, 343, 345, 352, 355 265, 270, 278, 281 – 2, 287 – 9, acquisition 5, 7, 12, 45, 71, 81, 137, 295 – 7, 299 – 300, 302 – 3, 306 – 8, 139, 349, 414, 417 – 18, 420 – 3, 313, 325, 340 – 1, 343, 347, 352, 429, 432 – 6; second language 8, 375, 389 – 92, 394; external 7, 290, 372 – 4, 381 299, 301, 304, 325, 331, 348, 394, Adger, D. 68, 179, 181, 196, 236, 309. 400; internal 7, 279, 290 – 2, 301, 385, 391 314n15, 346, 348, 357, 394 adjunct 10, 40, 123, 159, 169n7, 184, attrition 12, 169n10, 374 – 7, 407n3, 207, 217 – 18, 223, 265 – 78, 280 – 2, 418, 422, 432 – 6, 439n6 283n7, 296 – 7, 309, 312, 319, 322, Austin, J. L. 330, 359n4 352, 354, 358, 361n32 adjunction 25, 49, 141n17, 153 – 4, Baker, C. L. 337, 354 221 – 2 Baker, M. 4, 29, 40, 49 – 50, 127, 132, Åfarli, T. A. 105 – 6, 136, 143 – 4n36, 135 – 6, 138, 166, 168, 236, 312, 193, 372, 375 – 6, 385, 390, 392, 314n19, 341, 140n9, 141n13 404 – 5, 407n1, 423 Bare Phrase Structure (BPS) 33, 41 Afrikaans 154 – 5, 160, 163, 169n6 Barsky, R. F. 65 Agent 7, 287 – 9, 291 – 6, 302, 307 – 8, basic operation 32, 337, 373 311, 313, 331, 341 – 3, 375, 389 Basque 125, 127 – 8, 134, 137 – 9, Agree 10, 49, 87 – 8, 103, 124, 141n24, 149, 156 143n31, 231 – 2, 236 – 7, 242 – 7, Beck, S. 346, 349 – 50 249 – 50, 252 – 7, 259n18, 259n21, behaviorism 20, 62 260n30, 279 Belazi, H. M. 8, 372, 384 – 5 Alexiadou, A. 4, 7 – 8, 175, 177, 290, Belletti, A. 204, 223 – 5, 267, 280, 309, 340 – 1, 375, 389, 391, 394, 283n10 402 – 3, 416, 423 Benmamoun, E. 11, 371, 418 Alrenga, P. 178 – 9, 183 – 4, 187 – 92, Berwick, R. C. 74 – 5, 167 198n5 Biberauer, M. T. 4, 232 American Norwegian 11 – 12, 382, bilingualism 372 385 – 9, 395, 397 – 400, 402 – 3, Boeckx, C. 68, 71, 85 – 9, 91 – 3, 96, 99, 405 – 6, 414, 416, 421 – 3, 425, 114, 116 – 21, 123 – 4, 126 – 9, 132 – 8, 431 – 2, 435, 439n4 166 – 7, 169n13, 243, 337, 391, 394 Anagnostopoulou, E. 7, 133, 135, Borer, H. 4 – 7, 11, 93, 106, 167, 143n33, 175, 177, 290, 309, 340 – 1, 289 – 90, 299 – 303, 306, 309, 375, 389, 391, 394 314n18, 341, 374 – 6, 389 – 92, 394 Anderssen, M. 416, 421, 425, 429, 434 borrowing 375, 383; nonce- 383 applicative 132, 142n27, 143n30, Bowers, J. 7, 289, 299, 307 – 9, 341, 291, 307 375, 389, 394 444 Index Bracken, H. 65 Distributed Morphology 11, 153, 157, Bresnan, J. 21, 32, 36, 46, 82, 89, 300, 376, 390, 392 141n18, 177 Dotlačil, J. 296 Bricmont, J. 65 double definiteness 416 – 17 Bùlì 104 Dubinsky, S. 177 – 80, 182, 185, 198n5 Dummett, M. 77, 320 Cable, S. 11, 337 – 40, 349 – 50, 361n30 Dutch 86, 90 – 1, 149, 164, 193, 210, Caponigro, I. 349 233, 249 – 50, 415, 420 Carlson, G. 308, 312, 325, 340, 343, 389 economy 9, 50 – 1, 73, 329 cartography 41, 87, 117 E-language 3, 72 – 3, 77 – 8, 322 – 3, 329 Chan, B.H.S. 384 ellipsis 203, 207 – 12, 216, 220, 222, Cheng, L. 66, 337, 350 224, 225n4, 328 Chomsky hierarchy 23, 63 embedded language 384, 400, 406 Church, A. 2 – 3, 321, 333, 336 Embick, D. 11, 177, 376, 390, 392 – 3 Cinque, G. 203, 226n7, 267, 280 – 1, Enger, H. O. 413, 416 283n10, 299, 360n15 Ernst, T. 268, 280 code-switching see language mixing event variable 6, 10, 287, 302, 308, complement 35, 38, 40, 53n12, 119, 311 – 12, 313n11, 314n21, 346, 348 126, 188, 217 – 18, 304, 309, 312, Exceptional Case Marking (ECM) 190 340, 343 exoskeletal 11, 299, 381 – 3, 389 – 93, complementizer agreement 91, 119 395, 404, 406 conjunction 6, 30, 213 – 14, 287, Extended Standard Theory 30 – 2, 52, 305 – 6, 308, 312 – 13, 325, 328 61, 69 – 71, 73, 76 Cook, V. J. 372 – 3 Extension Condition 50 – 1 Corbett, G. G. 413, 416, 418 Cowie, F. 67 Faarlund, J. T. 101 – 2, 194, 415 Crain, S. 46, 81, 149, 167, 349 factive 217 – 18 Culicover, P. 89, 113, 129, 131, 134, filter 9, 38, 44 – 5, 50, 71, 73, 76, 86, 141n18, 225n3 155, 392 Czech 237 Fitch, W. T. 75 – 6 Fodor, H. 324 Dahl, Ö. 413 Fodor, J. D. 5 Danish 94 – 6, 98 – 9, 101 – 2, 107n4, frame 300, 375 – 6, 390 – 2, 394, 406 116, 415, 420 Franck, J. 65 Davidson, D. 6, 10, 287 – 9, 312, 320, Frank, R. 21, 24, 33, 46 – 7, 82 346, 359n9 Frazier, M. 208 Davies, W. D. 177 – 80, 182, 185, 198n5 Frege, G. 6, 77, 288, 321 – 2, 325, De Clercq, K. 265, 272 – 4, 277, 328 – 9, 346, 359n8 279, 281 French 29, 43, 48, 107n1, 140n5, Deep Structure 26 – 31, 44, 52n6, 143n29, 149, 217 – 18, 283n10, 386 53n18, 70 Fukui, N. 4, 33, 40, 138 Delahunty, G. P. 179, 181 – 2, Functional Application 295, 185 – 6, 191 307 – 8, 313 Delsing, L. O. 415 derivation 1, 9, 22 – 5, 27, 32 – 3, 43, Gelderen, E. van 8, 166, 187, 231, 250, 46 – 51, 63, 70, 118, 121 – 2, 124, 372, 385 129, 132, 149, 156, 159 – 60, 162 – 3, Generalized Phrase Structure Grammar 165 – 6, 206, 208 – 12, 215 – 16, 218, 21, 47, 82 220, 224, 237 – 8, 246, 250, 252, generative semantics 6, 21, 29 – 30, 80 254 – 6, 277, 295, 300, 307, 309 – 11, German 29, 129 – 30, 137, 140n5, 149, 335, 340, 345, 392 154 – 5, 157 – 61, 163 – 7, 169n9, 210, Descartes, R. 19, 65 233, 313n8, 377n2, 417, 420, 434, Dion, N. 382 – 4, 389, 400 439n6 Index 445 Goldsmith, J. A. 80 Huddleston, R. 178, 180, 183, 215, González-Vilbazo, K. 8, 372, 377n3, 268, 272 – 3, 277 – 8, 282n3 384 – 5 Hymes, D. 67 Government and Binding 21, 31 – 2, 38, 40, 50, 52, 68 – 9, 71 – 4, 167, Icelandic 94, 98 – 9, 101, 103, 116, 373, 392 129 – 30, 143n32 Greek 143n33, 179 idiom 291, 393, 424 Grimstad, M. B. 11 I-language 2 – 4, 8, 11, 72 – 3, 77 – 8, Groenendijk, J. 320, 329, 360n13 159, 183, 321 – 30, 334 – 7, 349, Grohmann, K. K. 69, 164 356 – 8, 359n8, 360n10, 361n28, Gullah 91 373 – 4, 376, 385 Imbabura Quechua 124 Haegeman, L. 68 – 9, 91, 119, 124, inclusiveness 41 141n21, 179, 190, 195, 198n12, inflection class 416, 428 206, 212, 214 – 15, 217, 219, 226n8, information structure 205 – 6, 210, 214 232 – 6, 239, 243, 248 – 52, 256, Initiator 304 258nn6 – 8, 260n26, 265 – 7, 270 – 2, input 3 – 6, 25, 45, 53n9, 137, 149, 153, 274, 282n1, 283n8 165, 167 – 8, 359n5, 373 – 4, 376, Hale, K. 4, 128, 203, 290, 305 413 – 14, 417 – 19, 421, 423, 425, Halle, M. 11, 30, 61, 76 – 7, 153, 433 – 5, 439n6 392, 396 intervention 49, 88, 118 – 19, 124, Hamblin, C. L. 320, 327, 329, 129, 210, 226n8, 253 – 7, 261n34, 360n13 281, 349 Hankamer, G. 43 – 4, 213 Irish 91 Harley, H. 6 – 7, 29, 132, 143n29, 222, island 36, 43, 52, 93, 96, 113, 130, 290 – 1, 309, 341, 375, 389, 392 – 3 140n9, 191, 205, 209, 339 Harman, G. 46 – 7, 67 Italian 29, 85, 90, 204, 259n16, Harris, R. A. 80 283n10, 361n29, 414 Harris, Z. 1, 8, 19, 21 – 2, 61 – 3, 65 Iwakura, K. 191 Hartmann, J. 177 – 9, 181, 186, 198n11 Haspelmath, M. 383 Jackendoff, R. 30, 32, 34 – 5, 37 – 8, Haugen, E. 375, 382, 386, 407n3, 53n8, 76, 131, 266, 313n1, 414, 421 313n4, 389 Haumann, D. 267, 271 – 2 Japanese 1, 28, 40, 350 Hauser, M. D. 75 – 6 Jayaseelan, K. A. 40, 42, 204, 209, 223, Hawkins, R. 7, 372, 374 281, 309, 350 – 1 Heim, I. 6, 11, 307 – 8, 324, 334, 346, Jenkins, L. 68 – 9 359n9 Jeong, Y. 132, 136, 142n27, 149 – 50, Heritage 5, 8, 12, 419; language 11 – 12, 155 – 9, 165, 168, 169n7, 261n34, 371 – 7, 382, 414, 416, 418, 421 – 3, 290 – 1, 307, 313n5 429, 432 – 6 Johannessen, J. B. 416, 421 – 3, 429, Higginbotham, J. 289, 312, 313n3, 432 – 3 320, 323 – 4, 327, 329, 347, 354, Johnson, K. 135 – 6, 205 – 6, 209, 136 356, 359n9 Joshi, A. 21, 23 – 4, 47, 82, 384 Hiraiwa, K. 104, 106, 232, 236, Julien, M. 98, 125, 162, 402 – 3, 238 415 Hjelde, A. 386, 405, 421, 423 Hockett, C. F. 413 Kanwangamalu, N. M. 386 Hoekstra, E. 40, 92, 309 Kaplan, D. 333 Hornstein, N. 35, 53n8, 67 – 9, Kaplan, R. M. 21, 46, 82 162, 166, 288, 324, 336 – 7, 347, Karttunen, L. 320, 327, 329, 349, 361n32, 373 360n13 Huang, C.T.J. 115, 355 Kato, Y. 272, 274 Huck, G. J. 80 Katz, J. J. 28 – 9, 53n16 446 Index Kayne, R. 4 – 5, 40 – 1, 43, 53n14, 92, 299, 308 – 9, 311 – 12, 314nn20 – 1, 103, 106, 121, 128, 143n29, 153, 340 – 1, 343, 361n24, 373, 375, 381, 155, 157, 167 – 8, 204, 223, 260n32 389 – 92, 394 King, R. 385, 391, 395 López, L. 8, 206 – 7, 219 – 20, 222, 224, Klein, E. 82, 294 246, 372, 377n3, 384 – 5 Klima, E. 272, 277, 279 Ludwig, K. 319 Koster, J. 46, 168n3, 177 – 9, 181 – 9, Lyons, J. 34, 65 192 – 3, 198n13 Kramer, R. 402 – 3 MacSwan, J. 8, 372, 376, 384 – 5, Kratzer, A. 6, 11, 197n3, 222, 288 – 91, 391, 395 293 – 5, 307, 309, 314n14, 324, 334, Mahootian, S. 384 – 5 341, 346, 350, 359n9, 375, 389, 391 Malayalam 351 Kupin, J. 34, 53n11 Mandarin Chinese 138, 385 Kuroda, S.-Y. 42, 350 Marantz, A. 7, 11, 132, 142n27, 153, 289 – 91, 303, 389 – 91, 392 label 35 – 6, 85, 115, 129, 131, 177, matrix language 384 – 5, 395, 400, 406 280, 283n10, 301, 309, 326, 337, May, R. 288, 320, 327, 347 341, 345, 347 – 8, 355, 361n32 McCawley, J. 30 – 1, 267, 272, 277, Lahiri, U. 329, 357 279, 282n6, 283n7 Lakoff, G. 6, 29, 43, 330, 360n15 McCloskey, J. 104, 177, 265, 299 Lambrecht, K. 189 McDaniel, D. 93, 149 – 50, 154, 158, language contact 385, 388, 420, 435 161, 164, 166 language of thought 423 – 5, 336 McGilvray, J. 65 – 7, 81 Larson, I. 416 McGinn, C. 321 – 2, 359n4 Larson, R. 81, 143n33, 313n6, 324, McGinnis, M. 132, 142n27, 291, 347, 359n9 313n5 Lasnik, H. 4, 9, 23, 25, 33 – 4, 43 – 5, Merchant, J. 7, 43, 115, 236, 290, 389 47, 49 – 50, 52n3, 53n11, 68 – 73, 76, mereology 292 86, 90, 92, 99 – 100, 115 – 16, 119, Merge 32, 51, 75, 87, 99, 118 – 19, 123, 130 – 1, 140n8, 155, 166, 175, 140n12, 301, 307, 309, 336 – 7, 340, 177, 186, 209, 355, 373, 392 345, 394 late insertion 11, 157, 376, 381 – 2, Miller, P. H. 177, 179, 184 – 5, 197n4, 389, 406 198n10 LaTerza, C. 296 Minimalist Program 21, 33, 51 – 2, 61, Lees, R. B. 27, 67 – 8 66, 68 – 9, 71 – 6, 79, 85, 93, 150, Lepore, E. 319 152, 322, 325, 391 – 2 level of representation 26, 29 – 31, Montague, R. 6, 77, 347, 359n9 63, 288 Montrul, S. 11, 371, 418, 434 Levin, B. 132, 295, 313n4, 391 Moulton, K. 178 – 9, 188, 196 – 7, Lexical-Functional Grammar 21, 46, 82 198n7 lexical insertion 22, 26, 49, 394 Myers-Scotton, C. 375 – 6, 377n3, Lie, S. 101, 194, 415 383 – 6 Lightfoot, D. 5, 81, 92 – 3 listeme 300 – 1 native speaker 2, 24, 105, 175, 182 – 3, loanword 382 – 4 195, 199n14, 226n6, 273, 300, 372, locality 20, 36, 51 – 2, 70, 73, 114, 382, 390, 396 135 – 6, 143n31, 209, 222, 232, 238, Neijt, A. 205 – 6, 209 241, 243, 245 – 6, 255, 278 – 9, 338 Neuckermans, A. 250 Lødrup, H. 140n7, 416 – 17, 420, 423, Newmeyer, F. J. 4, 29, 71 – 2, 80, 429 – 30, 432, 435 166 – 7 Logical Form (LF) 31, 166, 288, 339 Newson, N. 372 – 3 Lohndal, T. 4, 6 – 7, 11, 94, 119 – 25, Noyer, R. 11, 376, 390, 392 – 3 140n5, 141n18, 166, 169n8, 177, null subject 4, 45, 85 – 6 282n1, 283n8, 289, 291, 295 – 7, Null Theory 384 – 5 Index 447 Nunes, J. 69, 140n11, 150, 153 – 5, Radford, A. 68 – 9, 271 160, 162, 169n6, 361n31 Ramchand, G. 7, 289 – 90, 299, Nygård, M. 392, 404 – 5, 423 302 – 6, 309, 313n1, 314n19, 341, 389 – 91, 394 Otero, C. P. 67 – 8 Rappaport Hovav, M. 132, 295, 313n4, 391 Parsons, T. 288 – 9, 308, 313n2, 391 reciprocal 296 particle 270, 337 – 9, 350, 352, 358, Reinhart, T. 189, 313n4, 392 361n23, 361n30 Reiss, C. 81 Penka, D. 231 – 2, 236, 250, 259n9 relative clause 9, 86, 102 – 4, 106 – 7, Penthouse Principle 190 – 1 108n14, 123, 152, 319 – 20, 325, Perlmutter, D. 9, 43, 52n5, 85 – 6 327, 332, 334 – 6, 340, 343, 351 – 2, Pesetsky, D. 86 – 8, 92, 103 356, 358 phase 32, 49, 51, 74 – 5, 128, 139n4, Relativized Minimality 50, 118, 122, 140n5, 141n12, 143n31, 162, 224, 132, 392 252, 259n18, 260n30, 279 – 81, 343 reprojection 347, 352, 358, 361n29 Phonetic Form (PF) 31 Resultee 304 – 5 Piattelli-Palmarini, M. 67 – 8 Řezáč, M. 128, 133 Pietroski, P. 6, 11, 74, 81, 305, 308, Rheme 304 312, 322, 324, 326, 336 – 7, 340, Richards, M. 119, 138, 140n12, 343, 346, 357, 359n5, 359n8, 141n14, 143n31, 166 – 7, 260n31 360n10, 360n12, 360n19, 361n29, Richards, N. 86 – 7, 117 – 18, 155, 209, 361n32, 373, 391 211 – 13, 216, 219, 225n4, 309, 340 Pinkham, J. 191 Riksem, B. R. 402 – 3, 407n3 Plato’s problem 45 Ritter, E. 303, 403 plural 252, 293 – 4, 313n10, 407n3 Rizzi, L. 4, 39, 42 – 3, 45, 50, 85 – 7, polarity 10, 265, 270 – 1, 273, 277 – 82, 89 – 90, 92 – 3, 107n1, 116 – 17, 282n6, 283n8 124 – 6, 156, 161 – 2, 166, 182, 195, Polinsky, M. 11, 38, 180, 371, 414, 203 – 4, 208, 223, 254 – 5, 261n34, 418 – 19, 433 267, 278, 280, 349 – 50, 360n15, Polish 90 – 1, 143n34 361n32, 375, 392 Pollard, C. 82 – 3 Roberts, I. 4, 81, 135 – 6, 168, 231 – 2 polysemy 324 Rodina, Y. 414 – 15, 417, 419 – 21, 423, Poplack, S. 8, 372, 382 – 4, 389, 400, 425, 429, 433 – 5 407n2 Romani 149, 157 – 9, 163, 165 Post, E. 20 root 7 – 8, 11, 300, 303, 306 – 7, Postal, P. 28 – 30, 53n16, 179 314n21, 391, 393 – 7, 399 – 400, Potter, D. 208 402 – 3 Prince, A. 82 – 3 Ross, J. R. 129, 183, 190, 278 Principles and Parameters 4, 21, 37, 45, Rothman, J. 11, 418 50, 61, 70 – 3, 166 Roussou, A. 86, 88, 92, 179, 231 – 2 procedure 4, 45, 242, 321 – 3, 326, 331, Rubin, E. J. 8, 372, 384 – 5 336 – 7, 339, 350, 358 Russian 413 – 14, 417 – 19, 423, 433 processing 273, 277, 356, 433 proposition 321 – 2, 327 – 9, 331, Sag, I. A. 21, 33, 47, 82, 206, 294, 336, 351, 357, 359n3, 360n13, 321 360n19 Sailor, C. 208, 213, 224, 309, 341 Pullum, G. K. 82, 215, 272 – 3, 277 – 8, Saito, M. 44, 47, 86, 92, 186, 392 282n3 Salmons, J. 377n2, 421 Pylkkänen, L. 132, 142n27, 289 – 91, Sampson, G. 67 – 8 299, 307, 309, 313n5, 314n20, 341, Samuels, B. D. 81 – 2 360n18 Satisfaction 169n11, 324 – 6 saturate 327 Quirk, R. 268, 272, 274, 282n6 Saussure, F. de 19 448 Index Schäfer, F. 7, 290, 309, 341, 375, 389, Thornton, R. 46, 81, 93, 149 – 51, 391, 394 157 – 9, 166 – 7, 169n5, 169n16, Schein, B. 288 – 9, 291 – 4, 296, 310, 361n32 313n1, 313n3, 313nn10 – 11, Tlingit 337 – 9, 352, 361n30 314n12, 340, 391 Toribio, A. J. 8, 372, 384 – 5 Segal, G. 81, 322, 324, 330 – 1, 347, Torrego, E. 86 – 8, 92, 103, 117, 359n9 119, 126, 133, 232, 246, 259n21, separation 288, 291, 308, 340 – 1 260n30, 391 Seuren, P.A.M. 80 Trace Theory 31 Skinner, B. F. 62 transformation: generalized 23 – 4, Smith, N. 65 – 6, 283n8 26 – 8, 48, 50 – 1, 52n2 Smolensky, P. 82 – 3 Travis, L. 39, 124, 192, 303 Sobin, N. 90, 122, 270 – 1 Tree Adjoining Grammar 21, 23, 47, 82 Spanish 29, 78, 116, 126, 149, 156, Trosterud, T. 405, 414, 425 159, 179, 414 type 294 – 6, 305, 312, 322, 325 – 7, specifier 35 – 6, 38 – 40, 49, 53n12, 103, 331, 333 – 6, 355, 359n8, 360n12 118, 153, 212, 304 – 7, 309, 314n18, 343; multiple 212 Undergoer 304 – 5 speech act 321, 330 – 1, 357, 359n4, Uriagereka, J. 32, 51, 75 – 6, 115 – 16, 360n17 118, 127 – 8, 139n4, 140n5, 142n24, Spell-Out 32, 48 – 9, 139n4, 150 – 1, 162, 168, 309 153, 162, 232, 308 – 12, 339 – 43, 345 – 6, 394 Vanden Wyngaerd, G. 206, 208, Stainton, R. J. 321 210 – 14, 216, 219 – 20, 222 – 3, Standard Theory 20, 28, 30 – 2, 52, 61, 225n2, 226n4 63, 69 – 71, 73, 76 van der Auwera, J. 250 Starke, M. 40, 42, 140n5, 254 – 5, Vangsnes, Ø. A. 125, 415 261n34, 392 Vannebo, K. I. 101, 194, 415 Stokhof, M. 320, 329, 360n13 Verb Second (V2) 175, 192 – 4, 197, Stowell, T. 38 – 9, 53nn10 – 11, 92, 155, 387 177, 179, 196, 198n9 Stuurman, F. 37 – 8, 53n13 Wells, R. 19 Subset Principle 396, 400, 403, 406 Westergaard, M. 4 – 5, 12, 125, 414 – 17, Surface Structure 28 – 31, 43, 53n18, 419 – 21, 423, 425, 429, 433 – 5 69 – 70 Wexler, K. 113, 129, 134 Swedish 94 – 9, 201, 107n4, 116, 415 wh-movement 10, 39, 47 – 9, 71, 91, Sybesma, R. 66 164 – 5, 185, 194, 204, 209 – 11, 224, syncretism 415, 417, 420, 422, 319, 334, 355, 361n26, 375 431, 435 Winkler, S. 204, 206 – 7, 219 – 20, 222, 224 Takahashi, S. 178 – 9, 190 – 1, 196, Woolford, E. 8, 372, 384 – 5 198n7 Taraldsen, K. T. 94, 104 – 5, 107n1, X-bar theory 33 – 6, 38 – 40, 42, 61, 108n13 69 – 70 Tarski, A. 6, 77, 320, 322, 324 – 6, 333, 336, 359n9 Yoshida, M. 208 Template 223, 299, 346, 375 – 6, 390 – 2, 394, 406; see also frame Zaenen, A. 86, 98, 191 Tenny, C. 302, 309 Zanuttini, R. 232, 239, 243, 248 – 50, Thematic Integration 312 252, 256, 283n8, 283n10 thematic role 313n6, 389 Zeijlstra, H. 231 – 3, 236 – 9, 242 – 4, Theme 135, 288 – 9, 291 – 3, 295 – 7, 246, 250, 258n8, 259n9, 259n10, 307 – 12, 313n5, 314n15, 341 – 3, 259n11, 259n18, 261n37, 270, 391 346 – 7 Zulu-English 386 Thoms, G. 208, 213, 224 Zwart, J.-W. 91, 124