An Introduction to Phase-Based Minimalist Grammars: Why Move Is Top-Down from Left-To-Right
Total Page:16
File Type:pdf, Size:1020Kb
Cristiano Chesi An introduction to Phase-based Minimalist Grammars: why move is Top-Down from Left-to-Right Cristiano Chesi CISCL - University of Siena chesi media.unisi.it This paper is an introduction to a grammatical formalism (Phase-based Minimalist Grammar, elaboration of Stabler’s 1997 Minimalist Grammar) that includes a revised version of the standard minimalist structure building operations merge, move and the notion of derivation by phase (Chomsky 1999-2005). The main difference with respect to the standard theory is that these devices strictly operate Top- Down and from Left-to-Right. In these pages I will argue that long distance dependencies, such as successive cyclic A'-movement, are better understood within this unconventional (at least within the Minimalist Program) phase-based directional perspective1. 1. Introduction The Government and Binding approach (Chomsky 1981) broke with the Transformational Grammar tradition (Chomsky 1957) by shifting the focus of inquiry from the generative procedure of phrase structure building to the configurational patterns apt to discard/license bad/well-formed sentences given their underlying Structural Description(s). Within the Minimalist Program (Chomsky 1995), structure building operations (such as merge and move) re-gain a prominent position but certain well know problems of directionality of application of these operations, with respect to specific formal tasks such as generation and recognition (Aho and Ullman 1972) and their performance counterpart, production and parsing (in the sense of Berwick and Weinberg 1986), are only marginally discussed within the standard framework (Phillips 1996, 2003). This is essentially because the definition of these structure building operations implicitly includes a directionality assumption without discussion. In this paper I will tackle this assumption and argue that the directionality issue is crucial for understanding certain aspects of the grammar, especially long distance dependencies such as A'-movement: if we assume that a long-distance dependency necessarily includes a unique base thematic position and a unique topmost criterial position (e.g. operator, subject or topic, Rizzi 2004a), from a purely formal viewpoint we might expect that the order of computation of these positions be irrelevant, that is, 1 This work is a distilled version of the ideas discussed in my Ph.D. thesis defended at the University of Siena (January 2005). I am especially grateful to Valentina Bianchi for the careful discussion of every aspect of this paper. Parts of this work have been discussed at CISCL (2001-07), MIT (2001, 2004), Harvard (February 2006), UPENN (February 2006), University of Geneva (March 2005, 2006), University of Nanzan (February 2007), University of Gerona (June 2007). Thanks to all these audiences for their precious suggestions and remarks. 38 An introduction to Phase-based Minimalist Grammars the computation might equally well start from the bottom position (foot) or from the topmost position of the syntactic chain. However, even though this representational approach correctly characterizes the structural constraints on the form of the chain, it does not guarantee that an algorithm exists to build such a chain and that it is computable. The problem lies on the fact that in between the basic/thematic position and the topmost criterial one, there might be a potentially infinite number of intermediate positions which are required to satisfy locality constraints, and which must be created by a recursive procedure. Computationally speaking, when we deal with recursive procedures we need some special care: the grammatical formalism, Phase-based Minimalist Grammar (PMG)2, outlined in these pages allows for an explicit formalization of a class of computationally tractable minimalist grammars3. In particular I will provide an algorithmic definition of the movement operation that is computationally tractable, mainly deterministic and potentially cognitively motivated. In particular, to achieve these results, I will focus on three important properties: 1. structure building operations can be included in the grammar but only if they apply Top-Down, from Left-to-Right; 2. using a Linearization Principle (inspired by Kayne 1994’s LCA) and fixing the functional structure by means of a universal hierarchy (cartographic approach4) makes the algorithm mostly deterministic; 3. the adoption of the notion of phase (Chomsky 1999) allows us to achieve computational tractability in the computation of a relevant subset of long distance relations. 2. Formalizing grammars In standard generative approaches, we are used to thinking of a sentence as a bidimensional structure bearing information on both precedence and dominance relations among lexical items and categorial features, where precedence is a total order among the pronounced elements (namely words, which are groups of morpho- phonological features) while dominance expresses the constituency/dependency relations among pronounced and other (abstract/categorial) elements. These two kinds of information are represented by tree structures5 such as the following one: (1)X dominance: X dominates A, Y, B, C A Y Y dominates B, C B C precedence: <A, B, C> A language can be extensionally characterized as an infinite set of grammatical expressions each associated with at least one structural description. We assume that an intentional procedure characterizes this set by productively restricting the theoretically 2 Based on Stabler 1997; see Chesi 2004 for a full preliminary discussion of this approach. 3 The notion of “grammatical formalism” is thus the formally explicit counterpart of the informal notion of “framework” which has wide currency in syntactic theorizing. 4 Belletti ed. 2002, Cinque 1999, ed. 2002, Rizzi 1997, ed. 2004b and related works. 5 Rooted, labeled, oriented trees, in fact (McCawley 1968). 39 Cristiano Chesi possible precedence/dominance relations that can co-occur within the same sentence/structural description. We refer to this intensional procedure as speaker’s competence (or I-language, Chomsky 1986). Formally speaking, this competence is represented by a grammar: this includes by standard hypothesis6, at least a specification of a lexicon (a finite set of words/morphological units, built from an alphabet, with associated specific abstract/semantic features) plus some universal properties (usually encoded as principles/rules) and language specific options (parameters) to derive the combinatorial/recursive potentiality of any natural language. Within this framework, the specification of the Structure Building Operations and their order of application is controversial: namely, it could be impossible7 or at least superfluous8 to specify once and for all one precise algorithm that recursively defines the procedure for assembling bigger and bigger meaningful units, starting from the lexicon and its properties. Recent minimalist inquiries pay serious attention to this problem, and provide interesting solutions but also raise puzzling problems that need heedful discussion. In order to do that, we need to make fully explicit assumptions: a complete formalization is a necessary step since it forces us to specify everything we need to describe a language. On the other hand, when choosing a specific formalization (§2.1) we must consider to which extent it transparently encodes linguistic intuitions (§2.2) and at which (computational) cost it does (§2.3, §3). 2.1. A simple formalism: Minimalist Grammars Stabler (1997) proposes a simple formalization of a Minimalist Grammar (MG) as outlined in Chomsky 1995. Following his work, a MG can be defined as a 4-tuple {V, Cat, Lex, F} such that: (2) Minimalist Grammar (MG, from Stabler 1997) V is a finite set of non-syntactic features, (P ∪ I) where P are phonetic features and I are semantic ones; Cat is a finite set of syntactic features, Cat = (base ∪ select ∪ licensors ∪ licensees) where base are standard categories {complementizer, tense, verb, noun ...}, select specify selection requirements {=x | x ∈ base} where =x means base-driven selection of an x phrase; licensees specify requirements forcing phrasal movement {-wh, -case ...}, -x triggers covert movement, -X triggers overt movement; licensors are the features that can satisfy licensee requirements {+wh, +case ...} Lex is a finite set of expressions built from V and Cat (the lexicon); F is a set of the two partial functions from tuples of expressions to expressions{merge, move}; 6 I will assume by “standard hypothesis” the Principle and Parameter framework presented in Chomsky 1981. 7 In a transformational grammar (Chomsky 1957), rewriting rules have to be applied in a precise order, crucially different depending on the sentence to be processed. 8 As shown by Fong (1991), principles can be applied as filters/generators in many orders and all of them will lead to the same structural representation. The difference would be made in terms of the number of states the algorithm should evaluate before reaching the correct result. 40 An introduction to Phase-based Minimalist Grammars The language defined by such a grammar is the closure of the lexicon (Lex) under the structure building operations (F). (3) is an example of MG able to deal with simple (i.e. not cyclic) wh- movement9: (3) MG example V = P = {/what/, /did/, /you/, /see/}, I = {[what], [did], [you], [see]} Cat = base = {D, N, V, T, C}, select = {=D, =N, =V, =T, =C}, licensors {+wh}, licensees {-wh} 10 Lex = [-wh D what], [=V T did], [D you],