Integration Complexity and the Order of Cosisters
Total Page:16
File Type:pdf, Size:1020Kb
Integration complexity and the order of cosisters William Dyer Oracle Corp [email protected] Abstract subjectivity (Scontras et al., 2017); (2) a bi- The cost of integrating dependent constituents nary hierarchy based on features such as rel- to their heads is thought to involve the distance ative/absolute (Sproat and Shih, 1991), stage- between dependent and head and the complex- /individual-level (Larson, 1998), or direct/indirect ity of the integration (Gibson, 1998). The for- (Cinque, 2010); or (3) a multi-category hierarchy mer has been convincingly addressed by De- of intensional/subsective/intersective (Kamp and pendency Distance Minimization (DDM) (cf. Partee, 1995; Partee, 2007; Truswell, 2009), re- Liu et al., 2017). The current study addresses inforcer/epithet/descriptor/classifier (Feist, 2012), the latter by proposing a novel theory of in- tegration complexity derived from the entropy and perhaps most famously, semantic features of the probability distribution of a dependent’s such as size/shape/color/nationality (Quirk et al., heads. An analysis of Universal Dependency 1985; Scott, 2002). Similarly, prepositional corpora provides empirical evidence regard- phrases and adverbials have been held to fol- ing the preferred order of isomorphic cosis- low a hierarchy based on manner/place/time ters—sister constituents of the same syntactic (Boisson, 1981; Cinque, 2001) or thematic roles form on the same side of their head—such as such as evidential/temporal/locative (Schweikert, the adjectives in pretty blue fish. Integration 2004). While these models may be reasonably complexity, alongside DDM, allows for a gen- eral theory of constituent order based on inte- accurate—though see Hawkins(2000); Truswell gration cost. (2009); Kotowski(2016)—they seem to lack ex- ternal motivation (Cinque, 2010, pp. 122-3) and 1 Introduction explanatory power outside their specific con- An open question in the field is why certain con- stituent types. stituent orders are preferred to their reverse-order A more general approach suggests that certain variants. For example, why do pretty blue fish tendencies—constituents placed closer to their or Toni went to the store after eating lunch seem heads than their same-side sisters are more often more felicitous than blue pretty fish or Toni went complements than adjuncts (Culicover and Jack- after eating lunch to the store? In both sequences, endoff, 2005) and are more likely to be shorter two constituents of the same syntactic type de- (Behaghel, 1930; Wasow and Arnold, 2003), less pend on the same head—two ‘stacked’ adjectives complex (Berlage, 2014), or have less gram- modify fish and two prepositional phrases mod- matical weight (Osborne, 2007)—are the result ify went. Yet despite their syntactic and truth- of larger motivations such as Head Proximity conditional equivalence, one order is preferred. (Rijkhoff, 1986, 2000), Early Immediate Con- This order preference has often been treated stituents (Hawkins, 2004), or Minimize Domains with discrete models for each constituent type. (Hawkins, 2014). This line of inquiry seeks to ex- For example, it has been proposed that stacked plain Behaghel’s (1932) observation that syntactic adjectives follow (1) a general hierarchy based proximity mirrors semantic closeness, either due on inherence (Whorf, 1945)—that is, the ad- to iconicity or more recently as an efficiency-based jective closest to the head is more inherent aid to cognitive processing. to the head—discrimination (Ziff, 1960), in- The current study sits within this latter approach trinsicness (Danks and Glucksberg, 1971), tem- of appealing to a general principle to motivate a porariness (Bolinger, 1967; Larson, 2000), or constituent-ordering pattern. 3 4 3 2 2 2 1 1 1 1 1 1 1 1 1 1 Avery looked up the date of the last eclipse Bo looks it up Cam works very hard all day h d1 d2 h d1 d2 h d1 d2 Variant A1 total dependency distance = 13 B1 dep. dist. = 4 C1 total dep. dist. = 9 7 4 3 2 2 2 2 1 1 1 1 1 1 1 1 1 Bo looks up it Cam works all day very hard Avery looked the date of the last eclipse up h d2 d1 h d2 d1 h d2 d1 B2 dep. dist. = 4 C2 total dep. dist. = 9 Variant A2 total dependency distance = 18 Figure 2: Isomorphic cosisters Figure 1: DDM variants tures is to alternate the placement of sister con- 2 Dependency Distance & Isomorphic stituents on either side of the head (Temperley, Cosisters 2008), as in many double-adjective noun phrases Dependency is a relation between words such that in Romance—the Spanish gran globo rojo [big each word except the root depends on another balloon red] ‘big red balloon’—and single- and word, forming a tree of dependents and heads multi-word adjective phrases in English, as in the (Tesniere` , 1959; Mel’cukˇ , 2000). Dependency happy child / the child happy from playing outside. Distance Minimization1 (DDM) holds that word Another strategy for minimizing dependency orders which minimize the cumulative linear dis- distance is to place shorter cosisters closer to the tance between dependents and their heads tend head, as in Figure1 variant A 1, in which the to be preferred to variants with longer total dis- shorter dependent cosister d1 is placed closer to tances, where dependency distance is the count the head h than its longer cosister d2. Because the of words intervening between dependent and head two cosisters are of differing length, DDM is able (Liu et al., 2017). In Figure1, for example, the to predict that variant A1 be preferred to A2. two sentences may be semantically equivalent, but However, if the cosisters are of the same length, variant A1 yields a total dependency distance of or more accurately if they have the same form, 13, which is smaller than that of A2 at 18; thus DDM is unable to explain the preference for one A1 is preferred according to DDM. The variants variant over another. Figure2 shows two such in Figure1 hinge on whether the particle up ap- structures, B and C, in which varying whether d1 pears closer to the head looked than the longer or d2 appears closest to the head h does not yield a noun phrase the date of the last eclipse. DDM has different total dependency distance. The cosisters been shown to be quite widespread, if not univer- di in B have the same structure, as do the cosisters sal (Futrell et al., 2015), and rests on solid theo- di and C: the single-word it and up in B are single retical and empirical foundations from linguistics leaf-node dependents with no other internal struc- (Hudson, 1995), psycholinguistics (Futrell et al., ture, and the internal structure of to LA and after 2017), and mathematics (Ferrer-i Cancho, 2004). lunch is the same in that the first word depends on The methodology underlying DDM effectively the second in both cases. punishes certain structures, including those in These isomorphic cosisters, or same-side sister which two sister constituents are placed on the constituents that share the same internal syntactic same side of their head—‘cosisters’ after Osborne form, are the focus of the current study. In order (2007)—where the longer cosister appears clos- to motivate a preference for one linear order over 2 est to the head. Variant A2 in Figure1 shows another, as in Figure2B 1 and C1 over B2 and C2 such a case. One strategy for avoiding these struc- we must appeal to a mechanism other than DDM. 1 2 This approach is also called Dependency Length Min- B2 and C2 are not necessarily impossible, just disfavored. imization (DLM). Liu et al.(2017) suggests that because When asked Does Cam work very hard in the morning?, the distance connotes a dynamic state which may vary, while response No, Cam works ALL DAY very hard, might be ‘length’ is a more static feature, ‘distance’ is preferred. Re- marginally acceptable, especially with focus stress (Rooth, cent literature (e.g. Ferrer-i Cancho, 2017; Futrell et al., 2017; 1992). Adjective order tendencies—BLUE pretty fish—are Ouyang and Jiang, 2017) is converging on ‘distance.’ also violable under similar contexts (Matthews, 2014, p. 95). 3 Integration Complexity dependents with lower integration complexity tend to be placed closer to heads than their cosisters. The cost of integrating a dependent to its head “consists of two parts: (1) a cost dependent on A plausible feature of dependents, one which the complexity of the integration [... and] (2) a could form the basis of integration complex- distance-based cost” (Gibson, 1998, p. 13). If we ity, is their frequency. However, a simple ex- accept DDM as the basis for the distance-based ample shows that this cannot be the case: in cost and a valid motivation for preferred orders big chartreuse blanket, the less-frequent adjective among different-length constituents (Futrell et al., chartreuse is placed closest to the head, while in 2017), a definition of integration complexity may miniscule white blanket the more-frequent white allow the ordering preference between variant or- is placed closest the head. Clearly frequency of ders of isomorphic cosisters to be addressed. dependent alone cannot be the force driving inte- gration complexity. Many have wrestled with the notion of lin- guistic complexity (Newmeyer and Preston, 2014) A similar feature is the range of heads that a or grammatical weight (Wasow, 1997; Osborne, word can depend on. Ziff(1960) initially pro- 2007), though a consensus has yet to emerge. Sug- poses that this ‘privilege of occurrence’ could be gestions often involve number of words or phrase- the mechanism underlying adjective order, giv- structure nodes—more words or nodes equates to ing the example of little white house, in which higher complexity—yet counterexamples to this little can depend on a wider range of nouns sort of reasoning are readily found: Chomsky than can white—little sonnet for example, but not (1975, p.