<<

Journal of 6 (2013) 379–410 brill.com/jlc

Phylogenetic and areal models of Indo-European relatedness: The role of contact in reconstruction

Bridget Drinka University of Texas at San Antonio [email protected]

Abstract The Indo-European family has traditionally been viewed as textbook example of genetically related languages, easily fit onto a family . What is less often recognized, however, is that IE also provides considerable evidence for the operation of contact among these related languages, discernable in the layers of innovation that certain varieties share. In this paper, claim that the family tree model as it is usually depicted, discretely divided and unaffected by external influence, may a useful representation of language relatedness, but is inadequate as a model of change, especially in its inability to represent the crucial role of contact in linguistic innovation. The recognition of contact among Indo-European languages has implications not only for the geographical positioning of IE languages on the map of Eurasia, but also for general theoretical characterizations of change: the horizontal, areal of change implies a stratifica- tion of data, a layered distribution of archaic and innovative features, which can help us grasp where contact, and innovation, has or has not occurred.

Keywords phylogenetic; areal, Indo-European; stratification; language contact

1. Introduction

From the earliest days of Indo-European studies, scholars have used trees to represent the relationships among IE languages: Schleicher’s Stammbaum (1860) came to be the most frequent depiction of genetic relatedness for the IE languages soon after it first appeared. Although recognized as limited in scope virtually from the time of its inception (cf. Schmidt’s 1872 proposal for the replacement of the Stammbaumtheorie, the so-called Wellentheorie, or Wave Theory), the family tree model has persisted as the most familiar representation of linguistic filiation through the years.1 Recent advances in

1 See Brugmann’s (1883) point-by-point attempt to rebut Schmidt’s claims, such as Brugmann’s out-of-hand dismissal of Schmidt’s identification of the areal nature of the “satem”

© Koninklijke Brill NV, Leiden, 2013 DOI 10.1163/19552629-00602009

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

380 . Drinka / Journal of Language Contact 6 (2013) 379–410 computational modeling in genetics have fostered renewed interest among historical linguists in , the construction of phylogenetic trees to represent descendency from a single ancestor.2 An impressive array of computational approaches has developed, from character-based methods (Rexová et al., 2003; Gray and Atkinson, 2003) to distance-based methods (Serva and Petroni, 2008; Heggarty et al., 2010; Wichmann et al., 2011), from those focusing on the construction of phylogenetic trees (Gray and Atkinson, 2003; Warnow et al., 2006) to those concentrating on the construc- tion of networks (Forster and Toth, 2003;3 Bryant et al., 2005). Many authors attempt to reconstruct Indo-European genealogy and provenience using cladistic methods, such as those who seek to create a “perfect” or network of the IE languages based on lexical, phonological, or mor- phological features (Taylor et al., 2000; Ringe et al., 2002; Nakhleh et al., 2005a; 2005b) or those who use these data to locate a homeland for Proto- Indo-European (Gray and Atkinson, 2003; Atkinson and Gray, 2006; Bouckaert et al., 2012).4 The present study begins with a close examination of several of these cladistic models, then proceeds to an examination of alternative approaches. The genetic model is here recognized as a simplified model of the ancient relationship of Indo-European languages, but its usefulness as a definitive means of representing this relationship is questioned. Models focusing solely on language continua, as represented by waves, rivers, and other metaphors, may also be inadequate in isolation if they fail to acknowledge the essential role played by transmission from generation to generation (Labov, 2007). As argued in Heggarty et al. (2010: 3831), neither a tree model nor a can single-handedly represent the complex nature of relatedness within language families; what is required to reflect the dynamics of real-world rela- tionships is a “complex composite” of both approaches.

development (see §2 below) because it does not take into account the “unleugbar scharf trennende Sprachgrenzen” (‘undeniably sharply-dividing linguistic boundaries’) between Balto- and Indo-Iranian (1883: 228). It was this work which helped establish the family tree as the most popular model of IE relationship. See also Blažek (2007) for a synopsis of IE tree models. 2 The esemblancer between biological and linguistic processes of gradual differentiation was already noted by Darwin (1871: 89), who called these developments “curiously parallel” (Atkinson and Gray, 2005: 513). 3 Eska and Ringe (2004) present a strong critique of this work, both for its methodology and for its treatment of data. 4 All three of these works identify as the likely homeland of PIE. This claim will be examined in some detail below.

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

B. Drinka / Journal of Language Contact 6 (2013) 379–410 381

2. Trees and the Representation of Genetic Relationship in Indo-European

An appropriate place to begin this analysis of recent computer-generated cladistic models is with the preliminary study by Taylor et al. (2000), which establishes the basic tenets of tree construction: for each group, specific features (“characters”)—phonological, morphological, or lexical5—are coded and entered into an algorithm which produces a tree. Parallel developments and borrowings are excluded, because they “do not reflect the genetic descent of the languages involved” (Taylor et al., 2000: 397).6 This exclusion of borrowed forms fosters a focus on phonological and morphological characters, since lexical items are in many cases easier to borrow than structural patterns.7 Two sound changes, the “satem” development8 and the “ruki” rule9 are given special prominence. The resulting trees characterize the “satem” languages of Vedic, Avestan, Old Slavonic, and Lithuanian as forming a genetic subgroup that the authors call the “satem core” (Taylor et al., 2000: 400-403). In the west, the resist easy positioning on the tree, in some ways aligning with Balto-Slavic, in other ways with Celtic and Italic. The research- ers propose that Germanic began as the “nearest sister” of Balto-Slavic, but switched affiliation through contact with Celtic and/or Italic. Germanic, then, is placed on the “non-tree edges,” as in Figure 1. Several difficulties with this analysis become evident immediately. For example, while lexical items are, indeed, more easily borrowed, phonological and morphological features can certainly also spread from language to language

5 For the lexical analysis, the researchers use the 208-word Swadesh list, consisting of the lexical meanings assumed by Swadesh (1952) to be least subject to borrowing in the languages of the world. See, however, Kessler’s persuasive argument (2001:104-108) that many of the words on these lists have, in fact, been borrowed. 6 As we shall see, this view was revised in later work by several of the authors (Warnow et al., 2006). 7 It should be noted, however, that in certain contact situations, such as language shift with normal transmission, substantial structural influence can occur without heavy lexical borrowing (Thomason and Kaufman, 1988: 115-146). 8 Indic, Iranian, Slavic, Baltic, Armenian, and Albanian developed a spirant (s or š) from - tal *k,̑ where other IE languages have maintained a reflex of *k, as found in the words for ‘hun- dred’: Skt. šatám, Aves. satəm, Russ. sto Lith. šimtas vs. Lat. centum, Gk. hekatón, Eng. hundred (Szemerényi, 1996: 59-61). 9 Indic, Iranian, Slavic, and, to some extent, Baltic changed s to š before r, , k, and i, appear- ing as retroflex ṣ in and x in Slavic before back , .g., *ters- ‘dry’: Gk. térsomai ‘I become thirsty’, Lat. tostus (< *torsitos), Eng. thirst vs. Skt. t ṛṣyati ‘thirsts’, Aves. taršna- ‘thirst’, Lith. tirštas. Similarly, *wers- ‘high place’: Lat. uerrūca ‘wart’ vs. Skt. varṣman- ‘height, peak’, Lith. viršùs ‘summit’, OCS vrĭxŭ ‘summit, height’ (Szemerényi, 1996: 51-52).

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

382 B. Drinka / Journal of Language Contact 6 (2013) 379–410

Figure 1. The complex position of ermanicG (Taylor et al., 2000: 407) (reprinted with permission of John Benjamins). or to dialect under conditions of intense contact, as abundantly illus- trated in the literature on languages in contact (Thomason and Kaufman, 1988; Thomason, 2001; Heine and Kuteva, 2005; 2006). In fact, extensive struc­ tural borrowing can occur with minimal lexical borrowing, as has occurred among varieties of the Vaupés region of Amazonia, where language distinction is highly valued (Epps, 2009). Besides the fact that structural borrowing should not be excluded, we should also note that the two sound changes that Taylor et al., (2000) focus upon for their analysis, the “satem” development and the “ruki” rule, are often considered as prime examples of language con- tact, showing broad areal distribution (Hock, 1999: 15). Taylor at al. (2000: 400-403), on the contrary, place the “satem” languages under a single node, implying that they form a genetic subgroup. This grouping together of Indo- Iranian and Balto-Slavic gives the false impression that these languages descended from a single ancestor, rather than adopting features which had

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

B. Drinka / Journal of Language Contact 6 (2013) 379–410 383 spread across the territory.10 In some iterations of the algorithm (especially when Germanic is excluded), an additional problem arises: and come to be linked under a single node, reflecting the phonological and morphological similarities that these languages share, but identifying this resemblance, again, as a sign of genetic affiliation rather than more plausible later contact.11 Finally, as represented in Figure 1 above, the authors depict the relationship that Germanic had with other Indo-European languages as sharing a node first with Balto-Slavic (as its “nearest sister”) and then with Italic and Celtic. In forcing the data into tree structures, they fail to grasp the areal nature of these connections which Porzig (1954) has firmly established.12 The analysis of Ringe et al. (2002) expands upon the claims of Taylor et al. (2000), but continues to give primacy to genetic representations over those involving contact. The authors assert that normal linguistic transmission does not lead to mixed grammars, and that such mixings can only arise from a discontinuity of transmission—a rare occurrence, in their estimation (Ringe et al., 2002: 63).13 In this model, contact continues to be virtually excluded from consideration. Several criteria are used to construct the IE family tree, among the most important of which is the distribution of verbal structures (M1, i.e., morpo- logical character 1) (Ringe et al., 2002: 117-119). The authors assign languages to the following categories according to the patterning of verbal structures:

10 In later work (e.g., Ringe et al., 2002: 109), the genetic affinity of the “satem core” lan- guages is downplayed, and the role of contact is recognized, following Hock (1986: 442-444). But these innovations still appear as evidence for genetic connection. 11 Several scholars have recently made a case for the validity of an Italo-Celtic node, among them Jasanoff (1997), Schrijver (2006), and Weiss (2012). Schrijver, for example, points to several phonological and morphological similarities which suggest a closer genetic relationship between these two families than is usually recognized in recent scholarship; at the same time, he also acknowledges the significant role played by contact, assuming, however, that it occurred at a later time. Even if some intriguing similarities have been assembled, one must still take into account the broad array of differences that the two families exhibit in their lexicon, , and syntax. See Clackson and Horrocks (2007 §1.5) for additional arguments against the Italo-Celtic reconstruction. 12 Cf., for example, the many shared by Celtic and Germanic (Porzig, 1954: 118-127) and by Germanic and Balto-Slavic (140-148); see also Porzig’s identification of the Italic, Germanic, Baltic, and Slavic contact zones (205-209), and his conclusions concerning the distribution of dialectal features in the IE languages (213-217). 13 “To be sure, the effects of imperfect second-language learning strongly resemble ‘borrowing of morphosyntax’ after the fact, if the imperfectly learned second language eventually becomes a com- munity norm. But in our view such a pattern in the data reveals a discontinuity of transmission which should exclude the language in question from any strict ‘family tree.” (Ringe et al., 2002: 107).

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

384 B. Drinka / Journal of Language Contact 6 (2013) 379–410

Table 1 Categories of IE verb system (Ringe et al. 2002: 117-119). 1. one stem per lexeme a. two conjugations [Hittite] . b single conjugation [Luvian, Lycian] 2. present / / perfect contrast [Armenian, Greek, Albanian, , Avestan, Old , Latin, Old Persian, Oscan, Umbrian] 3. present / subjunctive / preterite contrast, the former two largely parallel [Tocharian A, Tocharian B] 4. present / preterite / infinitive contrast [Lithuanian, OPrussian] 5. present / preterite contrast, the latter in two conjugations (“strong” vs. “weak”) [Old English, Gothic, , Old High German] 6. present / subjunctive / future / preterite contrast [Old Irish] 7. present / subjunctive / preterite contrast, the latter two usually sigmatic [Welsh]

Several questions arise concerning this method of assessing relationship: are all of these distributions to be seen as distinct, single instantiations of change, or do they represent separate innovations? For example, is the presence or absence of a subjunctive construction enough to separate languages that are otherwise parallel? Subjunctives are closely connected to other parts of the verb system, often developing from present or future tense verbs in subordinate clauses (Bybee et al., 1994: 230-236). A separate consideration of the development of the moods would seem to be in order, similar to the special consideration given to the thematic optative (M6). On a deeper level, the authors’ categorization of the verb system points to a more fundamental methodological problem. The proposed criteria do assign the languages to categories, but do not represent how these categories are related, or how they reflect the verb system of the proto-language. The aim of the authors is clearly to establish relationship among daughters, and not to discern which systems have retained archaic distinctions and which have inno- vated. One must wonder, for example, why the sigmatic nature of the sub- junctives and preterites of Welsh are singled out as decisive, and not those of Greek, where the sigmatic forms enjoyed great productivity. The choice of

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

B. Drinka / Journal of Language Contact 6 (2013) 379–410 385 criteria appears to be arbitrary, reflecting the authors’ proclivity towards neat categorization that can be reflected on a tree. In similar fashion, no attention is paid to the crucial fact that some verbal categories are undeniably built upon others. Root , for example, consti- tute an archaic layer in the IE verb system; many present stems are built upon these old root forms by means of infixation or suffixation, and the imperfects are, in turn, constructed upon these present stems:

(1) Sanskrit aorist vs. imperfect √kṛ ‘make’: root aorist: akaram imperfect: akṛṇavam a- kar – am a- k ṛṇav- am augment + aorist stem + pst.1sg. augment + present stem + pst.1sg (k ṛ + n-infix)

Ringe et al. do establish a nesting of branches, implying that certain stages of a language—represented by nodes—precede others in time. Archaism and innovation in the verb system, then, are not completely excluded from consid- eration (cf. the important inclusion of the thematic aorist (M3) as innovative [Ringe et al., 2002: 94]), but the crucial clues provided by stratified temporal- aspectual categories receive little attention. As a result of this underestimation of the value of such details, the authors fail to recognize important similarities, such as the remarkable resemblance of the Greek and Indo-Iranian verb morphologies—their identical formation of the long- subjunctives, for example, and their parallel expansion of in the perfect systems. The authors do note that these two branches alone, along with Armenian, have implemented the use of the augment (M2), but they do not recognize the far-reaching implications of this innovation: the addition of the augment made it possible for both Greek and Indo-Iranian to form an imperfect alongside their older aorists (see ex. (1) above). This new usage thus allowed these languages to develop a full-fledged aspectual system, in contrast to the older, simpler system retained in Hittite and, presumably, in Germanic (present vs. preterite). What the authors also fail to notice is that the innovative use of the augment is areally distributed, not unlike the “satem” development and the “ruki” rule, though applying to different languages and occurring at a different time. In the last analysis, the extension of the verb system shared by Indo-Iranian and Greek is ignored because the characters which mark it are thought to indicate genetic relation- ship rather than shared innovation. What we must conclude is that this late, areal relationship between Greek and Indo-Iranian should not be placed on a tree.

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

386 B. Drinka / Journal of Language Contact 6 (2013) 379–410

How suitable, then, is the family tree model for representing relation­ ships within Indo-European or other language families? In the next section, arguments are presented which point to a need for a more comprehensive model.

3. Inadequacies of the Phylogenetic Model

At least four critiques of the phylogenetic model emerge, each of which merits further examination. In what follows, we will focus upon the non-explanatory nature of the phylogenetic model, the inaccuracy of its depiction of change and relationship, its inherent inability to recognize contact as an essential element of change, and its non-stratified nature.

3.1. Phylogenetic Trees Are Non-explanatory

While it is clearly essential that genetic relationships among languages be represented, it is also important to note that the family tree model, in seeking to identify archaism and to project that archaism back to a predecessor lan- guage, represents only one approach among several possibilities for organizing the data. The rationale for constructing a tree is not to determine why lan- guages developed the way they did. Rather, it is to utilize the evidence of later innovations to deconstruct change, to arrive at some point of unity in the past. An apt analog for the family tree is a skeleton, providing remnants of past relationship and shared ancestry, but offering no indication as to why some features change and others do not;14 stasis is thus viewed in this model as an unanalyzable entity, just as innovation is. The argument can, of course, be made that the family tree was not created to account for change. Trees are, by design, purely descriptive. However, if we hope to produce an accurate and dynamic characterization of Indo-European or other families, we need a more all-inclusive model, one which takes into account how and why innovations occur, and whether some similarities can be explained by reasons other than shared inheritance. What is essential to note is that, when speakers adopt change, they are doing so not because of their genetic background, but, above all, because of their interactions with others, and the new alignments they seek to establish (Croft, 2000: 178).

14 McMahon (1994: 248) refers to this issue as the “real actuation question”: “why some of these innovations die out and others catch on, spreading through the community, or why certain instances of variation become changes while others don’t.”

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

B. Drinka / Journal of Language Contact 6 (2013) 379–410 387

3.2. Phylogenetic Trees Are Inaccurate Depictions

As Ringe et al. themselves note (2002: 109), trees, by their nature, do not accurately represent the presence of dialect continua: when languages or on a disappear, analysts may mistakenly judge the survivors as representing “clean speciation.”15 A tree may then represent “not the original diversification of the languages but the end product of a complex series of events of differentiation and survival” (Ringe et al., 2002: 109). Only the survivors are specified—vestiges of social or political success rather than of purely linguistic development.16

3.3. Phylogenetic Trees Undervalue Contact as an Essential Element of Change

By forcing data onto trees, Ringe et al. diminish the possibility of innovations being viewed as moving across the territory in areal fashion. For example, even though they do admit (Ringe et al., 2002: 109), as mentioned above, that the “satem” innovation could have spread from Indo-Iranian into Balto-Slavic after a time of unity, they still use this change to support a closer genetic relationship between Indo-Iranian and Balto-Slavic than, say, between either of these and Greek (Fig. 2): In sum, there is no way for the family tree model to represent diffusion. Just as we acknowledged that the phylogenetic tree, as it stands, is not an explanatory model, so must we also admit that it is not presently capable of representing areal spread or contact phenomena, as McMahon & McMahon (2005: 18) note: We must acknowledge that family trees cannot tell the whole story, but equally that they do capture one important aspect of linguistic history; this does not mean we should castigate or reject the tree model for not incorporating contact, which it was never designed to do in the first place. One must still ask, however, whether the exclusion of data is licensed simply because it is not analyzable by a particular model. It would seem preferable to create a more comprehensive model which could depict other possible sce- narios of development in addition to the genetic one.17

15 The esearchersr also make the intriguing and credible observation that nodes may contain variation (Ringe et al. 2002: 108)—an implicit endorsement of the role of variation in change. 16 See Garrett (1999) and Heggarty et al. (2010) for insightful discussion of these issues. 17 Ringe et al. agree that trees cannot adequately address the issue of contact. When con- fronted with challenges such as the distributions of Germanic, Ringe et al. suggest that other models, such as network models of linguistic diversification, might be more suitable (Ringe et al., 2002: 108-112). See section 6 for further discussion.

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

388 B. Drinka / Journal of Language Contact 6 (2013) 379–410

Tocharian A Hittite Tocharian B Lycian Luvian

Latin Old Irish Oscan Welsh Umbrian

Albanian Gothic Old Norse Armenian Old English Greek Old High German

Old Church Slavonic Vedic Old Prussian Lithuanian AvestanOld Persian Latvian Figure 2. The apparent best tree for the entire Indo-European dataset (Ringe et al., 2002: 87) (reprinted with permission of Blackwell).

3.4. Phylogenetic Tree Models Cannot Utilize Stratified Data

An important but infrequently mentioned fact about recent IE phylogenetic analyses is that they unknowingly replicate sophisticated work already done by scholars such as Porzig (1954), Birwé (1955), and Euler (1979). These researchers sifted through large amounts of data to discern which similarities signal retention of an archaic form and which represent, conversely, a shared innovation, which imply genetic relationship and which point, instead, to

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

B. Drinka / Journal of Language Contact 6 (2013) 379–410 389 areal spread. An example of a questionable conclusion being reached through reliance solely on cladistic methods without the benefit of insights from previ- ous in-depth analyses is evident in Nakhleh et al. (2005a: 189), who view the distribution of the mediopassive *-r, found in Anatolian, Tocharian, Italic, and Celtic, vs. *-/-i, found in Germanic, Greek, and Indo-Iranian, (M5), as resulting from a split of these languages into two branches. Porzig (1954: 85) presents a more appealing, more credible, view: following Meillet (1931: 5), he notes the “peripheral” vs. “central” distribution of the two categories, and views the distribution of the innovative *-y/-i form not as indicative of genetic grouping but rather of areal spread (see also Drinka 1999: 490). The best representation of this change would probably be not that of a bifurcating tree, but of a layered, three-dimensional map which accounts for both the older, widespread distribution of *-r, but also the introduction of the innova- tive *-y/-i across a limited territory.

4. A Phylogeographic Model of Indo-European Diffusion and the

A close examination of one of these cladistic models, the recently-developed phylogeographic model of Bouckaert et al. (2012), will illustrate what is appealing yet still lacking in this mode of representing the diffusion of the Indo-European languages.18 As its name implies, this model undertakes the commendable but daunting task of incorporating both genealogical and geo- graphical information into one analysis, combining phylogenetic data (a tally- ing of the presence or absence of in a Swadesh-based list for ancient and modern IE languages) with geographical projections (using a “relaxed ran- dom walk” model,19 assuming a continuous spatial diffusion). The researchers thus “jointly infer the Indo-European language phylogeny and the most probable geographic ranges” of these phylogenies (Bouckaert et al., 2012: 958). That is, as explained in the supplementary material to this (page 9), they assume that “languages disperse as they evolve through time such that we can model their dispersal in space along the branches of the language tree inferred together with

18 This work is built upon a number of previous influential studies, including Gray and Atkinson, 2003, Atkinson and Gray, 2006, and Pagel et al., 2007. 19 As explained and illustrated in Lemey et al., 2010, the relaxed random walk method allows researchers to calculate the rate of spatial movement of a viral epidemic according to the con- straints of the context in which the diffusion occurs, not according to a fixed molecular clock model. In this way, the diffusion is not viewed as homogeneous over its entire expanse, but variable.

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

390 B. Drinka / Journal of Language Contact 6 (2013) 379–410 data”—a provocative but not uncontroversial claim. It is worth noting that the phylogenetic tree is a given in this model, used uncritically as an accurate representation of language relationship. At least two questionable implications emerge from the authors’ arguments: that speakers will necessar- ily move rather than remain stationary, and that, when they move, they will follow a trajectory that other speakers of the same or related languages have established.20 Responding to each of these implications in turn, we should note, first of all, that it is not necessarily the case that speakers of a given language will move at all; speakers may remain in the same locale for long periods of time. The amount of change that a language undergoes is not tied to its movement through space, but to the social dynamics within the community where it is spoken (see Heggarty et al., 2010 for similar arguments.). With regard to the establishment of paths of diffusion by speakers of related languages, it is a well-documented fact that family members will migrate where other family members have gone, a phenomenon referred to as “chain migration” (Anthony, 2007: 112-113); this tendency could entail the estab- lishment of paths of diffusion of related languages. What we must still won- der, however, is whether speakers of other languages could not follow these same well-worn paths of diffusion—the Huns, Altaic, and Mongolian peoples who successively moved across the steppe were not speakers of the same lan- guage, for example, nor were the Berber and Arabic populations who moved into Iberia, or the Magyar and Slavic peoples who moved into the . Many forces are at work alongside genetic affiliation in determining whether and where populations will move. Using the phylogeographical approach developed to study the origins of virus outbreaks mentioned above (Lemey et al., 2010), the authors carried out an analysis of the basic vocabulary of 103 IE languages, and concluded that the Anatolian peninsula is a more likely homeland for the Proto- Indo-Europeans than the steppe. They then constrained their analysis by using trees constructed with the phonological and morphological data presented in Ringe et al. (2002), mentioned above, and, again, arrived at the conclusion that Anatolia was the probable IE homeland, rather than the steppe. In both

20 The authors do not characterize their model as necessarily entailing a migration trajectory, focusing especially upon the diffusion of languages rather than populations. What ought to be recognized, however, is that the Indo-European languages were brought to new locations through the movement of speakers, whether in larger groups, as claimed by proponents of the steppe model (Gimbutas, 1970; 1977; Mallory, 1989) or through small-scale cultural transmis- sions, as maintained by advocates of the Anatolian model (Renfrew, 1987), so that migration behaviors are, in any case, implicitly germane to this discussion.

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

B. Drinka / Journal of Language Contact 6 (2013) 379–410 391 conclusions, whether the lexically-based or the structurally-constrained, an important assumption is to be noted: the phylogenetic trees that were used to carry out the analyses (S1 and S12 in the supplementary materials) list the —Hittite, Luvian, and Lycian—as undergoing the earliest split from the other IE languages. Few would contest the archaism that the Anatolian languages exhibit; what is to be questioned, however, is the appar- ent effect of this archaism on the conclusions that are reached. In such a model of very slow diffusion, languages with more archaic features will be more easily characterized as the center of diffusion, since all other languages are demon- strably less archaic and more amenable to depiction as later (as in the map in S4). The archaism represented in the family trees appears to prejudice the model towards choosing the site of archaism, Anatolia, as central.21 A more comprehensive analysis of the languages, including more thorough morpho- logical analysis,22 would reveal, as mentioned above, that other languages, such as Tocharian and Germanic, also show distinct signs of archaism, while another set of languages, Indo-Iranian and Greek, shows clear signs of later contact. The trees based on cognate sets do not reveal these facts, and cannot make use of them. Contact, a force of undeniable consequence among the IE languages, is, as in earlier models, simply excluded from consideration. An additional critique of the model, related to those mentioned above, pertains to a more subtle yet extremely important consideration: this model does not recognize the inherent layering of features in the Indo-European languages. While archaism is recognized in the lexicon of the Anatolian languages as indicated by a lower level of cognacy with other IE languages, this archaism is viewed as indicative of non-movement from the homeland, rather than as pointing to a separation from the central core of IE languages, parallel to that of other archaic languages like Tocharian and Germanic. Those lan- guages which did not separate and remained in contact went on to undergo an array of phonological and morphological innovations which spread, to varying degrees, across territories—phonological changes such as the above-mentioned

21 The authors attempt to control for this recognized potential bias by separately analyzing the ancient languages (including the Anatolian varieties) and the contemporary languages (exclud- ing Anatolian). The ancient languages yield a very high Bayes factor for Anatolian vs. the steppes (e.g., 1582.6 with a number greater than 100 considered “decisive”) while the contemporary languages yield a considerably lower Bayes factor (e.g., 11.4, still considered “substantial sup- port”) (Bouckaert et al., 2012: 959, Table 1). While the authors view both findings as supporting the Anatolian hypothesis, their results are far more striking when Anatolian is brought in, imply- ing that the inclusion of Anatolian does, indeed, bias their findings. 22 Cf., for example, the increasingly important role that Warnow et al. (2006: 83) give to morphological and phonological characters in the construction of phylogenetic trees, in contrast to their earlier focus on the lexicon.

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

392 B. Drinka / Journal of Language Contact 6 (2013) 379–410

“ruki” rule and “satem” development, which were both adopted in many Indo-Iranian and Balto-, and morphological innovations such as the introduction of the augment, with the resultant creation of the imper- fect aspect, found especially in Indo-Iranian, Greek, and Armenian.23 Mallory (1989) points to the inability of the Anatolian model as articulated by Renfrew (1987) to account for this late shared innovation: if Indo-European began to spread from Anatolia in the ninth millennium as posited by Renfrew, and reached around 6500 BC but did not reach the territory of the Indo- Iranians until the fourth millennium, there is no opportunity for the contact which is evident in the linguistic record to have occurred. As Mallory (1989: 180) states, “[t]his would indicate that, when we first encounter these languages, they have been separated for over 5,000 years despite the fact that they share what are generally regarded as numerous late isoglosses.” Mallory (1989: 178-181) goes on to note other inadequacies of the Anatolian hypothesis: its inability, for example, to account for the widespread distribution of words for ‘,’ ‘plow,’ ‘wool,’ etc., connected with the “Secondary Products Revolution” (Sherratt, 1981),24 and its failure to explain the existence of Indo-European and Uralic connections. One of the most important questions Mallory raises concerns the distribution of other, non-IE languages in Anatolia: if Proto- Indo-European sprang from Anatolia and began its slow diffusion from there, it seems unlikely that Hattic and Hurrian, both attested from the beginning

23 Notably, Renfrew (2000) later endorses the stratification of Indo-European data into early, middle, and late layers, but does not recognize the geographical implications of this claim, that varieties which developed shared innovations in the late period must have been in physical contact. This fact represents serious counterevidence to Renfrew’s Anatolian hypothesis. 24 Heggarty (2006; 2007) expresses concern about the reliability of this use of “linguistic palaeontology” to specify the location of languages in space and time. He claims, for example (Heggarty, 2006: 189), that when speakers adopt a new technology (e.g., the axle), they will usually extend the semantic value of a pre-existing word in their own language (e.g., ‘pole’) to cover this new meaning, based on the example of the model language. As the technology spread, so would have the semantic extension. When dialects or languages adopting this technology are closely related, he argues, the forms which undergo this semantic extension may be cognates, a fact which could give the false appearance of shared inheritance to the new, extended meaning. Based on this line of reasoning, Heggarty claims that the introduction of the axle itself and the semantic extension of the term for it in many Indo-European languages would not have had to occur at the same time, but could have arisen millennia apart. It would therefore not be neces- sary to assume that Indo-European unity extended to a time after the creation of the axle, since the unitary nature of the term, Heggarty argues, could have grown up later, and diffused with the technology. What is difficult to accept in this argument is the implication that cognate words will be chosen in such a unified manner as to resemble inherited forms. The root *aks- is excep- tionally well-represented across all the major IE language families except for Hittite and Tocharian (Buck, 1949: 725; Anthony, 2007: 63-65).

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

B. Drinka / Journal of Language Contact 6 (2013) 379–410 393 of the Hittite historical tradition, would have originated in the same space, but would have remained static, unaffected by the migrations. In sum, while the mathematical model used by Bouckaert et al. (2012) is sophisticated and impressive, it misses a number of important insights: • languages do not necessarily move, nor is it necessarily the case that languages will diffuse along the same paths as their congeners • the relationship among IE languages is not genealogical alone, and is not completely reducible to simple trees, which may distort the complex dynamics of relationship. A diffusion model based upon handy but under-informative phylogenetic trees provides an incomplete depiction of the operation of change • the stratification of data into archaic and innovative layers needs to be accounted for, as do the geographical implications of such stratification • the designation of Anatolia as the homeland of the Indo-Europeans remains highly questionable

5. Acladistic Models of Linguistic Change: Waves, Rivers, and Entangled Banks

Turning our attention to models outside the cladistic tradition, we begin with the proposal put forward by Johannes Schmidt as an alternative to Schleicher’s representation of linguistic relationship as a family tree, that a better metaphor would be that of a wave (Schmidt, 1872).25 In Schmidt’s view, change can be thought of as radiating from a central point, like waves emanating from a point of inception. Communities nearest to the place of origination are expected to adopt the change most completely; those at a more distant location may adopt the innovation later and only partially, or only in particular contexts. More recently, Trudgill (1983: 58-59) has provided an enhancement of the Wave Model in the form of the Gravity Model, in which innovation moves, not in steamroller-fashion across the landscape, but by hopping from one large city to the next largest (not necessarily contiguous) city and leaping over the intervening terrain. Figure 3, for example, illustrates the spread of the Parisian uvular r, extending first to the major urban centers of , Turin, and Copenhagen, but not to smaller Osnabrück, Bergamo, or Åbenrå. This Gravity Model, schematically represented in Figure 4, uses such factors as population size and distance from the point of origin to account for the movement of innovations across the map.

25 Schmidt (1872) does not actually characterize his proposal as a theoretical model, but refers to it as a sketch or a depiction.

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

394 B. Drinka / Journal of Language Contact 6 (2013) 379–410

Bergen Oslo uvular /r/ not usual Larvik

only in some educated speech Kristiansand usual in educated speech general Copenhagen

0 400 km

0 300 miles

e Hauge Berlin

Cologne

Stuttgart Paris

Basle Zurich

Turin

Marseille

Figure 3. The distribution of uvularr (Trudgill, 1983: 58) (reprinted with permission of NYU Press).

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

B. Drinka / Journal of Language Contact 6 (2013) 379–410 395

Figure 4. The gravity model of linguistic diffusion (based onWolfram & Schilling- Estes, 2003:724).

Other scholars also attempt to characterize change as non-stemmatic: Terrell (1988) proposes that Darwin’s image of an “entangled bank”26 might serve as a more appropriate model of the spread of population and language across the Pacific Islands than the family tree model. Such a depiction of change as complex and non-linear would encourage researchers to focus on variation within populations rather than on “unchanging, idealized types”; it would promote the search for proximate rather than final causes, and would, ultimately, allow us to conceive of change within societies “in human terms, [as] an interlocking, expanding, sometimes contracting and ever-changing set of social, political and economic subfields” (Terrell, 1988: 647).

26 “It is interesting to contemplate an entangled bank, clothed with many plants of many kinds, with birds singing on the bushes, with various insects flitting about, and with worms crawling through the damp earth, and to reflect that these elaborately constructed forms, so different from each other, and dependent on each other in so complex a manner, have all been produced by laws acting around us.” (Darwin 1859: 489)

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

396 B. Drinka / Journal of Language Contact 6 (2013) 379–410

Mufwene (2001) likewise presents alternative models to the phylogenetic tree in his characterization of the spread of innovations across a population. For example, he compares language to a parasitic species which must rely upon its host (the speakers) and society in order to survive: “linguistic features are passed on primarily horizontally, more or less on the pattern of features of parasites, through speakers’ interactions with members of the same communicative network or of the same speech community” (Mufwene, 2001: 150). Agents of change transmit linguistic features from one speech community to the next, just as they would a disease (2001: 151). Mufwene (2001: 143) also presents, in more schematic fashion, another attractive model of linguistic change: the image of a flowing river, its coloration determined by the hues and shades of tributaries which flow into it. As the river arrives at its delta and divides into smaller streams, each preserves rem- nants of the color of the larger stream, but each also takes on the coloration of the riverbed beneath. We can, in like fashion, imagine language as it moves through space and time, interacting with its own ecological environment, undergoing contact with other languages, experiencing substratal or super- stratal influence, etc. If innovations do, indeed, move across populations of speakers in horizontal fashion, is it the case that the family tree model should be given up in some locations (e.g., Australia, as argued by Dixon 1997), or abandoned altogether, in favor of an acladistic model? Scholars, in general, recognize the value of maintaining information about genetic relatedness, when possible, but some insist that the tree as it is currently conceived does not correctly depict this relationship. Harrison (2003) maintains, for example, that, while kinship of languages can be established by the , the subgroupings represented in family trees do not accurately portray this relatedness. In his opinion, innovations “diffuse through the linguistic landscape, and give rise to the patchwork of isoglosses rather than the discreteness of trees” (2003: 239). With regard specifically to the use of trees for the analysis of Indo-European, Garrett (1999:152-153) proposes a rather similar acladistic approach to that of Harrison in his characterization of IE subgrouping and dispersal. He recon- structs a non-distinct predecessor to Celtic, Italic, and Greek, and claims that these languages developed into their present states through “local responses to areal and cultural connections,” that is, in the context of “secondary phenomena.” Garrett thus rejects the image of well-formed “Celtic” or “Italic” languages already existing in western and southern at an early time. He envisions the relationship of these varieties as a triangle (Fig 5), still sug- gesting the structure of a tree, but free of the usual lines and nodes depicting overt genetic filiation.

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

B. Drinka / Journal of Language Contact 6 (2013) 379–410 397

(WESTERN & SOUTHERN) INDO-EUROPEAN

dialects ancestral to Celtic, Italic, and Greek dialects, and contiguous ‘minor’ languages Figure 5. Model of western and southern I-E sub-grouping (after Garrett, 1999: 152).

Additional arguments against the tree-like depiction of IE dispersal are presented in Garrett’s later work (2006: 143; 147) as well, where the “prun- ing” of intermediate dialects is seen as responsible for the apparent discrete- ness of branches. Garrett goes on to characterize the divergence of the IE languages by means of contact between the established indigenous cultures in Anatolia, Greece, and Bactria-Margiana and the incoming Indo-Europeans: he views the early borrowing of prestige lexical items (e.g., Hattic into Hittite: halmaššuitt- ‘throne’) as demonstrating adoption of local socio- cultural traditions by Indo-European speakers, and regards later phonological and morphological innovations as representing the development of ethnic identities in Hittite, Greek, and Indo-Iranian.

6. Hybridization of “Tree” and “Wave”

Calvert Watkins (2001: 63), in his analysis of areal influence in ancient Anatolia, states that, while the comparative method is usually used to detect genetic relation, it can also signal areal diffusion, as well:

The goal of genetic comparison is linguistic history, while that of typological comparison is often said to be linguistic universals. But one can and, I insist, must compare the components and manifestations of a linguistic area in order to draw historical conclusions. (emphasis in the original)

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

398 B. Drinka / Journal of Language Contact 6 (2013) 379–410

While Watkins does not provide precise details as to how linguistic areas should be analyzed using the comparative method, he points to the essential role that language contact plays in accounting for linguistic change, and the ability of comparative analysis to sort out those factors which are due to genetic inheritance, and those which can be explained by areal diffusion. Chappell (2001: 354) suggests that a more complex, synthesized approach is in order:

To reconstruct the history of a adequately, a model is needed which is significantly more sophisticated than the family tree based on the use of the comparative method. It needs to incorporate the diffusion and layering pro- cess as well as other language-contact phenomena such as convergence, metatypy and hybridization. The desideratum is a synthesis of all the processes that affect language formation and development. Although Campbell (2006: 19-20) questions this statement, which appears to him to be a challenge to the comparative method, he goes on to make a some- what similar assertion himself:

Mainstream historical linguists realize that it is not possible to understand diffu- sion fully without knowing the genetic affiliation of the languages involved, and vice versa, it is not possible to account fully for what is inherited without proper attention to what is diffused. That is, it is not two distinct, opposed and antago- nistic points of view that are involved, but rather both are needed and they work in concert (Campbell 2006: 18). While there is general recognition of the desirability of analyzing both genetic and contact factors simultaneously, such an amalgamation of the family tree and wave models has not yet been attained. Significant strides in this direction have been taken, however, by McMahon and McMahon (2005), in their recognition of the need for a more comprehensive analysis than that provided by the phylogenetic model. They endorse Kessler’s (2001) suggestion that, while cognate sets are useful for tree construction, researchers should also avail themselves of a fuller data set, including linguistic loans, when their goal is “historical connectedness,” i.e., to compare linguistic data with genetic or archeological data. McMahon and McMahon (2005: 137) express their support for the incorporation of contact data as follows: “If we are serious about rehabilitating contact-induced change and want to be able to account for both aspects of Kessler’s ‘historical connectedness’ (2001), then our concentration on trees is problematic.” They on to conclude that network models outperform family tree models in representing this contact, since the former can represent not only shared origin, but also differentiation due to contact (2005: 178). They point to NeighborNet, used in biology and genetics, as among the most

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

B. Drinka / Journal of Language Contact 6 (2013) 379–410 399 successful network models. This model will construct either trees or networks, as required, in order to represent the similarities and differences between lan- guages (2005: 158). When contact occurs, “reticulations” appear, that is, box- shaped structures representing ambiguities in the data which can be interpreted as signs of contact.27 These reticulations are evident, for example, in Figure 6,

Sardinia N Sardinia CSardinia L Portugu ST Italian Cataian Brazilian Spanish

Provencal Ladin French Walloon French C Welsh C

Breton Lst Lt BretonST Vlach

IrishA

Sranan

English ST Swedish LT Afrikaans DutchList Swedish UP Flemish Faroese Swedish VL Frisian Iceland ST German ST PennDutch Danish Riksmal Figure 6. Germanic, Romance, and Celtic lolo data on NeighborNet (McMahon and McMahon, 2005: 161) (reprinted with permission of Oxford University Press).

27 According to Wichmann et al. (2011: 208), what reticulations actually represent are deviations from phylogenetic trees, whether due to contact, homoplasy, or other factors. Reticulations thus reflect similar divergences by languages which are presumed to be genetically related. The authors go on to demonstrate that reticulations are found more frequently in the “direct offspring of the ultimate ancestor of the phylogeny” (Wichmann et al., 2011: 237), that is, among languages that are isolated in their phylogenies, and, presumably, vulnerable to external influence.

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

400 B. Drinka / Journal of Language Contact 6 (2013) 379–410 a network produced by NeighborNet using the least conservative Germanic, Romance, and Celtic lexical data from the Dyen et al. (1992) database.28 Reticulations appear in key locations on this network. For instance, the three varieties of Swedish are tightly linked through dialect contact as repre- sented by their dense reticulations. Similarly, although it is an English creole, Sranan is closely tied through reticulations to Dutch, as well, representing extensive lexical borrowing. Figure 7, which employs the full data set of Dyen et al. (1992), likewise produces a number of reticulations, implying contact, especially in Balto-Slavic. While reticulations do not provide clues as

Singhales Bengali NepaliLst PersianLt PanjabiST Tadzik Afghan Lahnda Hindi Kashmiri Ossetic Gujarati Khaskura Baluchi Waziri Marathi Wakhi Albanian K GypsyGK Albanian C Albanian Tp

Lithuanian Albanian T Lithuanian ST ALBANIAN Latvian Albanian G

Greek K Greek D Greek ML Ukrainian BYELORUSSIAN Greek MD Polish UKRAINIAP Btelorussion Greek Med RUSSIAN P Armenian Mod POLISHP Russian Armenian Lst SLOVAKP CZECHP Lusatian L Irish A Lusatian U Irish B Slovak Czech Czech E SLOVENIAN P WelshC SERBOCROATION Serbecrecation WelshN MACEDONIAN Macedenian Vlach BULGARIAP BretonLst BretonST Bulgarian RumaniaLT Slovenian BretonSE Sardinian N IcelandST Sardinian N Farcese Sardinian C SwedishVL PennDutch French Creole SwedishUP GermanST French Creole D SwedishLt Provencal Danish French Riksmal DutchList Italin Wallon Frisian Catatan Afnkaans Flemish Ladin Sranan Brazilian Spanish English Pertuguese Figure 7. Indo-European relations using NeighborNet (McMahon and McMahon, 2005: 164) (reprinted with permission of Oxford University Press).

28 McMahon and McMahon (2005: 109) assemble this least conservative sub-list by selecting lexical items from the list of Dyen et al. (1992) which show “far less potential for cross-cultural generalizability” (e.g., ‘grass’, ‘year’, ‘think’), and which are thus assumed to be more subject to the effect of contact. They refer to this sub-list as the “lolo” list, “low in reconstructability and low in retentiveness, in other words, less universal, and more changeable.”

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

B. Drinka / Journal of Language Contact 6 (2013) 379–410 401 to how contact functions, they at least point to places where contact seems to have occurred, while also signaling genetic filiation. While networks are generally endorsed as possible representations of language contact, some researchers regard them as useful, above all, for the construction of genetic trees. Warnow et al. (2006:82), for example, reveal their position on the primacy of genetic relationship over areal relationship in statements like the following: “if the evolutionary history does not include too much borrowing (specifically, too many contact edges), we should still be able to infer the evolutionary history.” The fact that only one contact edge can be used at a time in this model (2006: 81) severely limits its verisimilitude and applicability to real-life situations, where contact may exist with several other varieties at once. In a similar vein, Nakhleh et al. (2005b) conclude that the need to recon- struct contact for the Indo-European languages is minimal. They claim that most correspondences among IE languages can be accounted for by means of a genetic model, and that the only language family in real need of a non- genetic, network explanation to account for its development is Germanic. While the authors make a commendable effort to use not just lexical but also phonological and morphological data for their analysis and while they attempt to address comprehensively the possible role of contact, they are hampered by the fact that they rely on the data of Ringe et al. (2002), described above, as the foundation of their analysis. They therefore mistakenly continue to view the satem and ruki rules as indicative of genetic rather than areal relationship among the Indo-Iranian and Balto-Slavic languages, and they likewise fail to recognize the strong possibility that the distribution of mediopassive markers could be due to contact rather than genetic relationship (Nakhleh et al., 2005b: 410-411). Even more significant, however, is the fact that the areal relationship that must exist between Greek and Indo-Iranian is missed altogether. The network approach presented by Heggarty et al. (2010: 3830), on the other hand, recognizes the value of areal diffusion, and identifies the causes for linguistic divergence as stemming from the “real-world context” in which speakers live, including socio-political, cultural, and other forces. The authors point to the ability of NeighborNet to reflect with accuracy the influence of these real-world factors in the Germanic languages, both in the identification of dialect continua like the one stretching from Flanders to the Alps, but also in the representation of English as isolated from the other Germanic languages (Heggarty et al., 2010: 3837-3838). They regard the interface of time and geographical distribution as crucial:

[the] degrees of divergence between language varieties are a function not just of separation time but also of the degree of cohesiveness of a speaker community, for

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

402 B. Drinka / Journal of Language Contact 6 (2013) 379–410

which geographical space is often a fairly close proxy, especially within a dialect continuum. (emphasis in the original) (Heggarty et al., 2010: 3838) They view change as moving across a speech community, fueled by sociolin- guistic motivation: [Changes] come to be shared not by randomness, but by the forces in the real- world context that determine the extent and nature of those communities, their degrees of coherence, and also their external boundaries that the propagation wave may not cross. (Heggarty et al., 2010: 3841) Finally, the impressive work of Jaeger et al. (2011) should be mentioned in this regard as taking an important step towards simultaneously assessing the role of genetic and areal factors. The authors propose innovative ways to model the combined effects of geography and genealogy from the perspective of continents, countries, families, and sub-families, while also recognizing the difficulty of disentangling their combined effects. Rather than viewing contact evidence as ancillary (as proposed by Gray and Atkinson, 2003: 436; 2006:107) or minimally explanatory (Nakhleh et al., 2005b), studies like these demonstrate that contact, whether at the level of language or dialect, is a crucial element of linguistic change, and should be characterized as such.

7. Stratified Models of Indo-European Relationship

An additional advantage of combining elements of the “family tree” and “wave” models is that it allows us to conceptualize a “stratified reconstruction” of Indo-European, one in which descendent languages would have split from a dynamic “proto-conglomerate” at different points in time. The validity of this layered approach, in contrast to the traditional, idealized image of PIE as a compact entity existing at one point in space and time, is supported by the emergence of several stratified linguistic systems in the IE verb system, such as the temporal, aspectual, and modal categories. As mentioned above, certain languages retain more archaic morphological features, and so are assumed in this model to have separated from the proto-language at an early time. Others show more signs of accretion, with innovations clearly built on the archaic layers but not adopted by the earlier-separating languages; these languages must have separated later. As one would predict based on our previous discus- sion, Hittite and Germanic would both belong to the first group, since both have very simple verbal morphologies, with no trace of ancient subjunctives or

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

B. Drinka / Journal of Language Contact 6 (2013) 379–410 403 futures, or of a systematic aspectual contrast in the past (Polomé, 1964; 1982; 1987). To the second group would belong Greek and Indo-Iranian, which must have remained in contact for a longer period of time. These “eastern” languages display, among other characteristics, a remarkably similar temporal- aspectual system, with a productive 3-way contrast of present–imperfect– aorist vs. the 2-way contrast seen elsewhere in IE; a lengthened theme-vowel subjunctive, not found in other languages; and an obligatory use of reduplica- tion for marking the perfect, found much less frequently in the west than in the east (Birwé, 1955; Drinka, 1995; 2003).29 This shared complexity of the Indo-Iranian and Greek verbal morphologies does not, then, represent archa- ism, as traditionally construed, but rather signals shared innovation due to late contact in the eastern area. Meid (1975) characterizes this stratification of languages across space and time as shown in Table 2. This model sets up a framework of early, middle, and late stages in the proto-language, and posits a separation of eastern and western areas in the last stage, representing the shared changes which occurred separately in the east and in the west. Anthony (2007: 55-56) uses a similar model of stratified chronology to assemble a compendium of archeological, geographical, and linguistic data in an attempt to explain the diffusion of the IE languages in comprehensive fashion. He charts the sequence of proposed splits that the IE languages would have undergone (Figure 8); like many others, he proposes a very early

Table 2 Model of a developing proto-language (Meid 1975).

I. early Indo-European (c. 5th millennium B.C.) represented by archaisms in both the eastern and western areas II. middle Indo-European (c. 5th-4th millennium B.C.) represented by more recent features found in both east and west III. late Indo-European (3rd-2nd millennium B.C.) represented by recent innova- tions in differentiated languages a) eastern group: esp. Greek, Indo-Iranian b) western group: esp. Italic, Celtic, Germanic

29 The striking similarity of the ndo-IranianI and Greek verbal morphologies was already identified by Schmidt (1872: 21) as a sign of areal relationship. Watkins (2001: 57) adds that, in addition to sharing the augment prefix and the whole“ structural organization of the verbal system,” Indo-Iranian and Greek also share more poetic features than any of the other IE languages. Based on these features, Watkins positions Indo-Iranian, Greek, Armenian, and to some extent Phrygian in a large subgroup, and sketches a branching diagram with these languages on the right edge.

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

404 B. Drinka / Journal of Language Contact 6 (2013) 379–410 separation for Anatolian, at a time before wheeled vehicles would have existed in the steppes; Tocharian and Germanic are also seen as separating early. A later split is hypothesized for Greek, and even later for Indo-Iranian. Importantly, Anthony recognizes the crucial link between Greek and Indo- Iranian noted in Meid’s model, above.30 What the contact between Greek and Indo-Iranian implies is that Greek did not move out of the central region until fairly late. Since Greek did not par- ticipate in either the “satem” or the “ruki” innovations, however, these changes suggest a terminus ante quem: Greek and Indo-Iranian were in contact late, but evidently not as late as the period when these changes occurred. Because of

1000 BCE Greek Iranian Indic

1800 Italic Celtic Slavic Baltic 2200 2000 BCE

2500 Armenian 2800 od

3000 BCE ri Germanic 3300

3700 Tocharian 4000 BCE Anatolian 4200 maximal PIE 4500

5000 BCE st st ral Ea nt We Ce Steppe Regions

Key

largely isolated frequent a separation signicant borrowing

occasional but end of signicant borrwoig proto-language

Figure 8. A diagram of the sequence and approximate dates of splits in early I-E (Anthony, 2007: 100) (reprinted with permission of Princeton University Press).

30 In addition to morphological evidence, Anthony also collects cultural information, such as shared words for weapons, sacrifice, gods, and poetic terminology (2007: 55-56).

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

B. Drinka / Journal of Language Contact 6 (2013) 379–410 405 this connection with Indo-Iranian, Anthony (2007: 369; 379) positions Pre-Greek on the eastern border of southeastern Europe, potentially as part of the early western north of the Black Sea, 2800-2200 BCE (see Fig. 9). To the north of this Catacomb cultural territory lies the Abashevo culture, 2500-1900 BCE, assumed to be associated with Indo- Iranian speakers; as steppe customs spread to the forest-steppe in the north,

Figure 9. Culture groups of the Middle , 2800-2200 BCE (Anthony, 2007: 379) (reprinted with permission of Princeton University Press).

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

406 B. Drinka / Journal of Language Contact 6 (2013) 379–410 the ceramics and to some extent the metallurgy of this area tended to resemble more and more the artifacts of the Catacomb culture (2007: 382-383). Herders from this area and the Poltavka area began moving further north and east, near the marshes of the Sintashta area in the northern steppe east of the Ural Mountains. Anthony (2007: 408-411) provides copious evidence that the Sintashta settlement, dating between 2100 and 1800 BCE, was inhabited by pre-Indo-Iranians: in Sintashta were found the remnants of the earliest , widespread evidence of the use of metallurgy in homes, new types of weapons, and remains of unique funeral rituals, all of which match descrip- tions in the extraordinarily well.31 It was in this area, stretching from the Catacomb territory to the Sintashta area, that Anthony assumes that con- tact between the pre- and the pre-Indo-Iranians must have occurred. The stratification of languages represents the fact that proto-languages are not mere abstractions, but have temporal and spatial substance; Proto- Indo-European should not be regarded as a node, but as a language possessing variability and some geographical expansiveness, even in its early stages. The effect of contact is evident throughout the history of the IE language family. It is only with the help of more dynamic models of change, models which incorporate geographical factors with phylogenetic ones, that we can hope to successfully portray the complex development of the IE languages.

8. Conclusions

To sum up, the following related arguments have been presented here: 1) Family tree models are inadequate when used in isolation, and should be supplemented with more informative models which take contact into consideration. 2) The implication of constructing models which take “horizontal,” areal influence into consideration is that the stratification of data—innova- tive layers building on more archaic layers—emerges as significant. Languages which share only archaic elements, such as Hittite and Tocharian, are presumed to have separated from other IE languages at an earlier time; languages which share an array of morphological, lexi- cal, poetic, and other features, like Indo-Iranian and Greek, are assumed to have remained in contact for a longer period. These facts constitute

31 For example, in addition to a large number of horse sacrifices at this site, there exists substantial evidence of midwinter dog sacrifices at nearby sites, recalling the initiation rites of the New Year referred to in the Rigveda (Anthony 2007: 410).

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

B. Drinka / Journal of Language Contact 6 (2013) 379–410 407

strong counterevidence to the claim that Proto-Indo-European could have originated in Anatolia.

References

Anthony, David. 2007. The Horse, the Wheel, and Language. Princeton - Oxford: Princeton University Press. Atkinson, Quentin and Russell Gray. 2005. Curious parallels and curious connections— Phylogenetic thinking in biology and . Systematic Biology 54: 513–526. Atkinson, Quentin and Russell Gray. 2006. How old is the Indo-European language family? Illumination or more moths to the flame? In Peter Forster and (eds.), Phylogenetic Methods and the Prehistory of Languages, 91–109. Cambridge: McDonald Institute for Archaeological Research. Birwé, Robert. 1955. Griechisch-arische Sprachbeziehungen im Verbalsystem. Waldorf- Hessen: Verlag für Orientkunde. Blažek, Václav. 2007. From to Sergei Starostin: On the development of the tree-diagram models of the Indo-European languages. Journal of Indo-European Studies 35: 82–109. Bouckaert, Remco, Philippe Lemey, Michael Dunn, Simon Greenhill, Alexander Alekseyenko, Alexei Drummond, Russell Gray, Marc Suchard, and Quentin Atkinson. 2012. Mapping the origins and expansion of the Indo-European language family. Science. 24 Aug. 2012: Vol. 337, No. 6097: 957–960. Supplementary material http://www.sciencemag.org/ content/suppl/2012/08/22/337.6097.957.DC1/Bouckaert.SM.pdf (accessed January 10, 2013). Brugmann, Karl. 1883. Zur Frage der Verwandtschaftsverhältnisse der indogermanischen Sprachen. Internationale Zeitschrift für allgemeine Sprachwissenschaft 1: 226–256. Bryant, David, Flavia Filimon, and Russell Gray. 2005. Untangling our past: Languages, trees, splits and networks. In Ruth Mace, Clare Holden, and Stephen Shennan (eds.), The of Cultural Diversity: A Phylogenetic Approach, 67–84. London: UCL Press. Buck, Carl Darling. 1949. A Dictionary of Selected Synonyms in the Principal Indo- European Languages. Chicago: University of Chicago Press. Bybee, Joan, Revere Perkins, and William Pagliuca. 1994. The Evolution of Grammar. Chicago: University of Chicago Press. Campbell, Lyle. 2006. Areal linguistics: A closer scrutiny. In Yaron Matras, April McMahon, and Nigel Vincent (eds.), Linguistic Areas: Convergence in Historical and Typological Perspective, 1–31. Basingstoke, Hampshire - New York: Palgrave Macmillan. Chappell, Hilary. 2001. Language contact and areal diffusion in Sinitic languages. In and Robert M. W. Dixon (eds.), Areal Diffusion and Genetic Inheritance: Problems in , 328–357. Oxford: Oxford University Press. Clackson, James and Geoffrey Horrocks. 2007. The Blackwell History of the Latin Language. Chichester, West Sussex: Wiley-Blackwell. Croft, William. 2000. Explaining Language Change. Harlow, U.K. / New York: Longman. Darwin, Charles. 1859 [1967]. On the Origin of Species by Means of Natural Selection. New York: Atheneum. (Facsimile of 1st edition). Darwin, Charles. 1871. The Descent of . London: Murray. Dixon, R. M. W. 1997. The Rise and Fall of Languages. Cambridge: Cambridge University Press.

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

408 B. Drinka / Journal of Language Contact 6 (2013) 379–410

Drinka, Bridget. 1993. The dispersion of Indo-European dialects: Clues from morphology. Word 44: 413–429. Drinka, Bridget. 1995. Areal linguistics in prehistory: Evidence from Indo-European aspect. In Henning Andersen (ed.), Historical Linguistics 1993, 143–158. Amsterdam - Philadelphia: Benjamins. Drinka, Bridget. 1999. Alignment in Early Proto-Indo-European. In Carol . Justus and Edgar C. Polomé (eds.), Language Change and Typological Variation: Papers in Honor of Winfred P. Lehmann on the Occasion of his 82nd Birthday. Vol. II: Grammatical Universals and Typology. Monograph 31 in the Journal of Indo-European Studies Monograph Series, 462–498. Washington, DC: Institute for the Study of Man. Drinka, Bridget. 2003. The development of the perfect in Indo-European: Stratigraphic evidence for prehistoric areal influence. In Henning Andersen (ed.),Language Contacts in Prehistory: Studies in Stratigraphy, 77–105. Amsterdam: Benjamins. Dyen, Isidore, Joseph Kruskal, and Paul Black 1992. An Indoeuropean classification: A lexicostatistical experiment. Transactions of the American Philosophical Society 82/5. Epps, Patience. 2009. Language classification, language contact, and Amazonian prehistory. Language and Linguistics Compass 3: 581–606. Eska, Joseph and Don Ringe. 2004. Recent work in computational linguistic phylogeny. Language 80: 569–582. Euler, Wolfram. 1979. Indoiranisch-griechische Gemeinsamkeiten der Nominalbildung und deren indogermanische Grundlagen. Innsbruck: Innsbrucker Beiträge zur Sprachwissenschaft. Forster, Peter and Alfred Toth. 2003. Toward a phylogenetic chronology of ancient Gaulish, Celtic, and Indo-European. Proceedings of the National Academy of Sciences of the United States of America 100: 9079–9084. Garrett, Andrew. 1999. A new model of Indo-European subgrouping and dispersal. Berkeley Linguistics Society 25: 146–156. Garrett, Andrew. 2006. Convergence in the formation of Indo-European subgroups: Phylogeny and chronology. In Peter Forster and Colin Renfrew (eds.), Phylogenetic Methods and the Prehistory of Languages, 139–151. Cambridge: McDonald Institute for Archaeological Research. Gray, Russell and Quentin Atkinson. 2003. Language-tree divergence times support the Anatolian theory of Indo-European origin. Nature 426: 435–439. Gimbutas, Marija. 1970. Proto-Indo-European culture: The culture during the fifth, fourth, and third Millennia B.C. In George Cardona, Henry Hoenigswald and Alfred Senn (eds.), Indo-European and Indo-Europeans, 155–197. Philadelphia: University of Pennsyl­ vania Press. Gimbutas, Marija. 1977. The first wave of Eurasian steppe pastoralists into Age Europe. Journal of Indo-European Studies 5: 277–338. Harrison, S. P. 2003. On the limits of the Comparative Method. In Brian Joseph and Richard Janda (eds.), The Handbook of Historical Linguistics, 213–243. Malden, MA - Oxford - Melbourne - Berlin: Blackwell. Heggarty, Paul. 2006. Interdisciplinary indiscipline? Can phylogenetic methods meaningfully be applied to language data—and to dating language? In Colin Renfrew and Peter Forster (eds.), Phylogenetic methods and the prehistory of languages. (McDonald Institute Monographs), 183–194. Cambridge: McDonald Institute for Archaeological Research. Heggarty, Paul. 2007. Linguistics for archaeologists: Principles, methods, and the case of the Incas. Cambridge Archaeological Journal 17: 311–340. Heggarty, Paul, Warren Maguire, and April McMahon. 2010. Splits or waves? Trees or webs? How divergence measures and network analysis can unravel language histories. Philosophical Transactions of the Royal Society, B. 365: 3829–3843.

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

B. Drinka / Journal of Language Contact 6 (2013) 379–410 409

Heine, Bernd and Tania Kuteva. 2005. Language Contact and Grammatical Change. Cambridge Approaches to Language Contact. Cambridge: Cambridge University Press. Heine, Bernd and Tania Kuteva. 2006. The Changing . Oxford: Oxford University Press. Hock, Hans Henrich. 1986. Principles of Historical Linguistics. Berlin /NewYork /Amsterdam: Mouton de Gruyter. Hock, Hans Henrich. 1999. Out of India? The linguistic evidence. In Johannes Bronkhorst and Madhav Deshpande (eds.), and Non-Aryan in South Asia: Evidence, Interpretation and Ideology, 1–18. Cambridge, MA: Harvard University Press. Jaeger, T. Florian, Peter Graff, William Croft, and Daniel Pontillo. 2011. Mixed effect models for genetic and areal dependencies in . Linguistic Typology 15: 281–320. Jasanoff, Jay. 1997. An Italo-Celtic : The 3 pl. mediopassive in ntro*- . In Douglas Q. Adams (ed.), Festschrift for P. Hamp I, 146–161. Journal of Indo-European Monograph 23. Washington, D.C.: Institute for the Study of Man. Kessler, Brett. 2001. The Significance of Word Lists. Stanford: CSLI Publications. Labov, William. 2007. Transmission and diffusion. Language 83: 344–387. Lemey, Philippe, Andrew Rambaut, John Welch, and Marc Suchard. 2010. Phylogeography takes a relaxed random walk in continuous space and time. Molecular Biology and Evolution 27: 1877–1885. Mallory, James. 1989. In Search of the Indo-Europeans. London: Thames & Hudson. McMahon, April. 1994. Understanding Language Change. Cambridge: Cambridge University Press. McMahon, April and Robert McMahon. 2005. Language Classification by Numbers. Oxford: Oxford University Press. Meid, Wolfgang. 1975. Probleme der räumlichen und zeitlichen Gliederung des Indoger­ manischen. In Helmut Rix (ed.), Flexion und Wortbildung, 204–219. Wiesbaden: Reichert. Meillet, Antoine. 1931. Essai de chronologie des langues indo-européennes. Bulletin de la Société de Linguistique de Paris 32: 1–28. Mufwene, Salikoko. 2001. The Ecology of Language Evolution. Cambridge Approaches to Language Contact. Cambridge: Cambridge University Press. Nakhleh, Luay, , Don Ringe, and Steven N. Evans. 2005a. A comparison of phylogenetic reconstruction methods on an Indo-European dataset. Transactions of the Philological Society 103: 171–192. Nakhleh, Luay, Don Ringe, and Tandy Warnow. 2005b. Perfect phylogenetic networks: A new methodology for reconstructing the evolutionary history of natural languages. Language 81: 382–420. Nichols, Johanna and Tandy Warnow. 2008. Tutorial on computational linguistic phylogeny. Language and Linguistics Compass 2: 760–820. Pagel, Mark, Quentin Atkinson, and Andrew Meade. 2007. Frequency of word-use predicts rates of lexical evolution throughout Indo-European history. Nature. Vol. 449. 11 Oct. 2007. Polomé, Edgar C. 1964. Diachronic development of structural patterns in the Germanic conjugation system. In Horace G. Lunt (ed.), Proceedings of the Ninth International Congress of Linguists, Cambridge, Mass., August 27–31, 1962, 870–880. The Hague: Mouton. Polomé, Edgar C. 1982. Germanic as an archaic Indo-European language. In Kurt R. Jankowsky and Ernst S. Dick (eds.), Festschrift für Karl Schneider. 51–59. Amsterdam/ Philadelphia: John Benjamins. Polomé, Edgar C. 1987. Who are the Germanic people? In Susan Nacev Skomal and Edgar C. Polomé (eds.), Proto-Indo-European: The of a Linguistic Problem, 216–244. Washington, D.C.: Institute for the Study of Man. Porzig, Walter. 1954. Die Gliederung des indogermanischen Sprachgebiets. Heidelberg: Winter.

Downloaded from Brill.com09/29/2021 11:32:06PM via free access

410 B. Drinka / Journal of Language Contact 6 (2013) 379–410

Rexová, Kateřina, Daniel Frynta, and Jan Zrzavý. 2003. Cladistic analysis of languages: Indo-European classification based on lexicostatistical data. Cladistics 19: 120–127. Renfrew, Colin. 1987. Archaeology and Language: The Puzzle of Indo-European Origins. London: Pimlico. Renfrew, Colin. 2000. 10,000 or 5000 years ago? – Questions of time depth. In Colin Renfrew, April McMahon and Larry Trask (eds.), Time Depth in Historical Linguistics (2 volumes), 413–439. Cambridge: McDonald Institute for Archaeological Research. Ringe, Don, Tandy Warlow, and Ann Taylor. 2002. Indo-European and computational cladistics. Transactions of the Philological Society 100: 59–129. Schleicher, August. 1860. Die Deutsche Sprache. Stuttgart: Cotta. Schmidt, Johannes. 1872. Die Verwandtschaftverhältnisse der indogermanischen Sprachen. Weimar: Böhlau. Schrijver, Peter. 2006. Review of Gerhard Meiser (2003) Veni Vidi Vici. Die Vorgeschichte des lateinischen Perfektsystems. Kratylos 51: 46–64. Serva, Maurizio and Filippo Petroni. 2008. Indo-European languages tree by Levenshtein distance. Europhysics letters 81: 68005. Sherratt, A. 1981. and pastoralism: Aspects of the secondary products revolution. In Ian Hodder, Glynn Isaac and Norman Hammond (eds.), Pattern of the Past: Studies in Honour of David Clarke, 261–305. Cambridge: Cambridge University Press. Swadesh, Morris. 1952. Lexico-statistic dating of prehistoric ethnic contacts. Proceedings of the American Philosophical Society 96: 453–463. Szemerényi, Oswald. 1996. Introduction to Indo-European Linguistics. Translated from Einführung in die vergleichende Sprachwissenschaft. 4th edition, 1990, with additional notes and references. Oxford: Oxford University Press. Taylor, Ann, Tandy Warnow, and Don Ringe. 2000. Character-based reconstruction of a linguistic . In John Charles Smith and Delia Bentley (eds.), Historical Linguistics 1995. Vol. I: General Issues and Non-Germanic Languages, 393–408. Amsterdam - Philadelphia: Benjamins. Terrell, John. 1988. History as a family tree, history as an entangled bank: constructing images and interpretations of prehistory in the South Pacific. Antiquity 62: 642–657. Thomason, Sarah. 2001. Language Contact. Edinburgh: Edinburgh University Press. Thomason, Sarah G. and Terrence Kaufman. 1988. Language Contact, Creolization, and Genetic Linguistics. Berkeley, Los Angeles, London: University of California Press. Trudgill, Peter. 1983. On Dialect: Social and Geographical Perspectives. New York - London: New York University Press. Watkins, Calvert. 2001. An Indo-European linguistic area and its characteristics: Ancient Anatolia. Areal diffusion as a challenge to the Comparative Method? In Alexandra Aikhenvald and R. M. W. Dixon (eds.), Areal Diffusion and Genetic Inheritance: Problems in Comparative Linguistics, 44–63. Oxford: Oxford University Press. Warnow, Tandy, Steven Evans, Donald Ringe, and Luay Nakhleh. 2006. A stochastic model of language evolution that incorporates homoplasy and borrowing. In Peter Forster and Colin Renfrew (eds.), Phylogenetic Methods and the Prehistory of Languages, 75–87. Cambridge: McDonald Institute for Archaeological Research. Weiss, Michael. 2012. Italo-Celtica: Linguistic and cultural points of contact between Italic and Celtic. In Stephanie Jamison, H. Craig Melchert and Brent Vine (eds.), Proceedings of the 23rd Annual UCLA Indo-European Conference, 151–173. Bremen: Hempen. Wichmann, Søren, Eric Holman, Taraka Rama, and Robert Walker. 2011. Correlates of reticulation in linguistic phylogenies. Language Dynamics and Change 1: 205–240. Wolfram, Walt, and Natalie Schilling-Estes. 2003. Dialectology and linguistic diffusion. In Brian Joseph and Richard Janda (eds.), The Handbook of Historical Linguistics, 713–735. Malden, MA - Oxford - Melbourne - Berlin: Blackwell.

Downloaded from Brill.com09/29/2021 11:32:06PM via free access