Toward a Phylogenetic Chronology of Ancient Gaulish, Celtic, and Indo-European
Total Page:16
File Type:pdf, Size:1020Kb
Toward a phylogenetic chronology of ancient Gaulish, Celtic, and Indo-European Peter Forster*†‡ and Alfred Toth§ *McDonald Institute for Archaeological Research, University of Cambridge, Cambridge CB2 3ER, United Kingdom; †Junge Akademie, 10117 Berlin, Germany; and §Phonogrammarchiv der Universita¨t Zu¨ rich, CH-8032 Zu¨rich, Switzerland Edited by Henry C. Harpending, University of Utah, Salt Lake City, UT, and approved May 6, 2003 (received for review February 27, 2003) Indo-European is the largest and best-documented language family in the world, yet the reconstruction of the Indo-European tree, first proposed in 1863, has remained controversial. Complications may include ascertainment bias when choosing the linguistic data, and disregard for the wave model of 1872 when attempting to reconstruct the tree. Essentially analogous problems were solved in evolutionary genetics by DNA sequencing and phylogenetic network methods, respectively. We now adapt these tools to linguistics, and analyze Indo-European language data, focusing on Celtic and in particular on the ancient Celtic language of Gaul (modern France), by using bilin- gual Gaulish–Latin inscriptions. Our phylogenetic network reveals an early split of Celtic within Indo-European. Interestingly, the next branching event separates Gaulish (Continental Celtic) from the British (Insular Celtic) languages, with Insular Celtic subsequently splitting into Brythonic (Welsh, Breton) and Goidelic (Irish and Scot- tish Gaelic). Taken together, the network thus suggests that the Celtic language arrived in the British Isles as a single wave (and then differentiated locally), rather than in the traditional two-wave sce- nario (‘‘P-Celtic’’ to Britain and ‘‘Q-Celtic’’ to Ireland). The phylogenetic network furthermore permits the estimation of time in analogy to genetics, and we obtain tentative dates for Indo-European at 8100 ؎ BC ؎ 1,900 years, and for the arrival of Celtic in Britain at 3200 BC 1,500 years. The phylogenetic method is easily executed by hand and promises to be an informative approach for many problems in historical linguistics. Fig. 1. Living and extinct languages referred to in this study. ANTHROPOLOGY he quest for reconstructing the prehistory of the Indo- millennia, from ancient Greek to, for example, modern English. We TEuropean language family commenced in 1786 with the dis- shall see that the network infers a tree, and therefore a hypothetical covery by Sir William Jones of the remarkable similarities between ancestral Indo-European language for which we provide a tentative Sanskrit, Greek, Latin, Gothic, Celtic, and Persian, indicating a phylogenetic age estimate. ‘‘common source’’ for these languages (1). The next major step Our particular focus will be the Celtic languages, including occurred in 1863, when Schleicher proposed an evolutionary tree of ancient Gaulish, formerly spoken in what is today France and descent for the Indo-European language family (2), shortly after northern Italy (Fig. 1). In western Europe, Gaulish is the only Charles Darwin had introduced the evolutionary tree concept to pre-Roman language with a significant bilingual corpus, and knowl- the descent of species. Further insight into language evolution was edge of its time depth and relationship to other languages would supplied by Schmidt in 1872 (3), who published the wave model enable valuable comparisons with the time depth and landscape of according to which initially distinct languages increasingly acquire western European archaeology and genetics. In AD 98, Tacitus similarities through borrowing. More recently, we proposed a recorded that between Britain and Gaul ‘‘the language differs but method for uniting these two models into a single network diagram little’’ (Agricola 11). Nevertheless, classical sources excluded Britain of language evolution (4), which displays a tree if the analyzed from the ‘‘Celtic’’ designation bestowed on Gaul, compare Strabo languages have evolved in a strict branching process, but degener- in AD 18: ‘‘The men of Britain are taller than the Celti, and not so ates into a reticulate network if the data indicate borrowing and yellow-haired’’ (Strabo, Geography 4,5,2). Buchanan (5) and Lhuyd convergence. In Forster et al. (4), we tested this method on (6) proposed a relationship between Gaulish and the British vocabulary lists of Alpine Romance languages, producing a net- languages, hence British languages are now conventionally termed work that revealed language subclusters in close agreement with the ‘‘Insular Celtic,’’ as opposed to ‘‘Continental Celtic’’ formerly geographic locations of the Alpine valleys in which the languages spoken on the European mainland. Insular Celtic is subdivided into are spoken. Moreover, the hypothetical ancestral language pro- Brythonic (e.g., Welsh and Breton) and Goidelic (e.g., Irish and posed by the method was directly validated by comparison with Scots Gaelic). However, there are conflicting proposals on the Latin. Our Alpine analysis reconstructed the past from synchronic branching order and on relative and absolute dates of language data, in the sense that all of the used Romance languages either splits in the Celtic language tree, if it is a tree at all. The underlying were current or went extinct only very recently. But the network problems largely consist in the limited number and often uncertain method is equally applicable to reconstructing prehistoric evolu- translations of surviving Continental Celtic records (7). Extensive tionary relationships from diachronic data, i.e., from languages of quite different time levels. We now exploit this feature to tackle afresh the reconstruction of the prehistoric tree (or network?) of This paper was submitted directly (Track II) to the PNAS office. Indo-European languages, whose ages of attestation span several ‡To whom correspondence should be addressed. E-mail: [email protected]. www.pnas.org͞cgi͞doi͞10.1073͞pnas.1331158100 PNAS ͉ July 22, 2003 ͉ vol. 100 ͉ no. 15 ͉ 9079–9084 Downloaded by guest on September 24, 2021 Table 1. Minimal glossary of Gaulish translated into European languages Gaulish English Latin Classical Greek Old Irish Mod. Irish Mod. Scots Gaelic Syntax: SV (a)SV(a)SV(a)SV(a)VS(b)VS(b)VS(b) -OS (a) (nom.sg.masc.suffix) (b)-us(a) Ϫ (a) Absent (b) Absent (b) Absent (b) -I (a) (gen.sg.masc.suffix) -s (b)-i(a) Ϫ (a) ICM, vowel change (c) ICM, vowel change (c) ICM, vowel change (c) -V (a) (dat.sg.masc. suffix) (b)-o,-u(a) Ϫ (a) ICM, vowel change (c) Absent (b) Absent (b) -A (a) (nom.sg.fem.suffix) (b)-a(a) Ϫ, Ϫ␣ (a)ICM(c)ICM(c)ICM(c) -AS (a) (gen.sg.fem.suffix) -s (a)-ae(b) Ϫ, Ϫ␣ (a) Vowel change (c) Vowel change͞add. (c) Vowel change (c) parapsidiϾparaxidi (a) ps frequent (b) ps frequent (b) ps frequent (b) psrare(a) psrare(a)psrare(a) TEUO- (a) to gods (b) deis (a) ιι()(a)dode´ib(a) do dhe´ithe (a) do dhiadhan (a) -XTONION (a) and to men (b) et hominibus (c) ␣ι ␣πιι() ocus do daı´nib(a) agus do dhaoine (a) agus do dhaoinean (a) (d) IEVRV, IOVRVS...(a) has offered (b) obtulit (b) ␦ι␦␣ι,π␣ι (c)roı´r (a) ta´se´tare´isaı´obairt (a) thairgse (d) -IKNOS (a) (patronymic suffix) son fil. ϩ gen. (b) gen. (b) mac ϩ gen. (b) (mac) ϩ gen. (b) mac ϩ gen. (b) of (b) TARVOS (a) bull (b) taurus (a) ␣ (a)tarb(a) tarbh (a) tarbh (a) TRI- (a) three- (a) tri- (a) ι-(a) trı´-(a) trı´-(a) tri- (a) GARANVS (a) crane (a)grus(a) ␥␣ (a) corr (a) corr mho´na(a) absent (b) Tuos (a) oven (b) furnus (c) ιπ (d)sorn(e)sorn(e) abhan (b) LUXTODOS (a) loaded (a) oneratus (b) ␥ι (c)la´n(a)la´n(a) lionta (a) SUMMA UXSEDIA (a) grand total (b) summa summarum (c) π␣ ␣ι (d) Not determined an t-iomla´n(e) cunntas ( f) ETI (a) thing as well as thing (b) item (c) ␣ι (d) ocus (e) agus (e) agus (e) DUCI (a) person and person (b)et(c) ␣ι (d) ocus (e) agus (e) agus (e) ...DUCI...TONI...(a) person (and), p. and p.(b) p.etp.etp.(c) p. ␣ι p. ␣ι p.(d) p. ocus p.ocusp.(e) p., p. agus p.(e) p. agus p. agus p.(e) AVVOT, etc. (a) has made (b) fecit (c) ιι, ␦␣ (d) do-rigni (e) dhein se´, rinne se´(e) rinn (e) CINTUX (a) first (b) primus (c) π (c)ce´ tnae (a)ce´adϪ (a) ceud (a) ALLOS (a) second (b) secundus (b) ␦ (c)ta´ naise͞aile (a) dara (d) darna (d) TR[ ] (a) third (a) tertius (a) ι (a) triss (a) trı´u´(a) treas (a) PETUAR[ ] (a) fourth (b) quartus (c) ␣ (d) cethramad (c) ceathru´(c) ceithreamh (c) PINPETOS (a) fifth (b) quintus (c) ππ (a)co´ iced (c)cu´igiu´(c) coigeamh (c) SUEXOS (a) sixth (a) sextus (a) (a) seissed (a)se´u´(a) siathamh (a) SEXTAMETOS (a) seventh (a) septimus (a) ␦ (a) sechtmad (a) seachtu´(a) seachdamh (a) OXTUMETO[ ] (a) eighth (a) octauius (a) ␥␦ (a) ochtmad (a) ochtu´(a) ochdamh (a) NAMET[ ] (a) ninth (a) nonius (a) ␣ (a)no´ mad (a) naou´(a) naoidheamh (a) DECAMETOS (a) tenth (b) decimus (a) ␦␣ (a) dechmad (a) deichiu´(a) deicheamh (a) M, MID (a) month (a) mensis (a) (a)mı´(a)mı´(a)mios(a) LAT (a) day (b)dies(b) ␣ (c) laithe (a)la´(a) latha (a) MATIR (a) mother (a) mater (a) (a)ma´ thair (a)ma´ thair (a)ma` thair (a) DUXTIR (a) daughter (a) filia (b) ␥␣(a) ingen (c)inı´on (c) nighean (c) uncertainty in the primary data would seriously affect a phyloge- the current network: for each multistate character, that binary split netic analysis, so we decided to minimize this risk by consulting which partitions the largest group of closely linked nodes or taxa bilingual Gaulish–Latin inscriptions.