<<

Chapter 22 Linguistic Affiliation

In the preceding chapter gave very quick survey of some of the more common ways in which languages are known to change. Now are going to take a closer look at what is known about the processes of language change and how we can use this knowledge to reconstruct language .

Note my use of the ‘reconstruct’ or ‘reconstruction’. Bear in mind that, when we talk about language history, to a great extent we are talking history that isn't clearly recorded anywhere. For instance, if you look at European culture nowadays you will find a great many different languages — Czech, Danish, Dutch, English, Finnish, French, German, Greek, Hungarian, Icelandic, Italian, Nor- wegian, Polish, Portuguese, Russian, Slovene, Spanish, Swedish, to mention only a few of the more important ones — most of which are nevertheless recognizably related to each other.1 Pushing back as far as we can in the historical records becomes obvious that French, Italian, Portuguese, and Spanish are all descended from , and we have plenty of records of Latin. Czech, Polish, Russian, and Slovene are clearly all related to an older language called Old Church Slavonic,2 for which we have records. Danish, Icelandic, Norwegian, and Swedish are all descended from a language usually referred to as , for which we have records from about 800–1000 years ago, while Dutch, English, and German are all somehow or other related to what is usually called Old High German, for which we have some records, but the details of the relationship are not quite as obvious as might like. Beyond that, it is very clear that Old Norse and Old High German are related to each other, but there is no historical record of any language from which could both derived; any such language — usually referred to as ‘Proto-Germanic’ — obviously died out before any of its speakers learned to write. And the same is true, only much more so, for Greek, Latin, Old Church Slavonic, and Proto-Germanic; these languages are also known to be related to each other, but their common ancestor ceased to exist as a recognizable languages thousands of years ago, long before any of its speakers had ever so much as heard of writing. And in all this i' left out Finnish and Hungarian. These languages are related too — that is, they're related to each other, but as far as we know are not related to any of the other languages i've mentioned — but their common ancestor, like the common ancestor of the , has left no written records.

A major part of the work of is figuring out what we are able to learn about lan- guages that no longer exist on the basis of their living descendants. Actually, it isn't strictly neces- sary that the descendants be living, only that they be attested. If there are adequate historical records of a word, an expression, or a language, that word, expression, or language is said to be ‘attested’.

1The relatioships and relative status of these languages are all shown in Fig. 22.1 on the next page. 2Though perhaps not as closely as the modern are related to Latin.

420 Proto-Indo-European (PIE) Proto-Uralic

NEITHER LIVING NOR ATTESTED Proto-Germanic Finnish Hungarian

Common Slavic Proto-West Germanic

Greek Latin × Old Norse × Old High German (OHG) ATTESTED

Old Church Slavonic (OCS) Dutch English German LIVING

French Spanish Czech Slovene Danish Swedish

Italian Portuguese Polish Russian Icelandic Norwegian

Fig. 22.1 — Selected European Languages, Living, Dead, and Attested3

3In this diagram, the languages whose names appear below the broken line are living now; the languages whose names appear between the solid line and the broken line are no longer living, but are attested. The languages whose names appear above the solid line are neither living nor attested.

421 Lexical vs. Sound Correspondences

In this chapter we will be looking almost entirely at phonological change, because ultimately that's where most of the action is. There are several reasons for this. One, related to one of the statements i made earlier, is simply that more research has been done on phonological change than on any other kind of change, with the result that we have more to say about it. But there's a much more impor- tant reason, which i'm going to spend a little bit of time on right now. In the popular mind, historical linguistics4 is tied up with the notion of etymology, the study of the history of . When people want to know ‘what is the history of the word glamour?’ or ‘Why doesn't the word prevent mean what it did 400 years ago?’, it's ultimately to the historical linguist that they go. And well they should. And it's understandable that ordinary people should be more concerned about words, and the history of words, than about such abstractions as phonemes or seg- ments; that's one of the reasons why i began this course by talking about — about word structure — and only later talked about phonetics and phonology. Because ordinary human beings are much more aware of words than of anything else about language, at least in the abstract, when they think about the work of historical linguistics with regard to the of different languages to find connections between them and the reconstruction of an- cient languages, they naturally think about this work in connection with lists of words: ‘Oh, they collect lots of words from different languages and see if the words that different languages use for the same thing are similar to each other.’ Consider the words for the concept ‘3’ from 25 European languages given in (1).5 I would venture that it would be virtually impossible for an intelligent human being to look at this list and not suspect that there's something very funny going on here, that these words must all be, in some sense, the same word, even though they come from so many differ- ent languages. And indeed such a suspicion would be very well founded. We know that all these words derive ultimately from one single word, whose form was very, very similar to that of the Rumanian word in the middle of the third line. Furthermore, if you look at the complete vocabularies of these two dozen or so European languages, you will find hundreds — even thousands — of similarities just like this. (1) Breton, Bulgarian, Croatian, Irish, Norwegian, Serbian, Welsh tri Czech tři, Russian tryi, Polish tši, Lithuanian tryis, Greek & Latvian tris, Spanish tres, Portuguese treš, Rumanian treI, Albanian, Danish, Italian, Swedish tre, Dutch dri, German drai, Icelandic θrir, English θri, French trwa

But if you think that what historical linguists do is draw up long lists of words from different langua- ges and compare them, you're in danger of becoming horribly confused sooner or later. Because it's

4And remember that i said earlier that, in the popular mind, historical linguistics is linguistics. 5Note that in (1) i am presenting the words in a reasonable approximation of their pronunciations, not the way they are actually spelled in the standard orthographies of the languages in . The German, English, and French words are, of course, customarily represented as ‘drei’, ‘three’, and ‘trois’ respectively, while the Russian word is customarily represented as ‘три’.

422 absurdly easy to come up with lists like this. If you take any two languages, you can find dozens — hundreds — possibly even thousands of words that mean the same thing, or approximately the same thing, and that are pronounced somewhat alike, even though there is no historical connection between the words, or between the languages they come from. Mark Rosenfelder, a historical linguist has a knack for coming up with lists like this and shows them to his students and colleagues to - monstrate how dangerous this game can be, has calculated that, on the average, it is quite possible to come up with over 200 matches between languages that are chosen totally at random and have no historical connection with each other that can be recognized by the scientific methods of historical linguistics.

I'm going to give you a little taste — just a little taste — of how confusing this can be. Consider first the words in the first two columns in (2). Now we know that these languages are related to each other. But these pairs of words are not, in fact, related to each other, even though in each case they mean virtually the same thing and they are very similar in pronunciation. Pairs of words like these — in which we know the languages in question are related to each other, the two words sound alike and mean approximately the same thing, and yet the two words are not related — are called ‘false friends’; such pairs have indeed misled and deceived many a researcher. In both cases, i've included in the right-hand column the word from the language in the middle column that is actually related to the word in the left-hand column.6 (2) ‘false friend’ ‘true friend’ a. Greek theos, ‘god’ Latin deus, ‘god’ (possibly) Latin festus, ‘festive’ b. Latin dies, ‘day’ English day Tues- as in ‘Tuesday’

I can also show you the ‘matches’ or correspondences’ in Fig. 22.2 on the next page. These involve Chinese (Mandarin) and one or two other languages, in most cases either (or both) English or Quechua, a South American language native to Peru.

The reason i'm going into this is partly that most people don't realize just how easy it is to come up with phony lists of ‘correspondences’ like these. All too often, historical linguists hear claims like ‘here's a list of 200—500—1000 words in these two different languages that are pronounced alike and mean approximately the same thing. Surely such a long list can't be due to chance!’ As Rosen- felder points out, it's people like that who keep the major gambling houses in business; by and large, human beings are just not very good at judging probabilities. As has demonstrated mathematic- ally, yes indeed it is very possible for such a long list to be due entirely to chance!

But if it's possible to come up with lists of hundreds of words like this, that are pronounced some- what alike and mean approximately the same things, and yet not prove thereby that these words, or the languages they come from, are in any way related to each other, how then is it possible for histo- rical linguists to establish that two or more languages actually are related to each other?

6There is some uncertainty about the cognation of Greek theos and Latin festus, which is why I qualify it. There is, however, no doubt about the relationship between Latin dies and Tiw, Teus-.

423 Mandarin English Quechua French Dutch 搬 ban 板 pan 唱 chant 吃 chew 旦 dawn 賠 pay 配 pair 師傅 chief 耳 ear 是 sure7 給 (Scots) gie ‘give’ qoy ‘give’ 工 gung-ho kunay ‘carry’ 虫 chigger chinchi ‘kind of insect’ 人 runa, ‘person’ 難 nanaq, ‘difficult’ 吃飯 chipay ‘close mouth’ chef ‘cook’ 愛 ayni ‘mutual help’ aime ‘love’ 山 senqa ‘peak’ chaine ‘mountain range’ 水 suïr ‘sweat’ schuit ‘boat’

Fig. 22.2 — Misleading Lexical Correspondences

That's where the difference between sound correspondence and lexical correspondence becomes important. What we've been talking about so far is lexical correspondence: the existence in two dif- ferent languages of words that are pronounced similarly and that mean approximately the same thing. And it's true that historical linguists work with long lists of such words. But what most people don't understand is that merely being able to draw up such a list proves nothing; such lists can exist on a purely chance basis. What's important is what one does with them. Drawing up the list is just the first part of the job; there's a lot more work to be done once the list is drawn up. (3) a. Lexical Correspondence: Two words in two different languages that are similar in both meaning and pronunciation b. Sound Correspondence: A particular phoneme A in one language that consistently shows up in the same environment as a particular phoneme B in another language

What one does is study the list, looking for correspondences not between words — you've already got that on the list — but between segments, between phonemes. For instance, let's look at the list

7 I've included this pair in this list in respect of the common usage of the English word ‘sure’ as a expression of affirma- tion or , a usage closely paralleled by the conversational use of the Chinese word ‘是’ as a general expression of similar force.

424 in (4) of lexical correspondences among some Polynesian languages. Notice the word ‘cognate’ in the heading, by the way; two words from two different languages are cognate if they're related to each other, derived from a common source. By using the word ‘cognate’ here i'm telling you right off the bat that these words really are related, they don't merely look like it the way the words in (4) do; in the terms used in (2), these are ‘true friends’, not ‘false friends’. But what i'm trying to do is explain to you how we are able to tell that in fact they are related. (4) Some Polynesian Cognates Maori Hawaiian Samoan Fijian translation tapu kapu tapu tabu 禁止的 taŋi kani taŋi taŋi 叫 takere ½ele ta½ele takele 龍骨 kaho ½aho ½aso kaso 茅屋頂

Now, the historical linguist looks at this list and says, ‘Aha! Every time Maori, Samoan, and Fijian have a /t/, Hawaiian has a /k/; but whenever Maori and Fijian have a /k/, Hawaiian and Samoan have a glottal stop (/½/)!’ What’s critical is not that Hawaiian kapu corresponds to Maori or Samoan tapu or Fijian tabu, or that Samoan ta½ele corersponds to Fijian takele or Maori takere — although in fact they do — but that a /t/ in Maori, Samoan, and Fijian corresponds to a /k/ in Hawaiian, while a /k/ in Maori and Fijian corresponds to a glottal stop in Hawaiian and Samoan. This list of lexical correspondences, interesting as it may be, is useful to the historical linguist only in that it makes it possible to recognize and draw up a list of sound correspondences.

It's not that lexical correspondences are unimportant or uninteresting; they're certainly interesting, and they're very important. But they're important only in that they enable us to recognize and draw up a list of sound correspondences. This is why a list like that in Fig. 22.2 is worthless except as an entertaining passtime; it's just a bunch of words splashed all over the place; it doesn't allow us to draw up anything in the way of a set of regular sound correspondences. For instance, it doesn't suggest any explanation of why the Mandarin /k/ should correspond to a Quechua /k/ in the case of 工 but to /q/ in the case of 給; and, since Fig. 22.2 offers only one example apiece of /ān/, /ǎn/, and /àn/ in words with English correspondences, there isn't enough data to suggest any explanation as to why /ān/ and /ǎn/ both correspond to English /an/ but /àn/ corresponds to English /Šn/, or whether the dif- ference in tone between 搬 and 板 corresponds consistently with a distinction between voiced and voiceless initial consonant in English. What's important is being able to look at hundreds or - sands of lexical correspondences like these and see regular sound correspondences. If you can't find them no matter how hard you look, then they probably aren't there, and if they aren't there, there's no relationship to be recognized.

Let me give you another example of what i mean. In the spring of 1998 i had an extended discussion with a fellow who wanted to argue that English — and, apparently (though he was a little unclear about this), all Indo-European languages — are descended from Hebrew. Among the many, many lexical correspondences he offered were the two in (5a). I responded that the first of these suggests that an /:/ in Hebrew should correspond to an /eˆ/ in English and the second that a /ts/ in Hebrew should correspond in English to /s/. If this is so, how come English doesn't have a word ‘sais’ mean- ing ‘horse’, corresponding to Hebrew /su:s/? And how come the closest English equivalent to Hebrew

425 [°r°ts] is pronounced [|}], and not [aris]? In summing up, i told him that he would need to provide ‘a list of all the phonemes of the Hebrew language and all the phonemes of the , showing how they correspond to each other…. If i offer you a specific phoneme or phonological string in Hebrew, you should be able confidently to tell me what the corresponding phoneme or phonological string in English would be, or vice versa.’ (5) Hebrew English Hebrew English a. [mu:m] ‘傷’ maim [meˆm] [tsom°t] ‘峰’ summit [s·mˆt] b. [su:s] ‘馬’ sais* [seˆs]* [°r°ts] ‘地’ aris* (not ‘earth’)

I hope that's clear; it's terribly important that you understand that drawing up lists of similar words is really only the first part of a very long and complicated process of establishing the possibility that two languages are related and examining the exact nature of that relationship. What's really impor- tant is recognizing regular sound correspondences.

↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓ Great Shift I'm going to spend a few minutes talking about a particular sound change, and resulting set of sound correspondences, that was very important in the history of the English language; indeed, in my course of the i spend the better part of a whole lecture talking about this one change and its consequences. I'm talking about what is commonly known as the Great Vowel Shift (GVS). It's diagrammed in Fig. 22.3. i… u… aˆ a¤ e… eˆ

°… Š…

a…

Fig. 22.3 — The Great Vowel Shift

What's most important to note about the GVS is that it affected only long . had inherited seven long vowels from late Old English, and every one of them mutated around the 15th century. Every one of them, in every single word in which it occurs. I've given you some repre- sentative examples in (6). Bear in mind, these are representative examples; every one of them rep- resents a large number of words — indeed, every single word in which the appropriate long vowel occurred in the 14th century.

426 (6) Word Middle English Pronunciation Modern English Pronunciation mice [mi…s\] [maˆs] mouse [mu…s\] [ma¤s] geese […s\] [gi…s] goose [go…s\] [gu…s] break [br°…k] [breˆk] broke [brŠ…k\] [bro…k] name [na…m\] [neˆm]

I said that what's most important about the GVS is that it affected only long vowels. But Middle English had a great many words that were morphologically related to each other and which differed precisely in the length of their stem vowels. As a result of the GVS, these stem vowels came to differ from each other not so much in length as in quality. Hence the discrepancies between the stem vowels in the words in (7). The words in the left-hand column are morphologically related to those across from them in the right-hand column; but because of the GVS, these words differ from each other with regard to the quality of their vowels. (7) Long stem vowel Short stem vowel serene [s\ªi…n] serenity [s\ª°nˆti] divine [d´vaˆn] divinity [d´vˆnˆti] profound [pªofa¤nd] profundity [pªof·ndˆti] sane [seˆn] sanity [sænˆti] fool [fu…l] folly [fŠli]

As you can see from (7), in every case it is the shorter word that has the ‘long stem vowel’; this is because the involved in deriving the words in the right-hand column all had as part of their character, in Middle English, the effect of shortening the stem vowel. In Middle English, this effect would have been perceived only as an influence on the length of the vowel-sound itself. But once the GVS had taken place, it became a difference in vowel quality. The stem vowels in the words in the right-hand column in (7) are still pronounced, in , approximately the way they were pronounced in the 14th century. But the equivalent vowels in the words in the left-hand column are pronounced quite differently, due to the GVS.

It should be noted that the GVS is not unique by any means. In fact, vowel-shifts seem to be char- acteristic of English and of at least some of the other Germanic languages; the GVS is remarkable mainly because its effects were more thorough and far-reaching than any other single change in the pronunciation of vowels in the history of English. There have been vowel shifts in English before the GVS; there have been others since; there are some going on right now, at least in certain dia- lects. And Icelandic has sustained quite a few vowel-shifts too, with roughly the same result as those in English: The written form of the language has changed hardly at all in over 700 years, but the pronunciation is quite a bit different.

↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑

427 What do we Mean by ‘Language Family’?

Probably the majority of educated people have at least a vague notion that languages can be related to each other, that they can be grouped into families, and they assume that it is the work of historical and comparative linguists to establish what these language-families are and which languages belong to which families. And to some extent they're right; although this is by no means the whole of what historical and comparative linguistics is all about, it is and always has been an important part of our job. I'm going to talk now about what we mean by the notion ‘language family’, what we mean when we say that two languages are ‘related’ to each other.

Let's say we have three languages that we know about; for the moment, we'll call them A, B, and C. And some historical linguist says that A, B, and C are ‘related’ to each other. What this means, basic- ally, is that this historical linguist is claiming that at some point in the past there existed a single language, which we will call X, and that all three languages A, B, and C evolved from X by means of ordinary, typical processes of language change such as we looked at in the previous chapter. It means that we can, at least in theory, trace every stage in the historical evolution of each of these three known lan- guages A, B, and C from X by saying things like, ‘Well, at this stage such-and-such an assimilation process took place, at this stage epenthesis occurred, at this stage there was a certain kind of analogi- cal change in the morphology’ — all the sorts of things that we know can happen in the course of lin- guistic evolution. The point is that each of these three languages can be traced back through a series of recognizable or plausible ancestor languages until the three lines of evolution meet at X, as shown in Fig. 22.4. That’s what the claim of language ‘relationship’ or affiliation technically means. And, just as we say we are ‘descended’ from our biological ancestors, so we say that languages A, B, and C are ‘descended’ from X, which may be called their glossogenetic ancestor.

X

A B C

Fig. 22.4 — Glossogenetic Descent

Of course, i'm assuming that the historical linguist who made this claim of affiliation is correct in making it. That's part of where the scientific aspect comes in. Just as when a biologist, a chemist, or a physicist makes a claim about some phenomenon in hanns area of expertise it's the job of other experts in the same and related fields to look into it and see if it holds up, so when a linguist makes a claim about language it's the job of other linguists to check up on it. This is true in the various areas of grammatical theory i talked about last semester, and it's true in historical and comparative linguis- tics as well. This is a scientific enterprise, we go about all this work very scientifically. That's one of the things i want to talk about in this chapter.

Now, assuming this linguist's claim is valid and is substantiated, we can say that languages A, B, and C are related to each other and that they are all members of a language family (it's an open question whether we might find other members of the same family). But now consider this possibility: Sup- pose another linguist comes along and agrees with the first one that A, B, and C are all descended

428 from X, but suggests that A and B may be more closely related to each other than either is to C. What this would mean is that the second linguist thinks there is some other, intermediate language, which we will call Y, which is descended from X and from which A and B are descended, but which is not an ancestor to C. So the diagram in Fig. 22.4 would be replaced by Fig. 22.5.

X

Y

A B C

Fig. 22.5 — Glossogenetic Descent (Greater Detail)

Note that this claim does not challenge the earlier claim; it merely claims that the relationship be- tween A, B, and C is more complicated than was diagrammed in Fig. 22.4. A, B, and C are still related according to Fig. 22.5, but there is an extra, complicating level of affiliation: A and B are related to each other through common descent from Y, and Y and C are related to each other through common descent from X.

To put this in a more specific perspective, where we’re referring not to hypothetical languages called ‘A’, ‘B’, and ‘C’ but to real ones, let’s rename ‘A’ ‘English’, ‘B’ ‘German’, and ‘C’ ‘Greek’. We know that English, German, and Greek are all related to each other. But we also know that English and German are more closely related to each other than either is to Greek. Put another way, English and German are related to each other through common descent from an ancestor language called ‘Proto-Germanic’; English, German, and Greek are related to each other through common descent from an ancestor language called ‘Proto-Indo-European’ (PIE). But Proto-Germanic is itself descen- ded from PIE, so the common ancestor of English and German is more recent than the common an- cestor of English, German, and Greek. So, in this case, we can replace the rather abstract diagram in Fig. 22.5 by the more particular one in Fig. 22.6.

Proto-Indo-European

Proto-Germanic

English German Greek

Fig. 22.6 — Glossogenetic Descent (Particular Example)

A diagram like that in Fig. 22.6 represents to some extent what we mean by the ‘language family’. All the languages mentioned in the diagram are members of a family called ‘Indo-Europe- an’; in addition, the languages Proto-Germanic, English, and German are members of a (smaller) family called ‘Germanic’, which is itself a member of the Indo-European family.

429 In (7) i've reproduced a portion of a very important text in the history of comparative linguistics, Sir William Jones' 1786 announcement to the Royal Asiatic Society that Sanskrit,8 Greek, and Latin must be affiliated languages, that is, they must somehow be related in the manner diagrammed in Fig. 22.4. (7) The Sanscrit language, whatever be its antiquity, is of a wonderful structure; more than the Greek, more copious than the Latin, and more exquisitely refined than either, yet bearing to both of them a stronger affinity, both in the roots of and in the forms of , than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists: there is a similar reason, though not quite so forcible, for supposing that both the Gothick and the Celtick, though blended with a very different idiom, had the same origin with the Sanscrit; and the old Persian might be added to the same family, if this were the place for discussing any question concerning the antiquitities of Persia.

In this passage Jones announces the results of his comparison of Sanskrit with the Greek and Latin which, as an educated upper-class British gentleman of the 18th century, he was already familiar with and expected his audience to be familiar with. On the basis of this comparison, he asserts ‘that no philologer’— we would nowadays say ‘linguist’ — ‘could examine them all three, without believing them to have sprung from some common source’ — essentially the notion diagrammed in Fig. 22.4. He then goes on to suggest that the Germanic, Celtic, and Persian (or, as we would call them nowa- days, ‘Iranian’) languages may also participate in this relationship, although he does not feel as confi- dent about this notion as he does about the affiliation of Sanskrit, Greek, and Latin. Nowadays, we know that this more speculative hypothesis is indeed true; not only is Sanskrit related to Greek and Latin but also to Germanic and Celtic and particularly closely to Persian. We are now going to look at the content of this claim in a little further detail and discuss how it can be substantiated. How do we Know that Languages are Affiliated? To do so, let us return to the difference between lexical correspondences and sound corresponden- ces, because this is the point at which it becomes overwhelmingly relevant.

First of all, there’s no question that lexical correspondences are important in comparative linguistics; you can’t do comparative linguistics without them. If two languages are related, they’re going to share vocabulary; they’re going to have words in common, almost certainly lots of words, there’s no two ways about it. And the way to start investigating the possibility that two languages are related is to compare their vocabularies and collect words that both sound alike and mean similar things. But that’s only the beginning of the process.

8In (7) i’ve reproduced the spelling and as they are in Jones’ original. Nowadays in English, the name of the Sanskrit language is normally spelled with a ‘k’ rather than a ‘c’. And we don’t include a ‘k’ in the word ‘Celtic’. However, it should be noted that when Jones speaks of ‘Gothick’ he does not mean the single language we nowadays call by that name — and spell ‘Gothic’ — but the entirety of what is nowadays called the ‘Germanic’ language family, including German, English, and the Scandinavian languages as well as Gothic.

430 For one thing, it’s not a good idea to just start picking words at random; assuming for the moment we’re talking about two modern languages, it’s not a good idea to just grab a dictionary of each lan- guage and start going through them looking for any words at all that sound alike and mean alike. It makes no sense at all to claim that two languages are related because they happen to have the same words for things like ‘注射器’, ‘電話’, and ‘直昇機’; the responsible historical linguist is very likely to say, in situations like this, ‘Of course they've got the same words, you idiot — they borrowed them from English!’9 We will look at the of borrowing in the next chapter, including the fact that some languages are more likely than others to borrow vocabulary. But if we're just starting to do com- parative work on a given language, we're not likely to know ahead of time how likely it is to borrow words from other languages; that's the kind of thing we'll probably only find out later. No; when collecting lexical correspondences between two languages, it's a good idea to start by con- centrating on very basic vocabulary. Names of body parts — , eyes, mouth, arms, legs, hands, feet, that sort of thing. Words for universal human behaviours, like eating and drinking and talking. Words for the most obvious, universal features of the environment — sky, sun, moon, stars. Every human language, no matter how primitive the society using that language, is going to have words for things like these. I would say this is the First Rule of Comparative Linguistics: When Compiling a List of Lexical Correspondences, Concentrate on Basic Vocabulary.10

Beyond that, remember what i said earlier: It is OVERWHELMINGLY LIKELY that, if you take any two languages — chosen totally at random — you can find HUNDREDS of words that sound alike and mean alike. It is VERY HARD for the untrained person to believe this, but it is nevertheless true. Every comparative linguist knows it's true, partly because we've all done it ourselves or watched others do it. And i mentioned that some individuals, with some training in statistics as well as linguistics, have actually done the math and proven mathematically that it's true. A long list of lexical corresponden- ces — although no serious comparative work can be done without it — by itself proves nothing except that you've been very diligent at studying dictionaries. ↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓ On the other hand, it is very possible for cognate words between two languages that really are related to each other to be completely unobvious. Consider the Hindi and English words in (8). Each Hindi

9Once when i was a graduate student, one of my classmates was giving a classroom presentation on some detail of the grammar of Kannada, her native language. In the course of her presentation, mentioned that the Kannada word for ‘spoon’ is spunu. Another classmate — quite facetiously — exclaimed, ‘Wow! Just like English! They must be rela- ted!’, to which the very proper response was, ‘No, silly, it’s a loan!’

10One of my personal professional hobby-horses is the hypothesis that the of Northern Eurasia — Finnish, Hungarian, etc. — and the of Southern India — Tamil, Telugu, Kannada, Malayalam, etc. — are affiliated. Nobody has yet proved this hypothesis, and i don't know if i'm going to be able to either. But it's not new. Linguists have been arguing about it off and on for over a century and a half now. In the late 1940s, Thomas Burrow put the whole hypothesis on a much more solid footing by providing precisely the sort of list i’m recommending here: a list of well over 100 lexical correspondences between Uralic and Dravidian languages, focussing on basic anato- mical vocabulary. His list enabled him to propose a bunch of plausible sound correspondences, too.

431 word on the left-hand side means the same thing as the matching English word on the right-hand side, but they don't seem to resemble each other particularly in pronunciation. And yet most (but not all) of these words really are cognates, really are related to each other, being derived from com- mon sources, and we know that the languages Hindi and English are related to each other. But it's not at all easy to demonstrate that relationship directly, because it's quiet distant; Hindi and English are not ‘sister languages’; they're more like distant cousins. The relationship is much more obvious when we consider, not these modern languages, but the oldest known representatives of their respec- tive stocks, as in (9). This is another good idea: when you suspect that two languages, or families of languages, are related to each other, your comparison on the oldest versions you can get reli- able data for. (8) Hindi English Hindi English ek one do two tīn three six he is jāntā knows vo he ham we behen sister (9) Sanskrit Old English Sanskrit Old English eka ān dvā twā trī þrī şaţ seox asti is veda wāt sa sē vayam wē svasar sweostor

↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑

I'll say it once again: A list of lexical correspondences is only useful if it provides you with evidence for regular sound correspondences. Ultimately, you want to be able to say that a specific phoneme A in one language corresponds consistently with a specific phoneme B in the other. Where the pho- nemes in one language differ from those in the other, they do so consistently, according to a certain pattern. If there are exceptions to this pattern, you want to be able to come up with some principled explanation for those. It's not reasonable to expect perfect consistency; there are various kinds of language change that tend to prevent perfect consistency. What you want is something like 80–90% consistency; if you can establish that degree of consistency in sound correspondences based on your list of lexical correspondences, you're building a very good case. So that's the Second Rule of Com- parative Linguistics: Look for Regular Sound Correspondences, not just Lexical Correspondences.

But a human language isn't just a bunch of words or a bunch of sounds; there's also grammar. You'd be amazed at how often people forget this. I mentioned earlier that i have had the experience of people presenting me with long, long lists of apparent lexical correspondences between English and Hebrew and saying, ‘Doesn't this prove that English and Hebrew are related?’ First of all, as i noted earlier these people usually have no notion at all about regular sound correspondences. But secondly, when i start to point out to them that the of English and Hebrew are very, very, very dif- ferent, they look at me as though i just fell to Earth from Uranus. What the hell has grammar got to do with it, they ask. Well, it has a lot to do with it. Every human language has both a vocabulary

432 and a grammar. You'll remember that last term we considered the fact that most of us, most of the time, are not consciously aware of our native language's grammar; but that doesn't change the fact that it's there. Every human language consists of a vocabulary, a collection of words, and a grammar, a set of organizational principles that enables us to use those words. And as a language changes and evolves, both vocabulary and grammar change and evolve more or less together. This means that, if two languages are related, are affiliated, they must resemble each other both in terms of their voca- bularies and in terms of their grammars. If you can't establish resemblances in both vocabulary and grammar, it's virtually impossible that the two languages could be affiliated.

↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓ Can a Language Change its Affiliation? Which brings us to another little issue that is often brought up by people who are often very well educated in a general sort of way but don’t know much about comparative linguistics. I have noted that English has been very willing to accept loans from other languages. As a result of this charac- teristic and of the history of the English nation, Modern English has just gobs and gobs of words borrowed from Romance sources, especially French. And i and many other historical linguists have frequently heard it suggested that, because of all these borrowings, Modern English is really more of a Romance language than a Germanic language.

A statement like this runs completely counter to the historical linguist's understanding of the concept of linguistic affiliation. We tend to assume that it's impossible for a language to change its affiliation over the course of its history. To refer to the obvious biological analogy, you don't cease being your parents' child merely because you've married somebody who grew up 500 km from where you did. If Modern English is descended from Middle English and Middle English is descended from Old English and Old English is basically a of German — which it obviously is — then Modern English must be a Germanic language. If Modern English is not a Germanic language, then by defi- nition Modern English is not descended from Old English, in spite of the common name and every other obvious indication that it is.

The claim that English has changed from being a Germanic language to being a Romance language raises this theoretical objection, but there are in addition at least two important empirical problems with it. The first has to do with all that borrowing. Yes, Modern English has a lot of Romance words in it — but they’re from all over the place, or at least from all over . Modern English includes loans from French, Italian, Spanish, and Portuguese, not to mention Latin, which is by def- inition the common ancestor of all the Romance languages. If English were a Romance language, it ought to be possible somehow to fit it into a diagram like Fig. 22.7, which outlines the relationships between the Romance family. This diagram shows that Spanish and Portuguese are more closely re- lated to each other than either is to any of the other Romance languages, and among the other Romance languages they’re more closely related to French than they are to, say, Italian. However, as you may remember from the map in Fig. 18.2 in Chapter 18, the language we normally refer to as ‘French’ is actually divided into to major groups of , a northern group and a southern group. In many ways, the southern dialect-group, historically referred to as ‘langue d'oc’ but in Fig. 22.7 referred to by the names ‘Provençal’ and ‘Catalan’, forms a kind of intermediate transition between Spanish and the northern dialect-group, the so-called ‘langue d'oïl’. But, although English has borrowed very

433 heavily from the northern, ‘d'oïl’ dialects of French — including both Norman and Parisian French — and also from Spanish, it has borrowed very little from the southern ‘d'oc’ dialects. If English really were a Romance language, it ought to be possible to fit it into this diagram somewhere. But given the variety of Romance languages English has borrowed from over the centuries it's kinda difficult to make such an about English and any of them. Latin

Western Romance

Ibero-Romance

Portuguese Spanish Provençal/ French Italian Rumanian Catalan (‘Langue d'oïl’) English? English? Francien Norman (Parisian French) French

English? Fig. 22.7 — If English is a Romance Language, Where Does it Fit? But a much more important problem has to do with grammar. Yes, Modern English has an awful lot of loans and over 50% of its vocabulary comes from Romance sources. But its grammar is still about as Germanic as it can be. Some relatively astute folks trying to argue for the Romance-ness11 of English have pointed out that Modern English look more like French plurals than German plurals. Cf. (10). In German, the most productive way of forming plurals of is by suffixing -(e)n, while in both English and French the most productive - is -s.12 On the face of it, this fact would seem to provide a grammatical argument for the affiliation of Modern English with the Romance languages rather than with German. (10) French English German sing. notice notice Notiz plur. notices notices Notizen

11A more elegant term for this is ‘Latinity’, which i owe to the noted mid-20th-century Italian linguist Carlo Tagliavini, who once declared, ‘There's no such thing as “Latin races”; what there is, is “Latinity”.’ (‘Razze latine non esistono; esiste la latinità.’) By which he meant that the fact that the Portuguese, the Spanish, the French, the Italians, and the Rumanians all speak related languages does not in any way prove that these nations are ethnically related; all it proves is that their languages have something in common — in this case, being descended from Latin.

12Granted, in Modern French the -s suffix is almost purely orthographic; it’s rarely pronounced. But that's beside the point. We have every reason to believe that, at the time in the Middle Ages when English was borrowing all those words from French, the -s was pronounced in French as much as it is in Modern English.

434 But in fact it's German that's the odd one out here. In Old High German, the direct ancestor of Mod- ern Standard German that is roughly contemporaneous with (late) Old English, we find a variety of different plural markers, including both the - and the -s suffixes; and we find the same variety in Old English. This variety is common to all the early Germanic languages. In the history of English, the -s suffix was gradually generalized, being applied to a much wider range of nouns than it had applied to originally — and this process of generalization was well under way before English began being significantly influenced by French, which means we can't even claim that contact with French encouraged the process. German, on the other hand, generalized the -en suffix — and is almost the only Germanic language to do so.13 So this is really not a very good example.

More telling grammatical correspondences are the following. First of all, alongside its generalized -s plural — and the residual -en plurals in words like ‘children’ and ‘oxen’ — English retains the ‘umlaut’-plural, as in (11). This plural-formation-strategy is both characteristic of and peculiar to the Germanic languages; all Germanic languages do it at least to some extent, and among the Indo- European languages only the Germanic languages do it. Granted, it's more common in German than in Modern English, and it's relatively productive in Modern German where it isn't at all in Modern English,14 but that’s beside the point. Modern English has the umlaut-plural, and since only Germa- nic languages have it that makes English a Germanic language. (11) English German mouse / mice Maus / Mäuse man / men Mann / Männer tooth / teeth Zahn / Zähne foot / feet Fuß / Füße

Secondly, English has the characteristic Germanic ablaut conjugation-paradigms, as in (12), in which tense-differences are represented not by but by changes in the stem-vowel. Now, in fact, ab- laut is a characteristic feature of the very earliest Indo-European languages and is generally assumed to have been characteristic of Proto-Indo-European and therefore part of the common legacy of all Indo- European languages, but in fact among the modern Indo-European languages it survives only in the Germanic family. In this respect, too, English is very typical of its Germanic relatives. (12) English German sing / sang / sung singt / sang / gesungen break / broke bricht / brach / gebrochen bring / brought bringt / brachte see / saw sieht / sah / gesehen take / took fällt / fiel / gefallen

13The Scandinavian languages have generalized the suffix -r, a minor option among the West-Germanic languages.

14Except for humorous coinages such as ‘spice’ as the plural for ‘spouse’.

435 The third point i will mention has to do with comparatives and superlatives. Modern English has two different strategies for comparison: the flexional, involving the suffixes - and -est, and the periphrastic, involving the ‘more’ and ‘most’. The first of these is characteristic of Ger- manic languages; actually, like ablaut, this pattern is inherited from Proto-Indo-European, and it can be found in Latin as well. But all the Romance languages lost it at a fairly early stage of their evolution from Latin. The modern Romance languages, including French, have only the periphrastic strategy for adjective comparison. As can be seen in (13), English has both, and tends to use the flexional stra- tegy for shorter (most of which are inherited from German) and the periphrastic strategy for longer ones (the overwhelmingly majority of which are borrowed from French anyway). So, it can be said that English borrowed the periphrastic strategy from French along with the adjectives it uses it with — but it continues to use it only with those Romance-derived adjectives.15 Indeed, with regard to relative productivity, it has been noted that, while both the constructions in (14) strike most English-speakers as a little weird or offbeat, we tend by and large to be more comfortable with (14b) than with (14a), which suggests that the inherited Germanic pattern is more productive and felt to be more characteristic of . So even here, where we do have evidence of the borrowing of a grammatical pattern from French, it's really not as deeply-rooted a part of English grammar as it might appear merely from the fact that it's used with more adjectives; after all, it's just a matter of statistics that English is going to have more adjectives of 2, 3, or more syllables than of single sylla- bles. (13) German English French lang long long länger longer plus long (‘more long’) längste longest le plus long (‘the more long’) schön beautiful beau schöner more beautiful plus beau schönste (the) most beautiful le plus beau (14) a. *more long b. *?curiouser, *?miserabler

↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑

Now let's go back to Sir William Jones' statement. He is being very responsible here. He says that Sanskrit is related to Greek and Latin ‘both in the roots of verbs and in the forms of grammar’. First he mentions vocabulary, lexical correspondences; but not just any kind of vocabulary; he speaks spe- cifically of the ‘roots of verbs’, which is one type of very basic vocabulary — what is at least from the point of view of derivational morphology the most basic, fundamental vocabulary. But he also mentions ‘the forms of grammar’. His argument is based on both lexical correspondences and sim- ilarities in grammar. That's a big part of what makes it so convincing.

15There is at least one exception to this rule. Although ‘open’ is a short adjective of Germanic origin, it doesn't admit of a flexional comparative or superlative; ‘opener’ exists in Modern English, but only as a (開罐器).

436 To sum up, the Third Rule of Comparative Linguistics is: Look for Similarities in Grammar as well as Vocabulary. (15) The Basic Rules of Comparative Linguistics: a. When compiling a list of lexical correspondences, concentrate on basic vocabulary. b. Look for regular sound correspondences, not just lexical correspondences. c. Look for similarities in grammar as well as vocabulary. d. Reconstruct the hypothetical ancestor language.

The fourth and final rule, the one that puts the icing on the cake, as it were, is Reconstruct the Hypo- thetical Ancestor. As i said earlier, if you claim that two languages are related, that must mean that you believe they both evolved from a single common ancestor. That single common ancestor must have been a real human language at one time. And, given what we know about language change and linguistic evolution, it ought to be possible to reconstruct it, to say a certain amount about what it was like, both in terms of its vocabulary and in terms of its grammar. It's not necessary to put a name to it, to identify it with any previously known language — that was part of the genius of Sir William Jones' statement, when he said that the ‘common source’, as he called it, from which Sans- krit, Greek, and Latin ‘have sprung’ ‘perhaps no longer exists’. Most previous attempts at establishing linguistic affiliation had been focussed on arguing that all languages, or that certain languages, were descended from some specific language that could be independently identified — Latin, Greek, He- brew, whatever. What Sir William Jones was saying was that Sanskrit, Greek, and Latin are descen- ded from a common ancestral language — but that this language may no longer exist, this language may furthermore be one that has left no written records, this language may be one that nobody now living, or in the whole course of recorded history, has ever heard of. And of course this turned out to be true. We now know that, by the time any of the Indo-European languages entered the histori- cal records, their common ancestor, Proto-Indo-European, was dead.

But identifying the hypothetical common ancestor with some language for which we have indepen- dent evidence is not necessary. All that’s necessary is being able to say, with confidence that is supported by the data when examined scientifically, that it had such-and-such characteristics, that it included such-and-such words and had such-and-such grammatical processes. If you can provide a plausible picture of this language16 and show that the actual languages you’ve been examining can be derived from it by plausible sequences of sound changes and other kinds of reasonable language change, then you have created an overwhelmingly strong case for affiliation and the burden of proof is very definitely on those who would disagree with you.

16By ‘plausible’ i mean that it looks like a plausible human language, as opposed to one whose grammar makes stipula- tions about what make of car you may own or what colour clothes you may wear while speaking it.

437 Indo-European

TOCHARIAN ARMENIAN THRACIAN VENETIC GERMANIC GREEK ALBANIAN PHRYGIAN INDO-IRANIAN

CELTIC ITALIC West North East ANATOLIAN SLAVIC BALTIC IRANIAN INDO-ARYAN (Scandinavian) Lithuanian Latvian Continental Insular Oscan Latin Umbrian Hittite Lydian Persian Avestan

Gaulish Brythonic Goidelic ROMANCE Sanskrit Prakrits (British) (Gaelic) English German Gothic Farsi Kurdish Pashto Welsh Breton Irish Danish Icelandic Norwegian Swedish

Western Central Eastern West South East Urdu Hindi Bengali

French Spanish Italian Rhaetian Rumanian Czech Polish Russian Ukrainian

Engadine Sursilvan Friulan Slovene Croatian Serbian Macedonian Bulgarian

Fig. 22.8 — Indo-European Family Tree

(Languages and branches in italics are extinct)

438 Austronesian

Atayalic Paiwanic Tsouic Malayo-Polynesian

Atayal Taroko Bunun Saisiyat Tsou Western Eastern

Amis Paiwan Rukai Malagasy Malay Indonesian Tagalog Fijian Gilbertese Samoan

Marquesan Tahitian Maori Hawaiian

Fig. 22.9 — Austronesian Family Tree

Miao-Yao Sino-Tibetan

Hmong Mien Chinese Tibeto-Burman

贛 Mandarin Baric Burmese-Lolo Tibetan Daic 閩南 閩北 湘 吳 Lushai Adi Lolo Burmese Karen 客 粵 Tai Kadai Kam-Sui Lisu Yi Brek, Geba, Lahta, Manumanaw, Pwo Thai Lao Nung Tho Buyang, Gelo Aicham, Kam Padaung, Paku,Yinbaw, Yintale, Zayein Hlai, Lingao Mulam, Sui

Fig. 22.10 — Family Tree of the Sino-Tibetan and Related Languages

439 It's a lot of hard work; i won't say it isn't. In most cases, it can't be done by a single person. It took at least three generations for Proto-Indo-European to be reconstructed to everybody's satisfaction, and we're still arguing about some of the details. But by God it was worth it! The reconstruction of PIE is one of the great achievements of 19th-century science, right up there with Darwin's Theory of Evolution and Maxwell's Laws of Electromagnetics. Diagramming Language Families

In Figs. 22.8–10 on the previous two pages i've included some linguistic family trees. Fig. 22.8 re- presents the Indo-European family. It's not a complete representation; a complete representation of the Indo-European family would almost certainly be impossible, there are too many members.17 I've also tried to indicate that the Celtic and are suspected of being more closely related to each other than they are to any of the other Indo-European languages, likewise the Baltic and .18

As a bonus, in Fig. 22.9 i've included a diagram of the Austronesian language family, and in Fig. 22.10 of the Sino-Tibetan family and its hypothetical relatives. I say ‘hypothetical’; that's what the broken lines mean on this diagram. We know that the ‘Sinitic’ or Chinese languages are related to the Tibeto-Burman languages. We're not sure about their relationships with the Miao-Yao, Karen, and Daic families.19

And within the Daic family, we’re not sure about the relationship between the Tai languages properly so called and the Kadai and Kam-Sui languages.20 There’s a lot of work that needs to be done to resolve these .

With regard to the Austronesian family tree, i would like to point out that the Austronesian family is generally regarded by Austronesianists as consisting of four subfamilies, the Atayalic, Paiwanic, Tsouic, and Malayo-Polynesian languages. Of these four families, the first three are all indigenous to this island, the island of Taiwan. That's one of the main reasons why most Austronesianists be-

17When i was drawing up this diagram, i wanted to have the all the way over on the far left side and Tocharian all the way over on the right because they’re the farthest removed from each other geographically — after all, Tocharian was spoken in China, while Celtic is spoken in Ireland. But things just didn’t work out that way.

18 Please note that in Figs. 22.8–10 i have not by any means listed all the relevant languages, only the most important ones in terms of numbers of speakers.

19 The Miao-Yao languages are spoken in northern Vietnam and in the 雲南 and 貴州 provinces of southern China; the Karen languages are spoken in the extreme southern tip of Myanmar, along the Andaman coast.

20 The Kam-Sui languages are spoken around the northern part of 廣西 provice and the neighbouring areas of 貴州; the Kadai languages are sprinkled here and there in various parts of the southernmost part of China, including the northern part of 海南 Island.

440 lieve, or at least suspect, that this island is the original home of the entire Austronesian language family, from which the ancestor of the Malayo-Polynesian family moved out into the other islands of the Pacific and Indian oceans. Note in particular that the Malayo-Polynesian family includes Ma- lagasy, the native language of the island of Madagascar just off the coast of Africa, and Hawaiian; that's a range of several thousand kilometers. The Malayo-Polynesian speakers have been among the world's greatest explorers. Meanwhile, note that, back on Taiwan, the Tsouic family in particu- lar is dying out; almost all the members of this family are dead or dying, only the Tsou language itself seems to remain viable. How long this will remain the case is anybody’s guess.

↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓ Language Death Can a language die? Obviously, we've been talking about such hypothetical languages as Proto-Indo- European, Proto-Germanic, etc., and if these languages ever existed as real, living human languages they don't exist now. So it seems we must assume that languages can die. How does this happen? In linguistics textbooks you will often find statements like ‘A language dies when no children learn it’. This is, you will not be surprised to learn, somewhat of an oversimplification. In some respects it is true; it is certainly true that, when a language ceases to be the first language of any community, when there are no longer any children learning that language as their first, their ‘mother’ language, something rather serious has happened to that language, and in most cases such a language is defini- tely doomed. But there are exceptions. Or rather, there are examples of languages that have had this experience, that have ceased to be passed on from parents to children in the natural way human language is nor- mally passed on from generation to generation, but which seem nevertheless to remain ‘alive’; at least, you might find people who would disagree with you, and disagree with you rather passionate- ly, if you were to claim that these languages are ‘dead’.21 I'm thinking in particular of such ancient languages as Latin and Sanskrit. These languages are ‘dead’ in the sense that no little children are learning them at their mothers' knees, as we say in English, and no children have learned them in that most natural way for many, many centuries. But they are still in use; in some communities, in very lively use; some communities even make an effort to keep these languages up-to-date, by provi- ding them with new vocabulary items (based very ‘properly’ upon appropriate lexical material and morphological processes) to refer to new inventions, concepts, and technological developments. The result is that it is quite feasible for certain suitably educated people in Rome today to talk about busses, trains, telephones, televisions, etc. in Latin. And i am assured that they do. There aren’t many of them, but they exist.

21Granted, a fair amount of such disagreement may be motivated by non-linguistic considerations which we may not consider very substantial or persuasive. For instance, you are apt to find some Hindus who will passionately deny that Sanskrit is, or can ever be, a ‘dead’ language on the grounds that it is the Language of the Gods, the language that is the Foundation of the entire Universe, etc. Such ultimately religious or otherwise non-linguistic considerations are of little relevance here except insofar as they can — as they often do — provide motivation for the maintenance, and possible ‘revival’, of an otherwise ‘dead’ language.

441 Another very important example of such a language is Hebrew. According to the usual definition, Hebrew was a ‘dead’ language for over 2000 years. But it’s very much a living language now. This is an important difference between linguistic and biological evolution. With respect to biology, it is often said that ‘extinction is forever’; once a species becomes extinct, there is no way we can bring it back to life (Jurassic Park and The Eyre Affair to the contrary). But we know that this is not true of language, and Hebrew is the proof; this language was dead for over 2000 years, but is clearly alive today, by any definition. But it isn’t just any ‘dead’ language that can be ‘revived’. The Zionist movement was able to revive Hebrew because so much was known about the language. Maintaining a language alive requires a community that will use it, use it for ordinary, day-to-day affairs, and that will teach it to their child- ren. Reviving a dead language requires that — a community willing to treat it this way — and enough knowledge about what the language was like when it was alive to bring it back to life; knowledge about both its vocabulary and its grammar, the two things which i told you last term combine to char- acterise a language. Latin and Sanskrit can be maintained in a rather artificial life nowadays because not only are there (very small, granted, but viable) communities willing to use them at least for spe- cialized uses but because these communities know enough about the vocabulary and grammar of these languages to be able so to use them. Hebrew was brought back to life by people who knew enough about its vocabulary and grammar to know what it would be like if it were being used now- adays as a living language by a real human community. There is a movement afoot in Britain to re- vive the , a Celtic language that died in the 18th century.22 Again, this movement’s chances of success depend in part on the fact that enough is known about what the Cornish language was like when it was alive to be able to establish what it would be like if it were revived. If a species becomes extinct it will never exist again. If a language dies it can, in theory, be revived — but only if we know enough about it before it dies. This is why many linguists are committed to finding out as much as they can about the endangered languages of the world today.23 We don't hope that we can save every one of them from death. But we hope that, if the descendants of the commu- nities that are now using or have in the past used these languages should ever want to revive them, we will be able to provide them with the necessary information. ↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑ Lumpers & Splitters In Figs. 22.8–10 i gave you family trees of several language families, but there are many others that have been recognized or discussed by comparative linguists. In Fig. 22.11 on the next page i give you a list of some of the other languages of the world and the families they have been classified into. I would point out that some of these classifications are still rather controversial.

22The Cornish language actually has its own gravestone; i've seen a picture of it. It marks the grave of Dorothy Pentreath, who died in 1777, ‘said to have been the last person who conversed in the ancient Cornish’.

23There is, in fact, a professional organization of linguists, the Endangered Language Foundation, of which i am a charter member, devoted to this task.

442 Major Families Sub-families Representative Languages Where Spoken? Afro-Asiatic Berber Kabyle, Shluh NW Africa Chadic Hausa Nigeria Cushitic Oromo, Somali NE Africa Semitic Arabic, Hebrew Middle East Algonquian Cree, Ojibwa Eastern N. America Andean-Equatorial Quechua, Guaraní Central S. America Altaic Manchu-Tungus Manchu, Evenki NE Asia Mongolian Mongol, Buryat, Kalmyk Northern Asia Turkic Turkish, Uzbek, Uighur Turkey,Central Asia Japanese-Korean Japanese, Korean Japan, Korea Austro-Asiatic Mon-Khmer Vietnamese, Cambodian SE Asia Munda Mundari, Santali Eastern India Nicobarese Indian Ocean Aztec-Tanoan Aztec, Hopi SW North America Caucasian Abkhazo-Adyghian Kabardian, Adyghe, Abkhaz W-Central Asia Kartvelian Georgian W-Central Asia Nakho-Dagestanian Chechen, Avar, Lezghian W-Central Asia Dravidian North Brahui, Malto Pakistan, India Central Telugu Southern India South Tamil, Malayalam, Kannada Southern India Eskimo-Aleut Yupik, Inuit, Aleut Extreme N.America Hokan-Siouan Hokan Tlapanec Eastern N. America Iroquian Cherokee, Mohawk Eastern N. America Siouan Crow, Lakota Central N. America Khoisan Kwadi, Sandawe, !Xu Southern Africa Na-Dené Apache, Navaho Western N.America Niger-Congo Benue-Congo Swahili, Rwanda, Efik Central Africa Kwa Yoruba, Igbo Western Africa Mande Bambara, Malinka Western Africa Voltaic/Gur Mossi Western Africa West Atlantic Fulani Western Africa Nilo-Saharan Dinka, Masai Eastern Africa Penutian Mayan, Quiché, Araucanian South America Uralic Finno-Ugric Finnic Finnish, Estonian, Saame NE Europe Ugric Hungarian East-Central Europe Samoyedic Nenets Siberia Fig. 22.11 — Some Other Language Families

443 Fig. 22.11 mentions both the ‘Altaic’ and the ‘Uralic’ families. Around the middle of the 20th centu- ry it was commonly believed that these two families were actually members of a larger family, the ‘Ural-Altaic’ family, and you may find references to this family in older books. But i don't know of any responsible comparative linguists today who believe in the existence of such a family.

The Altaic family itself has undergone some serious challenges in the community of comparative linguistics, with many people bringing strong arguments suggesting that the three recognized sub- families (Manchu-Tungus, Mongolian, and Turkic) that supposedly make up the Altaic family — and about whose existence there is no doubt — resemble each other not because of descent from a common ancestor but rather because of extensive borrowing over long periods of time, the result of long-term language contact. The relationship of Japanese and Korean to any of these languages is much debated, the discussion being unfortunately obscured by nationalist chauvinism (沙文主義) on the part of even many well-educated Japanese, who apparently resent any suggestion that their nation or ethnic group has anything significant in common with any ethnic group on the mainland.

Many attempts have been made by many linguists, most notably Joseph Greenberg, to group all the Native-American languages into a small number of large families. Such attempts have not so far proved very convincing. And there have been even more ambitious attempts to group together most or all of the languages of the world into one or at most a very few very large families; i shall be talk- ing more about these in Chapter 25. It has become apparent that some people, some linguists, tend to be predisposed by their characters to try to develop large-scale classifications like these, while there are others who because of their characters tend to be very sceptical about any such thing. We sometimes call the first group ‘lumpers’ because they like to ‘lump’ languages together, often on evid- ence that strikes others as inadequate, and the second group the ‘splitters’ because they tend to want to split up language families that other linguists are comfortable with.24

There are, however, languages which have resolutely resisted all such attempts, languages which nobody has been able convincingly to affiliate with any other languages at all. These languages are called isolates. One isolate is Ainu, spoken on Hokkaido, the northernmost of the major islands of the Japanese archipelago. Another is Burushaski, spoken in the Himalayan highlands. But the best- known example of a language isolate is Basque. Basque is a language living in the western Pyrenees, the mountain range separating France and Spain. Basque has lived there for over 2000 years, since before the Latin language came to the area and evolved into the Romance languages that dominate the area today. And Basque has no known relatives. Many attempts have been made to affiliate it with some other language, either living today or known to have existed in the past; all such attempts

24Note that, before he started working seriously on Native American languages, Joseph Greenberg had worked hard for many years on the classification of African languages, and most Africanists find his macro-classification in that area impressive and, on the whole, reliable, and they are therefore frustrated at the poor reception his macro-classification of Native American languages has received among Americanists. It is not, however, simply the case that Africanists tend to be lumpers while Americanists tend to be splitters; it’s also true that Africa has been demographically fairly stable for a long time, while demographic stability is almost completely unknown in America. The result is that there has been plenty of opportunity for glossogenetic relationships among Native American languages to be obscured (possibly to the point of total irrecoverability) by various kinds of language contact over the millennia.

444 have failed. You will probably hear of this from time to time; the affiliation of Basque is, in many respects, the Holy Grail of historical and comparative linguistics; and, like the Holy Grail, there are a great many people who claim to have found it but who have failed to convince many others of their story; my teacher used to joke, whenever anybody mentioned a language isolate like, say, Burushaski or Ainu, that if such a language was not demonstrably related to any other language it therefore must be related to Basque.

445