Comparative Methodology for Distant Relationships in North and South American Languages1

Mary Ritchie Key University o f California, Irvine

ABSTRACT

Comparative linguistics has been an indispensable tool for classifying languages for the last couple of centuries. The method has been successfully used for grouping of languages with obvious relationships; it is also used, though less successfully, for identifying more distant relationships. An example is the placing of Hittite with the Indo-European languages. Languages with close relationships show a good deal of regularity in their reflexes from former states. More distant relationships exhibit fewer and fewer regularities, and thus comparative methodology is less secure. I propose that there are other linguistic features that can be observed in the application of comparative methodology to distant relationships. For example, the phonetic variants of the languages may show patterns of language change which are at the phonemic level in other languages. Another feature to observe is the pattern of fluctuations which occur between phonemes, as well as between phonetic variants. Still another structural feature which can maintain parallels between related families is the distribution pattern - phonemes within morphemes and within words, and the potential consonant and vowel clusters. One might also study “loanwords” (not always easily identifiable), to see if they exhibit patterns that reflect borrowings between related languages. At the semantic level, one can study the various meanings within cognate sets - if the same set of meanings occur in another group of languages, it may reflect common history. Finally, one can study the patterns of reflexes of the obviously related languages. If a proto form has been reconstructed, in addition, one must study the actualizations in the various languages. Proto forms2 can obscure useful identification markers, especially if they have been wrongly reconstructed. I maintain caution in proposing proto forms for distantly related languages.

INTRODUCTION

The name of Jakob Grimm is famous beyond linguistics because of the break­ throughs in comparative linguistics in the nineteenth century. Regular sound laws and regularity between sound correspondences were unquestionably demonstrated by many scholars, although it was Grimm’s name that was popularized. For over a century now, the discipline of linguistics has been developing on the basis of this “scientific” approach to the study of languages. I am now proposing that comparative linguistics go beyond Grimm and deal with what appears to be irregularities - the “unsystematic” and “sporadic” “residue” - the “unique” examples. The aberrant expressions have resulted in an enormous variety of many dialects and communities of languages among the peoples of the world. Studies of these phenomena bring together the theories of dialects (sociolinguistics) and historical linguistics.

COMPARATIVE METHODOLOGY

For purposes of discussion in the comparison of languages, we can review the steps of procedure. The first step is a hunch or a hypothesis set forth that results from noticing similarities between two or more languages. The next step involves gathering evidence - a substantial list of possible cognate forms. These must have both phonetic and semantic similarities. In addition to lexical resemblances, one also looks for similarities between the grammatical structures. The final step is the essence of comparative methodology. It is the most detailed and tedious to accomplish. This requires the accountability of every sound correspondence between two (or more) languages and how the sound system of one language corresponds to the sound system of the other language. For example /p/ in Language A corresponds to /b/ in Language B. We can write the symbols p : b for this formula. When the two systems correspond with predictable regularity, then the comparative method has produced the proof that the languages in question have evolved from the same proto language. Similarities between languages are not enough to establish genetic relationship. Any languages around the world might exhibit certain typological similarities, but typology is not history. The proof comes with the final step referred to above - a study which is painstaking and time-consuming. Because it is so demanding, there are actually few detailed comparative studies in languages of the world (other than Indo-European) in contrast to the number of good descriptive studies. In fact, to accomplish a comparative study, one must be in control of two disciplines: descriptive and comparative. Along with establishing the formulae of correspondences between the related languages, and showing their regularity under various conditions, the comparative linguist may proceed further and posit the sounds of the proto language from which the derived languages originated. This involves reconstructing a hypothetical proto sound system. Proto forms a)e starred, to indicate that they are hypothetical. Thus, we can say that *p became '/b/ in such and such language. No one living has ever heard a proto language; it is an assumption based on known facts about languages. The proto product must be like a real language; it must follow the rules of natural phonology. At the turn of the century, Vendry6s spoke of “general phonological tendencies,” “natural tendencies," and a “perfectly natural way” (Vendryes [1902] 1972). One must have a good deal of information about the derived languages in order to reconstruct a proto system based on them. The proto forms are posited on the basis of the distribution patterns of the series of stops, fricatives, nasals, and so forth, and the relationship of the various series to each other. Proto forms should not be posited simply because the formulae are different, any more than one would posit a different phoneme for every different allophone in a descriptive study. For example, two languages might have the sets of correspondences p : p and p : b. One does not posit *p and *b simply because the formulae are different. By examining the data in which these sounds occur, one might find that p : p occurs initially in the examples, and p : b occurs medially in the examples. In such a case, one would posit a single proto form, *p, and indicate that in the first language the sound was maintained as originally, and in the second language, the proto sound split, under the circumstances: p initially, and b medially. Thus,

1 am dwelling on these rigorous procedures somewhat at length, because there are some misunderstandings of the comparative method which cause confusion and bring it into disrepute. Thus, other scholars might question how scientific the method really is, and how reliable as a tool for reconstructing history. Unfortunately, and unscientifically, proto forms are often presented as accepted and verified conclusions. This is simply not so; they are hypotheses, to be challenged and tested with new information. Proto forms are suggestions as to how the former language might have looked; they are not facts; they are not the proof of common origin. It is the regularity of the correspondences which establishes the historical affinities. Further, proto forms should not be reconstructed on a one by one basis. This produces a grotesque and unreal bunch of sounds. A convincing proto system is reconstructed on the basis of how the sets of sounds (stops, fricatives, etc.) function within a dynamic system, and how they are distributed within that system. The methodology briefly outlined here can be applied with relative ease to languages which are closely related and which provide enough data for the investigator to find most of the rules which govern the sound correspondences - the reflexes of former states. The matter of applying comparative methodology to distantly related languages is more difficult. Because of the nature of distancing, there are less data. Nevertheless, I am suggesting in this presentation that one can use much of the information necessary for the reconstruction of the closely-related languages, to extend this precise and reliable methodology, and to make tentative observations about distantly related languages. I will mention a few of these linguistic features that can be useful in such an application.

Detailed phonetic information may lead one to recognize phonological rules at different levels in other languages. The phonetic variants of one language may be governed by rules which operate at the phonemic level in related languages. The Tacanan languages have such an example (Key 1968a). In Tacanan-Cavinena, semivowel /w/ has two allophones: voiced fricative [b-] when it occurs before front vowels, and allophone [w] elsewhere. Tacanan-Tacana has these two sounds as separate phonemes, contrasting in these same circumstances, rather than in complementary distribution. But in the cognates reflected from the former proto language, Tacana /b/ occurs in cognate sets only before front vowels, and Tacana /w/ occurs in cognate sets before /a/. The rules parallel the rules found in Tacanan- Cavinena at the phonetic level (Key 1968a: 23,38). Therefore, I have reconstructed only one phoneme for the proto system:

*w>Cavinena/w/: Tacana /w/ (before/a/) /b / (before front vowels)

Another feature to observe is the pattern of fluctuations (variant pronuncia­ tions) which occur between phonemes, as well as between phonetic variants. In Tacanan-Chama (unusal for its extraordinarily large number of fluctuations) the patterns of fluctuation follow the patterns of correspondences between the sister languages (Key 1968b). For example, /t/ fluctuates with /k/ (t ~k); and t : k. Also (s ~ h ); and s : h. Still another structural feature which can maintain parallels between related families is the distribution pattern — phonemes within morphemes and within words, and the conditioning factors. The following patterns are drawn from the Tacanan languages of South America, and Uto-Aztecan-Hopi (UAzHopi) of North America. They are suggestive of similarities of distribution patterns in the languages used in my studies, that is, Tacanan, Panojn, and Quechumaran. Tacanan velar stops are analyzed as follows (Key 1968a: 35-6): a E a XI U H U

*k kw k kw kw

Proto Tacanan *k has reflex kw in Cavinefia and Chama in all environments. In Tacana it has reflex kw when it occurs before /a/ in all positions and before /i/ in initial position. It has reflex k elsewhere. Uto-Aztecan has two proto velar stops: *k and *kw, with reflexes in UAzHopi as follows (Voegelin, Voegelin, and Hale 1962: 51-2):

a. a. o o X X

*k k (before high vowels) *kw kw q (before low vowels)

The following examples3 include the related , and presumably related Quechumaran languages. Though the details are not yet worked out, these and other examples are suggestive of the patterns outlined above. The analysis is further complicated by the Quechumaran languages, which also have two positions, velar and back velar, and modification by aspiration and glottalization. Other examples are presented in the word list of over 500 sets, in Key (in preparation):

Uto-Aztecan *k > Hopi /k/: Cortar ‘cut’ *UAz *siki ~ *sika; *Tacn *siki-. Lena ‘firewood’ *UAz *ku; ♦•Quechumaran **k’ul^u. Perro ‘dog’ *UAz *puku;TacnReyesano pako.

Uto-Aztecan *k > Hopi /q/: Dar ‘give’ *UAz *maka; *Q * qu-. Oreja ‘ear’ *UAz *naka; *Pan *n'ika- (hear). Piema ‘leg, thigh’ *UAz *kasi; *Pan *kisSi. Sentar ‘sit’ *UAz *kati; PanWariapano yakati.

Besides these structural characteristics, one might also study “loanwords” - not always easily identifiable - to see if they exhibit patterns that reflect borrowings between related languages. (To be sure, there are still unsolved problems in determining how close the relationship must be, in order to be designated “borrowings from related languages.”) At the semantic level, one can study the various meanings within cognate sets, f the same set of meanings occurs in another group of languages, it may reflect :ommon history. For example, both North and South American languages include :ombined meanings in cognate sets: ‘plant, bury’ and ‘point, nose’. Of course, it :ould also reflect common universals of human beings’ relationships with each other ind the environment. But if these meaning sets also have phonetic similarities, there s less likelihood of chance. In my data base for examining distant relationships I have filed data from over ixty languages from a limited group of language families. I have filed the data under i basic, core vocabulary and have, thus, grouped together morphemes with closely elated meanings, even though the actual vocabulary in the published dictionaries ind word lists may be glossed differently. I have filed without regard to grammatical :lass, so that nouns, verbs, and modifiers may occur in the same semantic set: broom, sweep’; ‘spit, saliva’. I have filed together words with closely related neanings, such as: ‘ice, freeze, frost, snow’; and ‘to see, to look for, to observe’. To the dismay of biologists, 1 have put together the toads and the frogs, and the mice ind the rats! In this way, languages can be quickly scanned to find phonetic iimilarities, with enough semantic similarity to be plausible in natural language levelopment. I believe that one of the major reasons that has impeded analysis of distantly related families among American Indian languages is that the data are too mormous and scattered. Having to hunt through dictionaries for words with related neanings becomes an impossible task. Just simply the mechanical difficulties have rindered and kept us from seeing that there are not just a few lexical resemblances, Dut hundreds of them. Limitations of time and limitations of language expertise in the distinctly different (but possibly related) families have kept us from amassing enough data to be able to analyze for sound correspondences. Therefore, I have spent considerable time organizing the files using a basic core vocabulary that brings together lexical items in close semantic relationships. It should also be recognized that while there might be errors in matching “resemblances,” thereby producing spurious sets, at the same time it is possible to pass over real cognates. Thus, the law of averages helps, in its own way. Besides these features that can be used as heuristic devices for recognizing affinities across distance, one can scrutinize the patterns of reflexes of the obviously related languages. Even though a proto form has been posited, one should not limit oneself to observing only at that level. The proto reconstruction is a model that gives insight into the problem, but inflexible adherence to a model can be deceiving. Proto forms can obscure useful identification markers, especially if they have been wrongly reconstructed; therefore, the actualizations in the various languages should also be studied. It goes without*saying, I must emphasize, that reflexes at a lower level of the phylum should not be used to reconstruct at a higher level, that is, one should not mix the levels in reconstructing. Observations at different levels, however, can be used for identifying structural similarities, and thereby be useful for gathering data for analysis. The following example of s : h in both North and South American languages illustrates this, where UAzPapago and Tacanan-Cavinefta both reflect /h/ from a proto sibilant form:

Cortar ‘cut’ *UAz *siki ~ *sika; UAzPapago hi'ik •Tacanan *slki-; TacananCavinena hikwi-

In South America, the languages show that /h/ derives from more than one proto form. A Uto-Aztecan example also suggests a multiple development. Note the correspondences of h : s, and s : s in:

Pierna ‘leg, thigh’ UAzVarohio kaahsi •Panoan *kis£

To illustrate further, let us take a hypothetical situation dealing with languages which are familiar to us. In order to get the perspective of how it is to work with preliterate languages, let us assume only present knowledge of extant languages, and no previous knowledge of the history of Indo-European languages. Suppose a linguist from China goes to Spain and becomes fluent in Spanish. He or she, then, travels on for further study in Germany. With notebooks and word lists in hand, the linguist soon discovers that a Spanish /p/ corresponds to a German /f/ in such words as: pez : Fisch\pie : Fuss; pulga : Floh. He, then, dedicates himself to the task of reconstructing the proto system for these languages, and establishes a *p, from which derived Spanish /p/ and German /f/. (Remember that our hypothetical linguist does not have access to the historical records of Latin and the scholarly studies of Proto Germanic.) Then, our hypothetical linguist moves on to the New World - that exotic land across the waters where strange people wear cowboy hats and make movies. After a few days of field work, our Chinese linguist discovers that this exotic cowboy dialect has a sound /f/ which is similar to the sound he heard in Germany. He sits down in his study and makes a list of all the words he can find in his German and Spanish notebooks where /f/ and /p/ occur. He, then, returns to his field notes of the New World language to scrutinize the data. From the list of words with New World /f/, he tries to match vocabulary items from the Old World with either a /p/ or an /f/. If he finds a reasonable sample of lexical items in these three languages that have both phonetic (p or f), and semantic similarities, then he can presume that these people had been in contact at some time in the past. Our hypothetical linguist returns to China, and from his comfortable armchair, he hypothesizes about the origins of the light-skinned, blue-eyed, and the olive-skinned, dark-eyed people o f the Old and the New Worlds.

RESIDUE

Residue = ‘that which remains’. Even though it is universally accepted that language has structure, and comparative linguistics has proved that there is regularity in language change, natural language contains a baffling array of irregularities. Grimm called these (translated, of course) “jumbled relics” (Lehmann 1967: 56). In fact, at the beginning stages of a comparative study, it may appear that the residue is more prominent than the actual patterns and regularities that are so beautifully and awesomely organized into a system. The residue appear at every level: phono­ logical, grammatical,lexical,andsemantic;and the examples defy analysis. Examining the residue may be analogous to observing non-normal behavior in order to get more insight into normal behavior. The beginning of a comparative study means identifying the formulae of sets of correspondences between the languages. In closely related languages, the sets of correspondences are usually found in dozens of words or morphemes, for example, English /1/ corresponds with German /s/ in fo o t: Fuss; water : Wasser and passim. But other pairs of sounds may occur without such pleasant regularity. One, or even two examples of a pair of sounds, does not necessarily manifest a set of correspondences. A “critical mass” is required before one can be sure that the formula results from common history. There is no agreed upon criterion regarding the number of examples which are necessary for a formula to be accepted as part of the comparative structure. In the discipline of economics, observers indicate that they require at least three reports of the same thing happening before they will say that there is a “trend.” When I am looking for correspondences in making language comparisons, I require at least three examples before I consider the formula to be a possibility of a set of correspondences. If I have three, then, I look for more; related languages will probably have more. Only three or so might be coincidental, or an indication of borrowings. English, for example, has some lexical items that exhibit a Ipl instead of an/f/ for the morpheme meaning ‘foot’, such as in ‘pedal, pedestrian’. Because of our literature and written history, we know that these are borrowings from Latin. When working with unwritten languages one does not have the luxury of recorded history to illuminate the residue. But still, we can gain a little insight into history by analyzing the irregular forms. Some regularity in the residue may be attributed to borrowings from related languages, such as the Latin example above. Patterns of fluctuation (optional shifts of pronunciation) may exhibit corre­ spondences from sister language* Unique examples give witness to casual migrations. In other words, the residue appears as a scattering across the language - evidential witnesses to the fact that “Kilroy was here!”

WORD FORMATION

A typical syllable in the American Indian languages that I have been examining in my comparative studies is made up of a consonant and vowel, CV (Key 1968 and following). This type of syllable may constitute a morpheme. It is possible that this is a pristine type in word-formation development. The single-syllable morpheme can be considered as a basic building block to make up one, two, and three syllable stems and words. Other canonical shapes can take form from this basic structure with additions and deletions of vowels and consonants, forming clusters of consonants and vowels. The following examples serve to illustrate the variety of CV patterns that result from language change. The comparisons given are from related languages. They represent very common patterns throughout the languages. Estdmago ‘stomach’ Quechua-S, Quechua-A wiksa; Quechua-T iksa. Cana ‘cane’ Panoan-Amahuaca paka; Tacanan-Chama eka. Sepultar ‘bury’ **Quechumaran **p’amp’a-; *Tacn *papa. Sentar ‘sit’ *Tacn *ani-; *Panoan *kwina [n/mVJ. Seis ‘six’ Quechua-T sukta; Quechua-Huaylas okta; TacnTac sokota. Amontonar ‘pile up’ *Quechua *uql^a-; Aymara iqi-. Borracho ‘intoxicated’ AzZacapoaxtla wiinti-; AzClassical iwintia. Mujer ‘woman’ AzZac siwaa- (woman); AzZac swaapii- (girl); AzTetelcingo sowa-. An illuminating way of observing comparisons of syllables is to line them up vertically:

‘widow* Sanskrit vi, dha va Old Church Slavic vi do va Gothic wi du wo Irish fe d b Latin vi du a Spanish vi U(j a

The following examples are from the Uto-Aztecan and South American languages of my recent comparative studies (q.v. for abbreviations).

cocinar cook **UAz *kwa si Ay ka n ka *Tacn *si na TacnChm hi ha TacnChm kwa kwa Mosetene ka na ki Map wai kwi costurar sew ♦UAz ma *Q *si Ay tu ku Ay C9U ku TacnCav to TacnTac ro so TacnChm so ko ♦Pan (reordered) si *kP ♦Pan (needle) su mu si Mos so so Mos co [?] mo escupir spit ♦UAz ♦tu ♦UAz *ci ♦♦UAz 9a UAzComanche tu si UAzTub tu hu 9a UAzHopi to ha *Q ♦thu qa **QAy ♦♦thu Ca Ay thu su n k*ea Ay thu sa TacnCav kwe di TacnTac ki to a TacnChm kwi to PanCshb tu su ka Pan na ka mo9 PanChac (reordered) so ko Pan SC mi tu Mosetene so ho Map to f k i estomago stomach ♦UAz ♦po ka V AzZac po s UAzHopi po no UAzSP sa P» UAzTub sa pu s ♦UAz Q pu su Ay pu ra ka TacnTac ma TacnTac TacnChm y see TacnRey ma so mo Continued •Pan *po ko Pan Yam po s to PanAma pu wi PanlC pu su PanChan a to Map P* tra Map pu e Yuracare su tu na ChonSelk q ’a t* ChonOna ka t

[r:w] In the processes of language development through the centuries there are subtle changes or loss of meanings. Meillet ([1925] 1967: 48) gives an example that is clarified by our knowledge of the past. It is possible that the Indian languages have developed along similar lines - losing syllables, sounds, and subtle meanings.

hiu - tagu ‘this day’ hiu - tu (Old High German) heu - te (Mod German) ‘today’

Word formation exhibits tautology - a needless repetition of an idea. This redundancy is seen at the level of morphology, as well as phrase and sentence level: hound-dog; to go out the exit; a tiny, little bug; in today’s modem world. Skeat discussed the matter of affixes in this aspect of word formation in his etymological dictionary of English (Skeat [1882] 1980: 630-31, q.v. for examples):

One of the most remarkable points is that most Indogermanic languages delighted in adding suffix to suffix, so that words are not uncommon in which two or more suffixes occur, each repeating, it may be, the sense of that which preceded it.

Besides pleonasms within a language, cross-cultural migrations or visitations produce other kinds of repetitions: the hoi polloi, el algoddn. These “true facts” result from languages in contact. This explanation may account for many baffling examples in the American Indian languages, which seem to exhibit morphemes of the same meaning in word construction (examples cited in Key 1981a). One can find numerous examples of this in place names, showing the coming together of two or more peoples and settling into a community: El Prado Meadow, Rio Grande River, La Brea Tar Pits, Sahara Desert.

DIFFICULTIES IN RECOGNIZING COGNATES

The reasons for the difficulties in identifying possible cognates, as against hance resemblances, are many and exceedingly complex. So complex, in fact, that ;ven though suggestions for relationships across the continents have been set forth or at least a couple of centuries, none has been taken seriously. Before the advent >f comparative studies in non-Western languages, it would have been impossible to do nore than suggest lexical resemblances. With this generation, enough comparative tudies have become available, so that it is now possible to test comparative nethodology beyond closely related languages. And it is also possible to push the rontiers of discovery by extending the methodology to distantly related languages. While some of the difficulties might not even be recognized so far, some are ibvious. 1 will mention a few of these. Vowel variation is common among the ndian languages that 1 have been studying. The following example is recorded from everal dialects of Quechua. Ampolla ‘blister’ *Quechua *phuslyV; Q-B p^uslyu; )-C p^usulyi/u; Q-T puslya. Metathesis is also common among these languages. It occurs (or is reliably ecorded) in varying degrees in the different languages, and not enough data are ivailable to be assured of a correct analysis. This is an aspect that must be seriously lealt with before reliable statements can be made about correspondences. See examples in Key (1980-1981). Taboo practices also contribute to the difficulties of finding forms to work with. This is a well-known difficulty which is cited as a problem in working with >lottochronology or lexicostatistics; word taboos skew the counts of lexical esemblances. Suarez (1971) gives such an account of word taboos in the Chon anguages. Different sources of material (strangely enough) also obscure the picture in :omparative studies. I noted this in summarizing the Arawakan materials (Key 1979: 75). Though i am not using in my studies, it is possible that a iimilar phenomenon could contribute to difficulties in finding cognates. Drthography is undoubtedly part of the problem, when working with library naterials versus field materials. Symbols can obscure, as well as enlighten. These irtifacts of advanced civilizations at times slow our progress and, incidentally, are imong the reasons that we have not been able to use computers to help in the :edious hunt for cognates. See Frantz’ account (1970). In my study of Mapuche (Araucanian) I noted the problems with orthography (Key 1978a: 281). I found three recordings for the word fajita ‘ribbon, band’: guton, nitrohue, and nguchrohue. The letters g, n, and ng were used for the sound [T) \ ; u= i; t = chr = tr (the retroflexed stop); o = o; the verbal suffix -n is deleted when another suffix is added. In the recording of this morpheme, then, only one orthographic symbol is the same in all three recordings. It is an acknowledged fact.^then, that finding cognates in distantly related languages is difficult, if not impossible at times. And I have explained a few of the reasons. What is not so easy to explain is why it is sometimes difficult to find cognates in closely related languages. Tacanan-Chama is a case in point (Key 1979: 16). Chama /t/ is a fairly common phoneme in the language with no obvious problems in its description. In the comparative studies, however, it does not derive from Proto Tacanan *t, which is reflected as /k/ in Chama, but it apparently derives from some proto sibilant(s) or assibilated sound(s). In the reconstruction, I had to leave it as tentative, because only a few cognates exhibit it (Key 1968a: 35, 37,43). It occurs with only one morpheme (a possessive form) and in limited position (stem initial position). The cognate forms in which it occurs are all irregular for other reasons. It is possible that the following sketch might elucidate the derivations. I am using a capital letter to represent a “pre-form,” since I cannot reconstruct conclusively. The sketch shows that Chama /1/ has a relationship with a pre-form [ T | (through fluctuation and forms borrowed from its sister languages), and at the same time it has a relationship with a pre-form I S I , sibilant or assibilated sound(s) of some kind, as yet unexplained in the analysis:

Cav = Cavinena ■—, Chm = Chama ,— , 0 U L J

s (Cav) t, s, s, c (?) *- (Chm) s : t t ~ k

BORROWINGS AND ONOMATOPOEIA

In the files that I have organized, there are multiple forms for some glosses in the basic vocabulary. This suggests borrowing between related languages, as is fovnc between English and French, in the oft-cited examples of p o rk: pig; veal: calf; mutton : sheep. Other types of examples suggest borrowings that are historically less obvious - from languages which have no apparent relationship. A large percer.i of the entries consist of two or three, and sometimes even four, different morphemes with the same or similar meanings. In the following examples I am showing a rough equivalent of the phonetic possibility. The morphemes for agua ‘water’ are, roughly: pa, ko, ena/oma. Cortar ‘cut’ siki, teki, pa-, ri-/ra-. Hombre ‘man’ honi-, d/teka, kit/ca, bo. Maiz ‘corn’ sixe, ra/wa, sa/se. The layerings of vocabulary could be explained, to some extent, by multiple migrations. Migrations or invasions that involved only males might explain some “male-female” languages, as Taylor (in correspondence) has suggested for the Island-Carib. And see Taylor and Hoff (1980) for a further suggestion of a pidgin language. The complex multiple derivations of presumably “borrowed” words across and up and down the Americas speak to many movements of peoples throughout the centuries - not just a single migration that formed a linguistic community. There is a very large vocabulary of debatable origin in the Indian languages, and in the Spanish of Latin America (Key 1966), and even in the English of the Americas. This deserves a study in itself — to show how widespread it is. Etymologies should be challenged, with new information on relationships. Borrowings have always been a stumbling block to comparative linguists. Quite simply, no linguist knows enough languages in the world to correctly identify all the possible loanwords. Undoubtedly all of us who have reconstructed proto forms have unwittingly reconstructed borrowings; there is no way that this can be avoided - with the limitations of the human mind and the limitations of access to information. The following are a couple of examples from proto studies which I regard very highly; the scholarship of these studies is of very high caliber. Nevertheless, it would appear that these examples might be loanwords from Spanish, rather than reconstruc­ tions of indigenous languages. The first example is from Dyen’s study of the Proto Malayo-Polynesian laryngeals (1953: 25):

•kurapu, Tg. kulapo Tilm on liquid’, SeBs. kulapoq ‘vegetable film on water’, SLBs. kulapo ‘kind of seaweed’, Bk. kulapo ‘sea-slime’, Ml., Jv. k£rapu ‘name of a fish’.

The Spanish word lapa means “vegetable film on surface of a liquid; barnacle; goose grass, cleavers.” The next example is from one of the Proto-Uto-Aztecan studies (Voegelin, Voegelin, and Hale 1962: 144): *cikuri ~ cikori circular: Papago sikol-k; Huichol ciktri. These forms are suspiciously close to the Spanish word for ‘circle’. These examples do not demonstrate that these studies are not meritorious; they do demonstrate that proto forms are not factual conclusions. They remind us very vividly that collaboration and cooperation are necessary between scholars all around the world - from all language families - in order to understand better the theory of comparative linguistics and language change. Onomatopoeia is another pitfall for comparativists; nevertheless, I am here proposing that we could also learn something from studying this phenomenon. The following examples come from the Polynesian languages and are glossed as ‘temblar (tremble, shake, quivfer)’ (many references - see Key, in press). One example is the proto form *nini. The other example has phonetically distinct actualizations, but is glossed the same: Hawaiian nauwe; Tahitian ueue. These sets, to my Western ear, both sound as though they could have resulted from onomatopoeia; nevertheless, they are different, and could very well indicate borrowings, or somehow historically significant affinities between languages. In other words, even forms which are not supposed to be useful for comparative purposes might be historically interesting.

EXTENDING THE METHODOLOGY TO DISTANT RELATIONSHIPS

The answer to “the great diversity of languages” in , and in pre- Columbian California (U.S.A.), and in Mexico, and Oceania, and, indeed, in the world, is simply that there are many more relationships than have been acknowledged, or discovered. From time to time, investigators have noted resemblances between the reputedly disparate languages, but suggestions for relation­ ships have been disregarded - relegating any resemblances to “sheer coincidence,” or borrowings, or onomatopoeia. Rather than dispense with the persisting resemblances by the hackneyed response of “pure chance,” 1 prefer to take on the challenge of a theoretical explanation of why there are so many uncanny similarities between so-called “unrelated” languages. By attempting to extend the methodology of comparative linguistics to distant relationships, we will discover something else about language change. The puzzles (the residue) are a rich source of data at every level. How far back can comparative methodology be reliably used? How long do words keep their essential shape? The cockroach has kept its shape for millions of years. Fossil evidence exhibits examples that are recognizable today by anyone who is familiar with the cockroach. How many centuries, millennia, ages, do words keep their essential shape? For how many ages does language retain original structures? And, as languages separate, how long do they retain similarities? Are structural similarities more stable in grammar than in phonology? I propose that we can extend some basic tenets of comparative linguistic methodology to discover new relationships, as well as gain new information on rules of natural phonology, and better understand “drift.” 1 believe that we can observe something about distantly-related languages, even if we don’t have enough information to propose proto phonemes and proto systems. There will be situations where there is enough information to posit some proto forms, but not the whole system. The degree of usefulness of the comparative method depends on how distant the relationship and how much data are available. In my recent comparative studies, I have used the concept of “phonological space.” This procedure groups together morphemes with same or similar meanings (as I described in COMPARATIVE METHODOLOGY above) that also have phonetic similarities which fall within a reasonable phonological space on the phonetic chart of consonants. Thus, a word with a consonant /p/ in Language A can be compared with a word in another language that has a bilabial sound. If the languages derive from the same origin, these bilabial sounds have derived from a proto bilabial sound. In the preliminary stages, one does not know whether it is a/p, b, f, m/ or /w/. It could even be connected, through a chain of events, to an /h/ or even zero. And in distantly-related languages, we may not be able to garner enough information to be able to posit a form. Therefore, we can call this a pre-form, or something of the sort that acknowledges that it is much less certain than a proto form. For purposes of discussion, I have devised a symbol for this pre-form: a capital letter enclosed in a box, for example [F ]. Under this symbol, I list the possible reflexes which occur in actual examples from the languages in question. I am using this device in a study which hypothesizes connections between the Polynesian languages and the Indian languages of the Americas (Key, in press). Here, I will give an example from my study of Uto-Aztecan (UAz) in North America, and Tacanan (Tacn), Panoan (Pan), Quechumaran (QAy), and Mapuche (Map) of South America. There is possibly an alveolar sound which can be designated as pre-form [¥]. Under it are listed the sounds which occur in examples taken from the language files, which might prove to be related forms. Thus:

Fuego ‘fire’ "'UAz *tahi; AzZac ti-t; *Tacn *-ti-; *Pan *ci9i; Mos i\. Aflojar ‘loosen’ *Az "torna; TacnTac tonati. Caminar ‘walk’ *UAz *piti (Uegar); *Tacn *po-ti-. Corto ‘short’ UAzTar t^ri; PanAma tor69. Corto ‘short’ *UAz "tup; TacnCav t^obo. Apretar ‘squeeze’ UAzTet kii-triniia, kii-tiriniia, kii-tiliniia; Map kitri'n. Examples from UAzTarahumara are especially significant. Uto-Aztecan *t is reflected as /1/ and /r/ in Tarahumara. These same reflexes occur in South America in lexical sets which are the same, or similar enough, in meaning to be plausible. Moreover, the resemblances extend to the bilabial stop counterpart, *UAz *p, which is reflected as /p/ and /b/ in Tarahumara. The following examples illustrate Proto Uto-Aztecan *p and its reflex /b/. The word for ‘hair’: Proto Uto-Aztecan *po; Tarahumara bo’wara (wool or (slang] human hair); Proto Panoan *boo. ‘To suck’: Proto Uto-Aztecan *pini; Panoan- Marinahua biii. ‘Sweet, bee, honey’: Proto Uto-Aztecan *pis, * pic; Proto Panoan *¥ata; Tacanan-Tacana bita-. ‘Enter’: Proto Uto-Aztecan *paki; Tarahumara baki; Tacanan-Cavinefla nobi-. ‘Eyf’: Proto Uto-Aztecan *pusi; Tarahumara busi; Tacanan-Chama -bosi (face); Proto Panoan "biro. ‘Heavy’: Proto Uto-Aztecan •pit*; Tarahumara bite; Proto Tacanan *bike-. ‘Bring’: Proto-Uto-Aztecan *pin; Proto Panoan *bi-; Proto Tacanan *be-. The following examples illustrate Proto Uto-Aztecan *t and its reflex /r/. The word for ‘lie down’: Proto Uto-Aztecan *tika; Tarahumara rika; Proto Panoan *raka-; Tacanan-Cavinefta hara-; Quechua (Ecuador) siri-. ‘To find’: Proto Uto- Aztecan *tiwa; Tarahumara riwa; Proto Quechua *tari-; Mosetene rijbiti; Tacanan- Cavinena ^oro-; Panoan-Amahuaca ranan-. ‘Rock’: Proto Uto-Aztecan *te, *tem; Tarahumara riU, rimojaci; Proto-Quechua *rumi; Panoan-Marinahua tokiri; Mapuche kura. ‘To see’: Proto Uto-Aztecan *tiwa; Tarahumara riwa; Proto Quechua *riku-; Panoan-Cashibo bari- (look for). Even with these impressive sets, I do not consider this enough information to posit proto forms - I maintain caution in proposing proto systems. With further input from authorities from these various language groups, we may accumulate enough examples (and other pertinent information about the phonetic, morphophonemic, and distributional patterns) to merit some tentative conclusions. We also have to accept the fact that solutions may not be found for all aspects of the system, where history has been irretrievably lost. Now, it must be acknowledged that when one is working in phonological space, with a great degree of uncertainty, the matter of typology and general limitations of language are of concern. With a sound system restricted to a dozen or two consonants, making up thousands of words, there are possibilities of spurious resemblances. A good scholar would not be dogmatic. We need to push out the frontiers to learn how far this methodology can safely take us. Of crucial importance to these explorations is a knowledge of actual sound correspondences in closely related languages that are accepted without doubt. I do not know of a handbook that gives us this information: What types of sound correspondences naturally occur in related languages of the world? Without these protocol statements to guide us in our observations, it is difficult to say when we are dealing with universals, and when we are dealing with structural similarities of a particular language group. In any case, I believe that comparative studies on a global basis have developed to the point when we can go beyond the tired, old responses of “sheer coincidence,” borrowings, and onomatopoeia.

MISCHSPRACHE

Human nature demands exploration and migration. The persistent movement of people has made complex problems for linguists. As Meillet has pointed out ([1925] 1967: 90), “there is scarcely a people which has not changed its language at least once, and generally more than once.” We could also say that it is difficult to imagine any language being so “pure” that it did not show some evidence of being “in contact” at some time or other - to some degree or other. As the oft-quoted anthropologist E. A. Hooton has said, “When different races of men come into contact with each other they sometimes fight but they always breed.” (Quoted in Gladwin 1947: 92.) Nevertheless, even acknowledging this, it seems to me that we attempt to analyze a language as though it has a single system of phonology, syntax, grammar, and so forth. With all the evidence that English could be thought of as a Mischsprache, we still relate to it as English, rather than French-English, or English- French, or Latinized English. As a matter of fact, as Hodge (1979) points out, we don’t even have a good definition of Mischsprache. Nevertheless, we continue to use such expressions as “Spanglish” and “code-switching,” with examples that embarrass us with their existence, and our lack of theory. To get a feel of how languages might have been in contact throughout the thousands of years of migrations, one can draw little sketches depicting the coming together and the separating of people. The sketches must involve two concepts - those languages which are clearly genetically related, and those languages which can be described as Mixed Languages.

Pre- A and B Pre-A Pre-Z

(genetically related)

From these simple, basic models, one can imagine any number of combinations. And probably languages around the world developed with such kinds of complexities. Pre-A and B Pre-Y and Z CONCLUSION

There is a time in preliminary studies, when one should be relatively uncommitted and not make dogmatic conclusions. Premature judgments can inject confusion and scrambling to the state-of-the-art. It should be made clear what is a certain finding, and what is a hypothetical conjecture. I also want to emphasize that I do not think that the languages Used in my study are the only ones related across the continents. On the contrary, there is very good evidence that other major language families should be integrated. In fact, when I began compiling my data base in 1975, during a period of study in Chile, I had thought to file material from other families rather soon. I have not done it yet, because the ramifications of the sounds are too complex (as I illustrated above in DIFFICULTIES IN RE­ COGNIZING COGNATES), and I believe there is a better control against chance by working with a manageable number of correspondences. Every new study requires us to revise, add new information, and correct our notions (and publications) of previous times. One must be willing to tolerate a fair amount of “mistakes” to test a methodological approach. As a matter of fact, pushing a methodology beyond its previous limits may also uncover mistakes which have been assumed as facts. For example, the methodology which I have discussed here might uncover proto forms which have been wrongly reconstructed, and accepted etymologies that are incorrect. 1 hope my presentation opens up discussions of comparative methodology and pushes the frontiers of it to see how much more we can learn about language change and the history of human language. 1 will close with a quotation from a contemporary scientist. He recognizes that he does not have all the answers. In an imaginary conversation, he invents a character who implores:

You have told me more than once that science advances only by making all possible mistakes; that the main thing is to make the mistakes as fast as possible - and recognize them. You like to quote the motto of the engine inventor, John Kris: “Start her up and see why she don’t run.” John A. Wheeler Center for Theoretical Physics

NOTES

1. This paper was presented at the 44th International Congress of Americanists, Manchester, England, September 5-10, 1982. It was the basis of lectures at the University of California at Berkeley, and the University of California at Santa Barbara, in February 1983. Here I acknowledge the very helpful comments from several colleagues who carefully read my first draft and discussed with me the “pitfalls” of comparative linguistics. It is always appreciated when scholars respond to ideas with intelligence and sensitivity - whether or not they agree. I am grateful for comments from former professors Henry M. Hoenigswald and Stanley Newman - always wise and thoughtful. In addition, stimulating and helpful comments also came from Robert A. Hall, Jr., Carleton T. Hodge, and Alan Kaye. Through the years two other former professors, Winfred Lehmann and Andre Martinet, have always generously responded to queries and problems that I presented to them. To all I acknowledge my appreciation. I am pleased to publicly acknowledge the generosity of the following who helped substantially by checking and verifying data: Kenneth Hilton, Carolyn Orr, and Lila W. Robinson. 2. I am using “proto” as a free form, to eliminate punctuation that is superfluous. This is similar to the use of “emic,” which now occurs widely as a free form. 3. Examples in this presentation are drawn from my comparative files which incorporate material from over sixty languages of North and South America. Numerous references have been used; for these, please see my previous publica­ tions on these matters.

REFERENCES

Buck, Carl Darling 1949 A Dictionary o f Selected Synonyms in the Principal Indo-European Languages, 1515 pages, Chicago: University of Chicago Press.

Dyen, Isidore 1953 “The Proto Malayo-Polynesian Laryngeals,” William Dwight Whitney Linguistic Series, 65 pages, Baltimore, Maryland: Linguistic Society of America.

Frantz, Donald G. 1970 “A PL/1 Program to Assist the Comparative Linguist,” Communica­ tions o f the ACM 13. 353-56.

Gladwin, Harold Sterling 1947 Men out o f Asia, 390 pages, New York: McGraw-Hill.

Hodge, Carleton T. 1979 “Egyptian and Mischsprachen,” in Linguistics and Literary Studies: in Honor o f Archibald A. Hill, pp. 265-75. Mohammad Ali Jazayery, et al (eds.), Vol. IV, The Hague: Mouton. Key, Mary Ritchie 1966 Vocabulario Castellano regional, Vocabularios Bolivianos No. 5, 62 pages, Cochabamba, Bolivia. 1968a Comparative Tacanan phonology: with Cavineha phonology and notes on Pano-Tacanan relationship, 107 pages, The Hague: Mouton. 1968b “Phonemic Pattern and Phoneme Fluctuation in Bolivian Chama (Tacanan),” La Linguistique 2. 35-48. 1978a “Araucanian Genetic Relationships,” International Journal o f American Linguistics 44. 280-93. 1978b “The History and Distribution of the Indigenous Languages of Bolivia,” paper read at the American Anthropological Association, 77th Annual Meeting, November, Los Angeles, California. 1978c “Linguistica comparativa Araucana,” Vicus Cuademos 2. 45-55. Amsterdam: John Benjamins. 1979 The Grouping o f South American Indian Languages, 170 pages, Tubingen: Gunter Narr. 1981a “Intercontinental Linguistic Connections,” Humanities Inaugural Lecture Series, 30 pages, University of California, Irvine. 1981b “North and South American Linguistic Connections,” La Linguistique 17.1:3-18. 1981c “Quechumaran and Affinities,” Scripta Ethnoldgica 6. 93-97. (Festschrift for Marcelo Bormido), Buenos Aires, Argentina. 1980- “South American Relationships with North American Indian Languages,” 1981 Homenaje a Ambrosio Rabanales, Bole tin de Filologia 31. 331-50. Universidad de Chile, Santiago, Chile. n. d. “Polynesian and American Linguistic Connections,” Forum Linguisticum. Supplement, in press. Key, Mary Ritchie and Christos Clairis 1978 “Fuegian and Central South American Language Relationships,” Actes du X L lf Congres International des Americanistes 1976, Vol. 4, 635-45. Lehmann, Winfred P. (ed.) 1967 A Reader in Nineteenth-century Historical Indo-European Linguistics, 266 pages, Bloomington: Indiana University Press. Meillet, Antoine [1925] The Comparative Method in Historical Linguistics, Trans, by Gordon 1967 B. Ford, Jr., 138 pages, Paris: Librairie Honore Champion. Skeat, Walter W. [ 1882] A Concise Etymological Dictionary o f the English Language, 656 pages, 1980 New York: G. P. Putnam’s Sons. Suarez, Jorge A. 1971 “A case of absolute synonyms,” International Journal o f American Linguistics 37. 192-95. Taylor, Douglas R. and Berend J. Hoff 1980 “The Linguistic Repertory of the Island-Carib in the Seventeenth Century: the Men’s Language - a Carib pidgin?” International Journal o f American Linguistics 46. 301-12. Vendryes, J. [1902] “Some Thoughts on Sound Laws,” in A Reader in Historical and 1972 Comparative Linguistics, pp. 109-20, Allan R. Keiler (ed.), New York: Holt, Rinehart and Winston. Voegelin, C. F., F. M. Voegelin, and Kenneth Hale 1962 “Typological and Comparative Grammar of Uto-Aztecan: I (phonology),” International Journal o f American Linguistics 28.1, 144 pages, Supplement. Wheeler, John A. 1981 [Quoted by John P. Wiley, Jr., From Frontiers o f Time.] Smithsonian 12.5.22-6.