Chapter 5 Writing Systems

There are many differences between spoken and written language. To the linguist, the most im- portant is: Spoken language is the birthright of all humans; every human being is born with the ability to learn a language and (except under extremely pathological circumstances) does so with- in a few years; we will talk about this issue in more detail in Chapter 13. Written language is an artifact peculiar to certain cultures. Vast numbers of people have lived out their lives as perfectly productive members of society without ever learning to read or write, or even imagining that such activities might be possible. All human beings everywhere acquire language; writing has only been invented a few times in the history of the world, in a few cultural environments.

When i speak of the ‘invention’ of writing, i mean the development of a , a method for making a permanent representation of language, in a culture that has not previously been ex- posed to any such system from an outside source.1 By this definition, as far as we know writing has been invented only 5 times in the history of the world, in 5 places:2

Mesopotamia: the fertile valley between the Tigris and Euphrates Rivers in what is now Iraq, ca. 5000 years ago. More specifically, the writing system was developed by the Sumerians, the first urban civilization to occupy this region, and later adopted by later residents and their neigh- bours.

Egypt: the fertile valley of the Nile River in Northeastern Africa, ca. 5000 years ago.

We do not know to what extent, and/or in what direction(s), these two may have influenced each other.

China: the writing system characteristic of modern Chinese is known to have existed in the time of the 商朝, a little over 3000 years ago; written inscriptions dating from that period but similar enough to modern 漢字 to be read have been found in the 揚子江 valley near Anyang. Internal evidence strongly suggests that writing had already been being practiced for some time before these inscriptions were made.3 More recent findings (reported April 2000) in 山東省 strongly indicate that the Chinese writing system was already beginning to evolve 4800 years ago, making it near- ly as ancient as the Mesopotamian and Egyptian systems.

1On the basis of this definition, the Europeans never invented writing, so far as we know; they got the idea from other cultures outside Europe, as i shall outline below.

2In previous editions of this text i included, as a sixth instance of the invention of writing, the ‘Harappan’ inscriptions found in the Indus Valley in what is now Pakistan and dating over 4000 years ago. However, in 2004 Steve Farmer, Richard Sproat, and Michael Witzel argued convincingly that these inscriptions do not qualify as a real ‘written lan- guage’ in the strict sense of the term. Cf. their paper ‘The Collapse of the Indic- Thesis: The Myth of a Literate Harappan Civilization’ (Electronic Journal of Vedic Studies 11-2, pp. 19–57). 3Nature of evidence: 1 The characters are already much stylized, noticeably removed from their (presumed) picto- graphic origins. 2 The inscriptions include the word 書, representing a bunch of bamboo strips tied together, imply- ing that such collections of writing existed and the Chinese were accustomed to writing on perishable material rather than on bone, which is the medium of the earliest known inscriptions.

102 Central America: roughly 1000 years ago, the Aztecs and Mayans had something in the way of a serviceable writing system.

Easter Island: inscriptions have been found on the great statues for which this small island in the Pacific Ocean is justly famous. They are, however, so far completely undeciphered, so there's virtually nothing i or almost anybody else can say about them. Writing Systems in Linguistic Science During the past century or so, linguists have by and large ignored or avoided the subject of writing systems, although rather paradoxically most linguistic research is based on written, not spoken, records. Part of the reason for this is that, as i remarked earlier, writing is an artifact, something that human beings have had to deliberately invent, while Language itself, the subject matter of linguistic research, is the natural birthright of all humans. This would suggest that there is a sig- nificant difference between Language and writing, even though writing is in one sense merely the visual representation of Language. And this is certainly true; we are speaking animals, not writing animals. Language is a skill we all learn effortlessly in childhood; writing is a skill which, if we learn at all, we learn through arduous practice in a formal context. In that sense they are clearly two very different skills, and linguists have often used this fact to argue that the study of writing can teach us nothing about Language.

It should be noted that this attitude of traditional linguistic science is strikingly at variance with the attitude common to the general educated public. The average educated person tends to place extra emphasis on the written form of a language, to the point that there is an often unspoken or even unconscious assumption that written expressions that resemble each other visually must somehow resemble each other also in sound if not in meaning. Thus, i have on many occasions run into the assumption that, since the name of the capital of Russia is properly represented in writing as in (1a), it must be pronounced as in (1b) as opposed to the more correct pronuncia- tion in (1c). And, at least until they've been properly educated, Westerners have a very strong tendency to assume that the symbols in (2a) must be pronounced [r] or [ª], and of course the one in (2b) must be pronounced [°p] or [ip], while the one in (2c) often looks to us Westerners like a rather elaborate letter ‘E’. Another common assumption i've run into is that, since written Japa- nese looks so much like , the languages themselves must be related, although as we shall see in Chapter 22 there is no good reason to believe any such thing. (1) a. москва b. [mŠkb\] c. [moskva] (2) a. 尺, 民 b. 印 c. 佳

Outside of East Asia and its immediately surrounding islands (i'm thinking here primarily of Japan and Taiwan), almost all languages nowadays that have any written form at all make use of some kind of ; outside of Asia, Eastern Europe, and Northern Africa, the overwhelming majority of written languages make use of some form of the Roman alphabet, the same alphabet i'm using here. This is due not to a common cultural inheritance shared by all these languages but to the history of imperialism and cultural domination of various nations — mostly but not all European — that happen to use of one kind or another — mostly Roman — for writing their own languages. Latin, English, and Fijian are all written with the Roman alphabet, but as we shall see later this does not mean that the Roman alphabet is particularly well suited to repre- senting all three of these languages. And in India there are several different alphabets competing

103 with each other, which differ often considerably in their outward, visual form but not necessarily in their underlying, structural organization, and at least one language, Sanskrit, is routinely writ- ten in all of them without any alteration in its content. It is an error to assume that the character of a language can be immediately deduced by a quick examination of its superficial written ap- pearance.

Nevertheless, there is a sense in which the study of writing can be very relevant indeed to the study of Language. A full-fledged writing system, of the sort i'm talking about in this chapter, the sort that is capable of recording most if not all of the content of spoken language, represents to a great extent the way the community that uses it thinks about their language. And if we're interested in the psychology of language use — and this is one of the major concerns of modern languistics — then how people think about their language is relevant to our study. Furthermore, there are different types of writing systems, and they seem to be appropriate for different types of languages. And so the study of writing systems becomes important for that recurring theme of mine, linguistic typology. Types of Writing Systems Students of writing systems, especially the few linguists who have been looking into this area, have coined the term ‘’ to refer to the units that make up a writing system. This word, analogous to the terms ‘morpheme’ and ‘’ that we have already met, refers to the most basic units, the ones that are treated by the system and by the people who are familiar with its use as distinct and indivisible entities. Thus, in an alphabetic system such as that used to write English, each individual letter is a grapheme; in the Chinese 漢字 system, each of the thousands of characters that can fill a square space is a grapheme.

The classification of writing systems into various types is based on what sorts of linguistic ele- ments are typically represented by the individual of a given writing system. Students of writing systems most commonly recognize three basic types, though as we shall see these three types to some extent define a continuum and there are writing systems that are of ‘mixed’ type, combining characteristics of two of the three basic types or even all three. The labels most com- monly used for the three basic types are logographic, syllabic, and alphabetic systems.

Logographic Writing The term ‘logographic’ means literally ‘word writing’; in an ideal logographic system, each gra- pheme represents a whole word. But remember our discussion back in Section 1: What, exactly, is a word? How long, how complicated can a word be before we stop thinking of it as a single word? The best-known ‘logographic’ writing system in the world is the 漢字 system used to write Chinese. And yet it can easily be argued that many ‘words’ in Chinese are represented not by single graphemes but but two or more, or rather that the expressions in (3) represented by two or more graphemes are single words. (3) a. 牙刷 b. 難關 c. 評審 d. 一點兒 e. 莫須有 f. 硬骨頭 g. 得過且過 h. 古往今來 i. 眼高手低

As explained at the end of Chapter 2, most linguists who have considered the matter at all, inclu- ding myself, agree that what are really represented by the graphemes in 漢字 are morphemes, many of which are words in their own right but can nevertheless, like morphemes in other langua- ges, be combined to form larger and more complex words. However, unlike the word ‘word’ the word ‘morpheme’ is not one with which most people are familiar, and so we continue to use the

104 word ‘word’ to translate the Chinese 字 and the word ‘logographic’ as a convenient and relatively accurate term to describe 漢字 and similar writing systems.

I am going to take a moment here to point out something that i will want to come back to later; i really shouldn't need to point this out at all to people fluent & literate in Chinese, but the point deserves to be made. The Chinese 漢字 system is not purely logographic; in many cases, a sig- nificant portion of a particular 漢字 grapheme represents not the meaning but the pronunciation of the morpheme in question. In other words, many 漢字 graphemes include what are often called ‘phonetic cues’.4 I've given some examples in (4). These thirty 字 have little in common with each other with regard to meaning, but the fact that they all include as a prominent part the symbol 包 represents the fact that they are very similar to each other with regard to their pronunciations, at least in Mandarin. Since the 漢字 system includes these pronunciation cues, it should ideally be described as a ‘mixed’ system, though as we shall see it is nowhere near as mixed as some. (4) a. 包 胞 苞 枹 笣 孢 雹 窇 飽 怉 抱 鮑 刨 鉋 菢 b. 袍 咆 匏 庖 炰 齙 鞄 炮 瓟 跑 泡 砲 皰 麭 髱

It is unfortunate, but perhaps not surprising, that the true nature of 漢字 has for so long been so little appreciated by Westerners who are accustomed to an alphabetic writing system. Many of us Westerners believe that 漢字 represent words, and, as i have said, while this may not be strictly accurate it is not too wide of the mark. But it has been relatively easy for many Westerners to jump from this reasonable approximation of the truth to the notion that what the 漢字 graphemes really represent are ideas; for this reason, you will occasionally come across the word ‘ideogra- phic’ in Western descriptions of 漢字 and similar writing systems.

I'm pointing this out to you partly for this reason, because you're liable to run across this word sooner or later if you haven't already, and partly in order to explain why this idea is so mislead- ing. It's obvious that there are a great many Chinese people who can't understand each other's spoken language; the languages spoken in 北京, 上海, and 香港 are quite noticeably different — to say nothing of Taiwanese! But, assuming they know how to read and write, these same peo- ple can understand each other reasonably well if they write down what they want to say. This wonderful ability of the 漢字 system to overcome language barriers has encouraged a lot of West- erners over the past few centuries to believe that what is represented in the 漢字 system is not language at all but thought itself; i occasionally run across Westerners — educated Westerners — who believe this even today. From this notion it is not hard to conclude that Chinese civiliza- tion — so ancient and so venerable in its antiquity — has somehow managed to do what Western philosophers have so far failed to do, identify the most basic units of human thought with such precision as to be able to represent them graphically; that each 漢字 represents a fundamental unit not of language but of thought itself; and anybody who masters the 漢字 system thereby automatically gains direct access not just to the Chinese language in its written form but to the pure essence of human psychology, if not fundamental philosophical truth.

You may find this attitude extremely flattering to your nation, and indeed it is, but i'm sure you all recognize that it is very far from the truth. And you can probably imagine that it has resulted

4Bernhard Karlgren (Analytic Dictionary of Chinese and Sino-Japanese, Geuthner: 1923) estimates that roughly 90% of all 漢字 consist of combinations of ‘radicals’ and pronunciation cues, which gives some indication of the extent to which this blending is characteristic of the system.

105 in a lot of frustration over the centuries as Europeans and Americans of a philosophical bent have struggled to master 漢字 in the mistaken belief that this endeavour would lead them eventually to the Philosopher's Stone, the ultimate goal of philosophical enquiry.

I would also mention that this same attitude was the cause of serious delay in the deciphering of Egyptian hieroglyphics. As can be seen from Fig. 5.1, the written form of the ancient Egyptian language is even more blatantly pictorial than 漢字, and this fact lead Westerners to believe that Egyptian writing is in some sense logographic, as we would say nowadays. And it is logographic — up to a point. d ie o kmC q k8 u & 8 Y t 1 E J z n k r N n Fig. 5.1 — Egyptian Hieroglyphic Text

Unfortunately, especially after Europeans began learning about Chinese, they concluded that Egyptian hieroglyphics were not merely logographic but ideographic, as they supposed 漢字 to be. Once they had reached this conclusion, they spent several generations labouring under the delusion that one didn't need to know anything at all about the Egyptian language in order to make sense out of .5 And this was a huge mistake, because in fact the Egyptian hieroglyphic system is not purely logographic but highly mixed, and it was completely impossible to make any sense out of Egyptian writing until its mixed nature was understood, as it came to be only in the early 19th century.

When i say that the hieroglyphic system is mixed, i mean that in any given text some of the gra- phemes may represent whole words, as one would expect of a logographic system, while some may represent syllables or other parts of words. There is even such a thing as a hieroglyphic ‘alphabet’, a set of graphemes that represent single ; i've done my best to reproduce this ‘hieroglyphic alphabet’ for you in Fig. 5.2 on the next page. So the hieroglyphic writing system involved a combination of graphemes functioning at all three levels, logographic, sylla- bic, and alphabetic, and such a mixture might very easily be present in any given hieroglyphic inscription.

And it was even more complicated than i've so far suggested. In addition to logographic, syllabic, and alphabetic symbols, there were also what Egyptologists call ‘determinatives’, symbols that were not pronounced themselves but which served, rather like radicals (部首) in 漢字, to clarify the meaning of nearby symbols or combinations of symbols that otherwise might be ambiguous. For instance, one might have a combination of graphemes that together represented a word that was literally the name of a kind of flower; but the same word might also be used as a woman's name. In order to clarify this, someone might write the graphemes representing the actual word, and then next to them draw a stylized picture of a woman to make it clear this was a woman who

5After all, if Egyptian hieroglyphs like Chinese 漢字 completely bypass the medium of human spoken language, which as we have discussed previously is typically very arbitrary and also includes many details of grammar, etc. that are completely irrelevant to semantic content or philosophical inquiry, and represent human thought directly (as Chinese 漢字 were supposed to do), why should one need to know anything about the Egyptian language in order to make sense out of Egyptian writing? Under this assumption, it ought in theory to be possible for any sufficiently wise and intelligent human being to decipher Egyptian hieroglyphics purely on the basis of their pictorial content.

106 was being talked about; or, alternatively, draw a stylized picture of a generic plant to indicate that what was being talked about was something in the nature of a plant, e.g. a flower.

畫像 象徵 音意義 畫像 象徵 音意義

a 兀鷹 ½ i 蘆花 i

A 前臂 ¿ y 兩棵蘆花 j

u 小鶉 o,w b 腳 b

n 水 n p 凳子 p

r 口 r f 角的蝮 f

t 一塊麵包 t m 貓頭鷹 m

d 手 d h 蘆葦的屋 h

T 綁住的繩 ť H 扭的亞麻 š

D 蛇 ď H 胎衣 x

s 閂 s L 動物的肚皮 h

S 摺的布 Í q 斜坡 q

S 池 ß k 籃子 k

l 獅 l g 罐架 g

Fig. 5.2 — Hieroglyphic ‘Alphabet’

The big difference between hieroglyphic determinatives and Chinese radicals is that in Chinese a radical is actually included as part of the symbol it's supposed to be disambiguating. A hierogly- phic determinative is a free-standing, independent grapheme that merely serves to modify (without itself being pronounced) the interpretation of other graphemes in its immediate vicinity. And this leads us to consider one of the two major sources of difficulty in interpreting hieroglyphics. Although as i have said there were four different levels at which a hieroglyphic grapheme could operate — logographic, syllabic, alphabetic, or determinative — graphemes of all four types are

107 drawn/written in the same style and look very much alike, and without a fair amount of practice it's impossible to look at a bit of hieroglyphic text and tell what, precisely, a given grapheme represents in this particular context. This is particularly true in that individual graphemes might serve a variety of purposes, sometimes within the same text! Thus, it is perfectly possible for a picture of an owl (m) to represent the word for ‘owl’ and then, a little bit farther on in the same text, for the same picture to represent merely the /m/. Or for a picture of a star (O) to represent the word dewa½, meaning ‘star’, and elsewhere (in the same text) to serve as a determi- native for the name of a particular star. Or (this in my opinion is one of the wackiest examples!) for a picture (Y) of a swallow (燕子) to represent the word wer, meaning ‘燕子’, or the word wer, meaning ‘big’ (pronounced somewhat the same way) or to serve, rather like 顆 or 粒 in Chinese, as a determinative for something small!

The other major source of complication in interpreting hieroglyphics is redundancy. Nowadays, in modern civilized society, it is generally felt to be sufficient to represent a given word once. If i want to write ‘beautiful’ i can write the word ‘beautiful’ and then go on and do something else; i don't have to write the same word over again — in a different fashion — in order to get my point across. And the same goes for ‘美麗’ or anything else one might want to express in writing. But that's not how Egyptian scribes worked. In Classical Egyptian the word for ‘beautiful’ was nefer, and the conventional written representation of it was a picture (%) of a lute (琵琶). So the symbol % would presumably represent the word nefer, ‘beautiful’, no problem. But i have seen an inscrip- tion in which the picture of a lute is accompanied by the symbols for the /n/, /f/, and /r/, as shown in Fig. 5.3. This is not to be read as ‘nefer nefer’, but only as ‘nefer’, even though the word is represented twice — once in its entirety and once spelled out in detail.

Fig. 5.3 — Redundancy in Hieroglyphic Writing

Syllabic Writing The second major type of writing system is syllabic. This word is pretty self-explanatory; in a syllabic writing system, each grapheme represents a syllable, and the complete set of all the gra- phemes is called a . Possibly the best, and certainly the best-known, example of a syl- labic writing system is the Japanese ; as we will note again later, Japanese in fact has not one but two , known respectively as and .6 In an ideal syllabic system, each distinct phonetic or phonological syllable defined by the language would be repre- sented by a single, distinct grapheme. This is not quite the case of Japanese kana. First of all, although like Mandarin Japanese allows syllables to end in nasals neither kana has distinctive graphemes representing such syllables, but instead each kana has a distinct symbol representing the syllable-final itself. Furthermore, Katakana, at least, shows evidence of a certain amount of fairly sophisticated feature-analysis: The syllables involving a glide between

6See the appendix to this chapter for details on how the two Japanese syllabaries evolved from Chinese 漢字.

108 the consonant and the are represented by composite characters, not single graphemes, and the syllables beginning with the consonants [g], [z], [d], [b], and [p] do not have distinct represen- tations of their own but are modelled upon the graphemes representing the syllables beginning with [k], [s], [t], and [h], with added to represent the distinctive features of voicing or, in the case of [p], the combined lack of voicing and continuance; cf. (5). I do not know, off- hand, to what extent these simplifications are true of Hiragana.

(5) キャ kya キュ kyu キョ kyo カ ka ガ ga サ sa ザ za タ ta ダ da ヘ ha ペ pa ベ ba

Alphabetic Writing

Fig. 5.4 — Who says alphabetic systems are easy?

The third type of writing system, of course, is an alphabetic system. And of course the best known alphabetic writing systems are those used nowadays for all European languages; Euro- peans, however, didn't invent the alphabet. I mentioned that writing had been invented only a half-dozen or so times in human history; well, the alphabet, or at any rate the notion of an alpha- betic writing system, has in fact been invented only once, so far as we know, in the history of the human race. This is not perhaps as surprising as it might at first appear, when you consider what a major intellectual leap the alphabetic principle involves. In an alphabetic writing system, by definition, each grapheme represents not a word, not a morpheme, not a syllable, but a phoneme. Consider how abstract a phoneme is; consider the challenge you've had in trying to come to grips recently with this concept. For a culture — not an individual human being, but a whole commu- nity — to recognize the phoneme as a useful linguistic concept and then to encode it graphically, which is what an alphabetic writing system involves, is a major intellectual feat, and the honour of accomplishing it without learning the idea from outside belongs to one single nation, or more likely a cohesive group of nations sharing a common culture, living on the far Eastern coast of the Mediterranean Sea about 3000 years ago. History Which brings us to the systems. As far as we know, the earliest writing sys- tems developed in Egypt and Mesopotamia. The systems that developed in these two locations were, at least in their original form, very logographic.7 Egyptian writing systems (‘hieroglyphic’,

7The most basic difference between them has to do with medium, with the tools used to write. The Sumerians in an- cient Mesopotamia wrote by making impressions in soft clay tablets, which would then dry and harden. Egyptians, like the Chinese, wrote on sheets made of vegetable fiber with brushes dipped in some sort of coloured fluid which

109 ‘’, etc.) were used only to write Egyptian, while Mesopotamian was used to write a great variety of languages, including not only Sumerian but various Semitic (Akkadian, Babylonian) and Indo-European (Persian, Hittite) languages as well. Both Egyptian hieroglyphs and Mesopotamian cuneiform evolved in the direction of syllabic writing.8 In Egypt, syllabic writing was already present in the earliest known samples, but the change from logographic to syllabic writing was never carried through to completion by the Egyptians. Mesopotamian cunei- form, on the other hand, was purely logographic in Sumerian but became strictly syllabic in some of its non-Sumerian versions.

Semitic peoples living on the Canaanite coast of the Mediterranean sea adopted certain elements of Egyptian hieroglyphs and from them developed a ‘consonantal’ syllabic writing system adap- ted to Semitic language structure. Back in Chapter 1 i said that in the Semitic languages almost all of a word's meaning is carried by the consonants that make it up, the contributing only subtle shades of meaning. This being the case, in writing a Semitic language it is of major impor- tance to represent the consonants, while the vowels are definitely very secondary; in most tradi- tional Semitic writing systems, there was no explicit representation of vowels as such, and even today in modern Israel street-signs and such like printed in Hebrew don't bother representing the vowels.9 In this respect, Semitic writing is in many ways syllabic: the individual grapheme repre- sents the consonant at the beginning of a syllable, and the vowel following it is either not indica- ted at all or is indicated only by little marks above, below, or to either side of the grapheme.10

In some respects, however, the Semitic writing system cannot be regarded as purely syllabic. For one thing, the basic Semitic grapheme does not unambiguously represent a syllable; it repre- sents only the consonant at the beginning of the syllable, and is accepted as a representation of several different syllables, depending on how many vowels there are in the language. Thus, in representing basically the consonant /m/, can in actual usage represent ,מ Hebrew the grapheme at least the five different syllables [ma], [mo], [m°], [mi], and [m\], as shown in (6).11 ’moß°h] ‘摩西] משה .malal] ‘擦’ b] מלל .a (6) ’mi¿oé] ‘少數] מעוט .m°l°l] ‘穗’ d] מלל .c ’m\lal] ‘縫] מלל .e

Furthermore, if the syllable should also end in a consonant, then the Semitic writing system re- presents that consonant by a second, separate grapheme. Thus, while some Semitic syllables are represented by single graphemes others are represented by pairs; i've given a few Hebrew exam- ples in (7). In this sense, the Semitic writing system is roughly halfway between being a purely syllabic system and a truly alphabetic one, and indeed it is, in its various forms, usually referred

we might as well call ‘ink’. And the entire difference in appearance between Mesopotamian cuneiform, as it's called, and Egyptian writing is due to this difference in medium. 8The latter under the pressure of adaptation to non-Sumerian languages, the former under internal pressure. 9Tables of the Hebrew and Arabic alphabets, including in the case of Arabic details on variant forms of the letters as used in different parts of words, can be found in the Appendix to this chapter. 10Such marks are in general called ‘diacritics’; their function is to qualify how the spoken equivalent of the grapheme itself is to be pronounced. Such diacritics show up in a lot of languages whose writing systems are not based on the kind of structure that underlies the Semitic system; the accent marks in French or Spanish, the umlauts in German, are diacritics. 11Please note that Hebrew is written from right to left!

110 to as an alphabet. At any rate, all alphabetic writing systems in the world today are derived from it, either directly or indirectly, to some degree or other.12 ’死亡‘ מות [{…所有’ [mo‘ כל [園’ [kol‘ גן [gan] (7) ’王國‘ מלכת [{君王’ [malku‘ מלך [melex]

In large part due to the wide-ranging travels of the Semitic Phoenicians, the consonantal writing system spread far throughout the surrounding regions, ultimately reaching all the shores of the Mediterranean Sea, as far south as Ethiopia, and as far east as India. In these last two places the ‘consonantal’ writing system of the Phoenicians was developed (independently) into full-fledged syllabic writing systems, in which each grapheme represents an initial consonant (if any) with a to represent the following vowel(s).13 The Ethiopian and Indian writing systems also developed distinctive graphemes to represent vowels by themselves, for those cases where a syllable doesn't begin with a consonant.14 As a result, the Ethiopian and Indian writing systems are able to represent every detail of the structure of a syllable unambiguously, and indeed are on the threshold of being true alphabetic systems. Tables showing the Ethiopian syllabary and the Devanāgarī alphabet, one of the versions of the Indian writing system, can be found in the Appen- dix to this chapter.

When the Greeks adopted the Phoenician writing system and tried to adapt it to their language, they were faced with the difficulty that the Semitic writing system allowed only consonants to be represented, but in their Indo-European language vowels were just as important as consonants and needed to be represented just as much. In solving this dilemma, the Greeks took advantage of the fact that the Semitic writing system had representations for consonant phonemes that didn't exist in Greek — /q/, /½/, etc. The graphemes used by the Phoenicians to represent phonemes which Greek didn't have were adopted to represent phonemes which the Phoenicians hadn't bo- thered trying to represent: vowels. Thus, the Greeks developed the Semitic writing system into a purely alphabetic system. The resulting system became the basis for all known European writing systems, including not only the itself but the Etruscan, Roman, Glagolitic, Cyrillic, and Runic alphabets. Samples of many of these are shown in the Appendix to this chapter.

Devanāgarī and : Towards a Scientific Alphabetic Organization The ‘hieroglyphic alphabet’ that i gave you in Fig. 5.2 should not be taken as representing the graphemes in any particular order. To the best of our knowledge, there was no fixed order for these graphemes as there is in modern alphabets. When the Semites developed an alphabet out of Egyptian sources, they established a (relatively) fixed order for the graphemes, but this order was totally random; in learning the Semitic alphabet, one simply has to memorize the sequence

12The most exotic developments known to me, the ones that derive the least from the Semitic source, are Korean Hangul and Irish . I shall have more to say about Hangul shortly. The letters used in the ancient Irish Ogham inscriptions seem to have been developed within a purely Celtic cultural context, although the idea for developing such letters and the principles for organizing them clearly owe something to the models of one or more of the alpha- bets used in continental Europe. 13In the writing systems developed in India, consonants coming at the end of a syllable are typically represented by reduced forms of the relevant graphemes which are attached to the grapheme representing the consonant at the begin- ning of the next syllable. 14The Semitic system makes use of a sort of all-purpose ‘dummy’ consonant letter for this purpose.

111 ‘ox’—‘house’—‘camel’—‘door’ etc.;15 the order is not justified by any logical or phonological basis. This is still true in most alphabets, including all those customarily used in European culture.

There are, however, exceptions. Around the year 1000 B. C. E., there was a great flowering of phonetic and phonological research in northern India.16 On the basis of the knowledge gained through this research, Indian grammarians strove to organize the alphabetic writing system they had learned from Semitic sources on a scientific basis. The result is shown in Fig. 5.5, a represen- tation of how the Devanāgarī alphabet is organized. (See chart on p. 128 for the symbols used to actually represent these sounds.)

a ā i ī u ū ŗ ¯ŗ ļ ( ¯ļ ) e ai o au k kh g gh { tß tßh d¹ d¹h Ê é éh Î Îh « t th d dh n p ph b bh m j r l w ß Í s Ó

Fig. 5.5 — Scientific Organization of the Devanāgarī Alphabet

First, we have all the vowels, organized among other things into pairs of short and long versions of the same basic vowel. Then all the stops, followed by the glides and finally by the fricatives. Furthermore, the 25 stops are organized in a very rational way: first, the five velars, then the five palatals, then the five retroflex stops, etc. And within each set of five the order is consistent: [- -aspirate] followed by [-voice +aspirate], [+voice -aspirate], [+voice +aspirate], and finally [nasal]. In short, phonemes sharing certain important, distinctive features are grouped together in alphabetical order in the Devanāgarī alphabet.

However, the forms of the individual Devanāgarī letters are completely random, being traceable ultimately back to their Semitic source, and do not reflect this scientific organization. Thus, be- ginning students always have to be warned not to confuse the letters in (8), which may look alike but do not necessarily share any relevant features.

(8) a. C gh 1 dh

b. ] p g Í

c. k s N m

d. < i S { ) Î

15referring to the literal meanings of the names of the various graphemes 16Indeed, early Indian phonetic science was so far ahead of anything achieved anywhere else in the world that, upon discovering this native tradition nearly 3000 years later, Western civilization could do little more than adopt its wealth of knowledge.

112 In this important respect Hangul, the Korean alphabet, differs from all other writing systems. The Koreans learned the alphabetic principle — the notion of representing individual phonemes by distinct graphemes — as they adopted the Buddhist religion, which traditionally expressed itself in Pali and other North Indian languages making use of the Devanāgarī alphabet. But when in the 15th century C. E. King Sejong called for a complete reform of the writing system, the scholars who developed the Hangul alphabet took advantage of previous centuries of research in articulatory phonetics and designed the individual graphemes as much as possible to represent how the phonemes they represented were actually articulated. Thus, the velar stops /k/ and /g/ are represented, as shown in (9a), by graphemes deliberately designed to suggest the tongue touching the velum. The apical consonants /t/, /d/, /n/, and /Ö/ are represented as in (9b) by gra- phemes designed to represent the tongue touching the teeth. And the palatal consonants /ß/, /d¹/, and /tß/ are represented as in (9c) by graphemes suggesting the tongue touching the hard palate, between the teeth and the velum. For this reason, Hangul has been declared ‘the only writing system based totally on scientific principles’.

(9) a. g k

b. d t n Ö

c. ß d¹ tß

The genius of King Sejong's scholars didn't stop there, however. They recognized not only the importance of the alphabetic principle but also, right alongside it, the importance of the syllable as a phonological unit. The result is that the simple graphemes of the Hangul alphabet, each one representing a single phoneme, can be combined into complex graphemes representing whole syllables. As in Chinese and Japanese, the representation of a syllable has to be enclosed within a square space. Within that square space is squeezed the letter representing the initial consonant17 and the letter representing the vowel.18 If the syllable ends in a consonant, the letter representing that consonant is placed at the bottom of the square space representing the entire syllable.

(10) a. n + u + n + a = nuna ‘姊姊’ n + a + m + u = namu ‘樹’

b. ‘dummy’ + u = u ‘右’ ‘dummy’ + o + h + u = ohu ‘下午’

c. m + o + g = mog ‘頸’ m + u + n = mun ‘門’

17Like the Semitic alphabets, Hangul makes use of a ‘dummy’ consonant, Ỏ, for syllables that begin with a vowel. In word-final position, the same symbol represents the velar nasal /ŋ/. 18The vowel-letter may be either to the right or beneath the preceding consonant-letter, depending on its form: A vowel-letter whose most important characteristic is a vertical line is placed beside the preceding letter, while one consisting primarily of a horizontal line is placed beneath it. The basic CV-combinations in Hangul are laid out in a table in the Appendix to this chapter.

113 d. Hun Min Jong Um ‘訓民正音’19

Hangul Hangul-lal20

漢字 in China and Japan The Chinese writing system was developed in the mists of pre-history; the graphemes in the ear- liest inscriptions from the 商朝 are already recognizable as precursors of modern 漢字. The Chinese writing system subsequently spread throughout East and Southeast Asia. While the 漢 字 system within Chinese cultural environment itself has remained resolutely logographic, adop- tions in other nations have developed in other directions.

The Japanese have four complete writing systems used simultaneously side-by-side: 1 Chinese 漢字 (called in Japan) used as logographic system; 2–3 two complete syllabaries, each one representing the entire inventory of possible syllables in Japanese (with distinctive strategies for representing long vowels), one (hiragana) used to represent grammatical morphemes charac- teristic of Japanese for which Chinese has no equivalents (case markers, etc.), the other (katakana) used to represent loan words from foreign languages; 4 the Roman alphabet (called Romaji). Note that any one of the last three mentioned would be a perfectly adequate representation of the Japanese language all by itself; that the Japanese continue to maintain four complete writing sys- tems side-by-side in daily usage is testimony to the willingness of human beings and cultures to be much more complicated than they need to be, a quality i sometimes refer to as ‘baroquity’.21

I've just mentioned that the Japanese adopted kanji as a logographic system, functioning much the same way as it does in Chinese. There is, however, a complication here, due to fundamental differences in linguistic structure between Chinese and Japanese. In Mandarin and, at least to some extent, other Chinese languages/方言, each 漢字 represents not only a single morpheme but a single syllable — indeed (barring the exceptions mentioned at the end of Chapter 2), in Chinese all morphemes are monosyllabic. This is not true in Japanese; in Japanese a given kanji may represent a single morpheme, but as shown in (11) that morpheme may include anywhere from one to four syllables. (11) Mandarin Reading Character(s) Japanese Reading yuè 月 tsuki shénfēng 神風 kami-kaze qiè 妾 mekake máng 盲 mekura dì 帝 mikado dié 疊 tatami shòu 獸 kademono

Some 漢字 are pronounced in different ways in Mandarin, depending on context; e.g., 了 may be pronounced either le or liăo. However, these variations are few, and in general they share cer-

19the title of the tract publicizing the Hangul writing system 20‘Hangul Day’, the anniversary of the proclamation of the Hangul writing system, Oct. 9, proposed in 1992 as an international linguists' holiday 21Someone has commented that the is ‘so complex it requires a whole extra writing system to explain it’; i'm not sure which is supposed to be the ‘whole extra’ writing system.

114 tain segments and it can be seen that they are probably historically related. In Japanese, however, as shown in (12) some kanji are pronounced quite differently depending on context, pronuncia- tions that may represent semantically related morphemes in Japanese but are not phonologically related at all. (12) Character Japanese pronunciation in isolation in certain compounds 北 kita hoku 人 hito jin 神 kami shin (e.g., Shintō) Relative Usefulness of Various Writing Systems So we've identified the three basic types of writing systems. What kinds of languages are each of the three types of writing system best suited for?

Basically, logographic systems are best suited for isolating languages, languages like Chinese with few bound morphs. Syllabic systems are best suited for languages with very limited range of possible syllabic types. The total number of basic graphemes — leaving aside the complex characters and the diacritics exemplified in (5) — in the Japanese kana is about 50; the total number of syllables in the Ethiopic syllabary is 182. Syllabic systems may work well for agglu- tinative languages, if they satisfy what i've just said about small number of distinct syllables and syllable types, and especially if every affix is a whole syllable.22

Synthetic languages seem to do best with alphabetic writing systems, especially when the range of possible syllabic types is as broad as it is in the early Indo-European languages.23 The same goes for polysynthetic languages. If in a given polysynthetic language all the morphemes consis- ted of whole syllables a syllabic writing system might serve, but i don't know of any such langua- ges; the polysynthetic languages of North America that i know of all include some morphemes that consist of single consonants and others that consist of single vowels.

In considering the relative usefulness of different types of writing systems to different types of languages, i feel that, especially here in China, i must especially address the fact that i don't know of any reasonable alternative to 漢字 as a representation of the Chinese language.24 Please bear in mind that everything i'm about to say in this part of the chapter is subject to correction from you, my students. In certain respects you know more about this subject than i do, since it's your language i'm going to be talking about. Please correct me if i make any mistakes.

Minuses of 漢字: The first and most obvious problem with the 漢字 system is that it takes a lot of time and effort to learn. Common estimates place the total number of 漢字 at several tens of thousands — several 萬. Given the time it seems to take the typical American schoolchild to master just 26 letters, i

22There may be a question of how to indicate distinct binyanim, but then, binyanim are characteristic of Semitic languages which have always been quite satisfied with their consonantal-syllabic system. 23The limited range of syllabic types represented by the Mesopotamian cuneiform writing system created serious problems in representing the Indo-European languages Hittite and Persian, and consequent problems for modern scholars trying to read these languages. 24I will have more to say about this at the end of Chapter 20.

115 don't want to think about how long it takes the typical Chinese schoolchild to master 1000 or so 漢字. I have to assume the training procedure is a lot more intensive then it is for us.

The large number of 漢字 not only greatly increases the time and effort needed to memorize them, in comparison to alphabetic systems with only a few dozen graphemes; it also greatly increases the risk of error, both in reading and in writing. In an alphabetic system, the total number of graphemes is so small that, typically, only a handful of them look enough alike to be confused with each other. The number of graphemes in the Chinese writing system, on the other hand, is so great, especially in comparison to the amount of space each is supposed to fill in a typical text, that the distinguishing characteristics are often very hard for the reader to identify or the writer to write/draw clearly.25

Pluses of 漢字: Having briefly discussed the problems with the 漢字 system, it is now time to discuss its virtues. (Oh yes, there are some!)

The one most often mentioned, the one i've already mentioned, is the fact that, because 漢字 bear so little relation to the pronunciation of the words and morphemes they represent, they can be used equally well for a variety of different languages. Not all languages, certainly. I mentioned earlier that for a long time it was commonly thought in Western culture that the Chinese writing system bypassed language entirely and constituted a direct representation of thought itself. Had this been true then 漢字 would presumably be equally serviceable as a system for representing any human language — and indeed there were some Europeans who advocated the adoption of a 漢字-like system for the representation of all languages, thereby overcoming all language barriers at least at the level of written language. That this is quite unfeasible is shown by the various adap- tations of the Chinese writing system to other East- and Southeast-Asian languages such as Japa- nese. 漢字 serves superbly well any language that has the same grammatical structure as Chinese, any language that has a distinct morpheme equivalent to every morpheme in Chinese and in which the morphemes come in the same order as their Chinese equivalents. But the grammatical struc- ture of Japanese, for instance, is quite different from that of Chinese, which is one important reason why Japanese, while adopting 漢字, has supplemented it with kana. But if two languages share the same grammatical structure, even if their speakers cannot understand each other's speech they can read each other's writing if they use 漢字. I've never come across this word in any discussion of the Chinese writing system, but i choose to call this feature of 漢字 intercommunicability.

In practice, this trait has long been important in the history of the Chinese nation. As it happens, the eight major languages of the 漢 nation and their various dialects, while they differ significant- ly from each other at the phonological level, have essentially the same grammatical structure, and

25As a personal note, i should mention that this is a matter of particular concern to me, struggling as i am to acquire not only some familiarity with spoken Mandarin but some ability at reading and writing Chinese. After i had alrea- dy been studying Chinese for several weeks, i made what might have been the rather embarassing discovery that i had quite happily been writing what i thought was 我 but which turned out to actually be 找. I've also had the expe- rience of being confused by the radicals 門 and 鬥. You may laugh, but i really think this is a very easy mistake to make when you're just starting to learn this stuff. And the thing that scares me is that there are so many graphemes in Chinese that can trip one up in just this way!

116 so their written form is effectively identical as long as they use 漢字 or a 漢字-like system.26 This has enabled the Chinese nation to hold itself together, at least as a cultural unit if not a political one, through millenia, and as far as i can tell 漢字 is for this reason an essential part of Chinese national identity.

Not quite as important, but of interest and value from a scholar's point of view, is the ability of 漢字 to promote intercommunicability across time. Since the Chinese writing system has changed very little in over 3000 years, it is still possible for a reasonably well-educated Chinese person to read the classics of Chinese literature and philosophy without any kind of intermediary. I assume that if 孔子 or 老子 or the poet 杜甫 were to appear among us now and begin to speak, you wouldn't understand a word he said any more than you would a peasant from 湖南. But presu- mably you would have little trouble reading the 論語 or the 道德經 or the poems of 杜甫; even though none of you speaks the language of these great men from the distant past, you can still read their language as easily as you could read the works of modern poets and philosophers wri- ting in Chinese. Contrast this with my situation, or rather the situation of English-speakers in general; i can to some extent read Beowulf and the writings of King Alfred, etc., but that's be- cause of the specialized nature of my education and my peculiar interests in ancient . Most English-speakers, even quite well-educated ones, couldn't possibly read these texts in their original form, even though they're a lot younger than 論語 or the 道德經 and even though they are, in a sense, written in English — that is, a language that was called English a little over 1000 years ago with as much justification as the language i'm using now has to the same name.

Another benefit of 漢字 has to do with the fact that, as a direct evolution of what is basically a pictographic system — a system of picture-writing — 漢字 directly connects each morpheme with a powerfully recognizable visual image. As a result, if you're looking at a page of printed text and trying to find a particular word, you have a clear mental image of what that word is sup- posed to look like, and it's relatively easy to find. Maybe you don't think so, but compare it to what an educated European like myself has to go through. If i'm looking at a page of text printed in English and i'm looking for a certain word, i have to bear in mind not a discrete visual image but a whole bunch of them, one for each letter. In practice, what i find myself doing usually is looking for the first letter of the word; but if that happens to be ‘s’ or one of the other common initial letters in English, i'm in trouble; there's likely to be several dozen words on that page that begin with that letter, and many of them will be about the same length as the word i'm looking for. In fact, i often end up having to juggle several different ‘search engines’ consciously in my mind all at the same time, which is not easy to do. Whereas an educated Chinese person scanning a Chinese text looking for a particular word presumably has in mind what is essentially a picture, a single grapheme, one single visual image which, although it may be rather complex and readily analyzable, tends nevertheless to be apprehended as a whole. As i've mentioned before, the skill necessary to do this kind of thing is difficult and time-consuming to acquire; but once it's acquired it makes a lot of tasks that the literate, educated person has to do a lot easier.

26At least, that's the ideal. I am gradually learning that the reality does not quite live up to this idealized assump- tion. One of my students once informed me that she had experienced considerable difficulty trying to read a Can- tonese newspaper in Hong Kong, not only because there were characters she was not familiar with but because those she was familiar with were often combined in ways that made no sense to her, a native speaker of Mandarin. Furthermore, i am becoming increasingly aware of the current inadequacy of the Mandarin-based 漢字 system to represent Taiwanese, a problem which i am convinced will have to be solved at some point not too far in the future.

117 Of all the pluses of 漢字 the most important, to my mind, is what i would call its function of dis- ambiguation. Due to various historical processes, modern Mandarin is chock full of homonyms or, better, homophones — words that are pronounced the same way although differing substan- tially in meaning. All natural languages have some homophones, they seem to be a necessary consequence of the ways in which language evolves. But as far as i know, Mandarin and perhaps some of the other Chinese languages have a lot more homophones than any other language that i know of. In (13) i've listed all the words i can find on my computer representing just three sylla- bles in Mandarin, complete with tone-specification. (13) a. 代 帶 待 袋 戴 怠 殆 黛 貸 迨 玳 岱 逮 襶 埭 靆 紿 廗 瀻 軩 跢 艜 蹛 柋 酨 b. 一 壹 衣 依 醫 伊 揖 噫 漪 猗 咿 禕 繄 黟 曀 銥 泆 鷖 欹 郼 圪 溰 稦 燚 洢 陭 蛜 嫛 瑿 檹 毉 黳 嶬 c. 勿 物 務 惡 誤 悟 晤 霧 戊 鎢 塢 兀 騖 寤 軏 杌 婺 鶩 堊 沕 迕 遻 鋈 屼 扤 煟 卼 焐 靰 阢 粅 矹 芴 埡 逜 痦 齀 蘁 岉 噁 蓩

But the fact that i can represent this vast number of homophones in Mandarin merely by presen- ting the written forms of these morphemes makes my point: Thanks to 漢字, the amazing number of homophones characteristic of Mandarin is almost completely disambiguated in writing. I don't know of any alternative way of accomplishing this with any sort of alphabetic writing; as i shall discuss further in Chapter 20, the people who press to have the Chinese nation adopt an alphabetic writing system certainly haven't come up with any, so far as i know.

↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓隨便↓↓↓

Vicissitudes of Alphabetic Systems I'm going to close this chapter by talking a little bit about some of the problems that arise with languages making use of alphabetic writing systems. Remember that, by definition, an alphabet- ic writing system is one in which each grapheme represents a phoneme. I also mentioned that phonemes are very abstract entities. The result is that the definition i've given you of an alpha- betic writing system — what is sometimes called the alphabetic principle — the association of graphemes specifically with phonemes — is often not understood very well, even by people growing up in Western cultures where alphabetic writing systems are the norm. A lot of people have the notion that the graphemes, the letters of the alphabet, are supposed to represent not pho- nemes but phones, the actual, superficial pronunciation of the language rather than the abstract phonemes on the basis of which the superficial phones are properly defined.

Spelling Reform Beware of heard, a dreadful word That looks like beard and sounds like bird, And dead: It's said like bed, not bead — For goodness' sake don't call it “deed”! Watch out for meat and great and threat (They rhyme with suite and straight and debt).

I'm sure you people, for whom the whole notion of an alphabet is a foreign cultural import and who, in addition to mastering the 漢字 system for your own language, have had to master English

118 as well, will admit that it's a challenge.27 You may or may not be surprised or inter- ested to know that it's just as much of a challenge to us who are brought up to it from childhood. Complaining about English spelling by native speakers of English is perennial; it goes on all the time.

There can be no question that the conventions of English spelling are not the clearest in the world. For starters, there's the problem that English has a lot of digraphs. A is a pair of letters that together represent a single phoneme. Examples are given in (14). Note in particular that whereas the voiceless palatal affricate is represented by a digraph, ‘’, its voiced equivalent is represented by a single letter, ‘j’. (14) ph [f] th [}/ð] ch [tß]

Another complication of is that the same phoneme — or at least, the same phone — may be represented by different graphemes; witness (15). On the other hand, and pro- bably even more confusing especially to the foreigner, is that different phonemes, and certainly different phones, can be represented the same way graphically; there are some examples in (16). Vowels in English have even more of a problem this way. One of my teachers once gave me the phrase in (17) as an example.28 And then, of course, there is the problem of words that include letters that aren't pronounced at all, as in (18). (15) a. George, Joe b. cement, sent c. cookie d. foot, photo (16) a. cement, candy b. George, gorge c. though, thought (17) though the rough cough ploughed him thoroughly through and through [ðo¤ ð\ ª·f kŠf pla¤d hˆm }ˆªo¤li }ªu ænd }ªu] (18) mnemonic whole resign ghost pterodactyl write hole corps psychology sword debt gnaw bough lamb island knot

Given all these complications (and others that i haven't mentioned), it's hardly surprising that people — very intelligent, very well-educated people — have struggled to promote . One of the most vociferous, the most active promoters of a complete revision of the English spel- ling system was the Irish writer (1856–1950), who pointed out that, given the superficial absurdities of English orthography, there was no apparent reason why the word

27Frankly, i have to admit that, on the basis of what i've seen of my students' spelling in English so far, i'm quite impressed that, on the average, you people do as well as you do, given the added difficulties i've just alluded to; i would say that as far as spelling is concerned what i'm seeing in the way of English prose from my students here at SCU is, on average, almost as good as what i've seen from undergraduates in American universities. 28In reading this phrase, it perhaps helps to note that ‘plough’ is the British spelling for the word that in is usually spelled ‘plow’.

119 ‘fish’ wasn't spelt as shown in (19).29 In his will, he left a sizeable sum of money as a prize for whomever came up with a perfectly rational alphabet and spelling system that would represent English phonetically. This contest was won in the 1960s by the invention of the ‘Shavian’ alpha- bet shown in Fig. 5.6. (19) ghoti (‘fish’) gh as in ‘enough’, o as in ‘women’, ti as in ‘nation’

Fig. 5.6 — Shavian Alphabet

Yet, there are reasons why most attempts at reforming English orthography usually fail, and rea- sons why this is, on the whole, a good thing. I'll mention a few of the ones that i think are most important.

One of the most important works of modern phonological scholarship is a book called The Sound Pattern of English, by Noam Chomsky and Morris Halle, two very noted linguists in Massachu- setts.30 This book is a thorough study of the phonology of the , and it includes the rather surprising argument that, as irrational as it appears on the surface, the English spelling system is actually fairly well-motivated — from a phonological point of view. Chomsky and Halle argue that what English orthography represents is not the superficial pronunciation of the language but its underlying phonological structure. Now, i've already said that this is in fact what an alphabetic writing system is supposed to do, but until the late 60's very few people even among linguists really understood this.

29It could be — and indeed has been — pointed out that Shaw's example is not really believable (though of course the fellow cannot be absolved of facetiousness): ‘gh’ is never pronounced [f] except at the ends of a few words like ‘rough’, and ‘ti' is never pronounced [ß] except when followed by ‘o’. But this scholarly rebuke doesn't change the much more important fact that English spelling is quite wild, wooly, and wacky! 30Noam Chomsky & Morris Halle (1968) The Sound Pattern of English. New York: Harper & Row.

120 I'm not going to take the time here to summarize Chomsky & Halle's argument. Instead, i'm going to mention a couple of salient points that have long struck me as particularly telling. Chomsky & Halle acknowledge that English orthography is not a perfect representation of the phonology of either the American or the British dialect of Modern Standard English — but that it deviates from such a perfect representation about as much as it deviates from being a perfect representation of the phonology of any other major English dialect.31 To try to push English orthography toward a more accurate representation of Modern Standard English — either the American or the British version — would almost automatically move it away from the other major dialects. And that just doesn't seem fair. There is a very strong sense among most native speakers of English that no one dialect should be privileged over others any more than any one social class or ethnic group should be privileged over others. In fact, to a great many of us the idea of granting such a privilege to one dialect of English smacks precisely of an undemocratic privileging of a social class or ethnic group. Another reason why attempts to make English orthography more ‘phonetic’, and to spell words more the way they're pronounced, are misguided has to do with homophones. I mentioned this issue earlier in my explanation of why the 漢字 system serves Chinese so well. But as i said then, homophones exist in all natural languages, though perhaps not as commonly as in Manda- rin, and as in Mandarin it helps to have a writing system that disambiguates them. I'll mention just one example of this sort of thing in English; it happens to be my personal favour- ite. Many years ago i was reading to my wife from a novel set about 200 years ago a description of a rural scene, and this description included the phrase ‘clipped yews’. Well, i read that part, and had gone on several paragraphs before it became obvious that my wife was a little confused. It turned out that, although having seen the phrase in print i knew perfectly well it referred to a kind of tree or bush that had been trimmed for ornamental purposes, she was imagining that it referred to sheep who had been shorn of their wool. In the very rural, agricultural setting in the story either meaning was, in fact, plausible. Only the fact that the two words ‘yew’ and ‘ewe’, although pronounced exactly the same way, are spelled quite differently could make the mean- ing clear.32 (20) [klˆpt ju…z]

‘clipped yews’ ‘clipped ewes’ (樹枝被修剪過的紫杉) (毛被剪短的母羊)

French I want to take a moment to make clear that English is not the only language with an alphabetic writing system that includes complications that are daunting (恫嚇的) to foreigners. As can be seen in (21), French, like English, has silent letters — indeed, French is notorious for ignoring the letters at the ends of words. And, like English, French has different ways of representing the same segment, as shown in (22) — and again, these alternative spellings can be and are used to clarify homophones.

31By ‘major English dialect’ here, i mean the major dialects of the British Isles, North America, Australia, and New Zealand — the half-dozen or so countries in which English is spoken as a native language by the overwhelming ma- jority of the population. The dialects of English that are indigenous to places like India, where the language is of great importance but only a small minority speak it natively, represent a more complex situation.

32Another example: the English verbs ‘straighten’ (使直) and ‘straiten’ (變狹 — related to ‘strait’ 峽 as in ‘Taiwan Strait’, ‘Straits of Gibraltar’). It is possible for both to show up in similar contexts: ‘The path strai(gh)tened’, ‘The valley strai(gh)tened’.

121 (21) beaucoup [boku] ‘much’ yeux [jœ] ‘eyes’ guetter [g°te] ‘watch, spy on’ temps [tå~] ‘time, weather’ bruit [bË¡i] ‘noise, rumour’ vingt [væ~] ‘20’ bon [bŠ~] ‘good’ bond [bŠ~] ‘leap’ tous les jours [tu le ¹uË] ‘everyday’ Je les veut tous [¹\ le v\ tus] ‘I want them all’ (22) os [o] ‘bone’ au [o] ‘to’ eau [o] ‘water’ père [p°Ë] ‘father’ pair [p°Ë] ‘peer’ verre [v°Ë] ‘glass’ vair [v°Ë] ‘squirrel-skin’33 pain [pæ~] ‘bread’ tiens [tjæ~] ‘Well!’ lapin [lapæ~] ‘rabbit’ peintre [pæ~tË] ‘painter’ six [sis] ‘6’ glisse [glis] ‘slide’ caprice [kapËis] ‘whim’

Polish The complications in English and French spelling can be accounted for by details of the histories of the two languages. The complications in the spelling of Polish are due at least in part to details of the political history of the Polish nation.34 For roughly 150 years, from the late 18th century to the early 20th century, Poland as a state, a political unit, did not exist; the country we now call Poland (波蘭) had been divided up between the Hapsburg, Prussian, and Russian Empires. To help them preserve a sense of their ethnic identity during this period, the Polish people were left with two institutions: their religion and their language, and so they have developed a fierce sense of loyalty towards those two institutions — and are rather unwilling to see either of them change. The result is that Polish orthography enthusiastically (if not aggressively) maintains a lot of pecu- liarities that may at one time have had some historical justification but which have been gradually weeded out of the of other Western , most notoriously the fact that 1 the phoneme /±/ is represented in Modern Polish in at least three different ways, ‘ż’, ‘ź’, and ‘rz’ 2 the letters ‘c’, ‘l’, and ‘o’ each represent at least two distinct phonemes35 3 the use of dia- critics to label nasalized vowels, nasalization being phonemic in Polish. I've given in (23) some examples of these complications and a few others. A Polish-ethnic friend of mine has pointed out that the word in (23e), among other things the name of a major Polish city, isn't pronounced at all the way a Westerner would expect from its spelling! (23) a. gaża [ga±a] ‘薪水’ doraźnie [dora±nie] ‘立即’ futrza [fut±a] ‘皮’ b. koci [kotsi] ‘貓的’ kucać [kutsatÇ] ‘畏縮’ krzyczeć [k±´tß°tÇ] ‘哭’ kocz [kotß] ‘馬車’ kochać [koxatÇ] ‘愛’ korzyść [ko±´ÇtÇ] ‘利益’ kość [koÇtÇ] ‘骨’ kościany [koÇtsjan´] ‘骨的’ kościół [koÇtsjuw] ‘教堂’ Konfucjusz [konfutsjuß] ‘孔子’ c. gala [gala] ‘盛會’ galowy [galov´] ‘盛會的’ galówka [galuvka] ‘節日’; gała [gawa] ‘地球’ głowa [gwova] ‘頭’ głośnik [gwoÇnik] ‘揚聲器’ d. gałąź [gawa~±] ‘樹枝’ przejechać [p±ejexatÇ] ‘橫穿’ przejęczeć [p±eje~tß°tÇ] ‘哼’ e. łódź [wud±] ‘船’ (also name of a city)

33The homophony between the French words for ‘glass’ and ‘squirrel-skin’ is supposed to be the origin of the pecu- liar references in the Cinderella story to a ‘glass slipper’; in the original, it's supposed to have been a slipper made (much more reasonably) of squirrel-skin. 34This is an example of something we shall be discussing at much further length in Part II, the effects of general political and social developments on language. 35Fortunately, the modern orthography supplies diacritical marks to distinguish between these various pronuncia- tions of these letters.

122 Irish One last European language whose spelling is very far from being an obvious representation of how it's supposed to be pronounced is Irish. The phonology of Irish, like other , is complex, and is filled with resulting complications, including lots of letters that aren't pronounced themselves but which modify slightly the pronunciation of other letters near them. I've given you a few examples in (24), including at the end the Irish name of the capi- tal of Ireland, a name which looks to a native English-speaker like three words involving any- where from four to six syllables, but which is actually only two syllables long. (24) amhras [a~ur\s] ‘doubt’ leabhair [ljourj] ‘books’ Aoine [i…nji] ‘Friday’ Meán Fhónhair [ma…n o…r] ‘September’ Baile Átha Cliath [blja… klji\h] ‘Dublin’

Fijian In closing, i'm going to talk for a few minutes about the attempt during the early 19th century to come up with a suitable writing system for the Fijian language, because it presents some good examples of what we can learn about a language in the process of dealing with it in written form and also of the confusions that can reign in the minds of people who, even though they have lived all their lives with an alphabetic writing system, have still not understood the alphabetic principle, which is that the individual graphemes of an alphabetic writing system are, ideally, supposed to represent phonemes, not phones.36 Let it be understood that the explorers and missionaries who are ultimately responsible for developing a written form of the Fijian language were all Western- ers, and they naturally used the Roman alphabet as their basis, and so it is not surprising that Fijian is in fact written using the same Roman alphabet that we use to write English. The interest is in the details of how that alphabet was adapted to the Fijian language.

First of all, it needs to be understood that Fijian, like a lot of , has no con- sonant clusters; every syllable ends with a vowel and begins with at most one consonant. This is a fact of the grammar, of the phonology of Fijian. This is not, however, the way a naïve Wester- ner experiences the language at first, because in Fijian many stops are ‘pre-nasalized’ in many if not all positions; that means that each voiced oral stop is apt to be preceded, in pronunciation, by the corresponding nasal. So the phonemes /b/, /d/, and /g/ are usually or always pronounced as [mb], [nd], and [{g].37 To us, both you Taiwanese and me as a Westerner, these are likely to sound like consonant clusters, and indeed that's how they were first recorded by explorers and missionaries trying to come up with a written representation of the language, as sequences of the letters ‘m’ or ‘n’ followed by ‘b’, ‘d’, or ‘g’. But this created serious problems when the mis- sionaries tried to teach the Fijian people to read what was, after all, meant to be simply a written representation of their own language. The natives were quite confused by these written clusters, and in trying to read them tended to insert vowels into them, even though these vowels were not present in the actual words, nor were they present in the written form. Thus, as shown in (25), they tended to read the name ‘Lakemba’ as ‘la-ke-ma-ba’. (25) “Lakemba” read as ‘la-ke-ma-mba’

36My information on this particular subject comes mostly from Albert J. Schutz (1985) The Fijian Language. Honolulu: University of Hawaii Press, especially pp. 27–34. 37There seems to be some dialectal variation on how consistent this pre-nasalization is.

123 [mb], [nd], etc. may sound like clusters to us, but they're single phonemes in Fijian. To be tho- roughly consistent, therefore, the alphabetic principle requires that they be represented by a single grapheme apiece.

On the other hand, of the 26 letters in the English version of the Roman alphabet there are sever- al for which Fijian has no use at all, such as ‘Q’, ‘C’, etc. In the 1830's, the missionaries hit upon the obvious solution: Use these ‘extra’ letters to represent those Fijian phonemes that are not phonemes in English,38 or at least are not represented in English by single graphemes. The end result is represented in (26), and some examples of the resulting spelling of some native Fijian names are given in (27). (26) ‘B’ = [mb] ‘D’ = [nd] ‘G’ = [{] ‘Q’ = [{g] ‘C’ = [ð] (27) ‘Lakeba’ [lakemba] ‘Beqa’ [mbe{ga] ‘Bau’ [mbau] ‘Cakobau’ [ðakombau]

In the process of making this adaptation, the missionaries also noticed that the dental stops [t] and [d] had palatal allophones that occured before [i], as shown in (28). So they reasonably figured, Why bother using distinct letters to represent these? We'll just represent them with the letters ‘t’ and ‘d’, and anybody learning to read Fijian will just learn to pronounce them appro- priately before ‘i’. Thus, the name of the island nation is actually spelled ‘Viti’, the ‘t’ being palatalized.39 (28) dina [Êd¹ina] ‘true’ tina [tßina] ‘mother’ Viti [vitßi] ‘Fiji’

Once these improvements in spelling had been made, the natives were, in the missionaries' words, ‘delighted’, and are reported to have said, ‘Finally you understand the nature of our language; finally we are able to read the books you have written.’ And that's the way it's been ever since. The Fijian people, to the best of my knowledge, are quite pleased with the written form of their language.

But some Westerners aren't. Early in this century, there was much fuss raised by people who were very upset because the letters of the Roman alphabet, as used in Fiji, didn't represent the same sounds that they represent in Europe or America. And many of these people complained that Fijian ought to be spelled ‘phonetically’, not realizing that no alphabetic script is ‘phonetic’ but rather ‘phonemic’. What they meant, of course, was that Fijian should be spelt so that Euro- peans could read it without having to alter long-ingrained habits, even though that would require some rather ugly-looking words (cf. (29)) and would resurrect all the problems the Fijians them- selves had had with the earlier attempts to spell their language ‘phonetically’. The arguments got rather vicious, and some rather nasty things were said about the actually very intelligent and as- tute missionaries who had devised the Fijian orthography in the 1830's. In fact, there were some proposals that the Fijian language itself should be changed, to get rid of the pre-nasalized conso- nant ‘clusters’ (not really clusters at all, as i've already explained) so that Fijian could keep its established spelling but be pronounced more to a European's liking.

38Note that, as mentioned earlier, this is the same strategy by which the ancient Greeks got some of the Phoenician letters to represent vowels. 39The ‘v’ is due to the fact that the labial fricative is not distinctively voiced or unvoiced; in Fijian, the letter ‘f’ is used only in spelling foreign names.

124 (29) Standard Fijian Spelling vs. Ignorant Western Proposed ‘Reform’ qaciqacia ng-gathing-gathia qaqa nggangga

A big part of the problem here was that the protesters, at some level, apparently believed that the sound values that we Western Europeans associate with the letters of the Roman alphabet are somehow inherent in the letters themselves, rather than being an arbitrary convention that each nation can decide for itself.40

Fortunately for the cause of humanity in general and the Fijian people in particular, wiser coun- sel prevailed. One person pointed out that ‘The Fijian language was and is written primarily (ca. 90%) by and for Fijians; and the present system is not at all difficult or confusing to them, but rather a model of simplicity and consistency.’ Another noted that English orthography departs in many ways from a ‘phonetic’ ideal, and concluded ‘I cannot see why there should be sarcastic remarks because our written language in the English script does not happen to be pronounced exactly as English is pronounced.’ The wise and rational point of view was summed up beauti- fully by one person who said, ‘Let those who want to correctly pronounce the Fijian language take the trouble of mastering the few peculiarities of its spelling.... The language belongs to the people and should not be tampered with to gratify the whim of globe-trotters lazily looking at a chart. It seems absurd to correct a language to suit people who will not go to the trouble of learn- ing it.’

The final conclusion was that, in maps published primarily for the use of foreigners in far-away places, place-names would be spelled in a way that would facilitate their pronunciation by such foreigners, but that maps and everything else published for use in Fiji would use the standard Fijian spellings. One guidebook intended for use by both kinds of people, Fijians and tourists, opted to use the standard Fijian spellings thoughout, and the author included the following ad- vice: ‘The reader who desires accuracy of pronunciation may achieve it from the Fijian spelling merely by spending a few minutes in studying the conventions and an hour or so practicing them.... The Fijian people themselves ... are entitled to spelling which they can recognize.’

↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑隨便↑↑↑

40Indeed, as already noted the Roman alphabet as used in Poland and Hungary is quite different, as far as pronuncia- tion is concerned, from the way it's used in the West. An added problem, of course, was subtle racism: the Poles and Hungarians, after all, were civilized, European people who had been using the Roman alphabet for centuries; they were entitled to their idiosyncracies. But the Fijians had only had a written language for 100 years; who were they to disagree with the mighty Europeans on the values of the letters of the venerable Roman alphabet? Of course, nobody, even in the 1920's and 30's, was so crude as to say this out loud.

125 Japanese Kana

Hiragana — and their derivation from 漢字.

Katakana — and their derivation from 漢字.

126 History Egyptian

Hebrew Phoenician Arabic

Ethiopian Greek Devanāgarī & other Indic alphabets

Cyrillic & other Roman Hangul Eastern European alphabets

Runic Ogham

Arabic ا ’ ب b ت t ث t ج ğ ح š خ Ó

د d ذ d ر r ز z س s ش š ص ş

ض À ط ţ ظ ż ع ¿ غ ġ ف f ق q

ك k ل l م m ن n ﻩ h و w ى j

The 28 basic letter forms (read from right to left)

The variant forms used word-initially, word-internally, and word-finally, and the names of the letters

127 Hebrew (read from right to left)

2. 1. Value Form Name (Meaning) Value Form Name (Meaning) Aleph 牛 א ½ Lamedh 刺棒 ל l Beth 房 ב Mem 海 b מ m Gimel 駱駝 ג Nun 魚 g נ n Daleth 門 ד Samekh 支撐 d ס s He 窗 ה Ayin 眼 h ע ¿ Waw 釘 ו Pe 口 w פ p Zayin 矛 ז Tsade 魚鉤 z צ ts Xeth 籬 ח Qoph 針眼 x ק q Ţeth 蛇 ט Resh 頭 t ר r Yodh 手 י Shin 牙 j ש ß Kaph 手掌 כ Taw 記號 k ת tþ

The Ethiopian Syllabary

128  a  ā < i  ī s u t ū _ ŗ ` ¯ŗ L ļ L ( ¯ļ ) 4 e ai Z o ! au % k F kh 6 g C gh S { ( tß K tßh @ d¹ B d¹h V Ê w é m éh ) Î * Îh T « n t p th + d 1 dh U n ] p ^ ph $ b & bh N m y j a r  l v w } ß g Í k s Ž Ó

The Dēvanāgarī Alphabet (as used in Sanskrit)

Korean — the basic CV combinations of the Hangul writing system

129

Phoenician Models for some Greek Letters

Greek Alphabet — showing both upper-case and lower-case forms

Form: Α α Β β Γ γ ∆ δ Ε ε Ζ ζ Η η Θ θ Name: alpha beta gamma delta epsilon zeta eta theta Value: a b, v g, ¥ d, ð e, ° ts e… th, }

Form: Ι ι Κ κ Λ λ Μ µ Ν ν Ξ ξ Ο ο Π π Name: iota kappa lambda mu nu xi omikron pi Value: i,j k l m n ks o, Š p

Form: Ρ ρ Σ σ,ς Τ τ Υ υ Φ φ Χ χ Ψ ψ Ω ω Name: rho sigma tau upsilon phi chi psi omega Value: r s t y ph, f kh, x ps o… The letter Σ has two lower-case forms, one used at the ends of words and the other used everywhere else. The letters Θ, Φ, and Χ represented aspirated stops in the Classical period, but later, in Byzantine and Modern Greek, they represent fricatives.

130 Cyrillic (as used in Russian)

Form: А Б В Г Д Е Ж З И К Л Phonetic value: a b v g d je/jo ¹ z ji k l

Form: М Н О П Р С Т У Ф Х Phonetic value: m n o p r s t u f x

Form: Ц Ч Ш Щ Ъ Ы Ь Э Ю Я Phonetic value: ts tß ß ßtß ' ´ j e ju ja The letters Е, И, Ь, Ю, and Я have the property of palatalizing the preceding consonant; the letters А, О, У, Ы, and Э lack this property. The letter Ъ represents a syllable-boundary between two phonemes which might otherwise be included within the same syllable, e.g, between the prefix ob- and a stem beginning with a vowel. The letter Е repre- sents a back vowel if it is in a stressed syllable, otherwise it represents a front vowel.

Common Germanic Fuþark

Rune f u T a r k

Phonemic Value f u þ a r k Name fehu ūruz þurisaz ansuz raiðō kaunaz Translation wealth auroch giant god journey ulcer

Rune g w h n i j

Phonemic Value g w h n i j Name gebō wunjō hagalaz nauþiz īsa jēra Translation gift joy hail need ice year

Rune I p z s t b

Phonemic Value ei p z s t b Name eihwaz perþ algiz sōwulō teiwaz berkana Translation yew ? protection Sun Tiw birch

Rune e m l N o d

Phonemic Value e m l { o d Name ehwaz mannaz laguz inguz ōðila dagaz Translation horse man lake Ing inheritance day

‘Tiw’ and ‘Ing’, the names of the t and N, are the names of gods. Most of the other Rune-names were common nouns in early Germanic, although there is much controversy over the proper interpretation of the name ‘perþ’ of p.

131