Europaisches Patentamt European Patent Office © Publication number: 0 271 619 A1 Office europeen des brevets

© EUROPEAN PATENT APPLICATION

© Application number: 86309765.5 © int. CI* G06F 3/00 , B41J 3/00

© Date of filing: 15.12.86

© Date of publication of application: © Applicant: Yen, Victor Chang-ming 22.06.88 Bulletin 88/25 20 Nassau Street Princeton New Jersey(US) © Designated Contracting States: BE CH DE FR GR IT LI NL SE © Inventor: Yeh, Victor Chang-ming 20 Nassau Street Princeton New Jersey(US)

© Representative: Skone James, Robert Edmund etal GILL JENNINGS & EVERY 53-64 Chancery Lane London WC2A 1HN(GB)

© Phonetic encoding method for Chinese ideograms, and apparatus therefor.

© A method and apparatus for data processing and word processing in the Chinese language are dis- PHnNFTir ffllMFSf Al PHABTT (PfAl- 8* « P¥X*»< t+iZ 1***, (Ptt) is de- closed. A Phonetic Chinese Language (PCL) « 8f + $ n £ £ fined in which any ideogram can be unambiguously represented by a Phonetic Chinese Word (PCW) no more than four characters in length, each word being CONSONANTS — 25 « i77*i composed of letters selected from a defined set of 12 3 4 5678 9 10 11 letters that can each be uniquely represented by a rj * * a if h 7-bit digital code. Each PCW represents one and D p m f a t n 1 g It h only one ideogram and provides the full sound and tone information required to pronounce it. Ambigu- 12 13 14 15 16 17 18 19 20 21 22 83 84 85 ities caused by homonyms and homotones are LoogH-t* ±t>B 3 + * T ¥ ill f avoided. PCL words are translated into their cor- jqx zt>chstir zcse YWYii responding ideograms and vice versa by means of a stored monosyllabic dictionary. A method for unam- biguously separating a polysyllabic PCL character TM* VOWELT0NES " 60 « JlWtt string into separate words is also provided, which 23 27 31 35 39 43 47 51 55 59 63 67 71 75 79 ^ makes it unnecessary to employ a polysyllabic dic- ©tionary. Also disclosed is a method of forming an i l/- il o/i u il ii ou to en eng an ongang er/l/- J""aiphagrammic listing from PCL character strings by CO ^separating the strings into separate characters and 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 relisting them in alphabetical order, provided that (2) ARhfe5 ^Jub X2±4J; >J« homotones and identical ideograms are grouped to- CM 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 even if strict alphabetical ordering of the gether IfiE53$£JiiPx7:F$T 7' O string would have separated them. A keyboard (3) n adapted for efficiently entering PCL characters for 26 30 34 38 42 46 50 54 58 62 66 70 74 78 82 HI processing is also disclosed. (4) JftfefiS ft

Xerox Copy Centre 1 0 271 619 2

METHOD AND APPARATUS FOR DATA PROCESSING AND WORD PROCESSING IN CHINESE USING A 'HONETIC CHINESE LANGUAGE

This invention is directed towards a method such an output is sufficient for word processing and apparatus for data processing and word pro- purposes, it is insufficient for data processing pur- cessing in the Chinese language, and more particu- poses. Since ideograms cannot be alphabetized, it arly by the use of a defined Phonetic Chinese is impossi ble to place the ideogram output of any .anguage, which avoids ambiguities resulting from 5 data processing system into alphabetical form. This lomonyms and homotones. hinders the creation of efficient dictionaries, tele- Modern Chinese is primarily polysyllabic. Tra- phone directories, personnel directories and other ditionally, each written Chinese word is made up of sorted or alphabetical listings. Thus, there is a Dne or more ideograms, which are pictorial repre- need for a non-ideographic representation of Chi- sentations of a concept or thing. Each ideogram w nese that can be sorted, listed alphabetically, and nas a monosyllabic pronunciation. The use of mon- so forth. syllabic words is insufficient, however, in the spok- In an effort to overcome the foregoing prob- sn language, since Chinese includes a large num- lems, the Chinese government has developed an ber of homonyms, i.e., words (ideograms in this alphabetic representation of the Chinese ideo- :ase) that are written differently or have different 15 graphic language. This language, known as Hanyu meanings, but have the same sound. That is, a Pinyin, is representative of the pronunciation of single Chinese spoken can represent a Mandarin (Peking Dialect). The Peking Dialect has large number of different ideograms and therefore about 400 distinct monosyllabic sounds. Pinyin re- a large number of different meanings. This makes lies on 25 letters of the English (v is not it impractical to use monosyllabic words for oral 20 used) to phonetically represent all 400 of these communications. sounds. Pinyin is successful in achieving this result To overcome this problem, an oral language on a purely phonetic basis. There are 21 consonant has evolved which is primarily polysyllabic, wherein sounds and 16 sounds (the sound "i", "u" a plurality of ideograms are strung together to form and "u" may be added to the other vowel sounds a single polysyllabic word, which significantly nar- 25 to achieve an additional 18 compound vowel rows down the possible meanings of such word. As sounds) in the Chinese language. Each of these a result of the foregoing, oral Chinese is approxi- sounds can be uniquely represented by a combina- mately 80% polysyllabic (75% bisyllabic). Modem tion of one or more Pinyin letters. Thus, systems written Chinese has followed the oral language with employing Pinyin for both input and output have the result that in written Chinese, many ideogram so led to improvements in word processing efficiency compounds are used, which are polysyllabic. and convenience. Approximately 8,000 ideograms are used in the However, for generating ideogram output, a modern Chinese language. While the total number primary drawback of this system stems from the of ideograms is somewhat greater than 50,000 ned to differentiate the large number of homonyms most are rarely used and do not occur in the 35 in the Chinese ideographic language. Assuming a everyday language. In 1981, the People's Republic base dictionary of some 8,000 ideograms, every of China set up a standard set of 6,763 ideograms Chinese syllable (corresponding to a single ideo- which are to be used for telecommunications sys- gram) has an average of 20 homonyms (since tems in China. As a result, a base of about 8.000 there are about 400 distinct sound in ideograms will handle most practical applications of 40 Chinese) with the result that on the average, one Chinese language. Pinyin syllable indentifies 20 different ideograms. In The use of ideograms enjoys a strong cultural some cases, the number of homonyms for a given bias in China and serves as a unifying force within sound exceeds 150. the nation. For this reason, it is necessary that any Since the Chinese language is about 80 per- word processing or data processing system be 45 cent polysyllabic, and since only a limited number capable of generating Chinese ideograms as an of combinations of ideograms are employed to output. The use of ideograms as a direct input form polysyllabic words, this problem can partially medium is, however, impractical because of the be overcome in computer applications by storing a large number of ideograms (about 8,000) that polysyllabic Pinyin dictionary in computer memory. would be required on a keyboard. Also, since ideo- 50 When a polysyllabic Pinyin word is entered, a grams are not alphabetical, the task of processing limited number of possible corresponding combina- and ordering ideograms is difficult and cumber- tions of ideograms are identified, and often a single some. While it is important for data and word combination of ideograms can be uniquely iden- processing systems to output ideograms, and while tified by the polysyllabic word. However, the use of

2 3 0 271 619 4 a polysyllabic dictionary requires a substantially syllables are used by the Chinese language). larger storage capacity than if a purely monosyl- Recognizing the problem of homonyms, some labic (ideogram) dictionary were utilized and also prior art publications have suggested that a mean- significantly increases the processing time of ing indicating be added to each Pinyin syl- coverting from the Pinyin input to the ideograph 5 lable to indentify the specific ideogram desired. output. Even with the storage of a large polysyl- Since there are 25 characters in the Pinyin al- labic dictionary, the predominance of homonyms in phabet, 26 different ideograms can be identified by Chinese (approximately 40% of bisyllabic words adding one of the 25 characters (or by not adding have homonyms) prevents unique and unam- any character) to the end of a given syllable. This biguous mapping between Pinyin and ideograms. 70 system has not come into significant use, since in Since many ideographic words have the same the proposed systems the added letters have had pronunciation, and hence are mapped into a given no rational connection to the particular ideogram to phonetic Pinyin word, written Pinyin also has a be represented, and it is difficult, if not impossible, large number of homonyms. Systems utilizing to remember which specific letter corresponds to Pinyin as an input language generally require spe- 75 each specific ideogram. cial forms of spelling, or require that a character be The deficiencies of a sound-based language added at the end of a bisyllabic word to distinguish were recognized in 1928 by Y.R. Chao, who pro- between homonyms. Other phonetic conversion posed a phonetic system using the Roman al- systems require the operator to make manual se- phabet. This system used a tone-indicating letter lections from among a choice of displayed hom- 20 which was inserted in each sound syllable to in- onyms of individual ideograms or compound dicate the tone of the syllable. The primary prob- words. lem with this system is that the extraneous tone- Pinyin has additional major drawbacks, since it indicating letter prevents the establishment of a disregards the most fundamental characteristic of meaningful alphabetical listing of the resulting the Chinese language - the tone. Pinyin specifies 25 words. It is also much more difficult to read, and only distinct vowel or consonant sounds, i.e., pho- does not permit a unique identification between its nemes. Every Chinese syllable also has a tone, i.e., phonetic words and individual ideograms. an inflection or pitch pattern. The tone can have Summarizing the foregoing, Pinyin is deficient any one of the four pitch patterns illustrated in Fig. in two major respects: (1) it does not take tone into 1. As shown therein, the four tones are the first 30 consideration, and (2) it cannot distinguish between tone (1) which starts high and stays high, the homonyms. While modifying Pinyin or other prior second tone (2) which starts at an intermediate art systems to include tone and meaning-indicating level and rises high, the third tone (3) which starts letters would alleviate these problems to some de- at a medium level, dips low and then rises high, gree, this would create problems of its own since it and the fourth tone (4) which starts high and dips 35 would destroy the alphabetical nature of the lan- low. guage and make it very difficult to create a proper The combination of a sound syllable and the dictionary or other sorted listing. Yet another prob- tone associated therewith will be referred to here- lem with the modifications to Chinese proposed by after as a tone-syllable. Every ideogram of the the prior art is that the number of letters required to Chinese language, and therefore every syllable of 40 identify a par ticular ideogram would be signifi- the Chinese language, is pronounced as a tone- cantly increased, thereby reducing the readability syllable. of the language and making it very difficult to learn. Therefore, a tone-based system would have In any practical alphabetical system, each Chi- major advantages. Providing sound information nese word (consisting of one or more ideograms) alone is not sufficient, because it does not provide 45 must be typed as a single string of letters. Words the complete information required to properly pro- are separated by spaces. In the prior art systems, nounce an ideogram. Further, as explained above, there is no method for dividing single polysyllabic a sound-based system must deal with the full set words into their individual components, with the of homonyms for a given Chinese sound syllable, result that a polysyllabic dictionary must be stored, and can do so only unsatisfactorily, while a tone- 50 thereby increasing the memory requirements and based system need deal only with homotones processing time of the data processing or word (syllables which have the same tone as well as the processing system. Even if means were provided same sound). By resolving at the homotone level, for separating the polysyllabic words into their in- rather than the homonym level, the average num- dividual component syllables, the prior art alpha- ber of ambiguities caused by more than one ideo- 55 betical systems do not achieve a one-to-one cor- gram being represented by a given tone-syllable is respondence between the phonetic representations reduced significantly. The reduction is about three- of ideograms and the respective individual Chinese fold (only about three-fourths of the possible tone- ideograms themselves. Thus, the alphabetical re-

3 I 271 619 iresentation will often identify a plurality of ideo- distinguish between tst> nomotones (equivalent xo This jrams which must further be distinguished man- 340 homonyms) for all other tone-syllables. lally by the operator of the system. one-to-one correspondence between PCWs (which The present invention preferably utilises a Pho- contain all of the sound and tone information re- letic Chinese Language (PCL) which uses a Pho- 5 quired to pronounce a given tone-syllable) and letic Chinese Alphabet (PCA) to form Phonetic ideograms is not possible with prior art systems. Chinese Words (PCWs), each of which corre- A major advantage of the present invention is sponds to a single ideogram. The Phonetic Chi- the ability to write a Polysyllabic Phonetic Chinese lese Words are, in turn, strung together to form Word as an unbroken string of letters from the 3o!-. -yllabic Phonetic Chinese Words (PPCWs). 10 Phonetic Chinese Alphabet in a manner which per- Each PPCW corresponds to a single Chinese poly- mits a computer program to separate the PPCW syllabic compound word consisting of a plurality of string into individual PCWs without a pre-stored deograms. The Phonetic Chinese Language used polysyllabic dictionary. This aspect of the invention n the present invention preferably has the following is extremely important. As a result of this feature, jnique characteristics: 75 in combination with the one-to-one correspondence 1. It utilizes a truly tone-based alphabet in between PCWs and ideograms, it is not necessary /vhich a discrete set of letters provides all of the to store a polysyllabic dictionary in computer mem- 3honetic and tonal information to pronounce all ory. Rather, all PPCWs may be entered as continu- syllables of the Chinese language (Mandarin); ous chains of PCL letters, which are then subjected 2. It utilizes either a dominant-root principle 20 to a separation method which divides the PPCW Dr a semantic classifier principle to select an addi- into individual PCWs. The computer then refers to tional character to be added to some PCWs to a monosyllabic dictionary to convert each PCW to Drovide a unique one-to-one correspondence be- its corresponding ideogram. This significantly cuts tween PCWs and Chinese ideograms, such that down the storage requirements and processing 3ach PCW uniquely and unambiguously identifies a 25 time of any data processing or word processing single ideogram; and system utilizing the present invention. 3. It enables the use of separation logic to Another significant result of the use of the automatically divide a Polysyllabic Phonetic Chi- separation logic and unique one-to-one correspon- nese Word (PPCW) comprising an unbroken string dence between PCWs and ideograms is that a data Df PCL characters which together represent a poly- 30 processor can automatically produce an alphag- syllabic compound word (a Chinese word consist- rammic listing (AGL) from stored PPCWs in a man- ing of a plurality of ideograms), into individual ner that is not possible with prior art systems. An PCWs (which correspond to ideograms). alphagrammic listing is one which lists PCWs in In PCL a given sound syllable can be written in generally alphabetical order, but ensures that four different ways to indicate the four different 35 homotones and identical ideograms are grouped tones of the sound syllable. As a result of the tonal together even when the alphabetical order indicates nature of the alphabet, the language is highly they should be separated. A purely alphabetical readable and automatically provides three times PCL listing might result in words or phrases which greater resolution than a purely sound-based sys- have the same initial ideogram being separated tem. A data processor or word processor receiving 40 from each other due to the presence of a semantic a PCL input need deal with a average of only 6 classifier in some words and its absence in others. homotones rather than some 20 homonyms as in An alphagrammic listing avoids this possibility, and the prior art (assuming a set of about 8,000 ideo- groups all words having the same initial ideogram grams). together. The AGL is described in greater detail Since the tone-based alphabet provides three 45 below. times the degree of resolution of a sound-based As a result of the tone-based nature of the alphabet, and due to special characteristics of the PCL, and further as a result of the dominant-root PCA described below, it is possible to achieve one- and semantic classifier distinctions described be- to-one correspondence between PCWs and Chi- low, the PCL can uniquely identify all 50,000 + nese ideograms, even in those cases where a large 50 ideograms. Of the 8,000 ideograms in the primary number of homonyms exists. As will be shown in set, about 3,900 can be uniquely identified by of each greater detail below, the PCL of the present inven- using only three variations on the spelling tion can distinguish between 255 homotones (an PCW "roof following a defined "dominant-root" equivalent of 1 ,020 (255 X 4) homonyms) for tone- principle. These account for about 97 percent of syllables wherein the only vowel is the Pinyin 55 language usage in Chinese. Of the remaining ideo- sound "i", "u" or "ti"; can distinguish between 170 grams in the primary set, 80 percent can be iden- homotones (eqivalent of 680 homonyms) for tone- tified by using a semantic classifier which is similar syllables ending in the Pinyin sound "i"; and can or identical to the Chinese radical on which the 7 ) 271 619 deogram is based. Thus, the PCL is both concise coding of ideograms. Rather, eacn ideogram is ind has high readability. All other ideograms in the represented by tonally spelling the ideogram as a Chinese language can also be uniquely identified, PCW, which is coded as a unique combination of Dy using a single semantic classifier. Thus, the 7-bit PCA letter codes. Thus, each ideogram is 3CL can uniquely identify all Chinese ideograms. 5 coded as a combination of no more than 4 - and a Thus, the PCL uses a maximum of 4 letters frequency-weighted average of 2.4 - standardised and a frequency-weighted average of only 2.4 let- 7-bit PCA letter codes, which leads to a significant :ers per ideogram, compared to a maximum of 7 simplification of the hardware and software require- [possibly 8) and an estimated frequency-weighted ments for computerised Chinese text processing. average of 4 letters which would be required using

5 ) 271 619

Fig. 13 illustrates a 7-bit code for represent- the voweltones 27-30 or 79-82, whicn in tnis situ- ing the PCA in digital form. ation only add tone, but do not contribute a vowel sound. The short consonants, on the other hand, do not include a vowel sound and must be followed ^. Phonetic Chinese Language 5 in a PCW by a voweltone indicating both the vowel sound and the tone to be employed. The present invention is based on a tone- In addition to the 21 traditional consonants 1- based alphabet which is illustrated by way of ex- 21 , the PCA further includes a zero consonant 22 ample in Fig. 2. While this alphabet represents the and semi-consonants 83-85. The zero consonant inventor's presently preferred embodiment, other w 22 (indicated by the symbol 0) is silent, and is letter representations which carry the same or es- used as a syllable delimiter to separate individual sentially the same tone and sound information can syllables of polysyllabic words in certain specified be used. Whatever specific letter representations situations de scribed below. It is also used to are used, it is highly preferable that distinct, but distinguish between homotones using the related, letters be used to represent having 75 dominant-root principle discussed below. the same sound but different tones. The semi-consonants 83, 84 and 85 are pro- Also, as shown in Fig. 13, the PCA can be nounced with a vowel sound but act like con- encoded as a set of digital codes having only 7 sonants since they do not incorporate any tone. significant bits, which substantially simplifies hard- Rather, a tone must be added to them in a PCW. ware and software requirements over prior art sys- 20 The sounds of the semi-consonants 83, 84 and 85 tems. are identical to the sounds of the voweltones 27-30, As shown in Fig. 2, applicant's Phonetic Chi- 39-42, and 47-50, respectively, so each of the latter nese Alphabet includes 25 consonants and 60 vow- voweltones may be added to its respective semi- el tones (a voweltone is a letter indicating both a consonant to contribute a tone thereto. This adds vowel sound and the specific tone with which the 25 significant flexibility to the PCL, enabling resolution vowel is pronounced) for a total of 85 letters. Each between a higher number of homotones. More im- letter is assigned a sequential number which can portantly, the combination of one of 83, 84 or 85 readily be used for data processing purposes. In with another vowel forms the 18 Pinyin compound the chart of Fig. 2, the Pinyin sound equivalent to vowels. The inclusion of two separate sets of "i", the PCA letter, if such equivalent exists, is in- 30 "u" and "u" (83-85 versus 27-30, 39-42 and 47- dicated below the PCA letter. Pinyin does not al- 50) provides an important foundation from which ways distinguish between characters pronounced the separattion logic is eventually made possible. with the sounds "u" and "u" by including the The Chinese language includes 15 vowel umlaut, and this can lead to confusion as to how sounds, each of which can carry any one of the the character is to be pronounced. However, this 35 four tones illustrated in Fig. 1 , with the result that distinction is made clearly in the PCL to increase there are 60 distinct voweltones in the Chinese its readability. Since Pinyin letters do not include language. In the PCL, each of the vowels is broken tone information, the Pinyin equivalents to the PCA up into a family of four related voweltones, each voweltones are set forth only below the voweltones having the same sound but a different tone. that are pronounced with the first tone (see Fig. 1). 40 By way of example, the voweltones 23-26 all The same sound, but different tone, is utilized for have the same sound "a" but carry the first each of the related voweltones in the columns of through fourth tones (corresponding to the tones 1- Fig. 2. Thus, each of the voweltones 23-26 have 4 of Fig. 1) as indicated. Each letter of a voweltone the sound "a". Hereinafter, the letters of the Pho- family has the same base character but is distin- netic Chinese Alphabet will be referred to inter- 45 guished with the use of an additional line added changeably by their Pinyin equivalents, their as- somewhere within the base character to identify the signed numbers, or by the actual PCA letters them- second, third and fourth tones. With particular ref- selves. erence to the family of vow eltones 23-26, for The Chinese language includes 21 consonant example, a line is added to the bottom of the base sounds and 15 vowel sounds. The 21 consonant so character to identify the second tone; a line is sounds are listed in two rows corresponding to the added to the top of the base character to identify short consonant sounds and long consonant the third tone; and a line is added about onequarter sounds, respectively. Each long consonant sound of the way down from the top of the base character inherently has one of various basic vowel sounds to identify the fourth tone. Similar distinctions are built into it. Some Chinese ideograms correspond 55 made for each of the families of voweltones as to a long consonant sound; these must have a shown. tone-indicating character included in the corre- The voweltones 27-30 serve two purposes. sponding PCW. This is achieved by adding one of When they follow the short consonants, they are

6 11 0 271 619 12 pronounced "i" and include both sound and tone between the consonants 4 and 5 and between the information. When they follow the long consonants, zero consonant and the semi-consonant 83 under or the semi-consonant 83, they act as silent vowels the column for voweltones 35-38, to indicate that and carry tone only. In Fig. 2 this is indicated by a the different sounds assigned to the voweltones 35- dash. In the latter case, a default vowel sound is 5 38 depend on which consonant they follow. inherently contained in the long consonant or semi- The PCA is capable of representing about consonant itself. 3,000 tone-syllables. Many tone-syllables can be The voweltones 35-38 also serve a dual pur- written in more than one way using the PCA. This pose. When they follow the short consonants 1-4 or is shown in Figs. 4A-4J and is described further the semi-consonants 83-85, they are pronounced w below. The Chinese language incorporates only "o". When they follow the remaining letters, they 1,292 of these tone-syllables. The tone-syllables are pronounced "e". This dual use is made possi- which are not used in the Chinese language are ble by the fact that there are no tone-syllables in indicated in Figs. 4A-4J by the presence of a blank the Chinese language where the sound "e" follows space or a dash. the sounds "b", "p", "m", "f", "y", V and "Yu" 75 While the PCA can represent all 1,292 tone- and there are no tone-syllables in the Chinese syllables of the Chinese language, standard Pinyin language wherein the sound "o" follows the re- can only represent the 410 sound syllables of the maining consonant sounds. This efficient use of the Chinese language. The full sound table of Pinyin is voweltones 35-38 reduces by four the total number shown in Fig. 3. The increased resolution of the of letters required in the PCA. 20 Phonetic Chinese Language compared to Pinyin The voweltones 79-82 serve three purposes. will be readily apparent by comparing the tone and Whenever these voweltones are written alone or sound tables of Figs. 3 and 4A-4J. This additional following the zero consonant 22, they are pro- resolution of the PCL is achieved utilizing fewer nounced "er". Whenever they follow a short con- letters per syllable than the Pinyin system, thereby sonant, they are pronounced "i". Whenever they 25 increasing the readability of the Phonetic Chinese follow a long consonant or any of the semi-con- Language while providing more information than is sonants, they have no sound and provide tone possible using Pinyin. information only (the vowel sound being provided Employing the Phonetic Chinese Alphabet, it is by the long consonant or semi-consonant itself). possible to phonetically and tonally provide all of Each ideogram of the Chinese language is 30 the information required to pronounce a tone-syl- defined by a single tone-syllable which can take lable taking any of the possible forms CV, CSV, SV any one of the following forms: CV, CSV, SV'and and V. However, the sound and tone information V, wherein C is a consonant, S is a semi-consonant required to pronounce an ideogram does not in (a letter having a vowel sound but carrying no itself provide sufficient information to distinguish tone) and V is a voweltone (a letter having a vowel 35 between homotones. For this reason, if necessary, sound and a tone). Utilizing the letters illustrated in the PCL adds an additional classifying character to Fig. 2, the Phonetic Chinese Alphabet can provide the tone-syllable to distinguish between homo- all of the sound and tone (information required to tones. The particular character added to the tone- pronounce every tone-syllable (and therefore every syllable is determined either by a dominant-root ideogram) of the Chinese language. The manner in 40 system or by a semantic classifier system. which these letters may be combined to produce The dominant-root system is used to distin- the required information is illustrated in detail in guish between the three most commonly occurring Figs. 4A-4J, which is a tone table showing the PCL homotones (based on actual frequency of usage) representation of all the tone-syllables that occur in for each tone-syllable. In accordance with this sys- the Chinese language. In this table, the consonants 45 tern, a Phonetic Chinese Word (identifying a unique of the PCA are listed vertically and the voweltones ideogram) can be written in a primary form consist- horizontally. The Pinyin sound equivalent of each ing of the tone-syllable (TS) alone (if it is not PCA letter, as well as the number assigned to the necessary to distinguish homotones), in a secon- PCA letter, is indicated adjacent the PCA letter. dary form consisting of the tone syllable with its Figs. 4A-4D illustrate all of the tone-syllables so vowel repeated (TS + V), and in a tertiary form taking the form CV, SV and V. Figs. 4E-4J illustrate consisting of the tone-syllable followed by the zero all of the tone-syllables taking the form CSV. A consonant (TS + Z). For example, primary, secon- heavy horizontal line is drawn between consonants dary and tertiary forms for writing the tone-syllable 11 and 12 to separate the short consonants from "sha" are: ?A (primary), ? AA. (secondary) the long consonants, since the voweltones 27-30 55 and AT (tertiary). Utilizing this simple sys- . and 79-82 are pronounced differently depending on tern, each tone-syllable has achieved three addi- whether they follow a short or long consonant (see tional degrees of resolution, and each sound syl- below). Similarly, in Fig. 4A heavy lines are drawn lable has attained 12 (4 X 3) additional degrees of

7 £.1 I O 19 solution. The combined set or tone-synaDies wru- uuii^iuy a uui i iuihoiiv-'i i wi «.» v*w . semantic classifier system, each n in the primary, secondary or tertiary form is system and the between 85 homo- jfficient to represent approximately 97 percent of tone-syllable can distinguish to 340 homonyms). While this is e Chinese language in terms of frequency of tones (equivalent most tone-syllables, some xurrence. Thus, the PCA can be utilized to more than sufficient for than 85 homotones. Yiquely identify 97 percent of the ideograms occ- tone-syllables have more fall into two classes: (1 ) those jrring in the Chinese language based on fre- These tone-syllables wherein the only vowel is "i", "u" or uency of occurrence following the simple tone-syllables with the ominant-root rules alone. "u", and (2) those tone-syllables ending characteristics of Since it is relatively easy for an individual to ) vowel "i". By utilizing the unique Phonetic Chi- lemorize the three most frequent homotones for the Phonetic Chinese Alphabet, the 170 homo- ach tone-syllable, this provides a very practical nese Language is capable of resolving of 680 homonyms) for all tone- iput system. Even if the person entering the Pho- tones (equivalent "i" and 255 homo- etic Chinese Words (into a keyboard or other syllables ending in the vowel of 1,020 homonyms) for those iput device) does not remember which homotone 5 tones (equivalent the vowel sound is "i", ; the first, second or third most frequent in terms tone-syllables wherein only achieved in the following man- f occurrence, it is a simple and quick task to "u" or "u". This is lerely guess the appropriate PCW form, observe ner. in 2, the sound "i" can be ie corresponding ideogram shown on the display As shown Fig. the semi-consonant 83 or the creen and change the entry if the displayed ideo- o written utilizing either 27-30. the sound "u" can be ram does not correspond to the desired ideogram. voweltones Similarly, 84 the The homotones of the Chinese language which written utilizing the semi-consonant or sound "u" can be ccount for the remaining 3 percent of Chinese voweltones 39-42. Finally, the the semi-consonant 85 or the anguage usage are distinguished by use of a sys- written utilizing 47-50. While the semi-consonants 83- 3m of semantic classifiers. Each of the letters of 5 voweltones the above-mentioned he PCA can be used as a semantic classifier 85 do not include a tone, used to indicate tone when they epresenting a specific category of meaning (e.g. voweltones can be the same sound isects, mountains, trees), to provide a logical in- follow the semi-consonant having 79-82, men- lication of which homotone is desired. (This is information. Also, the voweltones as indicate tone when listinct from their use as indicators of sound and « tioned above, can be used to semi-consonants 83-85. one information.) The one exception is voweltone they follow the to write each of the '9, which is used only to identify a specific ideo- This makes it possible "u" and "u" in twelve different jraphic Chinese character called the "retroflex tone-syllables "i", 5A-5C. In the first row of deogram" as further discussed below. When used ways as shown in Figs. the semi-consonant is used is a semantic classifier, a PCA letter is attached at is each of these Figures, information and the voweltone he end of a tone syllable, where it conveys mean- to provide sound the same sound is used to provide tone ng to the reader, but not sound or tone. containing In the second row of Fig. 5, the semi- By way of example, the letters 72, 84, 68 and 3 information. used to sound information are identical or substantially identical to the tradi- consonant is provide used to provide ional ideographic radicals for: insects, worms (72); to while the silent vowels 79-82 are third of Fig. 5, the @nountains (84); earth, dirt (68) and trees, wood (3), tone information. In the row alone to both sound and •espectively. These letters are used as semantic voweltone is used provide of the PCA classifiers having these meanings. In the top row of tone information. This unique ability and the resolution power of Fig. 9A, these letters are added to the tone-syllable increases the flexibility to prior 11 i\x " to form four different PCWs. The asso- i5 the PCL to a substantial degree compared ciated Chinese ideograms (which incorporate sub- art systems. of the PCL for tone-syllables stantially the same radicals) are shown below the The resolution PCWs. ending in the sound "i" is also significantly greater of art This results Fig. 9B is another illustration of how semantic than the resolution prior systems. voweltones 27-30 and 79-82 classifiers can be used to distinguish between so from the fact that the "i" the homotones. This figure is a dictionary listing of can all be pronounced depending upon follow. When the vowel- PCWs in alphagrammic order from left to right, particular consonants they follow short consonant, they are along with their corresponding ideograms. Each tones 79-82 a "i". In fact, there are no tone-syllables ideogram incorporates the radical for "wood", and pronounced in which the sound "i" each PCW has charac ter (3), which is similar 55 in the Chinese language "f", "k", "h" or "r". thereto, at its end. Note further that the four entries follows the consonants "g", 79-82 used follow- in the dashed block marked 9b are homotones Thus, the voweltones are never 10, 11 or 18, so these which in Pinyin would not be distinguishable. ing the consonants 4, 9,

o 15 D271 619 16 combinations are available for distinguishing homo- described, it also possesses the third attribute. tones. Whenever the voweltones 79-82 follow a All Phonetic Chinese Words formed utilizing long consonant 12-21 or a semi-consonant 83-85 the Phonetic Chinese Alphabet take one of the (each of which has a vowel sound built into it by following two forms: default), they act as silent vowels which carry no 5 PCW = TS + G Eq. (1) sound but indicate the tone of the tone-syllable. PCW = TS Eq. (2) The voweltones 27-30 are also pronounced "i" wherein TS is a tone-syllable (taking one of the whenever they follow a short consonant. Whenever four forms CV, CSV, SV or V) and G is a single they follow a long consonant, they act as silent character of the PCA which is added to the tone- vowels which carry no vowel sound but indicate the 70 syllable to distinguish between homtones. This ad- tone of the tone-syllable. As a result of the fore- ditional letter is selected using either the dominant- going characteristics of the voweltones 27-30 and root principle or the semantic classifier principle as 79-82, the Phonetic Chinese Language has the described above. This letter, whether selected us- capability of writing 170 homotones ending in the ing the dominant-root or the semantic classifier sound "i": 85 wherein the base tone-syllable ends 75 principle, will be referred to as the generalized with one of the voweltones 27-30, and an additional semantic classifier G. 85 wherein the base tone-syllable ends with the Thus, the relationship of Equations (1) and (2) voweltones 79-82, with the result that the PCL can can be expressed more generally as uiniquely distinguish between 680 homonyms end- PCW = TS + Q Eq. (3) ing with this sound. 20 wherein Q is a generalized tone-syllable modifier Fig. 11 shows two examples of how the PCL which is defined to include both the generalized resolves ideograms having a large number of semantic classifier G and the null set 0 (i.e., the homotones and homonyms, in this case "sha", with omission of any letter). The generalized tone-syl- 24 homonyms, and "shi", having 86 homonyms. lable modifier Q can therefore represent either the Each row shows all the homotones of a given 25 absence of a letter or the presence of any of the tone-syllable. For example, the first row (marked letters of the PCA (except the voweltone 79 which, "14" on the right) shows the 14 homotones of the as discussed more fully below, is never used as a tone-syllable "sha" pronounced with the first tone. semantic classifier). Below each PCW is the corresponding ideogram. As described above, the tone-syllable can take The first three PCWs are the primary, secondary, 30 any of four forms: CV, CSV, SV and V. The gen- and tertiary PCWs according to the dominant-root eralized tone-syllable modifier Q may assume any system. In the remaining 11 PCWs, the third PCL one of the five forms 0, C, Z, V, or S (Z represent- is a semantic classifier. ing the zero consonant 22). Thus, PCWs may as- Referring now to the bottom section of Fig. 1 1 sume any one of the twenty distinct forms shown in (marked "40" on the right) there are seen the 40 35 Fig. 6. homotones of the tone-syllable "shi" pronounced When strung together, the forms of the first two with the fourth tone. In the first 33 homotones, the columns (CV, CSV) are totally distinguishable from vowel "i" is represented by voweltone 30. In the one another. The third and fourth columns last seven homotones, the vowel "i" is represented (disregarding the asterisks for the present) can, by voweltone 82. 40 however, be confused with the first and second columns if the PCWs of the third and fourth col- umns form part of a PPCW wherein the imme- B. Separation Logic diately preceding PCW ends in a consonant. More particularly, if a PCW of the third column follows a An ideal representation of the Chinese lan- 45 PCW taking the form CVC or CSVC, the PCW of guage has three attributes: the third column can be confused with a PCW of 1 . It provides all the sound and tone informa- the second column. Similarly, if a PCW of the tion required to phonetically and tonally pronounce fourth column follows a PCW taking the form CVC Chinese tone-syllables; or CSVC, it can be confused with the PCWs of the 2. It provides a simple and efficient method so first column. for distinguishing between homotones; and To avoid this possibility, the zero consonant 22 3. It provides a basis for separating a poly- is to be added by the writer of PCL text to the syllabic string into its individual components, each beginning of the PCWs of columns 3 and 4 when- of which corresponds to one ideogram, without ever one of these PCWs forms part of a PPCW resorting to a polysyllabic dictionary. 55 and the immediately preceding PCW takes the As described in detail above, the Phonetic Chi- form CVC or CSVC. This is indicated by the pres- nese Language of the present invention clearly ence of an asterisk in front of each PCW in the possesses the first two attributes. As will now be third and fourth columns. By following this simple

9 7 271 eia 3 ntry rule, it is possible to create a simple com- set equal to zero, i ne array o i niiNv^u/ is uoou iu PPCW. The first letter uter program which can unambiguously divide a store consecutive letters of a 'PCW into its individual PCW components and of the PPCW will be stored in element STRING(1), stored in len identify the specific Chinese ideogram cor- the second letter of the PPCW will be is ssponding to each separated PCW. i element STRING(2), etc. The array STRING(J) Another special technique is necessitated by dimensioned to have a sufficient number of ele- which the ie nature of the retrofiex ideogram. The retroflex ments to store the largest PPCW system 20 element deogram (also referred to as the retroflex vowel) is is designed to handle. In most cases, a the ie sole Chinese ideogram which modifies the array is of sufficient size. If desired, array that ound of a prior ideogram to make the prior ideo- o STRING(J) can be made very large in order a of PCA letters (comprising a plu- iram end in the sound "er". This is the only case continuous string /here two consecutive ideograms combine to form rality of PPCWs) can be entered without depress- i single syllable (ending in "er"). As a result, the ing a space bar to separate PPCWs (compound etroflex ideogram will always appear at the end of Chinese words). five-element i polysyllabic string and therefore at the end of a 5 The array SEG(M) is a array 3PCW. As described above, the voweltone 79 is which will temporarily store a portion of a PPCW how me of those that are pronounced "er" when they string which is examined to determine many :tand alone or follow the zero consonant 22. Since characters of that string define a PCW. The array he retroflex ideogram is pronounced "er" in Chi- PCW(X) is used to temporarily store a PCW so that be identified. lese, the voweltone 79 is defined to represent the •o its corresponding ideogram can etroflex ideogram. This designation is important in When the arrays STRING(J), SEG(M) and PCW(X) snabling the computer program to unambiguously are cleared, each of their elements is set to zero. and is iivide a PPCW into its individual PCW components The flag RV is the retrofiex vowel flag " ind then identify the specific Chinese ideogram set equal to "1 whenever the final character of a Whenever corresponding to each PCW. As will be described (5 PPCW represents the retroflex vowel 79. indicates that the celow, the program treats the retroflex ideogram the flag RV is set to zero, this the ret- differently than the remaining ideograms. The pro- last letter of a PPCW does not represent gram identifies it by looking for this ideogram be- roflex vowel. whether ore otherwise separating the PPCW into individual The zero consonant flag Z indicates If =CWs. io the first letter of a PCW is the zero consonant 22. A flow chart setting forth a method for separat- the first letter is the zero consonant, the flag Z is "1". ng PCWs is illustrated in Figs. 7A-7D. This method set equal to E is the and is set equal to nay be implemented as computer program, which The flag error flag " determines that can be carried out by any general purpose corn- "1 whenever the separation logic form. outer. The illustrated flow chart presents the man- ?5 a string of PCA letters takes an improper incremented with the ner in which entered PPCWs are converted to The variable JMAX is Chinese ideograms utilizing a separation logic and counter J as a PPCW is loaded into STRING(J), so PPCW. a monosyllabic dictionary which uniquely relates as to track the length of the the Bach PCW to a single ideogram. This program can Once the arrays have been cleared and carried be used in connection with a larger data or word to flags set to zero, the first operation to be is to identify a single processing program as desired. out by the separation logic This While one specific program is being illustrated, PPCW and to store it in the array STRING(J). the invention is not limited to this program, and a is achieved in logic blocks 12-23 of Fig. 7A. first instruction block 12, the programmer of ordinary skill will be able to design Proceeding to variable J to "1". The many other programs utilizing the same prin ciples 45 program sets the equal in and achieving the same result as in the present program then determines if there is a character embodiment of the invention. In addition, the de- an input data buffer register REG A (block 14). For it is assumed that scribed program identifies an ideogram and then the purpose of this disclosure, at time in displays it on an output device. A display of the input characters have been placed one a which is lower ideogram is not absolutely necessary and the PCL so the buffer register REG A at a speed that and separating logic can be used simply to identify than the processing speed of the computer so REG A at an ideogram without displaying it. Broadly, the in- only one character is in register any be vention can be considered to include the use of given instant. If desired, the program can re- stored includ- separation logic to separate a polysyllabic string. vised to accept a previously listing, Turning now to Figs. 7A-7D, the program be- 55 ing a plurality of PPCWs with or without spaces the gins at instruction block 10 wherein the arrays between them. In such a case, program may PPCWs and STRING (J), SEG(M), and PCW(X) are cleared and first divide the listing into separate described the flags RV, Z and E and the variable JMAX are then process each PPCW in the manner 19 0 271 619 20 below. it can unambiguously be determined that the Returning to decision block 14, the program voweltone 79 represents the retroflex vowel. The continues polling register REG A until the first program examines the second to last character in character of a PPCW string appears in the register. STRING(J) in decision block 26 to determine if that At that time, the program proceeds to decision 5 character is a vowel (V) or one of the consonants block 16 and determines if the character in register C = 1, 3, 4, 7-11 or 18. If it is not, the voweltone REG A is a space (as opposed to a letter of the 79 does not represent the retroflex vowel and the alphabet). If it is not, the program proceeds to program proceeds to decision block 32. If the sec- decision block 18 and sets the first element of the ond to last character in STRING(J) is a vowel (V) or array STRING(J) (J is originally set equal to 1) w one of the consonants C, the voweltone 79 does equal to the numerical value of the PCA letter in represent the retroflex vowel. In this case, the last REG A. The register REG A is then cleared character in STRING(J) is set equal to zero and the (instruction block 20) and the variable J is in- retroflex vowel flag RV is set equal to 1 (see blocks creased by 1 (instruction block 22). The variable 28 and 30). JMAX is also incremented so as to track the length 75 Having determined whether the final character of the PPCW that is ultimately loaded into in STRING(J) represents the retroflex vowel, the STRING(J). The program then returns to decision first PCW of the PPCW string stored in STRING(J) block 14 and waits for a second character to be must be identified. This is done in the subroutine placed in the register REG A. If this character is consisting of logic blocks 32-76 (Fig. 7B). not a space, it will be placed in the second element 20 As noted above, a PCW takes the generalized of STRING(J) since J has been increased to 2 in form TS + Q. A tone-syllable can take the form block 22. The program will continue looping CSV, CV, SV or V and therefore can be either 1 , 2 through elements 14-23 until the character in regis- or 3 letters long. Since the generalized tone-syl- ter REG A is a space. Once this occurs, an entire lable modifier Q is either zero or one letter long, PPCW will have been placed in STRING(J) with 25 the total PCW can be either 1 , 2, 3 or 4 characters each character of the PPCW being stored in a long. The actual length of the first tone-syllable in consecutive element of STRING(J). The value of STRING(J) is determined in accordance with the the variable JMAX, that is, the length of the PPCW, subroutine of logic blocks 32-42. is also stored. Having completed the entry of a Once this determination has been made, the single PPCW into STRING(J), the program pro- 30 length of the PCW can be unambiguously deter- ceeds to decision block 24. mined by examining the two characters immedi- Having placed a PPCW in array STRING(J), ately succeeding the tone-syllable. This is achieved the program must determine if the last character of in accordance with the subroutine of blocks 44-76. the PPCW represents the retroflex ideogram, also More particularly, these characters are examined to referred to as the retroflex vowel. This is done in 35 determine if they take any one of the forms CS, logic blocks 24-30. Proceeding to logic block 24, CV, SV or VP (P = 0, C, V, Z, or S) which the pro gram first determines if the final character corresponds to the first two letters of the permis- in STRING(J) is the voweltone 79. If it is not, the sible tone-syllable forms, CSV, CV, SV and VP. If final character in the PPCW does not represent the they do take on the forms CS, CV, SV or VP, then retroflex ideogram and the program can imme- 40 these two letters define the beginning of a second diately proceed to decision block 32. tone-syllable in STRING(J), Q is equal to the null If the final character in STRING(J) is the vowel- set, and the length of the PCW is equal to the tone 79, further investigation must be made to length of the tone-syllable. If they do not take one determine if it represents the retroflex ideogram. In of these forms, then Q is the generalized semantic accordance with the rules set forth above, vowel- 45 classifier G, and the length of the PCW is equal to tone 79 cannot be used as a semantic classifier. the length of the tone-syllable plus 1 . For this reason, it cannot follow a voweltone as part Turning to Fig. 7B, the subroutine for determin- of a tone-syllable. If the voweltone 79 follows an- ing the length of the first tone-syllable in STRING other voweltone, it must represent the retroflex (J) begins at decision block 32. The computer first vowel. Similarly, as shown in Fig. 4D, it cannot so determines if the first character of the PPCW lo- follow the consonants 1, 3, 4, 7-11 or 18 as part of cated in STRING(J) is a semi-consonant. If it is, the a tone-syllable. (While the combinations 3-79 and tone-syllable must take the form SV and therefore 8-79 do form tone-syllables which occur in the has two letters. For this reason, the program pro- Chinese language, to avoid ambiguity these are ceeds to block 34 and sets the variable n = 2. The specifically excluded from those letter combina- 55 variable n indicates the number of letters in the tions which form permissible tone-syllables. See tone-syllable. Fig. 4D.) Thus, if the voweltone 79 follows either a If the first element of STRING(J) is not a semi- vowel or one of the consonants 1, 3, 4, 7-11 or 18, consonant, the program procees to decision block

11 !1 1271 619

36 and determines if the first element of STRING- voweltone (it should be rememoerea tnat me vari- J) is a vowel. If it is, the tone-syllable consists of a able M has been increased to the value N in the /, and the variable n is set equal to 1 (block 38). If subroutine encompassing blocks 46-50). If the last he first element in STRING(J) is neither a semi- character in SEG(M) is a voweltone, a determina- consonant nor a voweltone, it must be a consonant, 5 tion is made as to whether the second to last n such case, the tone-syllable can take the form character in SEG(M) is a voweltone. If it is, an error 3SV or CV, depending upon whether the second condition exists (the entry rules of the PCL prevent character in STRING(J) is a semi-consonant or a second PCW of a string from beginning in a /oweltone. To make this determination, the pro- voweltone). If an error condition exists, the program gram proceeds to decision block 40 and deter- w proceeds to instruction block 56 and enables a bell nines if the second character in STRING(J) is a or other error indicator. The program then pro- semi-consonant. If it is, the tone-sylable takes the ceeds to instruction block 58 where the error flag E form CSV and the variable n is set equal to 3 is set equal to 1 and the variable p is set equal to [block 42). If the second element is not a semi- N. As will be described below, this will cause the consonant, the tone-syllable takes the form CV and 75 entire string stored in SEG(M) to be displayed on the variable n is set equal to 2 (block 34). the display screen sos that the individual entering Once the subroutine comprising blocks 32-42 the PCW can examine it and determine where the has determined the number of characters in the entry mistake was made. tone-syllable and set the variable n equal to that If the second to last character of SEG(M) is not number, a string n + 2 characters long must be 20 a voweltone (block 54), the program proceeds to s-xamined to determine whether the generalized decision block 62 and determines if it is a zero tone-syllable modifier Q is equal to the null set or consonant. If it is, the zero consonant flag Z is set squal to G. This is done in the subroutine including equal to 1 and the variable p is set equal to n block 44-76. (blocks 64 and 65). If the second to last character Beginning at instruction block 44, the program 25 is not a zero consonant, the program proceeds sets the variables N = n + 2, M = 1 and J = 1 . directly to instruction block 66 and the variable p is The variable N defines the number of characters set equal to n. In either case, a determination has which will be placed in the array SEG(M), the been made that the generalized tone-syllable modi- variable M defines the specific element of the array fier Q is equal to the null set and p has therefore SEG(M) being examined and the variable J deter- 30 been set equal to n. This identifies the PCW as mines the specific element of the array STRING(J) being equal to the tone-syllable alone. being examined. Before the two characters imme- Returning to decision block 52, if the last char- diately succeeding the tone-syllable can be exam- acter in SEG(M) is not a voweltone, the program ined, the first N characters of STRING(J) must be determines if it is a semi-consonant (decision block copied into the array SEG(M). This is done in 35 68). If it is, the program next determines if the accordance with logic blocks 46-50. second to last element in SEG(M) is a consonant Once this has been completed, the program (block 70). If it is, the second PCW begins with the proceeds to the subroutine including decision second to last character in SEG(M) and the first blocks 52-76 wherein a determination is made as PCW is therefore n characters long. For this rea- to whether the PCW includes n or n + 1 char- 40 son, PCW length variable p is set equal to n (block acters (i.e. whether the generalized tone-syllable Q 66). If the second to last character in SEG(M) is not is a letter or the null set). This is achieved by a consonant, the semi-consonant located in the last looking at the last two characters in the array SEG- position of SEG(M) is the beginning of the second (M) and determining whether the two characters PCW in STRING(J) and therefore the first PCW in take the form CS, CV, SV or VP and therefore 45 STRING(J) is n + 1 characters long. For this which of those two characters is the first character reason, the program proceeds to instruction block of a second PCW in STRING(J). If the second to 76 wherein the PCW length variable p is set equal last character in array SEG(M) is the first character to n + 1 . of a second piece of PCW in STRING(J), then it is Returning to decision block 68, if the last char- not a semantic classifier and the length of the PCW so acter in SEG(M) is neither a voweltone nor a semi- is equal to the length of the tone-syllable. If the last consonant, it must be either a consonant or the character of the array SEG(M) is the first character zero consonant. In such a case, the first PCW in of a second PCW in STRING(J), then the second STRING(J) is n + 1 characters long and the PCW to last character in SEG(M) is a semantic classifier. length variable p is set equal to n + 1 in instruc- In such a case, the first PCW in STRING(J) is one 55 tion block 76. Before proceeding to instruction character longer than the tone-syllable. block 76, the program proceeds to decision block Beginning at instruction block 52, the program 72 to determine if the last character in SEG(M) is a determines if the last character in SEG(M) is a zero consonant. If it is, the zero consonant variable

12 23 )271 619 24

Z is set equal to 1 . As will be shown below, this will set equal to 1 . If it is, the PCW length variable p is -esult in the zero consonant being removed from set equal to p + 2 (block 98) and the program 3TRING(J) later in the program. proceeds to instruction block 100. If the error flag At this point, the program has unambiguously is not set equal to 1 , the program proceeds directly determined how many characters are in the first 5 to instruction block 100. PCW in STRING(J). This PCW is then placed in In accordance with instruction block 100, the the array PCW(X) in accordance with the subrou- variable J is set equal to 1 and the program enters tine comprising block 78-84. the loop including blocks 102-106. Each of the Proceeding to decision block 86, the computer elements in STRING(J) is effectively moved to the determines whether the error flag E is equal to 1. If w left by p characters to insure that the first letter of it is, the program proceeds to instruction block 88 the second PCW in STRING(J) is located in the and displays information stored in SEG(M) on a first element position of STRING(J). At decision display to enable the keyboard operator to deter- block 104, this process is continued as long as J is mine what his or her entry error was. less than JMAX, which is set at block 23 to be the If the error flag is not equal to 1 , the program 75 value J at which a space is first detected at de- proceeds to instruction block 90. The computer will cision block 16 (see Fig. 7A). Once this has been have a monosyllabic dictionary which uniquely done, the program proceeds to instruction block equates each possible PCW to one and only one 108 and determines if array STRING(J) is empty. ideogram. The program looks at the ideogram At this point in the program, the first PCW in identified by the PCW in array PCW(X) and dis- 20 STRING(J) has been analyzed and displayed and plays this ideogram on the display. the letters in STRING(J) have been shifted to the At this point, the next procedure to be per- left to place the first letter of the second PCW in formed is to examine the next PCW in STRING(J) STRING(J) in the first element position of STRING- to identify its ideogram and display it on the dis- (J). If there are any additional PCWs in STRING(J) play. As described above, the subroutine consisting 25 (block 108), the program returns to decision block of blocks 32-90 analyzes the first PCW indicated in 32 (Fig. 7B) and analyzes the first PCW now lo- STRING(J) and assumes that the letter located in cated in STRING(J) following the procedures de- the first element position of STRING(J) is the be- scribed above. Once this PCW has been analyzed ginning of the first PCW in STRING(J). In order for and displayed, the characters in STRING(J) are the program to analyze the second PCW in 30 again be moved to the left to ensure that the first STRING(J), each of the characters in STRING(J) letter of the next PCW in STRING(J) is located in must be shifted over to the left by a sufficient the first element position of STRING(J). This pro- number of positions to ensure that the first letter of cess is continued until all of the PCWs in STRING- the second PCW in STRING(J) is located in the (J) have been evaluated and displayed (until first element position of STRING(J). This procedure 35 STRING(J) is empty). is carried out in blocks 92-104 of Fig. 7D. Once STRING(J) is empty, the program pro- As discussed above, a PCW length variable p ceeds to decision block 110 and determines if the is set in blocks 58-66 and 76 equal to the number retroflex vowel flag RV is set equal to 1 . If it is, the of letters in the first PCW in STRING(J). The letters retroflex ideogram is displayed on the display must be removed from STRING(J) in order for the 40 (block 112) and the program returns to instruction program to evaluate the second PCW in STRING- block 10 to await the first element of the next (J). One additional letter must be removed if the PPCW string. If the retroflex vowel flag RV is not zero consonant has been used as a syllable-sepa- set equal to 1 , the program proceeds immediately rating letter between the first and second PCWs in to block 10. STRING(J). Two additional letters must be re- 45 An important feature of the foregoing program moved if an error condition was found to exist (which is shown only by way of example) is that a since p + 2 letters have already been displayed on string of PCA characters (preferably, but not nec- the display to enable the keyboard operator to essarily, representing a PPCW), can automatically determine his error and correct the same. This be divided into individual PCWs and then con- result is achieved in the subroutine comprising so verted unambiguously into the appropraite ideo- blocks 92-104 (see Fig. 7D). grams utilizing a monosyllabic dictionary of PCWs Beginning with block 92, the program deter- to ideograms. This avoids the need for polysyllabic mines whether the zero consonant flag Z is set dictionaries and permits the PCL to follow the ideo- equal to 1 . If it is, the PCW length variable p is set graphic nature of the written Chinese language. equal to p + 1 and the program proceeds to 55 instruction block 100. If the zero consonant flag is not set equal to 1, the program proceeds to de- cision block 96 and determines if the error flag is

13 !5 I 271 619 :e>

Z. Alphagrammic Listing array and placed in mass-storage ror suDsequent sorting. When all of the strings to be sorted have Another major feature of the PCL is that it can been passed through the separation logic and ce used to simply and directly create alphagram- placed in mass-storage, they are sorted in alpha- nic listings of both monosyllabic and polysyllabic 5 betical order and the virtual space is treated as the vords. An alphagrammic listing is one which is letter preceding the letter 1 . This will automatically substantially in alphabetical order but also ensures generate the type of alphagrammic listing illus- hat polysyllabic words or phrases beginning with trated in Fig. 8. tie same ideograms are grouped together, even if As a further exception to purely alphabetical i straight alphabetic ordering would separate these 70 order, the alphagrammic listing should also keep common ideograms. This can best be understood together PCWs having the same tone. LFor exam- with reference to Fig. 8, which is an alphagrammic ple the word LMNV, where V is the silent vowel 27, dictionary listing created utilizing the PCL of the should be followed by the word LMNV, where V is Dresent invention. In Fig. 8, the left most column the silent vowel 79, since LMNV and LMNV are comprises PPCWs, and the next column comprises 75 homotones both being pronounced with the first the corresponding ideograms. tone. The next two words listed should be, for In any alphabetical representation of the Chi- example, LMNV and LMNV, where V" and V™ are nese language, the number of letters utilized to the vowels 28 and 80, respectively. The latter two represent a given tone-syllable will vary as a func- words have the same sound as LMNV and LMNV, tion of the form of the tone-syllable (CSV, CV, SV 20 but are pronounced with the second tone. This is 3r V). The use of a semantic classifier will also vary achieved as follows. the number of letters in a PCW. A purely alphabeti- Figs. 12A and 12B are flow diagrams illustrat- cal listing would cause some Chinese compounds ing a COMPARE routine for use in comparing pairs having the same first ideogram to be .separated. of PCL text lines, word-by-word or syllable-by- For example, in Fig. 8, the words /(.Hl^ A j% , 25 syllable, to determine which of the lines should be and would be moved down to the position placed first in alphagrammic order. COMPARE has indicated by the dashed line, since the letter J£_ is the further feature that English text lines are placed assigned number 83 and the letter h is assigned in normal alphabetical order. COMPARE is applied number 35. This would result in ideographic words within an overall sorting procedure referred to here- in the second column having the same first ideo- 30 in as SORT, which rearranges the text lines after gram being separated from one another. the COMPARE routine identifies the proper order. The present invention avoids such a separation Before applying COMPARE, an entire text file by utilizing a modified form of the separation logic is loaded into a working memory. Each field con- described above to insert a virtual space between taining a word, phrase, etc., to be ordered is placed PCWs of a PPCW before sorting the PPCWs to be 35 on a separate line. The SORT program builds up listed in alphagrammic order. The virtual space is an array of pointers to the beginning of each line of assigned the number "0" and is therefore treated text; that is, an array containing the address of the by the sorting routine as being a letter before the first character of each line. The end of each line is also marked with detectable character. letter 1 , and before all PCA letters. a Virtual spaces can be inserted by a modified 40 COMPARE receives the addresses of pairs of form of the separation logic of Fig. 7B-7D, particu- lines to be compared; that is, COMPARE has two larly blocks 32-84 and 92-108. To use the separa- arguments, ARRAYLINE1 and ARRAYLINE2, each tion logic for the purpose of inserting a virtual of these arguments being an address from the the space into a PPCW to enable an alphagrammic array created by SORT. COMPARE processes listing, the separation logic can be modified as 45 indicated lines at these addresses and returns a follows. Blocks 54-64 and blocks 72-74 are not value which indicates whether they are in the cor- required and can be removed. In lieu of the blocks rect order to form an alphagrammic listing. If the 82-90 of the flow chart of Fig. 7C, the PCW stored lines are found to be out of order, SORT preferably in string PCW(X) can be placed in a holding array switches the pointers of the two lines, rather than which holds the entire string of letters (this can be so the lines themselves. more than one PCW) into which a virtual space is In the following, the two lines to be compared being added. After the PCW is placed in the hold- by COMPARE will be referred to as Line 1 and ing array, a virtual space is placed in the next Line 2. At block 210, COMPARE sets two counters element of the holding array. Thereafter, the pro- 11 - 12 = 0. 11 and 12 are indexes to the current examined in gram returns to block 92 and keeps looping 55 character in the word or syllable being through the separation logic until all of the PCWs Line 1 and Line 2, respectively. In this algorithm, 11 of the string are placed in the holding array. At this is ordinarily equal to 12, as discussed further below. point the entire string is removed from the holding At blocks 220-230 it is determined whether

14 27 0 271 619 28 data remains to be compared in both line 1 and the current words are identical, and no switching is line 2. If not, then either previous processing has required. At 248 the routine advances to the next reached the ends of both lines without detecting word by setting 11 =-END1 and I2 = END2. Next, at any difference, or else for some reason neither line block 205 11 and 12 are each incremented by one contains any data. At decision block 220 it is deter- 5 to begin the examination of the next word. mined whether end-of-line characters are detected If CMP does not equal 0, then at block 250, for both of Lines 1 and 2. If so, then at instruction COMPARE returns CMP, that is, either -1 or +1, block 222 the COMPARE algorithm returns a value according to whether the Line 1 current word is of 0. A value of 0 indicates that no difference has less than or greater than that in Line 2. In the latter been detected between the two lines so as to w case the lines are to be switched. require switching of address pointers. If it is not If it is determined at decision block 242 that true that the ends of both Line 1 and Line 2 have both current words are PCL words, that is, PCWs been reached, then at decision block 224 it is or PPCWs, then they must be compared syllable- determined whether the end of Line 1 has been by-syllable (ideogram-by-ideogram). This is carried reached. If so, then at 226 the routine returns -1, 15 out in steps 260-284. since Line 1 is shorter than Line 2 but is otherwise Referring to Fig. 12B, at instruction block 260, the same, and thus no switching is to be per- the end of the first word, or the first syllable of the formed. If u.ne 1 has not ended, then at 228 it is current PPCW, is found using the separation logic determined whether Line 2 has ended. If so, a discussed previously. The SEPARATE subroutine value of + 1 is returned at 230. A returned value of 20 returns values ENDSYL1 and ENDSYL2. ENDSYL1 + 1 indicates that Line 1 and Line 2 are to be represents the index of the end of the first syllable switched, since Line 2 is shorter than Line 1 . in Line 1 that occurs between 11 and END1. Simi- If neither line is determined to be shorter, then larly, ENDSYL2 is the index of the end of the next COMPARE examines the next word or syllable in syllable in Line 2. Lines 1 and 2 to determine their proper alphagram- 25 After the syllable ends have been found, then mic order. at block 262 the current syllables are compared If it is not true that both words are PCL words, with respect to tone. This is performed by a sub- for example if one is an English word, then they routine referred to as TONECOMPARE. are placed in order in steps 240-250. At instruction TONECOMPARE is similar to COMPARETEXT, but block 240, pointers END1 and END2 are set at the 30 is modified according to the rule described addresses of the spaces following the current hereinabove that homotones must appear together words in Lines 1 and 2, respectively. By conven- in an alphagrammic listing, and further must be tion, words in the PCL are separated by spaces. placed in alphabetical order with respect to one Thus, spaces serve as convenient delimiters for a another. It also disregards final characters that word-by-word comparison of the contents of Line 1 35 could cause PPCWs having the same initial ideo- and Line 2. Multiple blanks, control codes, the zero gram to be separated. One advantageous feature of consonant used as a syllable delimiter, and other TONECOMPARE is that it transforms all homo- irrelevant characters are ignored. 11 or 12 can be tones of a given tone-syllable into a single pre- incremented to bypass such characters, in which determined form having the same particular pro- case these two counters might not remain equal. 40 nunciation, and then applies COMPARETEXT. At decision block 242 the current words are TONECOMPARE returns a value CMPT, which examined to determine whether they are both is 0 if the current syllables are homotones, and is words from the Phonetic Chinese Language. If not, -1 or + 1 according to whether the current syllable then at block 244 a function COMPARETEXT is of Line 1 is less than or greater than the current applied to the two current words. COMPARETEXT 45 syllable of Line 2. At block 264, if CMPT does not examines each character in the portion of Line 1 equal 0, then TONECOMPARE returns the value from the current position, indicated by 11, to the CMPT at instruction block 266. If, however, CMPT end position indicated by END1. Similarly, COM- is equal to 0, then the two current syllables are PARETEXT examines the content of Line 2 from 12 homotones and it must be determined whether to END2. These two words are compared strictly 50 they are in the correct order alphabetically. To alphabetically; for example, according to the stan- accomplish this, COMPARE then applies the COM- dard ASCII or CSCII (see Fig. 13) sorting order. PARETEXT subroutine, described above, to the COMPARETEXT returns a value CMP, which current syllables. In comparing PCL letters COM- equals 0, -1 , or +1 according to whether the word PARETEXT follows conventions similar to standard in array line 1 is equal to, less than, or greater than 55 ASCII or CSCII (see Fig. 13) sorting. The system the word in Line 2, according to the usual lexical assigns digital values to PCL characters that are conventions. above the values assigned to the ASCII character At 246 it is determined whether CMP = 0. If so, set, so SORT places PCL letters alphabetically

15 !9 ) 271 619 su after English letters. At instruction block 268, COM- data or word processing system. 3ARETEXT returns a value CMP in a manner simi- There are many published studies concerning ar to that described above. At 270 it is determined efficient keyboard layouts. Perhaps the most fam- vhether CMP is equal to 0. If not, then at 272 ous is entitled Typing Behavior, American Book COMPARE returns the value CMP, which is either 5 Company, New York, 1936, by A. Dvorak et al. This @1 or +1. study suggests that the placement of characters on If, however, CMP is equal to 0, then in addition a keyboard should be determined on a statistical :o being homotones the two current syllables are basis so that the typist moves his fingers from the alphabetically identical. At 274 the routine then home keys (the keys "a, s, d, f, j, k, I, ;" on the advances to the next current syllable by setting w QWERTY keyboard) as little as possible. To this I = ENDSYL1 and I2 = ENDSYL2. end, the most frequently used group of keys are At 276 the routine tests to determine whether located in the home row (the third row of Fig. 10), :he end of either word has been reached. That is, it the second most frequently used group of keys are s determined whether 11 is less than END1 as well located in the row immediately above the home as 12 being less than END2. If neither word is 75 row (the second row of Fig. 10), the third most 3nded, then the system returns to block 260 to frequently used group of keys are located in the determine the end of the next two syllables in row immediately below the home row (the fourth Lines 1 and 2 and to apply TONECOMPARE. row of Fig. 10), and the least frequently used group If, however, at decision block 276 the end of of keys are located two rows above the home row one word has been reached, it is then determined 20 (the top row of Fig. 10). Within each row, the most at block 278 whether the ends of both words have frequently used keys are the index finger keys, the oeen reached. If so, the routine passes to instruc- second most frequently used keys are the middle tion block 205, where 11 and 12 are both incre- finger keys, the third most frequently used keys mented by one and the comparison of the next two are the ring finger keys and the fourth most fre- current words is continued. 25 quently used keys are the little finger keys. If it is determined at decision block 278 that While the Dvorak system is usually the most the end of only one word has been reached, then efficient, it does not take into account the desirabil- at block 280 it is determined whether it is the end ity of alternately typing with the left and right hand of the current word of Line 1 that has been as much as possible. The keyboard of the present reached. If not, that is if 11 is less than END1, then 30 invention achieves this result by placing all of the the current word in Line 1 is longer than the current consonants, and preferably all of the semi-con- word in Line 2 and the two lines should ' be sonants, on the right side of the keyboard so that switched. Accordingly, at block 282 the routine they are typed by the right hand of the operator. returns a value of +1. If, on the other hand, The most frequently used voweltones are located side of the Some vowel- II = END1 , then it is the end of the current word in 35 on the left-hand keyboard. Line 1 that has been reached, so no switching is tones must be located on the right-hand side of the required. Accordingly, at block 284 a value of -1 is keyboard since there are more voweltones than returned. keys on the left-hand side of the keyboard. As used herein, the left-hand side of the keyboard 40 refers to those keys to the left side of the dark D. Keyboard lines in Fig. 10. These keys are struck with the left hand. The right-hand side of the keyboard refers to A keyboard which is particularly efficient in those keys of the keyboard located to the right of entering the PCA into a computer system, word the dark lines in Fig. 10. These keys are struck by processor, or the like, is illustrated in Fig. 10. The 45 the right hand. physical arrangement of the keyboard is identical The present invention also determines where to to a standard QWERTY keyboard and the standard place the letters of the keyboard as a function of QWERTY symbols are shown in the left portion of the uppercase and lowercase conditions of the each key position. The PCA letters which cor- keyboard. Since the PCA contains 85 letters, they respond to each key position are shown on the 50 cannot all be placed on the lowercase of the key- right side of each key. Two PCA letters are shown board. Only 43 can be placed on the lowercase of with respect to each key position. The upper right- the keyboard. By selecting the particular letters hand letter corresponds to the uppercase position shown in Fig. 10, 74% of the letters used based on of the keyboard (where the shift key has been frequency of usage are contained in the lowercase. depressed) and the lower right-hand letter of each 55 The keyboard of the present invention also key position. This keyboard arrangement maxi- determines the location of the letters on the keys mizes the efficiency with which a typist or key- as a function of the tones the voweltones carry. board operator can enter PCL information into a The most frequently occurring tone is tone 4, so

16 31 0 271 619 32 the voweltones carrying the fourth tone are all 4. A method as claimed in claim 2 or claim 3, located on the home row (row three of Fig. 10). characterised in that the consonants include The second most frequently used tone is tone 1, a) a plurality of short consonants, each of and all of the voweltones carrying this tone are which represents a respective consonant sound; located on the second row. The third most fre- 5 b) a plurality of long consonants, each of quently used tone is tone 2, and all of the vowel- which represents a respective consonant sound tones carrying tone 2 are located on the bottom pronounced with a respective vowel sound; and row of the keyboard. The least frequently used c) a silent zero consonant. tone is tone 3, and all tone-syllables carrying this 5. A method as claimed in any preceding claim tone are located in the top row of the keyboard. w characterised in that each such PCW has the form To make it easier to learn the location of the TS + Q, wherein letters of the keyboard, the keyboard of Fig. 10 a) TS is a tone-syllable having one of the also groups voweltones families together so that to forms CV, CSV, SV, and V; C being a consonant, S a substantial extent all voweltones of a given family being a semi-consonant, and V being a voweltone; are entered using the same finger. Referring to Fig. 75 and 10, the voweltone family 47-50 are all typed by the b) Q is a generalised tone-syllable modifier left little finger, the voweltone family 51-54 are all which indicates the meaning for distinguishing be- typed by the left ring finger, the voweltone family tween homotones. 71-74 are all typed by the left middle finger, and so 6. A method as claimed in claim 5, charac- on. 20 terised in that Q has one of the forms 0 and G, wherein a) 0 is the null set; and Claims b) G is a generalised semantic classifier comprising a PCA letter added to the tone-syllable 1. A method of digitally encoding and storing 25 TS to the extent necessary for distinguishing be- the ideographic Chinese language, characterised tween homotones. by 7. A method as claimed in claim 6, charac- a) selecting a set of Chinese ideograms to terised in that G has one of the forms C, V, S and be encoded and stored; Z, wherein Z is the zero consonant. b) selecting one and only one digital repre- 30 8. A method as claimed in claim 7, charac- sentation for each selected ideogram; terised by selecting a primary set of at least about c) selecting a set of letters for a phonetic 8000 ideograms which are those most frequently Chinese alphabet (PCA) which can be formed into used in the Chinese language; wherein at least phonetic Chinese words (PCWs) which fully identify about 3900 ideograms of the primary set, which the pronunciation of such selected ideograms; 35 account for at least about 97 percent of usage, are d) selecting one and only one digital repre- uniquely identified by PCWs having one of the sentation for each PCA letter; and forms TS + 0, TS + V*, and TS + Z, V* being the e) storing a monosyllabic dictionary which same voweltone as that in the tone-syllable TS; identifies a one-to-one relationship between the re- and wherein all of the remaining ideograms of the spective digital representations of each selected 40 Chinese language are uniquely identified by PCWs ideogram and its corresponding PCW. having the form TS + G, where G is a PCA letter 2. A method as claimed in claim 1, charac- other than V* or Z. terised in that the PCA letters represent the follow- 9. A method as claimed in any preceding ing language elements: claim, characterised in that each PCW comprises a) a plurality of tones; 45 no more than 4 PCA letters. b) a plurality of vowels; including 10. A method of laying out a keyboard for 1) a plurality of voweltones, each of which repre- processing a phonetic Chinese alphabet (PCA), the sents a given vowel sound pronounced with a given PCA comprising a plurality of voweltones each of tone, and which represents a vowel sound pronounced with a 2) a plurality of semi-consonants, each of which so respective one of four tones which occur in Chi- represents a given vowel sound irrespective of nese; characterised by tone; and a) laying out at least four rows of keys c) a plurality of consonants. defined sequentially as a top row, a second row, a 3. A method as claimed in claim 2, charac- home row, and a bottom row; terised in that each of the voweltones comprises a 55 b) determining the relative frequencies of base character and an indicia incorporated therein use of the four tones; and which indicates the tone. c) associating voweltones having the most frequently used tone with keys in the home row.

17 13 1271 619 >4

11. Method as claimed in claim 10, charac- each phonetic Chinese word representing one ana erised by only one ideogram and providing the sound and a) defining segments of the keyboard in tone information required to pronounce that ideo- vhich the keys are to be operated by the same gram; and inger; and 5 processing the continuous string so as to deter- b) associalting a plurality of voweltones that mine unambiguously the beginning and end of nave the same vowel sound, but have different each phonetic Chinese word in the string. ones, with the keys of one of such segments. 17. A method of creating an alphagrammic 12. A method as claimed in claim 11, wherein listing of a set of word strings, each word string :he PCA further comprises a plurality of consonants ro including a plurality of characters which combine to and semi-consonants; characterised by form one or more phonetic Chinese words, each a) determining the relative frequency of use phonetic Chinese word representing one and only Df the consonants, semi-consonants, and vowel- one Chinese ideogram and providing the sound tones; and tone information required to pronounce that b) associating frequently-used voweltones ?5 ideogram, said characters having a predetermined with keys to be operated by one hand; and alphabetical order; characterised by the steps of: c) associating frequently-used consonants sorting a set of word strings in alphagrammic order and semi-consonants with keys to be operated by wherein the word strings are listed in the alphabeti- the other hand. cal order of the characters in that word string, the 13. A keyboard for entering letters of a pho- 20 alphabetical order being overridden to the extent netic Chinese alphabet (PCA) into a computer sys- that (a) all strings whose corresponding first Chi- tem or the like, wherein the PCA comprises a nese ideograms are identical are listed together plurality of voweltones, each of which represents a and (b) all words pronounced with the same sound vowel sound pronounced with a respective one of and tone are considered as units for purposes of four tones which occur in Chinese; a plurality of 25 alphabetisation; all strings within the groups (a) and consonants; and a plurality of semi-consonants; (b) being listed in alphabetical order with respect to characterised by one another. a) a plurality of keys divided into left and 18. A method of processing character strings, right sections to be operated respectively by the characterised by left and right hands; 30 a) entering a string of letters of a phonetic b) keys of one section being adapted for Chinese alphabet (PCA); wherein entering frequently-used voweltones; and 1) the PCA includes respective pluralities of vowel- c) keys of the other section being adapted tones (V), semi-consonants (S), and consonants for entering frequently-used consonants and semi- (C), and including a zero consonant (Z); consonants. 35 2) the strisng of letters includes at least two sepa- 14. A keyboard as claimed in claim 13, charac- rate phonetic Chinese words (PCWs), each said terised in that PCW having the form TS + Q, wherein TS is a tone- a) the keys are divided into four rows, name- syllable having one of the forms CV, CSV, SV and ly a top row, a second row, a home row, and a V, and Q is a generalised meaning-indicating modi- bottom row; and in that 40 fier having one of two forms, namely a PCA letter b) keys in the home row are adapted for and the omission of any PCA letter; provided that entering voweltones having the tone that is most Q cannot take the form of one voweltone (RV) frequently used. which is employed to indicate the retroflex ideo- 15. A keyboard as claimed in claim 14, charac- gram when it occurs at the end of a character terised in that 45 string; a) the keys are divided into groups of keys 3) each of the PCWs represents one and only one designated to be operated by the same finger; and Chinese ideogram and provides the sound and b) the keys of at least one such group are tone information required to pronounce that ideo- adapted for entering voweltones having the same gram; and vowel sound but different tones. so 4) each non-initial PCW that has the form V + Q is 16. A text processing method, characterised by preceded in such string by the zero consonant, and the steps of: each non-initial PCW that has the form SV + Q is entering a string of phonetic Chinese language preceded in such string by the zero consonant characters, each character identifying a sound whenever such last-mentioned PCW follows a PCW and/or tone of the Chinese language, the string of 55 having one of the forms CVC and CSVC; and characters including at least two groups of char- b) separating the string unambiguously into acters, each group of characters defining a pho- the separate phonetic Chinese words included netic Chinese word of variable character length, therein.

18 35 0 271 619 36

19. A method as claimed in claim 18, charac- terised by a) defining a predetermined alphabetical or- der for the PCA letters; b) entering at least two of the strings of PCA 5 letters; and c) sorting the strings in alphagrammic order wherein the strings are listed in the alphabetical order of the letters in that string, the alphabetical order being overridden to the extent that (a) all w strings whose corresponding first Chinese ideo- grams are identical are listed together, and (b) all PCWs pronounced with the same sound and tone are considered as units for purposes of al- phabetisation; ail strings within the groups (a) and 15 (b) being listed in alphabetical order with respect to one another. 20. A method of encoding and storing Chinese ideograms, characterised by a) selecting a set of letters for a phonetic 20 Chinese alphabet (PCA) which is capable of uniquely identifying all phonetic tone-syllables in Chinese; b) selecting one and only one 7-bit digital representation for each PCA letter; 25 c) selecting a set of Chinese ideograms to be encoded and stored; d) selecting one and only one phonetic Chi- nese word (PCW) composed of PCA letters for uniquely identifying each selected ideogram; and 30 e) storing a monosyllabic dictionary which identifies a one-to-one relationship between the re- spective digital representations of each selected ideogram and its corresponding PCW. 35

40

45

50

55

19 0 271 619

\TS / Z 3 4 CV CSV SV V

/ i CV CSV *SV -XV

Z C CVC CSVC xsvc XVC

3 Z CVZ CSVZ XSVZ -*VZ

^ V cvv csvv *SVV xvv

£ S CVS C5VS *5VS *VS 0 271 619

PHONETIC CHINESE Al PHARFT (PCAl ** rS «« PYXYXX (pa)

« iif # + £ 11 £ *

CONSONANTS t.^rk

I 2 3 4 5678 9 10 1I

Short PP t * J] * ft HH

b p m r dtnl gkh

12 13 14 15 16 17 18 19 20 21 22 83 84 85 M-t? Long ±t>B ^ + £ T ¥lLf j q x zh ch sh r z c s 0 Y W Yu

Tone VOWELTONES ** 60 ** J3¥*« mm

23 27 31 35 39 43 47 51 55 59 63 67 71 75 79

in An.Lt"5$i1i4'X^ + 4'l- >J'

a 1/- el o/e u al u ou ao en eng an ongang er/t/-

24 28 32 36 40 44 48 52 56 60 64 68 72 76 80

(2) ARhfe? ^Jli XJfil ijc

25 29 33 37 41 45 49 53 57 61 65 69 73 77 81

(3) IfLES^SZJl^x^T^T 3"<

26 30 34 38 42 46 50 54 58 62 66 70 74 78 82

(4> Jftfefj^ ft i i i r

I

M M M Mc c c ~ 5 sag 5 « B E NO«! M Jrf J= * E § cBBciiae cccg * -§3E 358S-S-S-S2

s -ssgsggg-s-glE ails

= IS 11 1 1 II 1 5 I 1 1

i ill i s '§ j S SoSooooasSo o o o -

life ails _ | =5iiIe32i = s53iiiE a^ii 5 1 1 1 r | 1 *@ " « M 1 , , , 1 . Jl @E ? I 1 I s JT j? 1" ^ — .e S. H :a — e ss :=, a k >, @ ^ £ ? §* £ ? J « -I i i Is * -i -S 1 ih I « -3 J i I S g J = j J g n U " g "J .5 .2 a a a a B hj| •« 2 i 1 1 H j £1-3 & c« 2 i 1 . @*S J 2 @=g S o gSS o "Vl „ » -5. E --5 a £ 4 { 1 R If" - - 1 •= R |0 -Sal •@3 a -•=. "5 *B

1 i 1 1 r i n r TT i in

(ftiiljLiiniffi! n i f ? ? f s ? ? ? ? ? ? ? ? s I f ? * l* £ sJaEd-saaJsssl-s-is ails« @ J 1 1 J 1 1 I 1 1 J J 1 ills

- J i I i 1 ! § J n s § J ! J = lilg

§ I 1 I 1 ! I I § I § I J I I | | 1 § 1 S § J § § I 4 J 1 § Ills

U ll J 1 111 1 1 12 I i

* 2 i I n i i n i H 1 * a i ? * * T -R -B « i i 1 x • « £ i i S S i II I S & JS JS • • 2 & i £ o "J&ld4fl8«R8llll | t j j . u t>ia

:4A.

^"N. 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 A A X X K K H K LbEfe tiSSS

b IB PA PA PX PA PR PR Pfl Pft PL -- PE Pfc Pt pti P5 Pfi P 2 P PX PX -- px PJi PR p/i pft PL Pb -- Pfc Pt Pti P5 PS * m 3 *A *A *X *A *R *R *n, *ft -- *l *E *fe *fj *ti *5 *fi f A * *A *A « « $L #b *E *fc -- *t> -- --

d 5 n ju m nx ha im m m m he -- m as t 6 t %x u xx « *r *r *t -- -- #g 7 n 4- frA frA J jrA fcR «i frU *ft tE fet M> tfi -- irS 1 8 Jj nx M fix t)A MlJJDSffi *>L tib fiE Jfc ftfj as g 9 I IA IA JJ i* IE -- It It K 16 k 10 5 *A fX St fL *t *o *5 *s h 11 H HA HA HA HA HL Ht Hfi H5 HS j 12 M MA MA MX MA MR MR MJl Ma ML Mb ME Mt q 13 -fc -fcA -- « -bA -fcR -fcR -tJL -fcft -fcL -fcb -fcE tt x 14 9 9X 9X 91 S>K 9K 9R 9K 91 9h 9E 9t

* 15 i ±X ±X ±X ±X ±R ±R ±71 zft ±fe ±t ±t> 25 ±t :h 16 t 1>A tA « tt tJi I2R 1zft. 1st « tfi >h 17 > >A -- >X $,* $ji 5-R >/i >ft -- 5-b £t 5* £5 £4 18 r B Bft B5 B6

2 19 * *A U 5X 3Ji -- *JL 5ft 5b 36 - -JS c 20 + +A +X -Ml +R +ft +ft +s s 21 £ £A -- H kX £R -- fen. feft feS

0 22 f A A X A H. R n. ft LbEttfcSS

1 83 V YA YA YX Y* ¥R YR YJL Yft YL Yb YE Yt u 84 A iLA iLA J,X tLA JjL iLb iLE iLfe iLfc iL6 Ji5 iLfi U * 85 fl -- fE ft ft 0 271 619

@-4B.

73Y* a i 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 3 3 3^ * S 5 $ A & 2 £ ft ft R ft

b 1 B m B* B£ P5 BS p 2 P P3 P5 Pz PS P* PI PS PS Pft PX PR -- m 3 * -- *5 *3 ** *£ *5 *$ $ft *ft *R -- f 4 * #3 *3 $3 *3 *R --

d 5 73 733 735 733 T}% j» -- 135 ns 73ft -- 73R JJft t 6 * 23 25 23 23 2* -- 25 2$ 2ft 2ft 2fl 2ft n 7 * -- £3 « 45 ir* -- feZ feft 1 8 73 733 733 fc3 ft$ -- 732 -- 73$ 73* }3S 732 J3ft 73ft 73R J3ft

g 9 I 13 13 13 K I* " 15 IS Ift -- IR Ift k 10 * 53 *3 f* f*. " f5 f S fft " fJL fft h 11 H H3 HI H3 H$ H* MS H5 HS Hft Hft HR Hft

j 12 M MA M£ MI Mi Mft -- MR Mft q 13 -fc @fcA ±6 -tS -ti -Ul -fcft -fcR -- X 14 J *2 5»6 5>2 9k *R 9k

Zh 15 ± ±3 ±3 « a ±* ±fi ±5 a zft zft zR zft ch 16 t £3 « 123 « tfe re « feft trt, «. t* sh 17 > >3 £3 ^3 5* >$. -- >5 >£ £ft >ft >R Sft r 18 B -- B3 B3 B3 - Bft Bft

z 19 5 53 55 5S -- 3* -- 55 5£ 5ft. -- 5R 5ft c 20 + +3 +5 -- +5 +i +5 +S +ft s 21 £ £3 £3 £^ £* £$ £ft -- £R £ft

0 22 t 3 ? 3 * * ft 5 * 4 £ £ £ ft - R ft

i 83 + Hi Yft ¥R Yft u 84 J, iL3 iL3 Ji3 iL-S Ji^ ibft J.5 JiS u 85 J FA f 6 f 2 fiS 0 271 619

CZr-4C-

X. J3Y* <— a o — > < e n — > < — e n g — > < — a n — > 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 ^77 i i ¥ * X X x S 72 7$+±f*

b 1 B B+ Pi W B* PX -- Bx PS B? B2 B? P+ -- Bf Pf p 2 P Pi Pi FP P* PX P>< -- PS F? P2 P? P* P+ Pi -- P* m 3 * *i ** *X *x -- *S « *2 *+ *± *T ** f A $ *X *x #x *S *2 *7 *+ *± *f **

d 5 J] jw m iw ------us tt? -- nf -@ nf nt t 6 ^ 2} 22 2+ 2± 2? 2* n 7 4 4+ 4i 4

g 9 1 1+ I«P I* IX IX Ix IS I? -- 17 » 1+ -- If 1+ k 10 f 5+ -- f

j 12 M Mi Mi Mi" M* MX -- Mx MS M2 M2 M? M$ M+ -- Mf M* q 13 -t -ti -ti ^ -fc* -tX -fcx -fcx -fcS +.? -t2 -fc? -fc$ -b+ -fc± -fcf -fc* x 14 9 9* 5>i 9% 9* 9X 9* — 9$ 97 91 $>7 9% 9+ 9± 91 9* zh 15 z zi ±i zip z* zX -- zx zS ±? -- z? z$ z+ -- ±T ±+ Ch16 fe 1?Pl2i1?Pi2* l2Xl2Xl2Xi2S « « W W l2+l2±-fefl2+ sh 17 > >i >i >i >* >X >x >x >S >7 >2 >? >+ -- >T r 18 B -- Bi B£P B* -- BX Bx BS B2 B2 Bf Bf --

z 19 i ^^^J* 3x 3S 37 3$ 3+ 3± 3f 3+ c 20 + +i 4-i -MP 4-X +x *7 +2 -- +$ 44 +± +T 44 S 21 £i -- £P £* £X £7 £+ fcf £*

8 22 t i i i * X--* 7 2?$ + -- *

i 83 ¥ ¥i ¥*P ¥* ¥X ¥X Yx YS Y? Y2 Y? Y* Y+ Yi YT Y* u 84 Jj J,X JjX J,x JiS J.? iL2 iL? J.* tL+ J,± iLT iL+ u 85 f fX fx fx ft f+ fi ff ff 0 271 619

fl¥* < — o n g — > < — a n g — > < — e r > I < — i / 0 — > 71 72 73 74 75 76 77 78 79 80 81 82 i 79 80 81 82 \^ * 4 * $ h ± T t 'J< !li F ftj »> ll! .7 ft

b 1 B PI- -- PF Pt ; rt« Pft P 2 P PH P± PT Pt | P>J- PJj PiH Pft m 3 * *l- *1 *T -- ! *>J< td: *ft *ft f 4 * #h *± *T *t j

d 5 JD )# JS W Jlh -- DT JJt j J]'J« J3ilf j]-p jjft t 6 2 2* 2$ 2$ 2$ 21- 21 2T 2t | 2* 2d! -- 2ft n 7 t 44 it W ii tT tt ! -- 41' 4* irft 1 8 n ftp m ffi ffi J3h Jit JJF 7rt | J3'J< jit J>T« Jjft

g 9 1 1* -- if 1$ ih -- it it i k 10 f f* f$ fh fl -- ft ! h 11 H tttlBPH? HH-Itt 1 j 12 M M* -- M$ -- Ml- -- MT Mt ! M'J< Mi< M* Mft q 13 -t -- -fc* -fcl- +_t -fcT -fct j -fc'J- -fci< -fc* -fcft x 14 5> 5>4 -- $»$ $>fc 5>1 5>T $>+ J ?>)• $>d< ?ft zh 15 ± ±* -- z$ z$ zh -- zT zt j±>J< ±ih ±3: ift ch 1 6 t t* 12$ t* th kfc tT tt ! W tzt 1>T' 12ft sh 17 > >l- -- >T >t 1 >* >>Jj >ft r 18 B B4 B$ -- B1- B± BT Bt | z 19 J W @@ » » JT Jt j 3U< *L« 3ft 5ft c 20 4- 4-* 4-4 4-h 4-1 | 4-U* 4-ili 4-ft 4-ft S 21 fc fc* -- fe*P fc$ feh -- fcT fct j fc* -- fcj. fcft

0 22 t 1- 1 - t * ± ft ft |

1 83 ¥ ¥4» ¥4 ¥$ ¥$ Yh Yl YT ¥t ! ¥* Yd; ¥ft Yft u 84 ill iLI- JjI JiT Jit ! di'J1 iL'l' Jift Jift i85 f fijif4f$fi)i ! f •> fit 3HP fft U £.1 I O IS

7 i u UJiJ OJXH OJIO OD"£0 B351 8332 83-33 83-34 53-27 83-28 83-29 83-30 YA YA YI YA YL Yb YE Yfe ¥3i ¥*. YE Yft

PfL Pft Pft PYt p 2 P 3¥L --- PYE --- m 3 * m, *¥fc f 4 *

M+A JfL JJft t 6 2 f¥L --- *¥E *Yfe "i 7 4 rYL ir¥b --- *¥fc *TJl *¥*. feYR frYfc I 8 33 JJ+A •-- 33¥b 73YE 33¥fc tffll 33Y*. 33Y31 33YJI

3 9 I c 10 f i II ff

i 12 q 1 13 -fc ( 14 *

h 16 12 h 17 > r 18 B

I 3 J 20 4 21 £ 0 271 619

4R.

J3Y* i a o I n l ng 8355 8356 8357 8358 8359 8360 8361 8362 8363 8364 8365 8366 >?7 Y+ ¥4 ¥* ¥* ¥X ¥X ¥x ¥* ¥7 ¥2 ¥? ¥$

D 1 P B¥+ --- BYi BY* BYX BY* BY? --- BY? BY* P 2 P PYi P¥4 PY* PY* PYX PYX PYx P¥* PY7 P¥2 m 3 * *¥i *¥4 *¥¥ *¥* --- *¥X *¥x --- *¥2 *¥? *¥* f 4 *

d 5 n J3YY J3Y* J1Y? --- J]¥? )1Y? t 6 2 2Y4- *¥4 *¥¥ *¥$ 2¥? *¥2 *¥? *¥$ n 7 fr¥

¥?" fi¥$

g 9 I k 10 i h 11 H

] 12 H q 13 -fc x 14 5>

ZH 15 2 :h 16 1? 3h 17 :> r 18 B

i 19 -J : 20 + 3 21 £ 0 271 619

\. J3Y* < i a n > < — i a n g — » < u a » 8567 8368 8369 8370 8375 83-76 8377 83-78 84-23 84-24 8425 84-26 ¥+ ¥± ¥T ¥* Yh ¥1 ¥T ¥* iLA JjA J,! j,* b 1 B e¥+ --- PYf P¥f p 2 P P¥+ P¥± PYT P¥* m 3 * --- *¥± *¥f *¥* f 4 * d 5 J3 J1Y+ J3¥± J3¥? J3¥* t 6 * *¥f 2¥± *¥f *¥* n 7 fc fc¥+ fc¥± fc¥f fc¥t --- fc¥l --- fc¥1= 1 8 fc --- fc¥± fc¥T fc¥* --- fc¥i fc¥T fc¥t g 9 I LLA — LLX ILA k 10 f fJ,A --- fJ,X fJ,A h 1 1 H HiiiA fhLA --- HiLA j 12 M q 13 -fc X 14 9

Zh 15 z ZiLA --- zd.A ch 16 1? Sh 17 > ioLA --- >iLA --- r 18 B --- BiLA

2 19 ? c 20 4 S 21 £ 0 271 619

@-4H.

r— U 0 < — u a i > 84-.S1 84-32 84-33 84-34 84-35 84-36 84-37 8438 B443 8444 8445 8446 JiL Jib iLE iLt iLfi Jit? iLfJ iL6 Jix tL£ iL5 iLS

D P 2 P m 3 * f 4 * a 5 n JUL - JldiE Ud/d JUS JUti JUS t 6 * #J,L Ut *LE *«Lfc tfiLt tfiLS tfJifj tfdifi n 7 ir --- fciLS --- friLS 8 J3 JUt JiLS JU5 JUS y y l IiiiL --- IiLb IiLE LLti LLb LLtj LLS LL* --- LL£ LLS k 0 f i d.L f Job f JjE fife fj.fi f J£ f J,S h 1 H HJjL tut HJiE fkLfc HJit fhLtl HJifJ HJifi --- HJiS --- HiLS

J 12 M q 13 -fc x 14 $ zn lb 2 ZAL ZJ.S ZJit iLti ±JA --- iL3? ZJ,£ ch 16 1> feLL 1aLb feLfc --- feL* teLfc feL5 -feLS sh 17 > — >Jib >iLE >Jjfe JJ,t --- >Ji5 >Ji* --- >Jj5 >J,S r 18 B --- BJib BJjE BJit --- BAt) z iy * JJjL --- 3J.E 3J.t JiLt 3«Lfc *L5 *Ls : 20 4- UL --- 4-.LE Ut f-Jjfj 4nLt3 4-Jj5 4-Jj6 3 21 fc kLL fcJjb fcJ£ fcJjfe 6J,t --- fcJ,5 --- 0 271 619

J3¥* < uen > < — u a n > < u a n g — » X. 8459 8460 8461 846 2 8467 8468 8469 8470 8475 8476 8477 84-78 iLX JjX JjX J* Ji+ tL± JiT iL* Ah il J,T At

b 1 B p 2 P m 3 * f 4 *

d 5 73 73AX --- 73J,X 73JjS --- 73tL± JUiT 7U* t 6 * *iLX tfiLX *J,X tfiLS 2di+ *J,± *diT --- n 7 fc fcAX fcj,* fcLT --- 1 8 73 71LX 71LX --- TILS -"- 73iL± fcltf 7lL*

g 9 I LLx LL$ LL+ --- LLf LL* LLh — LLT LLt k 10 f fiLX --- fiLx fJA fd.+ --- fiLT --- iJih fiLi fiLT fiLt h 11 H HAX HJjX --- HAS fkL+ HJj± HiLf fhL* ULb thkfc HiLT HiLt

j 12 M q 13 -fc X 14 $ zh 15 i iLX --- ±J,x --- ±1+ --- ±J,f iL* iLh --- ±«LT id,* ch 16 t 1aLX iaLX fcLx --- -&L+ fekb tiLT 12J4 feLh feLl feLT teLt sh 17 > >J,x >JiS >Ji+ >iL* >Jih --- >.LT --- r 18 B BA* BAT ---

z 19 3 3AX --- 3iLx --- *L+ --- *LT 3Ji* c 20 4- UX UX Ux U* U+

T3¥* < u e > < u n » < u a n » 8551 85-32 8533 8534 8559 8560 8561 8562 8567 8568 8569 8570 \ ft ft fE ft fX fx fx f* f+ fi ff ff

b I P p 2 P m 3 * f 4 *

d 5 73 t 6 * n 7 4 fef t 1 8 ^

g 9 I k 10 i h 11 H

j 12 M MfL Mft MfE Mft 4fX Mf* Mf+ --- Mff Mff q 13 -t -fcfL -fcfb --- -tffe -fcf X -fcf >< -kf+ -tf± -fcJT -fcff x 14 J 5>fL *ft *fE $ft 9 fX 9fX --- *f* *f+ $f± *ff *f* zh 15 2 ch 16 1> sh 17 > r 18 B

z 19 3 c 20 1 s 21 £ 0 271 619

¥11 ¥R ¥n. ¥n

¥•> ¥* ¥* ¥ft bxoiSA.

U R n, n.

Jj"5 iL5 Ji^ AS

Jj'J1 Ji'Jj iL«H J,ft

3 5 3 %

f A f=? fZ f£

J* f4 fft 0 271 619

J7A.

pcw{x) } servos /?v, zg^=0 SjE7~ TA1/4X=0

SET S^/

T S£T T= J-f/

@/0 a r

Sfr 5T/?/A/G(J)--0\

(5) )271 619 0 271 619

L7£.

7dA S£T X =/

S£T M = X

&£r pcw(x) 32A = seo- (m)

y£s

A/0 Y£S

jy/CT/PA/AXYj £O0/< UP /£>£OG/?/\Af £)/S£>£Ay SEG-(M) C0*/?£5P0A/D/A/e- T0 OA/ D/S£>£AY AP/?Ay PCW(K) A/V0 £/SfLAy /A/ /V&XT WEOG-fiZ/IM P0S/T/OA/ 0£ D/S£>LAy 0 271 619

\7D-

S£T J= /

" 5T/?/A/G{J'+P) /

tD£0&?AA1 /A/ A/FX A >D£0&A?/\M />0S/r/0/A /a/ j?/?/*ZAy I

(2) 0 271 619

DICTIONARY OF PCL, MPOBRAMfi, AND FNfil T RH

PA fl eight PLt ft cup, glass PAP5 ;\ eight hundred PLL?!l' ft cup, glass PAPSL* AWE ei8ht million PE jfc north, northern PA-H- j\ ^ eight thousand PE# J(j £ northern part PA>R j\ -j- eighty TCWf Jfc fg north pole PA>JUrt= ;\ -j- eight hundred thou PEMArtf+ Jfc jg [§ arctic circle -PAY*} j\ jg eight hundred mill P5 ^ no, negate PAjL* j\ ^ eighty thousand PIP* ^ not necessary -PAJfe j\ % August (month) P5P¥+* ^ ^| inconvenient PAA n£ particle (final) PUJ* ^ |ij not quite J*Ad §§ particle (final) P5JJ* ^ •{§ not only PA $ pull out PiJJYS ^ uncertain $ ± Pul1 out P531ll,fe 'F If IBroin8> not so PA $| grip, handle PSIfc 7f i# not enough PA dad, father P5LL4 ^ , 7f only, merely PiPA £ £ dad, father P5HJ.fe ^ f- cannot Pir jj| finish PI4* ^ j| not see, not meet P*f*YP f| ^ dismiss P54*JJ* ^ j| f f don't appear to be PAflt g| JQ labor strike P54*in ^ j| V disappeared, missing Ptf*4f §g |5 school strike P55* ^ J| be not, is not PAt>At jj| t}5 shopkeepers'strike PI3$ ^ does not lie in PAA gj dictator P54J.4 ^ |* not bad (good) Ptiskt f$ £ feudal chief P5Y* ^ £ don't want PR 3| forced to P5Y* 7f need not PA g nose PS 7f no, negate ?R5-J- g nose P$*X ^ illegal Pft H writing instrum't PSJJfc f| cannot, unable W&X $ writing technique PSISPS :j| /j\ cannot help PSJJW H g writing style PSOfciY* ^ jf J disastrous Pft» $ £ writer ?«!*¥* /js $ B could «>t help PSYSft $ £ pen pal Pflfi 7f || cannot PA §£ wall, cliff PttSIft ^ gg 4$ cannot PAP* g $g wall newspaper P$fS ^ "HJ" may not, must not PAtlT g Jgf built-in shelf PS*1¥J ^ nj" may not, must not 3 g -J ^ « L| ♦ « H

1I i1 ' «<-KK mUi — 1 -w- /: , . — -n a_ itnno CD -J TZ . o v » W C _lz__lz._iz_ -l -P r*rx it- jQjD Cc ^ - — S J 0) DCDO ^..1 1| "*» *« >• — — — M /> -3 xi -p Q, cC kj «>u> *>+> -» — cci -I «s @ ^—-2—= ^2 < B- -* ^«— *« T - — h£tt: * * * ||J x . ^ g < " ~ K ,u — c 'S &: * * 1 ii 3k □ o E — izz;

1I 05 <->^ ——CD < CD 1 *» — -W-+U- -H- w ^ 1 SC 1 -B- -+- 4< a; ^ — ^4^ -te — O -fej — ^

w ^ u — ^ 0 271 619

Jjf-££ftS?Jl [*] = [*] ff^ftipt. S?Jl#*Sfp2:^?S*0

re* PA* Pi* *b* *t* lit* j** #fc* ♦8

~9/> i**1 #4* ?±* [in* is* iA* JJ** #1 3£ I

JJ±* J)T* b¥2* i¥** i¥i* jbditi* n* m

Idife* liih* *** *r* HJcfc* MS* MA* MX* if tJl tto ffi

ML* lb* Mi* 44* 42* MR* M+* -Out 1-S ftft m a m m

«t ** « *i * « # «

»* 2*1* 2X* 2r* e *s & ffi £ *@ ii ti

E4* tt* tt* t3* Bi* fi* >A* >+* B4* Ift ft if K 1 fa & #

ft* m m m «* tt* ibl* f 4* fi* W & *fi tt H i i M 0 271 619

-.21.

[1] >A >AA >Af >kt >AJ >A+ >Ai >Ad >AJl >A* >A* >A* >M $ $ til11 0 3 # % # £ 42 ft £

[2] !>A*!

[3]

[4] >A >Ai >At >*4 >Afi >*P >*i >*> >*A I "i $ * * « ft ^ «

[1] >Jl >JUl >JIt >JW >Jl* >Jln >J14 >Jlb >A+ >R* >Jtf> >Wi >H> (30) BP £ P ® ^ ft f ft ft g #

* 38 fe 41 # Pi §|

[2] >R >JW >Rf >R5 >R4 >JW >Ah >AL >Ai >R> >JUv (tfl) + * *

!A >±4 :Ar m 54± >45 «t ft £ < » «

[3] >S >SJi >Sf >fit >JW /o\ £ I K * *

$ * tt

[4] >A 5tt >ftf >fti >ft* >ftL >ftY >A4 >A2 >AA >** >*A (48)

>An >A* >A*> >tt >j& >Aft >A$ >** >A4 >frf

m >** >ft* >AJ. >ft± >nJl >$A yt& >X$ >tt

>*i >&> >*+ <86> I g It « 3? 0 271 619

{staxt)

If*I/+/ S£TZ/=0j 12=12+/ 12 ~0 '2/0

20S 222^

'OFLJAAES/^ « A?£T0/?A/ 0 £/V0

2Jt

y£s- 1 ^e z /a/e/> xsrax//-/ £A/0\

2J>0j yes -\/?£ra/?A/+/ £A/0\ 220, 2^0 Z_

£/V0/ = //V0£X Of NEXT SPACE //VL/A/EA EA/0 2= /A/0EX 0EA/£XT SP4CE/A/A /A/fA 244 J 0/VP= C0A4/VIA*£ T£XT A A? A? AY /.///£/, Sf, EAV0 / Y Z.//VE2, Z2, E/VP2

2 Y^S

2S0j If = E/V£>/ VETl/XA/ CMP Z2 = E/V02

( )

:1EA. 0 271 619

£A/DSVl /= S£PAfiA7Tf(/}/?/?fiyi/A/£/,I/, EA/O /) 2S0 £WDjyi2=Sf/Vl/?/)T£{/l/?/Myi//V£2,IZ, £#02)

CA7P£= T0A/E COMfA/?£ (/lAWy l/A/£ /,!/, ^202 £/V/>SY£ /, /j/?/?AY l/AV£2, J2, EA/DSy/. 2]

2j#. A/0 A?£Ti//?A/ CA7/>7\ L Y£5 ( YTAY& )

CA^P= £0A7 TEXT (s4A?A?YI Y Z/AY£ Yy I 26$ I/, £A7J?5yz /; A/?/&y/L//VE2J I2y£A^0jyz2\

S2JS */a ATErVATA/ CA7E\

( /FA/£> ) 1/ ~EAY0Syz /j Z2 =EAY0&Z2\

Y£S.

2P&'

220'

JEB- 0 271 619

_/"JC£3sZ3_ ENCODING OF PHONETIC CHINESE PLPHflBET (PCfl) fiS CSCU CSCII ~ Chinese Standard Code for Information Interchange [Hex 80-FF3

PCLtt PCfi DEC HEX PCL# PCfl DEC HEX PCL» PCfl DEC HEX

0 128 BO 16 t 144 90 32 b 160 fi0 1 ? 129 81 17 > 145 91 33 E 161 ft1 2 P 130 82 18 B 146 92 34 fe 162 fl2 3 * 131 83 19 ? 147 93 35 t 163 A3 4 * 132 84 20 + 148 94 36 t 164 A4 5 1 133 85 21 £ 149 95 37 5 165 R5 6 1 134 86 22 f 150 96 38 t 166 A6 7 k 135 B7 23 A 151 97 39 5 167 A7 8 i) 136 88 24 A 152 98 40 5 168 A8 9 I 137 89 25 X 153 99 41 f 169 A9 10 * 138 8fl 26 * 154 9fl 42 * 170 AA 11 tt 139 8B 27 ft 155 9B 43 * 171 AB 12 M 140 8C 28 ft 156 9C 44 * 172 AC 13 -fc 141 8D 29 ft 157 9D 45 5 173 AD 14 $ 142 8E 30 A 158 9E 46 £ 174 AE 15 2 1 43 8F 31 L 159 9F 47 L 175 AF

48 k 176 B0 64 2 192 G0 80 * 208 D0 49 S 177 B1 65 ? 193 C1 81 3= 209 D1 50 i 178 B2 66 $ 194 C2 82 S 210 D2 51 % 179 B3 67 + 195 C3 83 * 211 D3 52 180 B4 68 ± 196 C4 84 A 212 D4 53 ft 181 B5 69 f 197 C5 85 J 213 D5 54 % 182 B6 70 * 198 06 86 , 214 D6 55 + 183 B7 71 + 199 C7 87 . 215 D7 56 i 184 B8 72 $ 200 D8 88 x 216 D8 57 * 185 B9 73 ¥ 201 09 89 - 217 D9 58 * 186 BA 74 $ 202 OA 90 ? 218 DA 59 X 187 BB 75 r 203 OB 91 < 219 DB 60 *. 188 BC 76 J: 204 00 92 > 220 DO 61 x 189 BD 77 T 205 OD 93 @. 221 DD 62 * 190 BE 78 t 206 OE 94 222 DE 63 * 191 BF 79 il' 207 OF 95 223 DF J) European Patent Application number EUROPEAN SEARCH REPORT EP 86 30 971 Office

DOCUMENTS CONSIDERED TO BE RELEVANT Citation of document with indication, where appropriate, Relevant CLASSIFICATION OF THE Category of relevant passages to claim APPLICATION (Int. CI.4) X EP - A - 0 130 941 (E.S.P.) * Whole document * 1-3,20 G 06 F 3/00 Y 16,18 B 41 J 3/00 G 06 F 15/20 A 17

X DE - A - 3 142 138 (SIEMENS AG) * Page 8, line 5 - page 11, line 9 * 1-3,20

X DE - A - 3 520 002 (G. DILUCIA) * Abstract; page 12, line 10 - page 1-3,20 13, line 6 * A 16-18

X GB - A - 2 121 220 (T.D. HUANG) * Abstract * 1,20 A 10 TECHNICAL FIELDS X GB - A - 2 076 572 (E. BAROUCH) SEARCHED (Int. CI.4) * Page 1, line 99 - page 2, line 1,20 16 * A 5 G 06 F 3/00 B 41 J 3/00 A GB - A - 2 158 776 (CHANG-CHI CHEN) G 06 F 15/20 * Abstract * ~ 1,10, 20

WO - A - 82 00 442 (R.C. JOHNSON) * Whole document * 1,5,6, 20

COMPUTER, vol. 18, no. 1, January 1985, Los Alamitos, pages 27-34 IEEE, NEW York, US; J.D. BECKER: "Typing Chinese, iapanese. and korean" -2- The present search report has been drawn up for all claims Place of search Date of completion of the search Examiner THE HAGUE 07-03-1988 CERVANTES CATEGORY OF CITED DOCUMENTS T theory or principle underlying the invention E earlier patent document, but published on, or X : particularly relevant if taken alone after the filing date Y : particularly relevant if combined with another D document cited in the application document of the same category L document cited for other reasons A : technological background 0 : non-written disclosure & : member of the same patent family, corresponding P : intermediate document document turopean raxent Office

ULAIMS INCUHHINva FEES

i ne present turopean patent application comprised at the time of filing more than ten claims.

[ | Allnil claimsciaims feeslees havenave beenDeen paidpaia withinwunin thetne prescribedprescnc-ea time limit. TheTne present EuropeanEu search report has been drawn nnup fnrfor nilall rlflimcclaims.

| | Only part of the claims fees have been paid within the prescribed time limit. The present European search report has been drawn up for the first ten claims and for those claims for which claims fees have been paid, namely claims:

[~~] No claims fees have been paid within the prescribed time limit. The present European search report has been drawn up for the first ten claims.

LACK OF UNITY Or INVENTION •Jea' "ivioiun i-ansiaers mai me present turopean patent application does not comply with the requirement of unity of nvention and relates to several inventions or groups of inventions, lamely:

1. Claims 1-9,20: Phonetic encoding method for Chinese ideograms

2. Claims 10-15: Laying out of a Chinese encoding keybord 3. Claims 16-19: Text processing method for Chinese ideograms + alphagrammic listing

runner searcn tees nave Deen paid within the fixed time limit..The present European search report has been drawn up for all claims.

I J| 0n|y Par1 of tne further search fees have been paid within the fixed time limit. The present European search report has been drawn up for those parts of the European patent application which relate to the inventions in respect of which search fees have been paid, namely claims:

| None of tne further search fees has been paid within the fixed time limit. The present European search report has been drawn up for those parts of the European patent application which relate to the invention first mentioned in the claims, namely claims: European Patent Application number EUROPEAN SEARCH REPORT . Office EP 86 30 9765 -2-

DOCUMENTS CONSIDERED TO BE RELEVANT Citation of document with indication, where appropriate, Relevant CLASSIFICATION OF THE Category of relevant passages to claim APPLICATION (Int. CI 4)

* Pages 32-34 * 1,10-13, 20

PROCEEDINGS 3rd USA- JAPAN COMPUTER CONFERENCE, October 10-12, 1978 San Francisco, Session 6-2-1/6-2-5, W.K. KING et al.: "The phonetic encoding of word-components for the computer input of Chinese charac- ters" * Pages 122-126 * 1,20

NEC REESEARCH AND DEVELOPMENT, 1985, Special Issue, pages 77-89 Tokyo , ( JP ) H. ISHIGURO et al.: "Input means for workstations" * Pages 86-89; figure 3 * 10-13 TECHNICAL FIELDS SEARCHED AMERICAN FEDERATION OF INFORMATION (Int. CM) PROCESSING, Societies and informa- tion processing, Soc. of Japan, 1st USA - Japan computer conference proceedings, October 3-5, 1972 Tokyo, Japan, pages 291-295 Montvale, N. J. , US; Chan H. YEH: "A Chinese typewriter and teletypewriter system" * Whole document * 10-13

COMPUTER, vol. 18, no. 5, May 1985, pages 29-35 New York, US; M. MORITA: "Japanese text- input system" * Abstract * 10-13

The present search report has been drawn up for all claims -3- Place of search Date of completion of the search Examiner

CATEGORY OF CITED DOCUMENTS T : theory or principle underlying the invention E : earlier patent document, but published on, or X : particularly relevant if taken alone after the filing date Y : particularly relevant if combined with another D : document cited in the application document of the same category L : document cited for other reasons A : technological background 0 : non-written disclosure member of the same patent family, corresponding P : intermediate document document European Patent . Application number J) EUROPEAN SEARCH REPORT Dffice EP 86 30 9765 -3-

DOCUMENTS CONSIDERED TO BE RELEVANT Citation of document with indication, where appropriate, Heievant ULAootrluA I IUN Ur I lit Category of relevant passages to claim APPLICATION (Int. CI.*)

JS - A - 4 484 305 (HO) * Abstract; figure 3 * L0

INFORMATION PROCESSING 80, Proceedings of IFIP Congress 80, rokyo, JP - October 6-9, 1980 yielbourne, AU - October 14-17,1980 pages 139-143 EC. KODAMA et al.: "The japanese word processor JW-10"

* Whole document * L6,18

A 11

A LIS - A- 4 544 276 (HORODECK) * Abstract; column 6, line 23 - 16,18 column 13, line 6 * TECMNIUAL HfcLLKS IBM TECHNICAL DISCLOSURE BULLETIN, SEARCHED (Int. CI.*) vol. 26, no. 3A, August 1983, pages 1147-1148 New York, US; K. KAWAI: "Multiple Kana/Kanji conversion method" * Whole document * 16,18

DE - A - 3 505 291 (SIEMENS) * Abstract * 16-18

The present search report has been drawn up for all claims Place of search Date of completion ot tne search bxaminer

CATEGORY OF CITED DOCUMENTS T : theory or principle underlying tne invention E : earlier patent document, but published on, or X : particularly relevant if taken alone after the filing date Y : particularly relevant if combined with another D : document cited in the application document of the same category L : document cited for other reasons A : technological background O : non-written disclosure member ot tne same patent Tamny, corresponamg P : intermediate document document