Visually and Phonologically Similar Characters in Incorrect Simplified Chinese Words

Total Page:16

File Type:pdf, Size:1020Kb

Visually and Phonologically Similar Characters in Incorrect Simplified Chinese Words Visually and Phonologically Similar Characters in Incorrect Simplified Chinese Words Chao-Lin Liu† Min-Hua Lai‡ Yi-Hsuan ChuangĹ Chia-Ying Leeľ †‡ĹDepartment of Computer Science; †ľCenter for Mind, Brain, and Learning National Chengchi University ľInstitute of Linguistics, Academia Sinica {†chaolin, ‡g9523, Ĺg9804}@cs.nccu.edu.tw, ľ[email protected] the incorrect characters. For instance, one might Abstract use “ӊ” (hwa2) in the place of “㣣”(hwa4) in 㣣׎ຝ” (ke1 hwa4 xing2 xiang4) partiallyڅ“ -Visually and phonologically similar cha racters are major contributing factors for because of phonological similarity; one might errors in Chinese text. By defining ap- replace “ܟ” (zhuo2) in “Ј㞇Κܟ” (xin1 lao2 propriate similarity measures that consid- li4 zhuo2) with “䶅” (chu4) partially due to visu- er extended Cangjie codes, we can identi- al similarity. (We do not claim that the visual or fy visually similar characters within a phonological similarity alone can explain the fraction of a second. Relying on the pro- observed errors.) nunciation information noted for individ- Similar characters are important for under- ual characters in Chinese lexicons, we standing the errors in both traditional and simpli- can compute a list of characters that are fied Chinese. Liu et al. (2009a-c) applied tech- phonologically similar to a given charac- niques for manipulating correctness of Chinese ter. We collected 621 incorrect Chinese words to computer assisted test-item generation. words reported on the Internet, and ana- Research in psycholinguistics has shown that the lyzed the causes of these errors. 83% of number of neighbor characters influences the these errors were related to phonological timing of activating the mental lexicon during the similarity, and 48% of them were related process of understanding Chinese text (Kuo et al. to visual similarity between the involved 2004; Lee et al. 2006). Having a way to compute characters. Generating the lists of phono- and find similar characters will facilitate the logically and visually similar characters, process of finding neighbor words, so can be in- our programs were able to contain more strumental for related studies in psycholinguistics. than 90% of the incorrect characters in Algorithms for optical character recognition for the reported errors. Chinese and for recognizing written Chinese try to guess the input characters based on sets of 1 Introduction confusing sets (Fan et al. 1995; Liu et al., 2004). The confusing sets happen to be hand-crafted In this paper, we report the experience of our clusters of visually similar characters. studying the errors in simplified Chinese words. It is relatively easy to judge whether two cha- Chinese words consist of individual characters. racters have similar pronunciations based on Some words contain just one character, but most their records in a given Chinese lexicon. We will words comprise two or more characters. For in- 1 discuss more related issues shortly. stance, “䟡” (mai4) has just one character, and To determine whether two characters are vi- yu3 yan2) is formed by two characters. sually similar is not as easy. Image processing) ”ق俟“ Two most common causes for writing or typing techniques may be useful but is not perfectly incorrect Chinese words are due to visual and feasible, given that there are more than fifty phonological similarity between the correct and thousand Chinese characters (HanDict, 2010) and that many of them are similar to each other in special ways. Liu et al. (2008) extend the 1 We show simplified Chinese characters followed by Cangjie codes (Cangjie, 2010; Chu, 2010) to en- their Hanyu pinyin. The digit that follows the symbols for the sound is the tone for the character. code the layouts and details about traditional 739 Coling 2010: Poster Volume, pages 739–747, Beijing, August 2010 Chinese characters for computing visually simi- pronunciations, we consider that C1 and C2 be- lar characters. Evidence observed in psycholin- long to the SS category. This principle applies guistic studies offers a cognition-based support when we consider phonological similarity in oth- for the design of Liu et al.’s approach (Yeh and er categories. Li, 2002). In addition, the proposed method One challenge in defining similarity between proves to be effective in capturing incorrect tra- characters is that the pronunciations of a charac- ditional Chinese words (Liu et al., 2009a-c). ter can depend on its context. The most common In this paper, we work on the errors in simpli- example of tone sandhi in Chinese (Chen, 2000) fied Chinese words by extending the Cangjie is that the first third-tone character in words codes for simplified Chinese. We obtain two lists formed by two adjacent third-tone characters will of incorrect words that were reported on the In- be pronounced in the second tone. At present, we ternet, analyze the major reasons that contribute ignore the influences of context when determin- to the observed errors, and evaluate how the new ing whether two characters are phonologically Cangjie codes help us spot the incorrect charac- similar. ters. Results of our analysis show that phonolog- Although we have confined our definition of ical and visual similarities contribute similar por- phonological similarity to the context of the tions of errors in simplified and traditional Chi- Mandarin Chinese, it is important to note the in- nese. Experimental results also show that, we can fluence of sublanguages within the Chinese lan- catch more than 90% of the reported errors. guage family will affect the perception of phono- We go over some issues about phonological logical similarity. Sublanguages used in different similarity in Section 2, elaborate how we extend areas in China, e.g., Shanghai, Min, and Canton and apply Cangjie codes for simplified Chinese share the same written forms with the Mandarin in Section 3, present details about our experi- Chinese, but have quite different though related ments and observations in Section 4, and discuss pronunciation systems. Hence, people living in some technical issues in Section 5. different areas in China may perceive phonologi- cal similarity in very different ways. The study in 2 Phonologically Similar Characters this direction is beyond the scope of the current study. The pronunciation of a Chinese character in- volves a sound, which consists of the nucleus and 3 Visually Similar Characters an optional onset, and a tone. In Mandarin Chi- nese, there are four tones. (Some researchers in- Figure 1 shows four groups of visually similar clude the fifth tone.) characters. Characters in group 1 and group 2 In our work, we consider four categories of differ subtly at the stroke level. Characters in phonological similarity between two characters: group 3 share the components on their right sides. same sound and same tone (SS), same sound and The shared component of the characters in group different tone (SD), similar sound and same tone 4 appears at different places within the characters. (MS), and similar sound and different tone (MD). Radicals are used in Chinese dictionaries to We rely on the information provided in a lex- organize characters, so are useful for finding vi- icon (Dict, 2010) to determine whether two cha- sually similar characters. The characters in group racters have the same sound or the same tone. 1 and group 2 belong to the radicals “Җ” and “ ”, The judgment of whether two characters have respectively. Notice that, although the radical for similar sound should consider the language expe- group 2 is clear, the radical for group 1 is not rience of an individual. One who live in the obvious because “Җ” is not a standalone compo- southern and one who live in the northern China nent. may have quite different perceptions of “similar” However, the shared components might not be sound. In this work, we resort to the confusion the radicals of characters. The shared compo- sets observed in a psycholinguistic study con- nents in groups 3 and 4 are not the radicals. In ducted at the Academic Sinica. Some Chinese characters are heteronyms. Let C1 and C2 be two characters that have multiple pronunciations. If C1 and C2 share one of their Figure 1. Examples of visually similar characters 740 many cases, radicals are semantic components of ID CC Cangjie ID CC Cangjie Chinese characters. In groups 3 and 4, the shared 1 ҖҖ! 10 偂! ДΓЈЈЉ! components carry information about the pronun- 2 җ ύҖ! 11 㟻! НЈЈЉ! ciations of the characters. Hence, those charac- 3 Ҙ Җύ! 12 ᫴! ЕЈЈЉ! ters are listed under different radicals, though 4 ҙ ύҖύ! 13 䠑! αДΓ! they do look similar in some ways. 5 侪ЉЉζΓΜ!! 14 䡿! ҖααДΓ! Hence, a mechanism other than just relying on 6 侘ЉЉζ΋Μ! 15 䟋! αΓεν! information about characters in typical lexicons 7 侓ЉЉζΜ! 16 䟄! ψ΋εν! is necessary, and we will use the extended Cang- 8 呆ψψ΋ВВ 17 勹! ψ΋΋ДΓ! jie codes for finding visually similar characters. 9 厇ψψ΋Јα 18 䶈! ζ΋ψΓ΋! Table 1. Examples of Cangjie codes 3.1 Cangjie Codes for Simplified Chinese jie codes for computing similar characters. The Table 1 shows the Cangjie codes for the 13 prefix “ψ΋” in c16 and c17 can represent “ ”, characters listed in Figure 1 and five other “吏” (e.g., c ), and “卺” (e.g., c ). Characters characters. The “ID” column shows the 8 9 identification number for the characters, and we whose Cangjie codes include “ψ΋” may con- th tain any of these three components, but they do will refer to the i character by ci, where i is the ID. The “CC” column shows the Chinese not really look alike. characters, and the “Cangjie” column shows the Therefore, we augment the original Cangjie Cangjie codes. Each symbol in the Cangjie codes codes by using the complete Cangjie codes and corresponds to a key on the keyboard, e.g.
Recommended publications
  • Orthography of Early Chinese Writing: Evidence from Newly Excavated Manuscripts
    IMRE GALAMBOS ORTHOGRAPHY OF EARLY CHINESE WRITING: EVIDENCE FROM NEWLY EXCAVATED MANUSCRIPTS BUDAPEST MONOGRAPHS IN EAST ASIAN STUDIES SERIES EDITOR: IMRE HAMAR IMRE GALAMBOS ORTHOGRAPHY OF EARLY CHINESE WRITING: EVIDENCE FROM NEWLY EXCAVATED MANUSCRIPTS DEPARTMENT OF EAST ASIAN STUDIES, EÖTVÖS LORÁND UNIVERSITY BUDAPEST 2006 The present volume was published with the support of the Chiang Ching-kuo Foundation. © Imre Galambos, 2006 ISBN 963 463 811 2 ISSN 1787-7482 Responsible for the edition: Imre Hamar Megjelent a Balassi Kiadó gondozásában (???) A nyomdai munkálatokat (???)a Dabas-Jegyzet Kft. végezte Felelős vezető Marosi Györgyné ügyvezető igazgató CONTENTS Acknowledgements ................................................................................................. vii Introduction ............................................................................................................ 1 CHAPTER ONE FORMER UNDERSTANDINGS ..................................................................................... 11 1.1 Traditional views ........................................................................................... 12 1.1.1 Ganlu Zishu ........................................................................................ 13 1.1.2 Hanjian .............................................................................................. 15 1.2 Modern views ................................................................................................ 20 1.2.1 Noel Barnard .....................................................................................
    [Show full text]
  • Example Sentences
    English 中文 harmony Opening/ Home page Tap on a button in the loading pentagon to dive into that Upon opening the app, the world. Pressing the yin yang user will see “English” and in the center takes you to the “中文” merge into a yin app’s “About” page. yang. That reflects the goal of harmony - to help the user Most things are labeled learn Cantonese and/or in English and Chinese to Mandarin through a bilingual help the user learn Chinese experience without getting more quickly, but this (and too stressed. Soothing colors, many other things) can be pleasing visuals, and relaxing changed in the settings and music keep the user at peace. preferences. harmony (Icons in top navigation bar, from left to right: home button, help button, and harmony settings button.) Dictionary (initial) When you first open the By default, the app only shows dictionary, it shows the items you the last 15 items you you last looked at - your looked at, but you can change history. The green tabs along this in the settings menu. the bottom allow you to swipe between items you recently The search bar is fixed as you viewed, items you starred, or scroll so you can search at any items most popular with other point (instead of having to harmony users. scroll back up to the top). Here, all the characters are in Traditional Chinese because the user left the “Traditional Chinese” checkbox in the search bar checked. The app remembers your choice even after you leave the dictionary section. harmony Choosing Typing in type of input your query To begin your search, you’ll Tapping the search field will want to first choose your make the keyboard pop up type of input by pressing the and allow you to type in your button next to the search field.
    [Show full text]
  • Localizing Into Chinese: the Two Most Common Questions White Paper Answered
    Localizing into Chinese: the two most common questions White Paper answered Different writing systems, a variety of languages and dialects, political and cultural sensitivities and, of course, the ever-evolving nature of language itself. ALPHA CRC LTD It’s no wonder that localizing in Chinese can seem complicated to the uninitiated. St Andrew’s House For a start, there is no single “Chinese” language to localize into. St Andrew’s Road Cambridge CB4 1DL United Kingdom Most Westerners referring to the Chinese language probably mean Mandarin; but @alpha_crc you should definitely not assume this as the de facto language for all audiences both within and outside mainland China. alphacrc.com To clear up any confusion, we talked to our regional language experts to find out the most definitive and useful answers to two of the most commonly asked questions when localizing into Chinese. 1. What’s the difference between Simplified Chinese and Traditional Chinese? 2. Does localizing into “Chinese” mean localizing into Mandarin, Cantonese or both? Actually, these are really pertinent questions because they get to the heart of some of the linguistic, political and cultural complexities that need to be taken into account when localizing for this region. Because of the important nature of these issues, we’ve gone a little more in depth than some of the articles on related themes elsewhere on the internet. We think you’ll find the answers a useful starting point for any considerations about localizing for the Chinese-language market. And, taking in linguistic nuances and cultural history, we hope you’ll find them an interesting read too.
    [Show full text]
  • Download Article
    Advances in Social Science, Education and Humanities Research, volume 289 5th International Conference on Education, Language, Art and Inter-cultural Communication (ICELAIC 2018) Chinese Literary Logic: Imaginary Thinking in Cognitive Structure* Lei Jia Xiaomang Zhang Department of Basic Subjects School of Philosophy Xinjiang Police College Nankai University Urumqi, China Tianjin, China Abstract—Chinese characters are ideographic writing, and of ancient Chinese in the process of character-making. there is a close relationship between their physical structures and meanings. The logic-cognitive process of this connection is Based on Hegel, "according to the order of time, the determined by the normative role of traditional Chinese imagery consciousness of human beings always forms the thinking. In the logic-cognitive process of visual analogy, it representation of the object before forming the concept, and regulates the historical process of the development of Chinese only through the representation can the mind of human beings 1 characters and guides the Chinese people's aesthetic concepts realize and grasp the thinking of things" , which is in line with and methods in the process of literacy recognition by adopting a the concept of Cangjie, an official historian of Yellow thinking method that combines subjective will and external Emperor that “realizing the difference of birds texture and that objects. To taste the normative function of this mode of thinking, of beasts after he seeing the footprints of these two kind of it’s better to appreciate its underlying principles of logical creatures and so he began to create characters. In response, methodology, aesthetic concepts, technical methods, cultural there were definite undertakings and various kinds of traditions and humanistic spirit.
    [Show full text]
  • A Long-Standing and Well-Established History Th E Origin and Development of China’S Ancient Publishing Industry
    Cambridge University Press 978-0-521-18675-9 — Chinese Publishing Hu Yang , Yang Xiao Excerpt More Information A Long-Standing and Well-Established History Th e Origin and Development of China’s Ancient Publishing Industry © in this web service Cambridge University Press www.cambridge.org Cambridge University Press 978-0-521-18675-9 — Chinese Publishing Hu Yang , Yang Xiao Excerpt More Information CHINESE PUBLISHING The Xia Dynasty, established in 2070 B.C., was the fi rst dynasty in China and marked the beginning of a new era in Chinese civilization. There is evidence to show that Chinese writing had developed and was on its way to being systemized at this time. Chinese ancestors also developed a degree of aesthetic and cultural accomplishment. The appearance of primary writing tools and books paved the way for early publishing activities. Th e Origin of Chinese Characters The Chinese character is an important marker of humanity’s shift from pre-literate to literate times. Written language is the basis of human civilization because it can be passed down through time and space. Chinese characters played a decisive role in the development of Chinese culture and the publishing industry. The development of Chinese characters was a long process that can be divided into two periods before their final appearance. One is the time when the spoken language existed without a written language. The other is that of a written language without complete articles. On the basis of language and objectives, the real emergence of Chinese characters underwent three stages during which events were recorded by knotted cords, carvings and graphs.
    [Show full text]
  • Ritualizing Confucius/Kongzi the Family and State Cults of the Sage of Culture in Imperial China
    ONE Ritualizing Confucius/Kongzi The Family and State Cults of the Sage of Culture in Imperial China Thomas A. Wilson THE CANONICAL AND THE NONCANONICAL SAGE It is said that he transcribed the sagely traditions of the ancients in sacred books that would be transmitted for ten thousand generations. The imperial courts of the last thousand years worshipped him as the "supreme sage" of state orthodoxy in temples devoted to him. To Confucian scholars of the same era, he played a pivotal role in the transmission of the Dao ~Me; in~ deed, without him it would have been lost forever.1 To common folk, he was Research in China was supported by the American Philosophical Society and the Couper faculty research fund of Hamilton College. The author wishes to acknowledge the assistance of Zhu Weizheng **fE~ (Fudan University), Kong Xianglin :f~~ and Luo Chenglie .~~f.U of Qufu, Kong Xiangkai :fLJHW and Kong Liuxian :f~IJ)t; of Quzhou, He Jun fill ~ (Hangzhou University), Huang Chin-shing (Academia Sinica, Taiwan), and Jun Jing {CCNY) during his research travels in China and Taiwan, and Sue Naquin, Michael Nylan, Benjamin Elman, and participants in a colloquium at the Institute for Advanced Study in November 1999 for their comments on earlier drafts of this piece. The writing was fi:mded by the National Endowment for the Humanities. 1. Consider, e.g., Zhu Xi's statement: "IfKongzi had never been born after Yao and Shun, where could anyone go to understand [the Dao}t He then goes on to say the same thing about Mengzi and the Cheng brothers (Zhuzi yulei 93-'·350).
    [Show full text]
  • On the Development of Calligraphy Theory from Qin and Han to Sui and Tang Dynasties
    2019 7th International Education, Economics, Social Science, Arts, Sports and Management Engineering Conference (IEESASM 2019) On the Development of Calligraphy Theory from Qin and Han to Sui and Tang Dynasties Dianti Chen Department of Humanities Education, Nanchang Institute of Technology, Nanchang, 330044, China Keywords: Chinese Calligraphy, Calligraphy Theory, Qin and Han to Sui and Tang, Development Course Abstract: This article introduces the theory and art of calligraphy in china. According to the chronological order, the change of dynasties is the clue of the article. It systematically explains the calligraphy theory of the three periods of Qin and Han, Wei, Jin, northern and southern dynasties, and tang. Masterpieces of famous calligraphy and great calligraphers, to help everyone better understand the development of Chinese calligraphy theory. 1. Introduction Chinese calligraphy is a unique art that is attached to Chinese characters. It is about the writing of Chinese characters. As a product of the fusion of rational and artistic thinking, Chinese characters have rich imagery and plasticity. They have subtle changes and different forms. After incorporating the personal feelings of the writer, it became a unique art of Chinese characters. With the development and change of Chinese characters, the art of Chinese calligraphy also has a clear development track. Studying Chinese calligraphy art with time as the horizontal axis, can It is roughly divided into two periods, with the boundary between Sui and Tang dynasties. The period from before Sui and Tang to Qin and Han dynasties can be the early stage and the period from Sui and Tang to the two Songs and Ming and Qing dynasties can be the later stages.
    [Show full text]
  • Chinese Traditional Xp Quick Free Download Chinese Traditional Xp Quick Free Download
    chinese traditional xp quick free download Chinese traditional xp quick free download. and save it to a known place on hard disk; - run "msconfig", select "Diagnostic Startup" and re-boot Windows; - run the SP3 installer from the place you stored it on hard disk; - run "msconfig" to re-select "Normal Startup" and reboot Windows. Check the permissions on ntldr! I found that the ownership of the file was wrong . Changing the ownership & giving it to administrators allows you to add full permission for the appropriate accounts. Once I corrected these, SP3 installed. For those who may need the guidance - You'll need to reveal hidden files, Go to My Computer, double click C:, tools, folder options, view, untick the hide stuff boxes & dot the show hidden, apply & ok. You can scroll down C: till you see ntldr, right click on the ntldr file, properties, security tab, advanced button, Ownership, & choose Administrators. Make sure administrators have full control on the security tab. Hope this helps! I tried both solutions (downloading the installer and doing te administrator thing), but it still doesn't work. So, I have the installer (in my language, Dutch) and everytime I try to install SP3 it shows the same error: "d:\ntldr is open or used by another program, please close. blablabla" (curious tough: the ntldr file is in my c directory under windows-softwareDistribution-download-and a map with many numbers and letters). Does someone have another solution (which hopefully works )? I had exactly the same problem as serendib and tried in vain to fix it and finally gave up.
    [Show full text]
  • Chinese Characters Dictionary Pdf
    Chinese characters dictionary pdf Continue ENGLISH SEE TRADITIONAL SEE CANTON TONE Have you ever thought that the most common Chinese characters? In a language of about 50,000 individual characters, mastering the most commonly used Chinese characters, and knowing their order of frequency- will benefit you on your way to the greatness of the language. If you're a new student willing to start reading and writing in Chinese or an experienced student brushing on the basics, students of all levels benefit from considering the most important Chinese characters. The following list of the 100 most common Chinese characters is based on data from classical and modern Chinese writings collected by linguist Jun Da.The 100 Most Common Chinese Characters:No.Characterp'ny'nEnglish1的de (possessive particles), / really and truly /target, clear2y' / y'/y'one/single/a(n)3是sh'is, are, am, yes be4不be (negative prefix) no, not5了le/liǎo (modal particle activation of the previous position), (past tense marker) / to know, to know, to know6renman, man, people7我wǒI me, myself8在z'i (located) on, in, exist9有yǒuto is, is, is, is, is to exist, to be10他thah, him11这zh'this/these12为w'i / w'iact how, take ... be, to be, to do, to serve to become/because, for, to13之jahim, her, it14'd'big, huge, big, big, wide, deep, senior, senior/doctor15来l'ito come16以yǐto use, take, according to, because of, in order to 17个g (measure the word), individual18'ngwithin, among, in, in the middle, in the center, while (do what上to), during19上shungabova, on, on, above, (go) up, last, the previous2
    [Show full text]
  • Faster Segmentation-Free Handwritten Chinese Text Recognition with Character Decompositions
    Faster Segmentation-Free Handwritten Chinese Text Recognition with Character Decompositions Theodore´ Bluche Ronaldo Messina A2iA SAS, Paris, France Email: ftb,[email protected] Abstract—Recently, segmentation-free methods for hand- acters. Shape-based methods take advantage of similarities written Chinese text were proposed. They do not require of parts of characters. The characters are represented as a character-level annotations to be trained, and avoid character sequence of elementary shapes or constituants. segmentation errors at decoding time. However, segmentation- free methods need to make at least as many predictions as there The existence of such methods shows that the alphabet are characters in the image, and often a lot more. Combined required to produce Chinese text may effectively be reduced with the fact that there are many characters in Chinese, these from several thousands symbols to less than a hundred, systems are too slow to be suited for industrial applications. and that a sequence made of this alphabet can straightfor- Inspired by the input methods for typing Chinese characters, wardly be converted into a sequence of Chinese characters. we propose a sub-character-level recognition that achieves a 4x speedup over the baseline Multi-Dimensional Long Short-Term Moreover, in shape-based methods, the decomposition of a Memory Recurrent Neural Network (MDLSTM-RNN). character corresponds to a graphical realization, which might be suited to visual character recognition. If we train a system to recognize these decompositions, we may also reasonably I. INTRODUCTION hope that characters that were not seen during training could Most systems that recognize Chinese Text operate at still be recognized by their decomposition.
    [Show full text]
  • Dynamic Candidate Keypad for Stroke-Based Chinese Input Method on Touchscreen Devices
    Dynamic Candidate Keypad for Stroke-based Chinese Input Method on Touchscreen Devices L.M. Po1, C.K. Wong1, C.W. Ting1, K.H. Ng1, K.M. Wong1, C.H. Cheung2, K.W. Cheung1,3 1Department of Electronic Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong SAR, China 2Department of Information Systems, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong SAR, China 3Department of Computer Science, Chu Hai College of Higher Education, N.T., Hong Kong SAR, China Abstract - With the popularity of touchscreen devices, physical button-based keyboards are being replaced by finger-operated virtual keyboards. Entering text on these touchscreen devices is no longer limited by finger tapping activities on the keys. Text input can be performed by finger sliding over the virtual keyboard, which is well realized by the Swype technology on a typical QWERTY keyboard. Such shorthand gesturing for text input, however, may be inefficient when directly applied to Chinese input, especially in Chinese stroke-based input method. In this paper, a novel dynamic candidate keypad with use of Fig. 1. A virtual QWERTY keyboard of an Android Smartphone. unidirectional finger gesture on the stroke key for character selection is proposed for enhancing stroke-based Chinese input method. The new design can enhance frequently used Chinese characters searching and input using stroke-based input. The new method is implemented on the Android 2.2 platform for performance evaluation using Traditional Chinese characters set. Experimental results show that the proposed method enables users to input popular Chinese characters easier than conventional stroke-based input methods.
    [Show full text]
  • Six-Digit Stroke-Based Chinese Input Method Lai-Man Po, Chi-Kwan Wong, Yiu-Ki Au, Ka-Ho Ng and Ka-Man Wong
    Six-Digit Stroke-based Chinese Input Method Lai-Man Po, Chi-Kwan Wong, Yiu-Ki Au, Ka-Ho Ng and Ka-Man Wong Department of Electronic Engineering, City University of Hong Kong, Kowloon, Hong Kong Email: [email protected] Abstract—During the last three decades, more than one B. Pronunciation-based Chinese Input Methods thousand Chinese input methods have been developed. However, The two most popular pronunciation-based Chinese input people are still looking for better input methods in terms of easy to use, easy to remember, high input speed and small keypad methods in mainland China and Taiwan are Pinyin (拼音) [3] implementation on handheld devices. The well-known stroke- and Zhuyin (注音), respectively. The Pinyin method simply based Chinese input method using only five basic stroke types uses the Pinyin table as its conversion dictionary. This method could achieve low learning curve and small numeric keypad is very easy to learn for people who are already familiar with implementation but its input speed is limited for complex Chinese Putonghua (or Mandarin). For inputting the Chinese character characters with a lot of strokes. To tackle this problem, of “汉“, we can enter its Pinyin letters of “han” into the simplified stroke-based Chinese character and phrase coding methods using (3+3) rules are proposed in this paper. The computer for starting the character input. However, we will proposed method only uses the first 3 stroke codes and the last 3 get a list of 30 or more candidates to choose from, as there are stroke codes to represent the first and last radical information of a lot of Chinese characters with identical Pinyin string.
    [Show full text]