Chinese Pinyin Aided IME, Input What You Have Not Keystroked Yet

Total Page:16

File Type:pdf, Size:1020Kb

Chinese Pinyin Aided IME, Input What You Have Not Keystroked Yet Chinese Pinyin Aided IME, Input What You Have Not Keystroked Yet Yafang Huang1;2, Hai Zhao1;2;∗, 1Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China 2Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China [email protected], [email protected] ∗ Abstract between pinyin syllables and Chinese characters. (Huang et al., 2018; Yang et al., 2012; Jia and Chinese pinyin input method engine (IME) Zhao, 2014; Chen et al., 2015) regarded the P2C as converts pinyin into character so that Chinese a translation between two languages and solved it characters can be conveniently inputted into computer through common keyboard. IMEs in statistical or neural machine translation frame- work relying on its core component, pinyin- work. The fundamental difference between (Chen to-character conversion (P2C). Usually Chi- et al., 2015) work and ours is that our work is a nese IMEs simply predict a list of character fully end-to-end neural IME model with extra at- sequences for user choice only according to tention enhancement, while the former still works user pinyin input at each turn. However, Chi- on traditional IME only with converted neural net- nese inputting is a multi-turn online procedure, work language model enhancement. (Zhang et al., which can be supposed to be exploited for fur- 2017) introduced an online algorithm to construct ther user experience promoting. This paper thus for the first time introduces a sequence- appropriate dictionary for P2C. All the above men- to-sequence model with gated-attention mech- tioned work, however, still rely on a complete in- anism for the core task in IMEs. The pro- put pattern, and IME users have to input very long posed neural P2C model is learned by en- pinyin sequence to guarantee the accuracy of P2C coding previous input utterance as extra con- module as longer pinyin sequence may receive less text to enable our IME capable of predicting decoding ambiguity. character sequence with incomplete pinyin in- The Chinese IME is supposed to let user in- put. Our model is evaluated in different bench- mark datasets showing great user experience put Chinese characters with least inputting cost, improvement compared to traditional models, i.e., keystroking, which indicates extra content which demonstrates the first engineering prac- predication from incomplete inputting will be ex- tice of building Chinese aided IME. tremely welcomed by all IME users. (Huang et al., 2015) partially realized such an extra predication 1 Introduction using a maximum suffix matching postprocess- ing in vocabulary after SMT based P2C to predict Pinyin is the official romanization representation longer words than the inputted pinyin. for Chinese and the P2C converting the inputted To facilitate the most convenience for such an pinyin sequence to Chinese character sequence is IME, in terms of a sequence to sequence model as the most basic module of all pinyin based IMEs. neural machine translation (NMT) between pinyin Most of the previous research (Chen, 2003; sequence and character sequence, we propose a Zhang et al., 2006; Lin and Zhang, 2008; Chen P2C model with the entire previous inputted ut- and Lee, 2000; Jiang et al., 2007; Cai et al., 2017a) terance confirmed by IME users being used as a for IME focused on the matching correspondence part of the source input. When learning the type ∗ Corresponding author. This paper was partially sup- of the previous utterance varies from the previous ported by National Key Research and Development Program of China (No. 2017YFB0304100), National Natural Science sentence in the same article to the previous turn of Foundation of China (No. 61672343 and No. 61733011), utterance in a conversation, the resulting IME will Key Project of National Society Science Foundation of China make amazing predication far more than what the (No. 15-ZDA041), The Art and Science Interdisciplinary Funds of Shanghai Jiao Tong University (No. 14JCRZ04), pinyin IME users actually input. and the joint research project with Youtu Lab of Tencent. In this paper, we adopt the attention-based NMT 2923 Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2923–2929 Brussels, Belgium, October 31 - November 4, 2018. c 2018 Association for Computational Linguistics User Selection 天气 还 可以 的 EOS Gated Attention User choice t-2 (c: context) P2C P2C Pinyint-1 Pinyint-1 Cand. list t-1 User choice t-1 Cand. list t-1 User choice t-1 (c: context) 今天 的 天气 怎么样 呢 BC tianqi hai keyi de EOS (How's the weather) (Weather is not bad) (c) Simple C+ P2C Model: Simple context-enhanced pinyin-to-character model, straightforwardly concat the context to the source pinyin sequence. Pinyint Pinyint Source Emb. Gated Attention 天气 还 可以 的 EOS Cand. list t User choice t Cand. listt User choice t (c: context) Context Emb. BiLSTM Encoder Target Emb. RNN Decoder Pinyin Pinyin t+1 t+1 BiGRU Cand. list t+1 User choicet+1 Cand. listt+1 User choice t+1 (c: context) (a) The data flow of (b) The data flow of our 今天 的 天气 怎么样 呢 tianqi hai keyi de EOS traditional P2C gated-attention enhanced P2C (d) Gated C+ P2C Model: Gated Attention based Context-enhanced Pinyin-to-Character model. Figure 1: Architecture of the proposed model. framework in (Luong et al., 2015) for the P2C As introduced a hybrid source side input, our task. In contrast to related work that simply ex- model has to handle document-wide translation tended the source side with different sized context by considering discourse relationship between two window to improve of translation quality (Tiede- consecutive sentences. The most straightforward mann and Scherrer, 2017), we add the entire input modeling is to simply concatenate two types of utterance according to IME user choice at previous source inputs with a special token ’BC’ as sepa- time (referred to the context hereafter). Hence the rator. Such a model is in Figure1(c). However, resulting IME may effectively improve P2C qual- the significant drawback of the model is that there ity with the help of extra information offered by are a slew of unnecessary words in the extended context and support incomplete pinyin input but context (previous utterance) playing a noisy role predict complete, extra, and corrected character in the source side representation. output. The evaluation and analysis will be per- To alleviate the noise issue introduced by the formed on two Chinese datasets, include a Chi- extra part in the source side, inspired by the nese open domain conversations dataset for veri- work of (Dhingra et al., 2016; Pang et al., 2016; fying the effectiveness of the proposed method. Zhang et al., 2018c,a,b; Cai et al., 2017b), our model adopts a gated-attention (GA) mechanism 2 Model that performs multiple hops over the pinyin with the extended context as shown in Figure1(d). As illustrated in Figure 1, the core of our P2C In order to ensure the correlation between each is based the attention-based neural machine trans- other, we build a parallel bilingual training cor- lation model that converts at word level. Still, pus and use it to train the pinyin embeddings and we formulize P2C as a translation between pinyin the Chinese embeddings at once. We use two and character sequences as shown in a traditional Bidirectional gated recurrent unit (BiGRU) (Cho model in Figure1(a). However, there comes a et al., 2014) to get contextual representations of key difference from any previous work that our the source pinyin and context respectively, Hp = source language side includes two types of inputs, BiGRU(P );Hc = BiGRU(C), where the repre- the current source pinyin sequence (noted as P ) as sentation of each word is formed by concatenating usual, and the extended context, i.e., target charac- the forward and backward hidden states. ter sequence inputted by IME user last time (noted For each pinyin pi in Hp, the GA module as C). As IME works dynamically, every time forms a word-specific representation of the con- IME makes a predication according to a source text ci 2 Hc using soft attention, and then adopts pinyin input, user has to indicate the ’right answer’ element-wise product to multiply the context rep- to output target character sequence for P2C model resentation with the pinyin representation. αi = T learning. This online work mode of IMEs can be softmax(Hc pi); βi = Cαi; xi = pi βi, where fully exploited by our model whose work flow is is multiplication operator. shown in Figure1(b). The pinyin representation H~p = x1; x2; :::; xk 2924 is augmented by context representation and then user’s input does not end with typing enter, we can sent into the encoder-decoder framework. The regard the current input pinyin sequence is an in- encoder is a bi-directional long short-term mem- complete one. ory (LSTM) network (Hochreiter and Schmidhu- ber, 1997). The vectorized inputs are fed to for- 3.2 Datasets and Settings ward and backward LSTMs to obtain the internal representation of two directions. The output for PD DC each input is the concatenation of the two vec- Train Test Train Test tors from both directions. Our decoder based on # Sentence 5.0M 2.0K 1.0M 2.0K the global attentional models proposed by (Luong L < 10 % 88.7 89.5 43.0 54.0 et al., 2015) to consider the hidden states of the en- L < 50 % 11.3 10.5 47.0 42.0 coder when deriving the context vector. The prob- L > 50 % 0.0 0.0 4.0 2.0 ability is conditioned on a distinct context vector Relativity % 18.0 21.1 65.8 53.4 for each target word.
Recommended publications
  • China's Shanzhai Entrepreneurs Hooligans Or
    China’s Shanzhai Entrepreneurs Hooligans or Heroes? 《中國⼭寨企業家:流氓抑或是英雄》 Callum Smith ⾼林著 Submitted for Bachelor of Asia Pacific Studies (Honours) The Australian National University October 2015 2 Declaration of originality This thesis is my own work. All sources used have been acknowledGed. Callum Smith 30 October 2015 3 ACKNOWLEDGEMENTS I am indebted to the many people whose acquaintance I have had the fortune of makinG. In particular, I would like to express my thanks to my hiGh-school Chinese teacher Shabai Li 李莎白 for her years of guidance and cherished friendship. I am also grateful for the support of my friends in Beijing, particularly Li HuifanG 李慧芳. I am thankful for the companionship of my family and friends in Canberra, and in particular Sandy 翟思纯, who have all been there for me. I would like to thank Neil Thomas for his comments and suggestions on previous drafts. I am also Grateful to Geremie Barmé. Callum Smith 30 October 2015 4 CONTENTS ACKNOWLEDGEMENTS ................................................................................................................................ 3 ABSTRACT ........................................................................................................................................................ 5 INTRODUCTION .............................................................................................................................................. 6 THE EMERGENCE OF A SOCIOCULTURAL PHENOMENON ...................................................................................................
    [Show full text]
  • The Challenge of Chinese Character Acquisition
    University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Faculty Publications: Department of Teaching, Department of Teaching, Learning and Teacher Learning and Teacher Education Education 2017 The hC allenge of Chinese Character Acquisition: Leveraging Multimodality in Overcoming a Centuries-Old Problem Justin Olmanson University of Nebraska at Lincoln, [email protected] Xianquan Chrystal Liu University of Nebraska - Lincoln, [email protected] Follow this and additional works at: http://digitalcommons.unl.edu/teachlearnfacpub Part of the Bilingual, Multilingual, and Multicultural Education Commons, Chinese Studies Commons, Curriculum and Instruction Commons, Instructional Media Design Commons, Language and Literacy Education Commons, Online and Distance Education Commons, and the Teacher Education and Professional Development Commons Olmanson, Justin and Liu, Xianquan Chrystal, "The hC allenge of Chinese Character Acquisition: Leveraging Multimodality in Overcoming a Centuries-Old Problem" (2017). Faculty Publications: Department of Teaching, Learning and Teacher Education. 239. http://digitalcommons.unl.edu/teachlearnfacpub/239 This Article is brought to you for free and open access by the Department of Teaching, Learning and Teacher Education at DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in Faculty Publications: Department of Teaching, Learning and Teacher Education by an authorized administrator of DigitalCommons@University of Nebraska - Lincoln. Volume 4 (2017)
    [Show full text]
  • Strategic Social Media Tools and Increasing Engagement
    Strategic Social media tools and increasing engagement Soolcio D I ST RI C T CO UN C IL In this guide 1. How to make best use ofyour time and money 2. What's in your social media too/kit? 3. How can you increase engagement? Soolcio D I ST RI C T CO UN C IL 1. Investment should you focus your time and money? Soolcio D I ST RI C T CO UN C IL Let’s think about 1. Platforms 3. Photography 2. Planning 4. Promotions Soolcio D I ST RI C T CO UN C IL 1. Platforms Always better to do fewer platforms, brilliantly You don't have to be everywhere, all the time! Think: Where is my audience? Soolcio D I ST RI C T CO UN C IL ------------------- ----------- 9!.!Je! · little_aae_kitchen ti II ~• • • • little_acre_kitchen • Follow eny oc •, d • Origi al io little_acre_kitchen ; PY F D Y! ; Biscoff brownies o t e counter! I you vould like to book a house old table or the veeken t en ge i touch! 01480 46405 □ eve s@hallandco en desig .co.uk 3w simply_rose711 Are hese available for take away? 3 1 like Rep y (? 0 Y/ Lik by cambridgejuiceco an 75 others e Q ddacommen ... Huntingdonshire- DISTRICT COUNCIL 2. Planning Put time aside to think: • What type ofcontent you'll post • What you won't post about! • Who's going to do it • What will be sustainable Soolcio D I ST RI C T CO UN C IL SOCIAL MEDIA CHECKLIST Planning Maybe you're new to social media - or you want to take a fresh 2.
    [Show full text]
  • The Challenge of Chinese Character Acquisition: Leveraging Multimodality in Overcoming a Centuries-Old Problem
    The Emerging Learning Design Journal Volume 4 Issue 1 Article 1 February 2018 The Challenge of Chinese Character Acquisition: Leveraging Multimodality in Overcoming a Centuries-Old Problem Justin Olmanson University of Nebraska Lincoln Xianquan Liu University of Nebraska Lincoln Follow this and additional works at: https://digitalcommons.montclair.edu/eldj Part of the Instructional Media Design Commons Recommended Citation Olmanson, Justin and Liu, Xianquan (2018) "The Challenge of Chinese Character Acquisition: Leveraging Multimodality in Overcoming a Centuries-Old Problem," The Emerging Learning Design Journal: Vol. 4 : Iss. 1 , Article 1. Available at: https://digitalcommons.montclair.edu/eldj/vol4/iss1/1 This Article is brought to you for free and open access by Montclair State University Digital Commons. It has been accepted for inclusion in The Emerging Learning Design Journal by an authorized editor of Montclair State University Digital Commons. For more information, please contact [email protected]. Volume 4 (2017) pgs. 1-9 Emerging Learning http://eldj.montclair.edu eld.j ISSN 2474-8218 Design Journal The Challenge of Chinese Character Acquisition: Leveraging Multimodality in Overcoming a Centuries-Old Problem Justin Olmanson and Xianquan Liu University of Nebraska Lincoln 1400 R St, Lincoln, NE 68588 [email protected] May 23, 2017 ABSTRACT For learners unfamiliar with character-based or logosyllabic writing systems, the process of developing literacy in written Chinese poses significantly more obstacles than learning
    [Show full text]
  • NY IME Instructions
    Selecting Simplified Input Method To type in simplified characters, click on the arrow to the right of the selected input language at the top left corner of your screen. Then select "Chinese (Simplified)" from the drop-down list that appears, as shown below. Please note that after each time you select a language, you must use your mouse to reposition your cursor inside the response box before you can begin typing. Note: You may also toggle between your selected Chinese input method and English by using the key code CTRL+SPACEBAR. IMPORTANT: Once you have selected a Chinese input language, if you press SHIFT, your keyboard will type in English until you press SHIFT again to resume Chinese input. Text Entry and Character Selection 1. Use the keyboard to type the pronunciation of the desired Chinese character in Simplified Chinese. As you type, a long window will appear with numbered character options, as shown in the example below. If you do not see the desired character in the window, type additional letters or use the mouse to click on the black arrow at the end of the window to see another set of character options. 2. If the first character shown is the desired character, press SPACEBAR to select it. To select another character, use the mouse to click on the character in the window or type the number that corresponds to the desired character. 3. The character you selected will now appear in the response box. Press ENTER or SPACEBAR to confirm the character and move the cursor forward.
    [Show full text]
  • Chinats Language Input System in the Digital Age Affects Childrents Reading Development
    China’s language input system in the digital age affects children’s reading development Li Hai Tan1,2, Min Xu1, Chun Qi Chang, and Wai Ting Siok2 State Key Laboratory of Brain and Cognitive Sciences, Department of Linguistics, and Shenzhen Institute of Research and Innovation, University of Hong Kong, Hong Kong Edited by Dale Purves, Duke–National University of Singapore Graduate Medical School, Singapore, and approved December 5, 2012 (received for review August 7, 2012) Written Chinese as a logographic system was developed over 3,000 analysis of characters and to establish their representation in long- y ago. Historically, Chinese children have learned to read by learning term memory (21–25). to associate the visuo-graphic properties of Chinese characters with Severe reading difficulty arises when children fail to establish lexical meaning, typically through handwriting. In recent years, a cohesive reading circuit that links orthography, phonology, and however, many Chinese children have learned to use electronic meaning. Reading difficulty is defined as an unexpectedly low communication devices based on the pinyin input method, which reading ability in people who have adequate nonverbal intelligence, associates phonemes and English letters with characters. When have acquired typical schooling, and have experienced sufficient children use pinyin to key in letters, their spelling no longer depends sociocultural opportunities (1–12). Estimates of prevalence of se- on reproducing the visuo-graphic properties of characters that are vere reading difficulty in English range from 5% to 17% (3, 8, 11, indispensable to Chinese reading, and, thus, typing in pinyin may 13). Severe reading difficulty has also been found in Chinese conflict with the traditional learning processes for written Chinese.
    [Show full text]
  • Open Vocabulary Learning for Neural Chinese Pinyin IME
    Open Vocabulary Learning for Neural Chinese Pinyin IME Zhuosheng Zhang1;2, Yafang Huang1;2, Hai Zhao1;2;∗ 1Department of Computer Science and Engineering, Shanghai Jiao Tong University 2Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, Shanghai, China fzhangzs, [email protected], [email protected] Abstract lation between two different languages, pinyin se- quences and Chinese character sequences (namely Pinyin-to-character (P2C) conversion is the Chinese sentence). Actually, such a translation in core component of pinyin-based Chinese in- put method engine (IME). However, the con- P2C procedure is even more straightforward and version is seriously compromised by the am- simple by considering that the target Chinese char- biguities of Chinese characters corresponding acter sequence keeps the same order as the source to pinyin as well as the predefined fixed vocab- pinyin sequence, which means that we can decode ularies. To alleviate such inconveniences, we the target sentence from left to right without any propose a neural P2C conversion model aug- reordering. mented by an online updated vocabulary with Meanwhile, there exists a well-known challenge a sampling mechanism to support open vocab- ulary learning during IME working. Our ex- in P2C procedure, too much ambiguity mapping periments show that the proposed method out- pinyin syllable to character. In fact, there are only performs commercial IMEs and state-of-the- about 500 pinyin syllables corresponding to ten art traditional models on standard corpus and thousands of Chinese characters, even though the true inputting history dataset in terms of multi- amount of the commonest characters is more than ple metrics and thus the online updated vocab- 6,000 (Jia and Zhao, 2014).
    [Show full text]
  • Elementary Mandarin Immersion Students Learning Alphabetic Pinyin and Using Pinyin to Learn Chinese Characters a DISSERTATION S
    Elementary Mandarin Immersion Students Learning Alphabetic Pinyin and Using Pinyin to Learn Chinese Characters A DISSERTATION SUBMITTED TO THE FACULTY OF THE UNIVERSITY OF MINNESOTA BY Zhongkui Ju IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Dr. Martha Bigelow August 2019 © Zhongkui Ju 2019 Acknowledgements First of all, I would like to thank the American Council on the Teaching of Foreign Languages (ACTFL), Language Learning, National Federation of Modern Languages Teaching Associations (NFMLTA)/National Council of Less Commonly Taught Languages (NCOLCTL), Pearson Education, and the Department of Curriculum and Instruction at the University of Minnesota for their generous support and recognition of this project. Second, I want to acknowledge those who have supported me, guided me, and helped me through every step of this process to bring me to this point. I am grateful to my advisor, Dr. Martha Bigelow, for her faith in me. She is more than an advisor but also a friend, a confidant, and a cheerleader during this PhD journey. I appreciate that she understands that earning a scholarship is not the most challenging hardship in graduate school. I am also grateful to Dr. Bob delMas for his advice, expertise, patience, and role- modeling for what it means to be a rigorous researcher. His purposeful questions, comments, and suggestions created valuable learning moments during the process of writing this dissertation. I have learned so much from him over the past six months. I am indebted to Dr. Yanling Zhou for her support, advice, sharing, and time commitment to assist me from Hong Kong, answering my questions online and attending my oral defense in Minneapolis.
    [Show full text]
  • Supporting the Chinese Language in Oracle Text
    Supporting the Chinese Language in Oracle® Text Masterarbeit im Fach Arbeits-, Lern- und Präsentationstechniken Masterstudiengang Informationswirtschaft der Fachhochschule Stuttgart – Hochschule der Medien Poh Choo Lai Erstprüfer: Prof. Dr.-Ing. Peter Lehmann Zweitprüferin: Frau Barbara Steinhanses Bearbeitungszeitraum: 20. Okt 2003 bis 31. Mär 2004 Stuttgart, März 2004 Erklärung ii Erklärung Hiermit erkläre ich, dass ich die vorliegende Masterarbeit selbständig angefertigt habe. Es wurden nur die in der Arbeit ausdrücklich benannten Quellen und Hilfsmittel benutzt. Wörtlich oder sinngemäß übernommenes Gedankengut habe ich als solches kenntlich gemacht. Ort, Datum Unterschrift Kurzfassung iii Kurzfassung Gegenstand dieser Arbeit sind die Problematik von chinesischem Information Retrieval (IR) sowie die Faktoren, die die Leistung eines chinesischen IR-System beeinflussen können. Experimente wurden im Rahmen des Bewertungsmodells von „TREC-5 Chinese Track“ und der Nutzung eines großen Korpusses von über 160.000 chinesischen Nachrichtenartikeln auf einer Oracle10g (Beta Version) Datenbank durchgeführt. Schließlich wurde die Leistung von Oracle® Text in einem so genannten „Benchmarking“ Prozess gegenüber den Ergebnissen der Teilnehmer von TREC-5 verglichen. Die Hauptergebnisse dieser Arbeit sind: (a) Die Wirksamkeit eines chinesischen IR Systems ist durch die Art und Weise der Formulierung einer Abfrage stark beeinflusst. Besonders sollte man während der Formulierung einer Anfrage die Vielzahl von Abkürzungen und die regionalen Unterschiede
    [Show full text]
  • Universidad De El Salvador Facutad Multidisciplinaria Oriental Departamento De Ciencias Y Humanidades Seccion De Educacion
    UNIVERSIDAD DE EL SALVADOR FACUTAD MULTIDISCIPLINARIA ORIENTAL DEPARTAMENTO DE CIENCIAS Y HUMANIDADES SECCION DE EDUCACION AÑO ACADEMICO: CICLO II, 2018 TEMA: LOS BUSCADORES (GOOGLE) DOCENTE: JORGE ENESTO PORTILLO. ESTUDINTE: GRANADOS VENTURA JOSE ISAAC. CIUDAD UNIVERSITARIA ORIENTAL, OCTUBRE 2018 INTRODUCCION En el presente trabajo daremos a conocer una gama de buscadores que son útiles para que las personas, busquen información de acuerdo a su intereses, cada buscador posee sus ventajas y desventajas en esta sección solo se presentara una síntesis por cada uno de los buscadores nos centraremos en aspectos importantes sobre el buscador google, su conceptos así como también sus aspectos generales, las características que este posee como material didáctico, además su clasificación, luego presentamos la importancia que este tiene para el procesos de enseñanza aprendizaje, así como también las ventas y desventajas que este buscador tiene, de igual manera se presentara el procesos detallado para la elaboración de la búsqueda de información de google, y el procesos como se usa en el aula. 2 INDICE OBJETIVOS.................................................................................................................................................... 5 OBJETIVO GENERAL. ..................................................................................................................................... 5 OBJETIVO ESPECIFICOS. ...........................................................................................................................
    [Show full text]
  • Using Google Pinyin IME
    Back Home Blog Tools Media Textbooks Using Google Pinyin IME Google Pinyin IME (谷歌拼音输入法 ) Free download here (http://www.google.com/intl/zh-CN/ime/pinyin/) 说到中文打字,谷歌拼音输入法比微软拼音输入法更快捷:中英转换不必换键,只需连着打,或按一下Shift 即 可。屏幕下方显示的控制台,可便于选择软键盘标注拼音调号,或繁简转换。此程序没有Mac电脑版本,不过 Mac使用者可以用QIM(新版称为 IMKQIM), 非常快捷。(详见关于QIM的说明)。 When it comes to typing Chinese on PC, Google Pinyin IME (freely available) is superior to Microsoft IME built into the Windows OS. The program is similar to MS IME in many ways, but it is much faster and easier. It can be freely downloaded and installed (and learned) in seconds! This user-friendly program has these features: 1. Easily switch between English and Chinese without changing keys, or just press Shift once. 2. The visible control center makes it easy to add pinyin tone marks, or switch between simplified and traditional character forms. 3. Type faster using the sentence mode. Just keep typing, and the system will automatically correct wrong characters. Download When you are on the download page (right), click on the yellow area to download the program, and then open the file and install it to your computer. Control Center After you have installed the Google pinyin IME program, click on the language bar (EN or CH) to choose Chinese (CH), then choose Google IME (谷歌 ) if you have more than one input methods. The Control Center should be displayed (normally at the bottom right of your screen). Click on each symbol to see how it works. Note that the moon shape should be kept as the following. If you change it to a full moon, it will affect your English text.
    [Show full text]
  • Six-Digit Stroke-Based Chinese Input Method Lai-Man Po, Chi-Kwan Wong, Yiu-Ki Au, Ka-Ho Ng and Ka-Man Wong
    Six-Digit Stroke-based Chinese Input Method Lai-Man Po, Chi-Kwan Wong, Yiu-Ki Au, Ka-Ho Ng and Ka-Man Wong Department of Electronic Engineering, City University of Hong Kong, Kowloon, Hong Kong Email: [email protected] Abstract—During the last three decades, more than one B. Pronunciation-based Chinese Input Methods thousand Chinese input methods have been developed. However, The two most popular pronunciation-based Chinese input people are still looking for better input methods in terms of easy to use, easy to remember, high input speed and small keypad methods in mainland China and Taiwan are Pinyin (拼音) [3] implementation on handheld devices. The well-known stroke- and Zhuyin (注音), respectively. The Pinyin method simply based Chinese input method using only five basic stroke types uses the Pinyin table as its conversion dictionary. This method could achieve low learning curve and small numeric keypad is very easy to learn for people who are already familiar with implementation but its input speed is limited for complex Chinese Putonghua (or Mandarin). For inputting the Chinese character characters with a lot of strokes. To tackle this problem, of “汉“, we can enter its Pinyin letters of “han” into the simplified stroke-based Chinese character and phrase coding methods using (3+3) rules are proposed in this paper. The computer for starting the character input. However, we will proposed method only uses the first 3 stroke codes and the last 3 get a list of 30 or more candidates to choose from, as there are stroke codes to represent the first and last radical information of a lot of Chinese characters with identical Pinyin string.
    [Show full text]