Chinese Pinyin Aided IME, Input What You Have Not Keystroked Yet

Total Page:16

File Type:pdf, Size:1020Kb

Chinese Pinyin Aided IME, Input What You Have Not Keystroked Yet Chinese Pinyin Aided IME, Input What You Have Not Keystroked Yet Yafang Huang1;2, Hai Zhao1;2;∗, 1Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China 2Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China [email protected], [email protected] ∗ Abstract between pinyin syllables and Chinese characters. (Huang et al., 2018; Yang et al., 2012; Jia and Chinese pinyin input method engine (IME) Zhao, 2014; Chen et al., 2015) regarded the P2C as converts pinyin into character so that Chinese a translation between two languages and solved it characters can be conveniently inputted into computer through common keyboard. IMEs in statistical or neural machine translation frame- work relying on its core component, pinyin- work. The fundamental difference between (Chen to-character conversion (P2C). Usually Chi- et al., 2015) work and ours is that our work is a nese IMEs simply predict a list of character fully end-to-end neural IME model with extra at- sequences for user choice only according to tention enhancement, while the former still works user pinyin input at each turn. However, Chi- on traditional IME only with converted neural net- nese inputting is a multi-turn online procedure, work language model enhancement. (Zhang et al., which can be supposed to be exploited for fur- 2017) introduced an online algorithm to construct ther user experience promoting. This paper thus for the first time introduces a sequence- appropriate dictionary for P2C. All the above men- to-sequence model with gated-attention mech- tioned work, however, still rely on a complete in- anism for the core task in IMEs. The pro- put pattern, and IME users have to input very long posed neural P2C model is learned by en- pinyin sequence to guarantee the accuracy of P2C coding previous input utterance as extra con- module as longer pinyin sequence may receive less text to enable our IME capable of predicting decoding ambiguity. character sequence with incomplete pinyin in- The Chinese IME is supposed to let user in- put. Our model is evaluated in different bench- mark datasets showing great user experience put Chinese characters with least inputting cost, improvement compared to traditional models, i.e., keystroking, which indicates extra content which demonstrates the first engineering prac- predication from incomplete inputting will be ex- tice of building Chinese aided IME. tremely welcomed by all IME users. (Huang et al., 2015) partially realized such an extra predication 1 Introduction using a maximum suffix matching postprocess- ing in vocabulary after SMT based P2C to predict Pinyin is the official romanization representation longer words than the inputted pinyin. for Chinese and the P2C converting the inputted To facilitate the most convenience for such an arXiv:1809.00329v1 [cs.CL] 2 Sep 2018 pinyin sequence to Chinese character sequence is IME, in terms of a sequence to sequence model as the most basic module of all pinyin based IMEs. neural machine translation (NMT) between pinyin Most of the previous research (Chen, 2003; sequence and character sequence, we propose a Zhang et al., 2006; Lin and Zhang, 2008; Chen P2C model with the entire previous inputted ut- and Lee, 2000; Jiang et al., 2007; Cai et al., 2017a) terance confirmed by IME users being used as a for IME focused on the matching correspondence part of the source input. When learning the type ∗ Corresponding author. This paper was partially sup- of the previous utterance varies from the previous ported by National Key Research and Development Program of China (No. 2017YFB0304100), National Natural Science sentence in the same article to the previous turn of Foundation of China (No. 61672343 and No. 61733011), utterance in a conversation, the resulting IME will Key Project of National Society Science Foundation of China make amazing predication far more than what the (No. 15-ZDA041), The Art and Science Interdisciplinary Funds of Shanghai Jiao Tong University (No. 14JCRZ04), pinyin IME users actually input. and the joint research project with Youtu Lab of Tencent. In this paper, we adopt the attention-based NMT User Selection 天气 还 可以 的 EOS Gated Attention User choice t-2 (c: context) P2C P2C Pinyint-1 Pinyint-1 Cand. list t-1 User choice t-1 Cand. list t-1 User choice t-1 (c: context) 今天 的 天气 怎么样 呢 BC tianqi hai keyi de EOS (How's the weather) (Weather is not bad) (c) Simple C+ P2C Model: Simple context-enhanced pinyin-to-character model, straightforwardly concat the context to the source pinyin sequence. Pinyint Pinyint Source Emb. Gated Attention 天气 还 可以 的 EOS Cand. list t User choice t Cand. listt User choice t (c: context) Context Emb. BiLSTM Encoder Target Emb. RNN Decoder Pinyin Pinyin t+1 t+1 BiGRU Cand. list t+1 User choicet+1 Cand. listt+1 User choice t+1 (c: context) (a) The data flow of (b) The data flow of our 今天 的 天气 怎么样 呢 tianqi hai keyi de EOS traditional P2C gated-attention enhanced P2C (d) Gated C+ P2C Model: Gated Attention based Context-enhanced Pinyin-to-Character model. Figure 1: Architecture of the proposed model. framework in (Luong et al., 2015) for the P2C As introduced a hybrid source side input, our task. In contrast to related work that simply ex- model has to handle document-wide translation tended the source side with different sized context by considering discourse relationship between two window to improve of translation quality (Tiede- consecutive sentences. The most straightforward mann and Scherrer, 2017), we add the entire input modeling is to simply concatenate two types of utterance according to IME user choice at previous source inputs with a special token ’BC’ as sepa- time (referred to the context hereafter). Hence the rator. Such a model is in Figure1(c). However, resulting IME may effectively improve P2C qual- the significant drawback of the model is that there ity with the help of extra information offered by are a slew of unnecessary words in the extended context and support incomplete pinyin input but context (previous utterance) playing a noisy role predict complete, extra, and corrected character in the source side representation. output. The evaluation and analysis will be per- To alleviate the noise issue introduced by the formed on two Chinese datasets, include a Chi- extra part in the source side, inspired by the nese open domain conversations dataset for veri- work of (Dhingra et al., 2016; Pang et al., 2016; fying the effectiveness of the proposed method. Zhang et al., 2018c,a,b; Cai et al., 2017b), our model adopts a gated-attention (GA) mechanism 2 Model that performs multiple hops over the pinyin with the extended context as shown in Figure1(d). As illustrated in Figure 1, the core of our P2C In order to ensure the correlation between each is based the attention-based neural machine trans- other, we build a parallel bilingual training cor- lation model that converts at word level. Still, pus and use it to train the pinyin embeddings and we formulize P2C as a translation between pinyin the Chinese embeddings at once. We use two and character sequences as shown in a traditional Bidirectional gated recurrent unit (BiGRU) (Cho model in Figure1(a). However, there comes a et al., 2014) to get contextual representations of key difference from any previous work that our the source pinyin and context respectively, Hp = source language side includes two types of inputs, BiGRU(P );Hc = BiGRU(C), where the repre- the current source pinyin sequence (noted as P ) as sentation of each word is formed by concatenating usual, and the extended context, i.e., target charac- the forward and backward hidden states. ter sequence inputted by IME user last time (noted For each pinyin pi in Hp, the GA module as C). As IME works dynamically, every time forms a word-specific representation of the con- IME makes a predication according to a source text ci 2 Hc using soft attention, and then adopts pinyin input, user has to indicate the ’right answer’ element-wise product to multiply the context rep- to output target character sequence for P2C model resentation with the pinyin representation. αi = T learning. This online work mode of IMEs can be softmax(Hc pi); βi = Cαi; xi = pi βi, where fully exploited by our model whose work flow is is multiplication operator. shown in Figure1(b). The pinyin representation H~p = x1; x2; :::; xk is augmented by context representation and then user’s input does not end with typing enter, we can sent into the encoder-decoder framework. The regard the current input pinyin sequence is an in- encoder is a bi-directional long short-term mem- complete one. ory (LSTM) network (Hochreiter and Schmidhu- ber, 1997). The vectorized inputs are fed to for- 3.2 Datasets and Settings ward and backward LSTMs to obtain the internal representation of two directions. The output for PD DC each input is the concatenation of the two vec- Train Test Train Test tors from both directions. Our decoder based on # Sentence 5.0M 2.0K 1.0M 2.0K the global attentional models proposed by (Luong L < 10 % 88.7 89.5 43.0 54.0 et al., 2015) to consider the hidden states of the en- L < 50 % 11.3 10.5 47.0 42.0 coder when deriving the context vector. The prob- L > 50 % 0.0 0.0 4.0 2.0 ability is conditioned on a distinct context vector Relativity % 18.0 21.1 65.8 53.4 for each target word.
Recommended publications
  • Chinese Pinyin Aided IME, Input What You Have Not Keystroked Yet
    Chinese Pinyin Aided IME, Input What You Have Not Keystroked Yet Yafang Huang1;2, Hai Zhao1;2;∗, 1Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China 2Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China [email protected], [email protected] ∗ Abstract between pinyin syllables and Chinese characters. (Huang et al., 2018; Yang et al., 2012; Jia and Chinese pinyin input method engine (IME) Zhao, 2014; Chen et al., 2015) regarded the P2C as converts pinyin into character so that Chinese a translation between two languages and solved it characters can be conveniently inputted into computer through common keyboard. IMEs in statistical or neural machine translation frame- work relying on its core component, pinyin- work. The fundamental difference between (Chen to-character conversion (P2C). Usually Chi- et al., 2015) work and ours is that our work is a nese IMEs simply predict a list of character fully end-to-end neural IME model with extra at- sequences for user choice only according to tention enhancement, while the former still works user pinyin input at each turn. However, Chi- on traditional IME only with converted neural net- nese inputting is a multi-turn online procedure, work language model enhancement. (Zhang et al., which can be supposed to be exploited for fur- 2017) introduced an online algorithm to construct ther user experience promoting. This paper thus for the first time introduces a sequence- appropriate dictionary for P2C. All the above men- to-sequence model with gated-attention mech- tioned work, however, still rely on a complete in- anism for the core task in IMEs.
    [Show full text]
  • Strategic Social Media Tools and Increasing Engagement
    Strategic Social media tools and increasing engagement Soolcio D I ST RI C T CO UN C IL In this guide 1. How to make best use ofyour time and money 2. What's in your social media too/kit? 3. How can you increase engagement? Soolcio D I ST RI C T CO UN C IL 1. Investment should you focus your time and money? Soolcio D I ST RI C T CO UN C IL Let’s think about 1. Platforms 3. Photography 2. Planning 4. Promotions Soolcio D I ST RI C T CO UN C IL 1. Platforms Always better to do fewer platforms, brilliantly You don't have to be everywhere, all the time! Think: Where is my audience? Soolcio D I ST RI C T CO UN C IL ------------------- ----------- 9!.!Je! · little_aae_kitchen ti II ~• • • • little_acre_kitchen • Follow eny oc •, d • Origi al io little_acre_kitchen ; PY F D Y! ; Biscoff brownies o t e counter! I you vould like to book a house old table or the veeken t en ge i touch! 01480 46405 □ eve s@hallandco en desig .co.uk 3w simply_rose711 Are hese available for take away? 3 1 like Rep y (? 0 Y/ Lik by cambridgejuiceco an 75 others e Q ddacommen ... Huntingdonshire- DISTRICT COUNCIL 2. Planning Put time aside to think: • What type ofcontent you'll post • What you won't post about! • Who's going to do it • What will be sustainable Soolcio D I ST RI C T CO UN C IL SOCIAL MEDIA CHECKLIST Planning Maybe you're new to social media - or you want to take a fresh 2.
    [Show full text]
  • Open Vocabulary Learning for Neural Chinese Pinyin IME
    Open Vocabulary Learning for Neural Chinese Pinyin IME Zhuosheng Zhang1;2, Yafang Huang1;2, Hai Zhao1;2;∗ 1Department of Computer Science and Engineering, Shanghai Jiao Tong University 2Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, Shanghai, China fzhangzs, [email protected], [email protected] Abstract lation between two different languages, pinyin se- quences and Chinese character sequences (namely Pinyin-to-character (P2C) conversion is the Chinese sentence). Actually, such a translation in core component of pinyin-based Chinese in- put method engine (IME). However, the con- P2C procedure is even more straightforward and version is seriously compromised by the am- simple by considering that the target Chinese char- biguities of Chinese characters corresponding acter sequence keeps the same order as the source to pinyin as well as the predefined fixed vocab- pinyin sequence, which means that we can decode ularies. To alleviate such inconveniences, we the target sentence from left to right without any propose a neural P2C conversion model aug- reordering. mented by an online updated vocabulary with Meanwhile, there exists a well-known challenge a sampling mechanism to support open vocab- ulary learning during IME working. Our ex- in P2C procedure, too much ambiguity mapping periments show that the proposed method out- pinyin syllable to character. In fact, there are only performs commercial IMEs and state-of-the- about 500 pinyin syllables corresponding to ten art traditional models on standard corpus and thousands of Chinese characters, even though the true inputting history dataset in terms of multi- amount of the commonest characters is more than ple metrics and thus the online updated vocab- 6,000 (Jia and Zhao, 2014).
    [Show full text]
  • Universidad De El Salvador Facutad Multidisciplinaria Oriental Departamento De Ciencias Y Humanidades Seccion De Educacion
    UNIVERSIDAD DE EL SALVADOR FACUTAD MULTIDISCIPLINARIA ORIENTAL DEPARTAMENTO DE CIENCIAS Y HUMANIDADES SECCION DE EDUCACION AÑO ACADEMICO: CICLO II, 2018 TEMA: LOS BUSCADORES (GOOGLE) DOCENTE: JORGE ENESTO PORTILLO. ESTUDINTE: GRANADOS VENTURA JOSE ISAAC. CIUDAD UNIVERSITARIA ORIENTAL, OCTUBRE 2018 INTRODUCCION En el presente trabajo daremos a conocer una gama de buscadores que son útiles para que las personas, busquen información de acuerdo a su intereses, cada buscador posee sus ventajas y desventajas en esta sección solo se presentara una síntesis por cada uno de los buscadores nos centraremos en aspectos importantes sobre el buscador google, su conceptos así como también sus aspectos generales, las características que este posee como material didáctico, además su clasificación, luego presentamos la importancia que este tiene para el procesos de enseñanza aprendizaje, así como también las ventas y desventajas que este buscador tiene, de igual manera se presentara el procesos detallado para la elaboración de la búsqueda de información de google, y el procesos como se usa en el aula. 2 INDICE OBJETIVOS.................................................................................................................................................... 5 OBJETIVO GENERAL. ..................................................................................................................................... 5 OBJETIVO ESPECIFICOS. ...........................................................................................................................
    [Show full text]
  • Using Google Pinyin IME
    Back Home Blog Tools Media Textbooks Using Google Pinyin IME Google Pinyin IME (谷歌拼音输入法 ) Free download here (http://www.google.com/intl/zh-CN/ime/pinyin/) 说到中文打字,谷歌拼音输入法比微软拼音输入法更快捷:中英转换不必换键,只需连着打,或按一下Shift 即 可。屏幕下方显示的控制台,可便于选择软键盘标注拼音调号,或繁简转换。此程序没有Mac电脑版本,不过 Mac使用者可以用QIM(新版称为 IMKQIM), 非常快捷。(详见关于QIM的说明)。 When it comes to typing Chinese on PC, Google Pinyin IME (freely available) is superior to Microsoft IME built into the Windows OS. The program is similar to MS IME in many ways, but it is much faster and easier. It can be freely downloaded and installed (and learned) in seconds! This user-friendly program has these features: 1. Easily switch between English and Chinese without changing keys, or just press Shift once. 2. The visible control center makes it easy to add pinyin tone marks, or switch between simplified and traditional character forms. 3. Type faster using the sentence mode. Just keep typing, and the system will automatically correct wrong characters. Download When you are on the download page (right), click on the yellow area to download the program, and then open the file and install it to your computer. Control Center After you have installed the Google pinyin IME program, click on the language bar (EN or CH) to choose Chinese (CH), then choose Google IME (谷歌 ) if you have more than one input methods. The Control Center should be displayed (normally at the bottom right of your screen). Click on each symbol to see how it works. Note that the moon shape should be kept as the following. If you change it to a full moon, it will affect your English text.
    [Show full text]
  • Tracing a Loose Wordhood for Chinese Input Method Engine
    Tracing a Loose Wordhood for Chinese Input Method Engine Xihu Zhang, Chu Wei and Hai Zhao∗ Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China fxihuzhangcs, [email protected], [email protected] Abstract phonetic representation of Chinese language, into Chinese character sequence. For example, a user Chinese input methods are used to convert wants to type a word ”北¬”(Beijing). The cor- pinyin sequence or other Latin encoding responding pinyin beijing is inputted, then the systems into Chinese character sentences. pinyin IME provides a list of Chinese charac- For more effective pinyin-to-character ter candidates whose pinyins are all beijing, such conversion, typical Input Method Engines as ”北¬”(pinyin: beijing, the Beijing city), ”背 (IMEs) rely on a predefined vocabulary o”(pinyin: beijing, background) as presented in that demands manually maintenance on Figure 1. The user at last selects the word ”北¬” schedule. For the purpose of removing the as the result. inconvenient vocabulary setting, this work focuses on automatic wordhood acquisi- beijing tion by fully considering that Chinese in- 1.北¬ 2.背centero 3.背 hereY 4.« 5.倍 putting is a free human-computer interac- tion procedure. Instead of strictly defin- Figure 1: IME interface on one page ing words, a loose word likelihood is in- troduced for measuring how likely a char- Chinese IMEs often have to utilize word-based acter sequence can be a user-recognized language models since character-based Language word with respect to using IME. Then an Model does not produce satisfactory result (Yang online algorithm is proposed to adjust the et al., 1998), Chinese words are composed of word likelihood or generate new words multiple consecutive Chinese characters.
    [Show full text]
  • Usporedba Mobilnih Aplikacija Za Uređivanje Fotografija
    Primjena digitalne fotografije u reprodukcijskim medijima Katedra za grafički dizajn i slikovne informacije Grafički fakultet Sveučilišta u Zagrebu USPOREDBA MOBILNIH APLIKACIJA ZA UREĐIVANJE FOTOGRAFIJA SEMINARSKI RAD Nositelji kolegija i voditelj rada: Ime i prezime studenata: Dr. sc. Maja Strgar Kurečić, doc. Leon Burazin, Petar Mamić, Sara Žitvaj prosinac, 2019. Tablica sadržaja UVOD .......................................................................................................................................... 3 USPOREDBA REZULTATA ............................................................................................................ 4 Photoshop Express ................................................................................................................. 4 VSCO ..................................................................................................................................... 11 Facetune ............................................................................................................................... 14 Snapseed .............................................................................................................................. 17 Adobe Lightroom .................................................................................................................. 19 ZAKLJUČAK ............................................................................................................................... 21 LITERATURA ............................................................................................................................
    [Show full text]
  • G Ooooooooooooooo Gle
    80.000,000.000 $ 80.000,000.000 $ 60.000,000.000 $ 60.000,000.000 $ 40.000,000.000 $ 40.000,000.000 $ Brin in Page se spoznata Brin in Page sodelovati in začneta začne PageRank Algoritem splet iz 5 doniranih indeksirati Stanforda v omrežju računalnikov kapitala $100,000 zagonskega (via Andy Bechtolsheim), podjetja Google Inc.ustanovitev Dodatnih $25.000,000 zagonskega Caufield kapitala (via Kleiner Perkins in Sequoia& Byers Capital) AdWords, storitve Zagon na oglasni model prehod poslovni Blogger, platforme Prevzem vsebin za z namenom analize Google News algoritmov izboljšanje izide na borzi. Beta izdaja Podjetje Gmail z namenom analize platforme oglasnih izboljšanje vsebin za algoritmov Google Maps Izid platforme in analize z namenom zbiranja geografsko pogojenih vsebin YouTube platforme Predstavitev vsebina za z namenom analize algoritmov iskalnih izboljšanje sistema operacijskega Predstavitev leti po prevzemu (dve Android podjetja)istoimenskega lastnega brskalnika Predstavitev ki podjetju Chrome, omogoči storitev naprednih razvoj nemoten naložbe, za rezerv 50 milijard Najavijo 79 podjetij, kar prevzamejo skupno eno na teden. kot več Mobility Motorola prevzem Rekordni dolarjev. 12 milijard ceno za 20.000,000.000 $ 20.000,000.000 $ Likvidna sredstva 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Likvidna sredstva G ooooo ooooo ooooo gle let “Sklepava, da bo iskalnik z oglaševalskim poslovnim modelom neizogibno pristranski v korist oglaševalcev in tako ne bo zadovoljeval dejanskih potreb njegovih uporabnikov.” — Priloga A k znanstvenemu članku, ki sta ga leta 1998 izdala doktorska študenta Univerze v Stanfordu Sergey Brin in Lawrence Page. Članek z naslovom Anatomija obsežnega hipertekstovnega spletnega iskalnika opisuje temelje danes najbolj uporabljane spletne platforme: Google.
    [Show full text]
  • Google Ime Marathi Typing Software Free Download
    Google Ime Marathi Typing Software Free Download Google Ime Marathi Typing Software Free Download 1 / 2 Download Google Input Marathi Setup - best software for Windows. Marathi Indic Input: With ... Free. Google Gujarati Input is part of the Google Transliteration IME. ... Free. Point your cursor in a text area and start typing in Marathi or Hindi. Google input tools Marathi is a tool that will help you easily type in Marathi. ... How to install google Marathi input tools in window ... download Software from.. Oct 9, 2020 — google IME tool you were using to write Marathi in Microsoft Word has been removed from your ... google ime offline installer | google ime setup hindi marathi | google marathi ime offline ... English; मराठी; Software Name; 4 ... increase typing speed ... Free Download Total video converter with serial key. Google Unicode Marathi is the best way of typing the Marathi language on the ... 10.2.0.0 is available to all software users as a free download for Windows.. Try type “我想食叉燒包” with “ngoshuengsikchasiubao”. While typing, you will see a list of candidates matching the sounds of your input. Choose a candidate by ... google marathi typing software google marathi typing software, google marathi typing software for windows 10, google marathi typing software free download for windows 7 32 bit, google marathi typing software free download, google marathi typing software for windows 7, google marathi typing software free download for windows 7 64 bit, google marathi typing software free download for windows 7, google marathi typing software free download for windows 8, google marathi typing software offline, google english to marathi typing software, google indic marathi typing software free download, google input tools marathi typing software free download Tutorial on how to install and use Google IME to type in Hindi, Marathi, Bangla, Tamil, Telugu etc.
    [Show full text]
  • HS-CIT Standards
    2018 HS-CIT Standards HS-CIT Standards Learning Topics in HS-CIT 1 OPERATING SYSTEM ...................................................................................................................................... 2 2 INTERNET ...................................................................................................................................................... 4 3 WORD PROCESSING .................................................................................................................................... 12 4 SPREADSHEET .............................................................................................................................................. 14 5 PRESENTATION GRAPHICS ....................................................................................................................... 15 6 PERSONAL INFORMATION MANAGER ............................................................................................ 16 7 ENGLISH AND DEVANAGARI (HINDI) TYPING ............................................................................. 16 Page 1 of 16 HS-CIT Standards 1 Operating System Sr. Topic Category HS-CIT Standard International No. Standard 1. Windows Basic Learner should able to start, restart, shutdown, lock, sleep, iCCCS Operations hibernate and log off a computer or laptop. 2. Learner should able to use mouse techniques such as Click, Northstar Digital Right Click, Double Click and Drag & Drop Literacy Standards 3. Learner should able to plug in headphones correctly and use Northstar Digital
    [Show full text]
  • Google Tamil Transliteration Software
    Google tamil transliteration software Available input tools include transliteration, IME, and on-screen keyboards. Marathi, Nepali, Oriya, Punjabi, Russian, Sanskrit, Serbian, Sinhala, Tamil, Telugu, ​Installation · ​Configuration · ​Features · ​Windows XP. Try Google Input Tools online. Google Input Tools makes it easy to type in the language you choose, anywhere on the web. Learn more. To try it out, choose. Google Input Tools for Windows is an input editor that helps you type text in 22 different langua Type using 22 different languages with this tool Marathi, Nepali, Oriya, Punjabi, Russian, Sanskrit, Serbian, Sinhala, Tamil, Telugu, Tigrinya. Portable Indian languages transliteration software Azhagi+ ( +). Android version exists too (with voice recognition). Transliterate or type in Hindi, Tamil, Sanskrit, Telugu, Kannada, Malayalam, Gujarati, Install from Google Play Store. Google Transliteration IME is an input method editor which allows users Nepali, Oriya, Punjabi, Russian, Sanskrit, Serbian, Sinhalese, Tamil. Google Transliteration IME(Input Method Editor) is an input method editor which Gujarati, Hindi, Kannada, Marathi, Nepali, Punjabi, Tamil, Telugu and Urdu. Available input tools include transliteration, IME, and on-screen keyboards. Google Input Tamil Virtual Online Tamil Keyboard. Alternative software. Gujarati Indic Input. It gives users a convenient way of entering text in Indian Languages. FREE. Google Gujarati Input. Part of the. The Google Tamil Input tool is an easy to use transliteration and language support IME application. It is very efficient in "phonetic transliteration". Download Google Transliteration IME now to start typing in your language. the user needs to download a very small piece of software for any if the 14 Tamil,. Telugu,. Urdu. You can also use bookmarklet to directly type in.
    [Show full text]
  • Open Vocabulary Learning for Neural Chinese Pinyin IME
    Open Vocabulary Learning for Neural Chinese Pinyin IME Zhuosheng Zhang1;2, Yafang Huang1;2, Hai Zhao1;2;∗ 1Department of Computer Science and Engineering, Shanghai Jiao Tong University 2Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, Shanghai, China fzhangzs, [email protected], [email protected] Abstract lation between two different languages, pinyin se- quences and Chinese character sequences (namely Pinyin-to-character (P2C) conversion is the Chinese sentence). Actually, such a translation in core component of pinyin-based Chinese in- put method engine (IME). However, the con- P2C procedure is even more straightforward and version is seriously compromised by the am- simple by considering that the target Chinese char- biguities of Chinese characters corresponding acter sequence keeps the same order as the source to pinyin as well as the predefined fixed vocab- pinyin sequence, which means that we can decode ularies. To alleviate such inconveniences, we the target sentence from left to right without any propose a neural P2C conversion model aug- reordering. mented by an online updated vocabulary with Meanwhile, there exists a well-known challenge a sampling mechanism to support open vocab- ulary learning during IME working. Our ex- in P2C procedure, too much ambiguity mapping periments show that the proposed method out- pinyin syllable to character. In fact, there are only performs commercial IMEs and state-of-the- about 500 pinyin syllables corresponding to ten art traditional models on standard corpus and thousands of Chinese characters, even though the true inputting history dataset in terms of multi- amount of the commonest characters is more than ple metrics and thus the online updated vocab- 6,000 (Jia and Zhao, 2014).
    [Show full text]