<<

Korean-to-Chinese Machine using Chinese as Pivot Clue

Jeonghyeok Park1,2,3 and Hai Zhao1,2,3, ∗

1Department of Computer Science and Engineering, Jiao Tong University 2Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, 3 MoE Lab of Artificial Intelligence Institute, Shanghai Jiao Tong University [email protected], [email protected]

Abstract et al., 2018; et al., 2019). Meanwhile, there are few attempts to improve the performance of the Korean-Chinese is a low resource NMT model using linguistic characteristics for sev- pair, but Korean and Chinese have a lot eral language pairs (Sennrich and Haddow, 2016). in common in terms of vocabulary. Sino- On the other hand, Most of the recently proposed Korean words, which can be converted into corresponding , account for statistical machine translation (SMT) systems have more then fifty of the entire Korean vocabu- attempted to improve translation performance by lary. Motivated by this, we propose a simple using linguistic features including part-of-speech linguistically motivated solution to improve (POS) tags (Ueffing and Ney, 2013), syntax ( the performance of Korean-to-Chinese neural et al., 2007), semantics (Rafael and Marta, 2011), machine translation model by using their com- reordering information ( et al., 2015; Zhang et mon vocabulary. We adopt Chinese charac- al., 2016) and so on. ters as a translation pivot by converting Sino- Korean words in Korean sentence to Chinese In this work, we focus on machine translation be- characters and then train machine translation tween Korean and Chinese, which have few parallel model with the converted Korean sentences corpora but share a well-known culture heritage, the as source sentences. The experimental results Sino-Korean words. Chinese loanwords used in - on Korean-to-Chinese translation demonstrate rean are called Sino-Korean words, and can also be that the models with the proposed method written in Chinese characters which are still used by improve translation quality up to 1.5 BLEU points in comparison to the baseline models. modern Chinese people. Such a shared vocabulary makes the two closer despite their huge linguistic difference and provides the possibility for 1 Introduction better machine translation. arXiv:1911.11008v1 [cs.CL] 25 Nov 2019 Neural machine translation (NMT) using sequence- Because of its long history of contact with China, to-sequence structure has achieved remarkable per- have used Chinese characters as their writ-  formance for most language pairs (Bahdanau et al., ing , and even after adopting (ôÇ/åJ in 2014; Cho et al., 2014; Sutskever et al., 2014; Lu- Korean) as the standard language, Chinese charac- ong and Manning, 2015). Many studies on NMT ters have a considerable influence in Korean vocabu- have tried to improve the translation performance lary. Currently, the system adopted by mod- by changing the structure of the network model or ern Korean is Hangul, but Chinese characters con- adding new strategies ( and Zhao, 2018; Zhang tinue to be used in Korean and Chinese characters

∗ used in Korean are called ””. Korean vocab- Corresponding author. This paper was partially supported ulary can be categorized into native Korean words, by National Key Research and Development Program of China (No. 2017YFB0304100) and Key Projects of National Natural Sino-Korean words, and loanwords from other lan- Science Foundation of China (U1836222 and 61733011). guages. The Sino-Korean vocabulary refers to Ko- Systems Sentences Korean " §î î“Ér 아Aü< °úᇀsᅵ ìÍøŸí÷&%3다. HH-Convert 命令“Ér 아Aü< °úᇀsᅵ 颁布÷&%3다. Chinese 命令颁布如下。 English The command was promulgated as follows. Korean €ᆼª²DG“Ér Fg #3ôÇ% %iò \"f_ᅴ /BN1lx sᅵ”e`¦ XS‰“þÙ¡다. HH-Convert 两国“Ér 广范ôÇ 领域\"f_ᅴ 共同 利益`¦ 确认þÙ¡다. Chinese 两国在广泛的领域确认了共同利益。 English The two countries have confirmed common interests in a wide range of areas.

Table 1: The HH-Convert is Korean sentence converted by Hangul-Hanja conversion of the Hanjaro. The underline denotes Sino-Korean word and its corresponding Chinese characters in Korean sentence and HH-Convert sentence, respectively. rean words of Chinese origin and can be converted ties between language pairs to improve MT perfor- into corresponding Chinese characters, and consid- mance. et al. (2009) improved the translation erably account for about 57% of Korean vocabu- quality for Chinese-to-Korean SMT by using - lary. Table 1 shows some sentence pairs of Korean nese syntactic reordering for an adequate generation and Chinese with the converted Sino-Korean words. of Korean verbal phrases. In Table 1, some Chinese words are commonly ob- Since Chinese and Korean belong to entirely dif- served between the converted Korean sentence and ferent language families in terms of typology and the Chinese sentence. genealogy, many studies also tried to analyze sen- In this paper, we present a novel yet straightfor- tence structure and word alignment of the two lan- ward method for better Korean-to-Chinese MT by guages and then proposed the specific methods for exploiting the connection of Sino-Korean vocabu- their concern (Huang and , 2000; et al., lary. We convert all Sino-Korean words in Korean 2002; Li et al., 2008). Lu et al. (2015) proposed sentences into Chinese characters and take the con- a method of translating Korean words into Chinese verted Korean sentences as the updated source data using the Chinese character knowledge. for later MT model training. Our method is applied to two types of NMT models, recurrent neural net- There are several attempts to exploit the connec- work (RNN) and the Transformer, and shows signif- tion between the source language and the target lan- icant translation performance improvement. guage in machine translation. Kuang et al. (2018) proposed methods to somewhat shorten the distance 2 Related Work between the source and target words in NMT model, and thus strengthen their association, through a tech- There have been studies of linguistic annotation, nique bridging source and target word embeddings. such as dependency label (Wu et al., 2018; Li et al., For other low-resource language pairs, using pivot 2018a; Li et al., 2018b), semantic role labels (Guan language to overcome the limitation of the insuf- et al., 2019; Li et al., 2019) and so on. Sennrich and ficient parallel corpus has been a choice (Habash Haddow (2016) proved that various linguistic fea- and Hu, 2009; Zahabi et al., 2013; Ahmadnia et tures can be valuable for NMT. In this work, we fo- al., 2017). et al. (2013) bulid a Chinese char- cus on the linguistic connection between Korean and acter mapping table for Japanese, Traditional Chi- Chinese to improve Korean-to-Chinese NMT. nese, and Simplified Chinese and verified the ef- There are several studies on Korean-Chinese fectiveness of shared Chinese characters for Chi- machine translation. For example, Kim et nese–Japanese MT. Zhao et al. (2013) used the Chi- al. (2002) proposed verb-pattern-based Korean-to- nese character, a common form of both languages, as Chinese MT system that uses pattern-based knowl- a translation bridge in the Vietnamese-Chinese SMT edge and consistently manages linguistic peculiari- model, and improved the translation quality by con- 北 ‚„B^‰ “北美›a'>ᅨ•¸ “南北›a'>ᅨ%ƒ!3 Chinese, many were created in their @/„¨Š8” vocabulary in the process of translating the Chinese 3.1îr1lx 100ÅÒ¸ ´úᆽ아 ᆼ©# î #QL:\ "鶐Òo( words into their language. Around 35% of the Sino- 原色) IFGlᅵÂ҂ÃÌ Korean words registered in the Standard belong to homophones. Thus Table 2: News headlines with Chinese characters. The converting Sino-Korean words into (usually differ- underline denotes Chinese characters. ent) Chinese characters will have a similar as semantic disambiguation. For example, the Korean verting Vietnamese into Chinese characters word uisa (_ᅴ사 in Korean) has many homophones with a pre-specified dictionary. Partially motivated and can have several meanings. To clarify the mean- by this work, we turn to Korean in terms of NMT ing of the word uisa in Korean context, these words models by fully exploiting the shared Sino-Korean are occasionally written in Chinese characters as fol- vocabulary between Korean and Chinese. lows: 医师 (doctor), 意思 (mind), 义士 (martyr), 议 事 (proceedings). 3 Sino-Korean Words and Chinese In addition, There is a difference between Chinese Characters characters (Hanja) used in and Chinese char- acters used in China. Chinese can be divided into Korea belongs to the Chinese cultural sphere, which two categories: Traditional Chinese and Simplified means that China has historically influenced regions Chinese. Chinese characters used in China and Ko- and countries of . Before the creation of rea are Simplified Chinese and Traditional Chinese, Hangul (Korean ), all documents were writ- respectively. ten in Chinese characters, and Chinese characters were used continuously even after the creation of 4 The Proposed Approach Hangul. The proposed approach for Korean-to-Chinese MT Today, the standard in Korea is has two phases: Hangul-Hanja conversion and NMT Hangul, and the use of Chinese characters in Korean model training. We first convert the Sino-Korean sentences is rare, but Chinese characters have left words of the Korean input sentences into Chinese a significant influence on Korean vocabulary. About characters, and convert the Traditional Chinese char- 290,000 (57%) out of the 510,000 words in the Stan- acters of the converted Korean input sentences into dard Korean Language Dictionary published by the Simplified Chinese characters to share the common National Institute of Korean Language belongs to units between source and target vocabulary. Then Sino-Korean words, which were originally written we train NMT models with the converted Korean in Chinese characters. Some Sino-Korean words do sentences as source data and the original Chinese not currently have corresponding Chinese words and sentences as target data. their meanings and usage have changed in the pro- For Hangul-Hanja conversion, we use open cess of introduction, but most of them have corre- toolkit Hanjaro that is provided by the Institute of sponding Chinese words. In Korean, Sino-Korean Traditional Culture1. The Hanjaro can accurately words are mainly used as literary or technical vo- convert Sino-Korean words into Chinese characters cabulary and are often used in abstraction concepts and is based on open toolkit UTagger ( and Ock and technical terms. The names of people and Ko- (2012) in Korean) developed by the Korean Lan- rea place are mostly composed of Chinese charac- guage Processing Laboratory of University. ters, and newspapers and professional books occa- More specifically, the Hanjaro first obtains tagging sionally use both Hangul and Chinese characters to information about , parts of speech(POS) clarify the meaning. Table 2 shows some news head- and homophones of a Korean sentence through the lines that contain Chinese characters from the Ko- Utagger, and converts Sino-Korean words into cor- rean news. responding Chinese characters by using this tagging Since Korean belongs to alphabetic writing sys- tems and is a language that does not have tones like 1https://hanjaro.juntong.or.kr Domains Train Validation Test corpus consists of a set of 55,294 pairs of parallel Society 67363 2,000 2,000 sentences. 2,000 and 2,000 pairs from the paral- All 258386 5,000 5,000 lel corpus were extracted as validation data and test data, respectively. Table 3: The statistics for the parallel corpus extracted The second corpus (Dong-A) is collected from from Dong-A newspaper (The number of sentences). the online Dong-A newspaper5 by us. We collected articles on four domains, Economy (81,278 sen- information and pre-built dictionary. The UTagger tences), Society (71,363), Global (68,073) and Pol- is the Korean morphological tagging model which itics (61,208), to build two corpora as shown in - has a recall of 99.05% on morpheme analysis and ble 3. 96.76% accuracy on POS and tagging. Since the sentences in the Dong-A newspaper are Nguyen et al. (2019) significantly improved the relatively long, the maximum sequence length that performance Korean-Vietnamese NMT system by we used to train the NMT model is set to 200. On building a lexical semantic network for the special the other hand, the maximum sequence length for characteristics of Korean, which is using a knowl- SWRC corpus is set to 50 because each sentence in edge base of the UTagger, and applying the Utagger the SWRC corpus is short. to Korean tokenization. For MT modeling, we use two types of NMT 5.2 NMT Models models: RNN based NMT and Transformer NMT The Torch-based toolkit OpenNMT (Klein et al., models. We train the NMT models on parallel cor- 2018) is used to build our NMT models, either RNN- pus processed through the Hangul-Hanja conversion based or Transformer. above. As for RNN-based models, we further consider two types of them, one with unidirectional LSTM 5 Experiments encoder (uni-RNN) and the other with bidirectional There have been many studies on how to segment LSTM based encoder (bi-RNN). For both RNN Korean and Chinese text (Zhao and Kit, 2008a; Zhao based models, we use 2-layer LSTM with 500 hid- and Kit, 2008b; Zhao et al., 2013; and Zhao, den units on both encoder and decoder and use the 2016; Deng et al., 2017). To find out which seg- global attention mechanism as described in (Luong mentation method has the highest translation per- et al., 2015). We use stochastic gradient descent formance, we tried multiple segmentation strate- (SGD) optimizer with the initial learning rate 1 and gies such as byte-pair-encoding (Sennrich et al., with decay rate 0.5. Mini-batch size is set to 64, and 2016), jieba2 , KoNLP3 and so on. Eventually, the dropout rate is set to 0.3. we found that character-based segmentation for both For our Transformer model, both the encoder and languages can give the best performance. Therefore, decoder are composed of a stack of 6 uniform layers, both Korean and Chinese sentences are segmented each built of two sublayers as described in (Vaswani into characters for our NMT models. et al., 2017). The dimensionality of all input and output layers is set to 512, and that of Feed-Forward 5.1 Parallel Corpus Networks (FFN) layers is set to 2048. We set the We use two parallel corpora in our experiment. The source and target tokens per batch to 4096. For op- first corpus is a Chinese-Korean parallel corpus of timization, we used Adam optimizer (Kingma and casual conversation and provided by Semantic Web , 2014) with β1= 0.9, β2= 0.98 to tune model pa- Research Center4 (SWRC). However, the SWRC rameters, and the learning rate is set by the warm-up corpus contained some incomplete data, so we re- strategy with steps 8,000 ,and it decreases propor- moved the erroneous data manually. The parallel tionally as the model training progresses. All of the NMT models are trained for 100,000 2https://pypi.org/project/jieba/ 3http://konlpy.org 5http://www.donga.com/ (Korean) and 4http://semanticweb.kaist.ac.kr http://chinese.donga.com/ (Chinese) Systems BLEU Score (Test set) specified dictionary and maximum matching mecha- w/ HH-Conv. w/ HH-Conv nism as described in (Zhao et al., 2013). Unlike Chi- uni-RNN 33.14 34.44 nese, which does not use inflectional , bi-RNN 35.31 36.66 Korean belongs to an agglutinative language that Transformer 35.47 37.84 tends to have a high rate of affixes or morphemes per word. Since some Korean syllables do not have Table 4: Experimental results of SWRC corpus. The corresponding Chinese characters, so converting all HH-Conv refers to Hangul-Hanja conversion function. Korean syllables of Korean sentence into Chinese characters is an impossible mission. In fact, we Systems Domains BLEU Score built a bilingual dictionary for Korean and Chinese w/o HH-c. w/ HH-c and used maximum matching mechanism to convert Society 36.25 37.58 all the affixes and inflectional morphemes of Ko- uni-RNN All 39.84 40.70 rean sentences into Chinese characters and trained Society 39.08 40.00 an RNN based NMT model, but the performance bi-RNN All 41.76 42.81 was even lower. Society 39.34 40.55 In our implementation, we estimate that the main Transformer All 44.70 44.88 reason for improving performance is to make the distinction between homophones clearer by con- Table 5: Experimental results of Dong-A corpus. verting Sino-Korean words into Chinese charac- ters. Many of the Korean vocabularies that employ the alphabetical writing system are homophones, steps and checked the performance on the validation which can confuse meaning or context. Especially, set after every 5,000 training steps. And we save the as mentioned in Section 3, 35% of Sino-Korean models every 5,000 training steps and evaluate the words are homophones. Therefore, it is possible models using traditional machine translation evalu- to clarify the distinction between homophones by ation metric. applying Hangul-Hanja conversion to Korean sen- tences, which leads to performance improvement in 5.3 Results Korean-to-Chinese MT. We used the BLEU score (Papineni et al., 2002) as our evaluation metric. Tables 4 and 5 show the ex- 6 Analysis perimental results for SWRC corpus and Dong-A 6.1 Analysis on Sino-Korean word Conversion corpus, respectively. All NMT models, trained with Korean sentences converted through Hangul-Hanja In this subsection, we will analyze the conversion conversion as source sentences, improve the transla- from Sino-Korean words to Chinese characters. To tion performance on all test sets in comparison to the estimate how much Chinese characters converted NMT models for the original sentence pairs. The ab- from Sino-Korean words by Hangul-Hanja conver- solute BLEU improvement is about 1.57 on average sion function are included in the corresponding ref- for SWRC corpus and 0.93 on average for Dong-A erence sentence, we propose ratio of including the corpus when applied the Hangul-Hanja conversion, same Chinese character between the converted Ko- respectively. rean sentence and Chinese sentence (reference sen- Our proposed method is to improve the trans- tence) (ROIC): lation performance of NMT models by converting P f(wi) only Sino-Korean words into corresponding Chinese ROIC = wi (1) characters in Korean sentences using the Hanjaro |w| and sharing the source vocabulary and the target vo- where |w| is the number of Chinese words in con- cabulary. verted Korean sentence, f(wi) is 1 if the Chinese In the work, we do not convert the entire Ko- word wi of the converted Korean sentence is in- rean sentence into Chinese characters using a pre- cluded in the corresponding Chinese sentence, and Figure 1: ROIC of each corpus. Word and Char denote Figure 2: BLEU scores for the translation of sentences the ROIC for Chinese word and the ROIC for Chinese with different lengths. character, respectively.

for all the sentence lengths. Since we set the Maxi- 0 otherwise. For example, in the second example mum sentence length to 200 for the Society corpus, of Table 1, because the five Chinese words such as we also can see that the performance continues to 两国 (two countries), 领域 (area), 共同 (common), improve when the length of the input sentence in- 利益 (interests), 确认 (confirm) are commonly ob- creases. served between the converted Korean sentence and the reference sentence except for 广 范 (abroad), 6.3 Analysis of Homophones Translation so we say that the ROIC of the converted Korean 5 In this subsection, we translate several sentences that sentence is 6 (83.33%). We perform analysis of contain two homophones and analyze how the Sino- Sino-Korean word conversion in two separate ways: Korean word conversion makes the distinction be- ROIC for Chinese word and ROIC for Chinese char- tween homophones more apparent. We translated acter. the sentences using the Transformer model trained Fig. 1 presents the ROIC of each corpus. It can with the Dong-A corpus. Table 6 presents the trans- be observed that for each corpus, more than 40% of lation results of sentences with two homophones. the converted Chinese words or more than 65% of We can see that our NMT model clearly distin- the converted Chinese characters are included in the guishes between homophones for all examples, but reference sentence. So we can see that source vo- the baseline model does not distinguish or translate cabulary and target vocabulary share many words af- homophones. For example, in the first example, the ter converting Sino-Korean words into Chinese char- baseline model does not translate Ä»tᅵ*(commu- acters. Sharing source vocabulary and target vo- nity leader). In the second and third example, the cabulary is especially useful for same alphabet lan- baseline model translated them into the same words guages, or for domains where professional terms are without distinguishing between the homophones. In written in English (Zhang et al., 2018). Therefore, the last example, _ᅴ사** (wishes) was improperly we set to share the source vocabulary and the tar- translated into 意向 (intention). Therefore, as men- get vocabulary of our NMT models, which leads to tioned in Section 5.3, these results indicate that our performance improvement. method helps distinguish homophones in Korean-to- 6.2 Analysis of Translation Performance Chinese machine translation. according to Different Sentence Lengths 7 Conclusion Following Bahdanau et al. (2017), we group sen- tences of similar lengths together and compute This paper presents a simple novel method exploit- BLEU scores, which are presented in Fig. 2. we con- ing the shared vocabulary of a low-resource lan- duct this analysis on Society corpus. It shows that guage pair for better machine translation. In - our method leads to better translation performance tail, we convert Sino-Korean words in Korean sen- Systems Sentences Korean sᅵtᅵ%i\사H Ä»tᅵ*[þtsᅵsᅵ마`¦`¦ Ä»tᅵ**하“ᅩ ›a'oᅵK나가“ᅩ ”e다. HH-Convert sᅵ 地域\사H 有志*[þtsᅵsᅵ마`¦`¦ 维持**하“ᅩ 管理K나가“ᅩ ”e다. Chinese 在这个区域生活的有志之士*在维护**和管理这个小区。 English The community leaders* living in this area are maintaining** and managing this community. Trans w/o HH-c 居住在该地区的维持**和管理村庄。 Trans w/ HH-c 居住在该地区的有志*们维持**这个村子,并进行管理。 Korean sᅵ$ í* çᆫ_ᅴß “ᅭ]jH sᅵ$ í**\따라K야 ôÇ다. HH-Convert 异性* 间_ᅴ 交际H 理性**\따라K야 ôÇ다. Chinese 异性*之间交往应该保持理性**。 English A romantic relationship between the opposite sex* should be rational**. Trans w/o HH-c 理性**间的交往应遵从理性**。 Trans w/ HH-c 异性*之间的交往应该根据理性**进行。 Korean ÕᅳH ;ƒ자"é¶*`¦ „ÃÐ사하H ”eÁº\자"é¶**þÙ¡다. HH-Convert ÕᅳH 天然资源*`¦ 探查하H 任务\ 自愿**þÙ¡다. Chinese 他自愿**参加勘探自然资源**的任务。 English volunteered** for the task of exploring natural resources*. Trans w/o HH-c 他为探测天然资源**的任务提供了资源**。 Trans w/ HH-c 他自愿**担任探测天然资源*的任务。 Korean _ᅴ사*_ᅴ ÀDK“Ér ŸílᅵþÙ¡tᅵëᆫß, 가7á¤[þt“Ér Õᅳ_ᅴ_ᅴ사**\¦>r× ”æKÅÒ%3다. HH-Convert 医师*_ᅴ ÀDK“Ér 抛弃þÙ¡tᅵëᆫß, 家族[þt“Ér Õᅳ_ᅴ 意思**\¦ 尊重KÅÒ%3다. Chinese 虽然放弃了医生*的梦想,但家人也尊重他的意愿**。 English Although he gave up on his dream of becoming a doctor*, his family respected his wishes**. Trans w/o HH-c 虽然医生*的梦想放弃了,但是家人却尊重了他的意向。 Trans w/ HH-c 虽然放弃了医生*的梦想,但家人却尊重了他的意愿**。

Table 6: Translation results of sentences with two homophones. The HH-Convert is Korean sentence converted by Hangul-Hanja conversion of the Hanjaro. Trans w/o HH-c and Trans w/ HH-c are the translation results of Transformer baseline model and Transformer using our method, respectively. The underline denotes homophone and the number of stars(*) distinguishes the meanings of the homophone in each example. In Chinese, English, and translation results, they denote words that are equivalent to the homophones in the sense of meaning. tences into Chinese characters and then train ma- References chine translation model with the converted Korean Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob sentences as source sentences.Our proposed - Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz provement has been verified effective over RNN- Kaiser and Illia Polosukhin. 2017. Attention Is All based and latest Transformer NMT models. Besides, You Need. arXiv preprint arXiv:1706.03762. we regard that this is the first attempt which takes Benyamin Ahmadnia, Javier Serrano and Gholamreza a linguistically motivated solution for low-resource Haffari. 2017. Persian-Spanish Low-Resource Sta- translation using NMT models. Although this pro- tistical Machine Translation Through English as Pivot posed method seems only suitable for the language Language. Proceedings of the International Confer- ence Recent Advances in Natural Language Process- pair of Korean and Chinese, it has enormous poten- ing (RANLP 2017), page 24–30. tial to work for any language pair which shares a Changhyun Kim, Young Kil Kim, Munpyo , Young considerable vocabulary from their shared history. Ae , Sung Il and Sung-Kwon Choi. 2002. Verb Pattern Based Korean-Chinese Machine Transla- tion System. Proceedings of the First SIGHAN Work- Hai Zhao and Chunyu Kit. 2008. An Empirical Com- shop on Processing, page 157–165. parison of Goodness Measures for Unsupervised Chi- Chenhui Chu, Toshiaki, Nakazawa, Daisuke, Kawa- nese Word Segmentation with a Unified Framework. hara and Sadao Kurohashi. 2013. Chinese-Japanese The Third International Joint Conference on Natural Machine Translation Exploiting Chinese Characters. Language Processing (IJCNLP-2008), volume 1. page ACM Transactions on Asian Language Information 9–16. Processing, volume 12. page 1–25. Hai Zhao, Masao Utiyama, Eiichro Sumita, and Bao- Chaoyu Guan, Yuhao and Hai Zhao. 2019. Se- Liang Lu. 2013. An Empirical Study on Word Seg- mantic Role Labeling with Associated Memory Net- mentation for Chinese Machine Translation. CICLing work. 2019 Annual Conference of the North American 2013, page 248–263. Chapter of the Association for Computational Linguis- Hai Zhao, Tianjiao Yin and Jingyi Zhang. 2013. Viet- tics (NAACL) , page 3361–3371. namese to Chinese Machine Translation via Chinese Deng Cai, Hai Zhao, Zhisong Zhang, Xin, Yongjian Character as Pivot. Proceedings of the 27th Pacific Wu and Feiyue Huang. 2017. Fast and Accurate Neu- Asia Conference on Language, Information, and Com- ral Word Segmentation for Chinese. ACL 2017, page putation (PACLIC 27), page 250–259. 608–615. Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Se- Deng Cai and Hai Zhao. 2016. Neural Machine Transla- quence to sequence learning with neural networks. In tion of Rare Words with Subword Units. In Proceed- Proceedings of NIPS 2014 ings of the 54th Annual Meeting of the Association for , page 3104–3112. Computational Linguistics (ACL 2016), page 409–420 Jingyi Zhang, Masao Utiyama, Eiichro Sumita, Hai Zhao, Dongdong Zhang, Li, Chi-Ho Li and . Graham Neubig and Satoshi Nakamura. 2016. Learn- 2007. Phrase Reordering Model Integrating Syntactic ing local word reorderings for hierarchical phrase- Knowledge for SMT. Proceedings of the 2007 Joint based statistical machine translation. Machine Trans- Conference on Empirical Methods in Natural Lan- lation, volume 30. page 1–18. guage Processing and Computational Natural Lan- Jinji Li, Dong-Il Kim and Jong-Hyeok . 2008. An- guage Learning (EMNLP-CoNLL). notation Guidelines for Chinese-Korean Word Align- Dong-il Kim, Zheng Cui, Jinji Li and Jong-Hyeok Lee. ment. In Proceedings of LREC. 2002. A Knowledge Based Approach to Identifica- Jinji Li ,Jungi Kim, Dong-Il Kim and Jong-Hyeok Lee. tion of Serial Verb Construction in Chinese-to-Korean 2009. Chinese Syntactic Reordering for Adequate Machine Translation System. Proceedings of the First Generation of Korean Verbal Phrases in Chinese-to- SIGHAN Workshop on Chinese Language Processing, Korean SMT. Proceedings of the Fourth Workshop on volume 18, page 1–7. Statistical Machine Translation, page 190–196. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A - Huang and Key-Sun Choi. 2000. Using Bilin- Method for Stochastic Optimization. arXiv preprint gual Semantic Information in Chinese-Korean Word acXiv:1412.6980. Alignment. Proceedings of the 14th Pacific Asia Con- Dzmitry Bahdanau, Kyunghyun Cho and Yoshua Ben- ference on Language, Information and Computation, gio. 2014. Neural Machine Translation by Jointly page 121–130. Learning to Align and Translate. arXiv preprint arXiv:1409.0473. Joon-Choul Shin and Cheol-Young Ock 2012. A Ko- Fengshun Xiao, Jiangtong Li, Hai Zhao, and rean morphological analyzer using a pre-analyzed par- KIISE:Software and Ap- Kehai . 2019. Lattice-based Transformer En- tial word-phrase dictionary. plications coder for Neural Machine Translation. The 57th An- , volume 39, page 415–424. [in Korean] nual Meeting of the Association for Computational Kishore Papineni, Salim Roukos, Todd Ward, and Wei- Linguistics (ACL 2019) Zhu. 2002. BLEU: A Method for Automatic Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senel- Evaluation of Machine Translation. Proceedings of lart and Alexander M. Rush. 2018. OpenNMT: the 40th Annual Meeting on Association for Compu- Neural Machine Translation Toolkit. arXiv preprint tational Linguistics, pages 311–318. arXiv:1805.11462. Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Hai Zhao and Chunyu Kit. 2008. Exploiting Unlabeled Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk Text with Different Unsupervised Segmentation Cri- and Yoshua Bengio. 2014. Learning Phrase Repre- teria for Chinese Word Segmentation. The 9th In- sentations using RNN Encoder–Decoder for Statistical ternational Conference on Intelligent Text Processing Machine Translation. Proceedings of the 2014 Confer- and Computational Linguistics (CICLing-2008), vol- ence on Empirical Methods in Natural Language Pro- ume 33. page 93–104. cessing, page 1724–1734. Lontu Zhang and Mamoru Komachi. 2018. ma- Embeddings. Proceedings of the 56th Annual Meet- chine translation of logographic language using sub- ing of the Association for Computational Linguistics, character level information. In Proceedings of the volume 1. page 1767–1776. Third Conference on Machine Translation, page 17– Shuo Zang, Hai Zhao, Chunyang Wu and Rui Wang. 25 2015. A Novel Word Reordering Method for Statisti- Minh-Thang Luong and Christopher D Manning. 2015. cal Machine Translation. The 2015 11th International Stanford neural machine translation systems for spo- Conference on Natural Computation (ICNC’15) and ken language domains. In In Proceedings of the Inter- the 2015 12th International Conference on Fuzzy Sys- national Workshop on Spoken Language Translation tems and Knowledge Discovery (FSKD’15), page 843– Minh-Thang Luong, Hieu Pham and Christopher D. 848. Manning. 2015. Effective Approaches to Attention- Yingting Wu and Hai Zhao. 2018. Finding Better Sub- based Neural Machine Translation. Proceedings of word Segmentation for Neural Machine Translation. the 2015 Conference on Empirical Methods in Natu- The Seventeenth China National Conference on Com- ral Language Processing, page 1412–1421. putational Linguistics, Volume 11221, page 53—64. Nicola Ueffing and Hermann Ney. 2013. Using POS Yingting Wu, Hai Zhao and Jia-Jun Tong. 2018. Multi- Information for SMT into Morphologically Rich Lan- lingual Universal Dependency Parsing from Raw Text guages. 10th Conference of the European Chapter of with Low Resource Language Enhancement. Pro- the Association for Computational Linguistics, page ceedings of CoNLL 2018, page 74–80. 347–354. Yuanmei Lu, Toshiaki Nakazawa and Sadao Kurohashi. Nizar Habash and Jun Hu. 2009. Improving Arabic- 2015. Korean-to-Chinese Word Translation using Chi- Chinese Statistical Machine Translation using English nese Character Knowledge. Proceedings of MT Sum- as Pivot Language. Proceedings of the Fourth Work- mit, 15(1). page 256–269. shop on Statistical Machine Translation, page 173– Zhisong Zhang, Rui Wang, Masao Utiyama, Eiichiro 181. Sumita and Hai Zhao. 2018. Exploring Recombina- Phuoc Nguyen, anh-dung Vo, Joon-Choul Shin, Phuoc tion for Efficient Decoding of Neural Machine Trans- Tran and Cheol-Young Ock. 2019. Korean- lation. Proceedings of EMNLP 2018, page 4785— Vietnamese Neural Machine Translation System With 4790. Korean Morphological Analysis and Word Sense Dis- Zuchao Li, Shexia He, Zhuosheng Zhang and Hai Zhao. ambiguation. IEEE Access, volume 7. page 32602– 2018. Joint Learning for Universal Dependency Pars- 32614. ing. Proceedings of CoNLL 2018, page 65-73. Rafael E. Banchs and Marta Ruiz Costa-jusaa. 2011. A Zuchao Li, Jiaxun Cai, Shexia He and Hai Zhao. 2018. Semantic Feature for Statistical Machine Translation. Seq2seq Dependency Parsing. Proceedings of the 27th Proceedings of the Fifth Workshop on Syntax, Seman- International Conference on Computational Linguis- tics and Structure in Statistical Translation, page 126– tics (COLING 2018), page 3203–3214. 134. Zuchao Li, Shexia He, Hai Zhao, Yiqing Zhang, Zhu- Rico Sennrich and Barry Haddow. 2016. Linguis- osheng Zhang, Xi Zhou and Xiang Zhou. 2019. tic Input Features Improve Neural Machine Transla- Dependency or Span, End-to-End Uniform Semantic Proceedings of AAAI 2019 tion. Proceedings of the First Conference on Machine Role Labeling. . Translation, volume 1, page 83–91. Rico Sennrich, Barry Haddow and Alexandra Birch. 2016. Neural Machine Translation of Rare Words with Subword Units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguis- tics (ACL 2016). Samira Tofighi Zahabi, Somayeh Bakhshaei and Shahram Khadivi. 2013. Using Context Vectors in Improving a Machine Translation System with Bridge Language. Proceedings of the 51st Annual Meeting of the Asso- ciation for Computational Linguistics, volume 2. page 318–322. Shaohui Kuang, Junhui Li, Antonio´ Branco, Weihua Luo and Deyi Xiong. 2018. Attention Focusing for Neural Machine Translation by Bridging Source and Target