Learning to Represent Bilingual Dictionaries Muhao Chen1∗, Yingtao Tian2∗, Haochen Chen2, Kai-Wei Chang1, Steven Skiena2 & Carlo Zaniolo1 1University of California, Los Angeles, CA, USA 2The State University of New York, Stony Brook, NY, USA
[email protected]; fkwchang,
[email protected]; fyittian, haocchen,
[email protected] Abstract mantic transfers, these approaches critically sup- port many cross-lingual NLP tasks including neu- Bilingual word embeddings have been widely ral machine translations (NMT) (Devlin et al., used to capture the correspondence of lexi- 2014), bilingual document classification (Zhou cal semantics in different human languages. However, the cross-lingual correspondence et al., 2016), knowledge alignment (Chen et al., between sentences and words is less studied, 2018b) and entity linking (Upadhyay et al., 2018). despite that this correspondence can signifi- While many existing approaches have been pro- cantly benefit many applications such as cross- posed to associate lexical semantics between lan- lingual semantic search and textual inference. guages (Chandar et al., 2014; Gouws et al., 2015; To bridge this gap, we propose a neural em- Luong et al., 2015a), modeling the correspon- bedding model that leverages bilingual dictio- dence between lexical and sentential semantics naries1. The proposed model is trained to map the lexical definitions to the cross-lingual tar- across different languages is still an unresolved get words, for which we explore with differ- challenge. We argue that learning to represent ent sentence encoding techniques. To enhance such cross-lingual and multi-granular correspon- the learning process on limited resources, our dence is well desired and natural for multiple rea- model adopts several critical learning strate- sons.