Learning Multilingual Word Embeddings in Latent Metric Space: A Geometric Approach Pratik Jawanpuria1, Arjun Balgovind2∗, Anoop Kunchukuttan1, Bamdev Mishra1 1Microsoft, India 2IIT Madras, India 1fpratik.jawanpuria,ankunchu,
[email protected] [email protected] Abstract different languages are mapped close to each other in a common embedding space. Hence, We propose a novel geometric approach for they are useful for joint/transfer learning and learning bilingual mappings given monolin- sharing annotated data across languages in gual embeddings and a bilingual dictionary. Our approach decouples the source-to-target different NLP applications such as machine language transformation into (a) language- translation (Gu et al., 2018), building bilingual specific rotations on the original embeddings dictionaries (Mikolov et al., 2013b), mining par- to align them in a common, latent space, and allel corpora (Conneau et al., 2018), text clas- (b) a language-independent similarity met- sification (Klementiev et al., 2012), sentiment ric in this common space to better model analysis (Zhou et al., 2015), and dependency the similarity between the embeddings. Over- parsing (Ammar et al., 2016). all, we pose the bilingual mapping prob- lem as a classification problem on smooth Mikolov et al. (2013b) empirically show that a Riemannian manifolds. Empirically, our ap- linear transformation of embeddings from one proach outperforms previous approaches on language to another preserves the geometric ar- the bilingual lexicon induction and cross- rangement of word embeddings. In a supervised lingual word similarity tasks. setting, the transformation matrix, W, is learned We next generalize our framework to rep- given a small bilingual dictionary and their resent multiple languages in a common latent corresponding monolingual embeddings.