An Etymological Approach to Cross-Language Orthographic Similarity. Application on Romanian Alina Maria Ciobanu, Liviu P. Dinu Faculty of Mathematics and Computer Science, University of Bucharest Center for Computational Linguistics, University of Bucharest
[email protected],
[email protected] Abstract spondences are the most popular approaches em- ployed for establishing relationships between lan- In this paper we propose a computational guages. Barbanc¸on et al. (2013) emphasize the va- method for determining the orthographic riety of computational methods used in this field, similarity between Romanian and related and state that the differences in datasets and ap- languages. We account for etymons and proaches cause difficulties in the evaluation of the cognates and we investigate not only the results regarding the reconstruction of the phylo- number of related words, but also their genetic tree of languages. Linguistic phylogeny forms, quantifying orthographic similari- reconstruction proves especially useful in histor- ties. The method we propose is adaptable ical and comparative linguistics, as it enables the to any language, as far as resources are analysis of language evolution. Ringe et al. (2002) available. propose a computational method for evolutionary 1 Introduction tree reconstruction based on a “perfect phylogeny” Language relatedness and language change across algorithm; using a Bayesian phylogeographic ap- space and time are two of the main questions of the proach, Alekseyenko et al. (2012), continuing the historical and comparative linguistics (Rama and work of Atkinson et al. (2005), model the expan- Borin, 2014). Many comparative methods have sion of the Indo-European language family and been used to establish relationships between lan- find support for the hypothesis which places its guages, to determine language families and to re- homeland in Anatolia; Atkinson and Gray (2006) construct their proto-languages (Durie and Ross, analyze language divergence dates and argue for 1996).