Machine Translation and Automated Analysis of the Sumerian Language Emilie´ Page-Perron´ †, Maria Sukhareva‡, Ilya Khait¶, Christian Chiarcos‡, † University of Toronto
[email protected] ‡ University of Frankfurt
[email protected] [email protected] ¶ University of Leipzig
[email protected] Abstract cuses on the application of NLP methods to Sume- rian, a Mesopotamian language spoken in the This paper presents a newly funded in- 3rd millennium B.C. Assyriology, the study of ternational project for machine transla- ancient Mesopotamia, has benefited from early tion and automated analysis of ancient developments in NLP in the form of projects cuneiform1 languages where NLP special- which digitally compile large amounts of tran- ists and Assyriologists collaborate to cre- scriptions and metadata, using basic rule- and ate an information retrieval system for dictionary-based methodologies.4 However, the Sumerian.2 orthographic, morphological and syntactic com- This research is conceived in response to plexities of the Mesopotamian cuneiform lan- the need to translate large numbers of ad- guages have hindered further development of au- ministrative texts that are only available in tomated treatment of the texts. Additionally, dig- transcription, in order to make them acces- ital projects do not necessarily use the same stan- sible to a wider audience. The method- dards and encoding schemes across the board, and ology includes creation of a specialized this, coupled with closed or partial access to some NLP pipeline and also the use of linguis- projects’ data, limits larger scale investigation of tic linked open data to increase access to machine-assisted text processing.