1 ================================================================= Language in India www.languageinindia.com ISSN 1930-2940 Vol. 19:5 May 2019 India’s Higher Education Authority UGC Approved List of Journals Serial Number 49042 ================================================================ ENGLISH TO TAMIL MACHINE TRANSLATION SYSTEM USING PARALLEL CORPUS Prof. Rajendran Sankaravelayuthan
[email protected] Amrita Universiy, Coimbatore Dr. G. Vasuki AVVM Sri Pushpam College, Poondi
[email protected] Coimbatore 2019 ================================================================= Language in India www.languageinindia.com ISSN 1930-2940 19:5 May 2019 Prof. Rajendran Sankaravelayuthan and Dr. G. Vasuki English To Tamil Machine Translation System Using Parallel Corpus 2 A FEW WORDS This research material entitled “ENGLISH TO TAMIL MACHINE TRANSLATION SYSTEM USING PARALLEL CORPUS” was lying in my lap since 2013. I was planning to edit and publish it in book form after making necessary modifications. But as I have taken up some academic responsibility in Amrita University, Coimbatore after my retirement from Tamil University, I could not find time to fulfil my mission. So I am presenting it in raw format here. Let it see the light. Kindly bear with me. I am helpless. Statistical machine translation (SMT) is a machine translation paradigm where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora. The statistical approach contrasts with the rule-based approaches to machine translation as well as with example- based machine translation. Statistical machine translation (SMT) learns how to translate by analyzing existing human translations (known as bilingual text corpora). In contrast to the Rules Based Machine Translation (RBMT) approach that is usually word based, most mondern SMT systems are phrased based and assemble translations using overlap phrases.