
Machine Translation of Noun Phrases from English to Igala using the Rule-Based Approach. Sani Felix Ayegba1, Osuagwu O.E. 2, Njoku Dominic Okechukwu3 1Department of Computer Science, Federal Polytechnic Idah, Kogi State, Nigeria [email protected] 2 Department of Computer Science, Imo State University, Owerri [email protected] 3Department of Electrical/Electronics Engineering, Imo State Polytechnic, Umuag Abstract We live in a multilingual society where large volumes of documents are produced in different languages. Translation is the means by which information generated in one language can be accessed by someone in a different language. Igala is one of the languages spoken in Nigeria. Igala is the ninth largest ethnic group in Nigeria and the language is spoken by about 2.5 million people. The main objective of this research is to model a language processor that can accept as input Noun Phrases in English language and translate same to Igala language. The two core technologies, corpus based and rule based technologies for building machine translation systems were carefully studied. Due to the structural differences between English and Igala, noun phrases coupled with the non- availability of large amount of parallel aligned corpus for English and Igala language, the rule based technology was adopted to develop the model. The model was implemented using VB.net programming language as front end and Microsoft Access as back end. The application was tested on 120 randomly selected English noun phrases using the Bilingual Evaluation Understudy (BLEU) method for evaluating Machine Translation systems. An accuracy of 90.9% was obtained. Key words: Translation, Igala language, Language processor, corpus based technology, rule based technology, Bilingual Evaluation Understudy. _________________________________________________________________________ Introduction Language is the medium of translating from one language to another is communication. Human language is very crucial contribution to human purposively to communicate ideas, development. Without translation, there can emotions, feelings, desires, to co-operate be no communication, except among those among social groups, to exhibit habits etc who share a common language and many which can be translated along a variety of voices will not be heard without this critical channels [2]. There are over 6,800 living function. languages in the world which reflects the Translation is critical for addressing scope of linguistic and cultural diversity. information inequalities. A study conducted Access to information written in another by Common Sense Advisory on behalf of language is of great interest and the means Translators without Borders finds that of sharing information across languages is translation is critical for the public health, translation, therefore creating tools for political stability, and social wellbeing of West African Journal of Industrial & Academic Research Vol.11 No.1 June 2014 18 African nations [11]. [16] Showed that due wider outreach and bridge the gap of to differences in culture and the language diversity [17]. multilingual environment in India, inter- Igala is the language of the ethnic group language translation was necessary for the located at the eastern flank of the transfer of information and sharing of ideas. confluence of rivers Niger and Benue. They The need for translation is also very glaring are the ninth largest linguistics group in in the business community. It has been Nigeria [12]. Geo-politically, they are observed that language barriers between described as belonging to the middle belt or companies and their global customers are north-central of Nigeria. They are bordered stifling economic growth and in fact, forty- on the north by Benue and Nassarawa nine percent of executives say a language States, on the West by River Niger, on the barrier has stood in the way of a major East by Enugu State and on the South by international business deal, nearly two- Anambra State [5]. Igala land is 120 thirds (64 percent) of those same executives Kilometres wide and 160 Kilometres long. said language barriers are making it It is located approximately between difficult to gain a foothold in international latitudes 60 30” and 80 North and markets, whether inside or outside your longitudes 6030” and 7040” East and company, your global audiences prefer to covers an area of about 13,665 square read in their native languages[18], it speeds kilometers. The population of the Igala efficiency, increases receptivity, and allows people is estimated at two-million in the for easier processing of concepts This late 1990s [5]. Historically, they are said to clearly shows that language translation is a be linked to the Yoruba, the Jukuns and the matter of absolute necessity in the globally Binis (Edo) and the northern Ibos. Owing united and yet linguistically and culturally to their central location, they have mutually separated world in which we live. interacted and lived with the Idomas, The work of translation was originally Bassa-Nkomo, Nupe, Igbirra and Hausa carried out by human translators. At a point people. The Igala ethnic group is densely the supply of translation services could no populated in their settlements around the longer keep pace with the demand for major towns such as Idah, Ankpa and translated content, moreover human Anyigba. They are also found in Edo, translation is costly, time consuming and Delta, Anambra, Enugu, Nassarawa, inadequate for addressing the real-time Adamawa and Benue States. However, the needs of businesses to serve multilingual bulk of them are indisputably found in prospects, partners and customers. The Idah, Ankpa, Dekina, Omala, Olamaboro, inherent limitations of human translation Ofu, Igalamela/Odolu, Ibaji, Bassa (and made the search for an alternative means of even Lokoja and Ajaokuta) Local translation paramount. The search led to the Government Areas of Kogi State [5]. discovery of what is known today as The aim of this research is to develop a machine translation or computer assisted system for translating Noun Phrases from translation. Machine Translation is the use English to Igala. The specific objective is to of computers to automate some or all of the carry out a computational analysis of process of translating from one language to English to Igala noun phrases translation another [1]. This need has prompted processes and to model a language research organizations and government processor that will have the capacity to agencies to develop tools for automatic accept as input noun phrase in English translation of text in an attempt to achieve West African Journal of Industrial & Academic Research Vol.11 No.1 June 2014 19 language and translate it into Igala multilingual and linguistic expertise. language. Therefore, rule-based systems require large Information Communications initial investment and maintenance for Technology (ICT) has not made any every language pair [7]. Also within the significant inroad in empowering Africans corpus-based paradigm, three other towards development because 90 percent of approaches can be further distinguished: existing content and applications are in the example-based and statistical-based and English language [3]. Igala is not left out of context based. Under the corpus-based this. The impact of ICT among the Igala approach the knowledge is automatically people has not reached a level where it can extracted by analyzing translation examples be said to be significant. This is due to the from a parallel corpus built by human fact that existing contents and applications experts. The advantage is that, once the are in a foreign language. required techniques have been developed The outcome of the research will result for a given language pair, machine in greater access to information in Igala translation systems should (theoretically) language. It will also give Igala language a be quickly developed for new language public profile in the information technology pairs using provided training data. world and provide a platform for people to Although the rule based system requires really appreciate the beauty of their significant amount of linguistic knowledge, indigenous language and also help to the knowledge acquired for one natural develop Igala language and elevate it to the language processing system may be reused level of languages of developed nations. to build knowledge required for a similar This will lead to the preservation of the task in another system. [9] posited that rule- Igala culture and values and also serve as based approach is better than its counterpart the springboard to the much needed corpus-based approach for two main development of Igala society and by reasons: extension, the Nigerian nation, African 1: less-resourced languages, for which large Continent and perhaps the entire globe. corpora, possibly parallel or bilingual, with representative structures and entities are Machine Translation Technologies neither available nor easily affordable, and Machine Translation systems (MT) can 2: for morphologically rich languages, be classified according to their core which even with the availability of corpora methodology. Under this classification, two suffer from data sparseness. Both the rule main paradigms can be found: the rule- based and corpus based technologies have based approach and the corpus-based the strength and weaknesses. Due to the approach. Within the rule based paradigm inherent weaknesses of these technologies three approaches can
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages11 Page
-
File Size-