Proceedings of the Language Technologies for All (LT4All) , pages 215–218 Paris, UNESCO Headquarters, 5-6 December, 2019. c 2019 European Language Resources Association (ELRA), licenced under CC-BY-NC

The IRCAM Realizations for the Amazigh Preservation and Revitalization in

Fadoua ATAA ALLAH, Aicha BOUHJAR CEISIC, DCOM The Royal Institute of Amazigh Culture Allal El Fassi Avenue, Madinat Al Irfane, Rabat, Morocco {ataaallah, bouhjar}@ircam.ma

Abstract

The computerization of languages is a strategic issue, closely linked to the rise of information and communication technologies. The globalization of exchanges and the dematerialization of information due to the digital revolution have changed the status of several languages in the world and have influenced their cultural, industrial and economic stakes. Aware of this, the Royal Institute of Amazigh Culture has put in place a vision for the Amazigh digitalization. This vision consists in computerizing the language, from a simple display on screens to the development of specialized tools. This paper presents the Institute’s achievements for Amazigh to live in the "information society".

Keywords: Amazigh, Less Resourced Language, Revitalization

ⵜⴰⴳⴹⵡⵉⵜ

ⵜⴰⵙⵏⵎⴰⵍⴰⵢⵜ ⵏ ⵜⵓⵜⵍⴰⵢⵜ ⵜⴳⴰ ⵢⴰⵏ ⵓⵙⵇⵙⵉ ⴰⵙⵜⵕⴰⵜⵉⵊⵉ ⵉⵟⵟⴼⵏ ⵙ ⵡⴰⵍⴰⵢ ⵏ ⵜⵉⵜⵉⴽⵏⵓⵍⵓⵊⵉⵜⵉⵏ ⵏ ⵓⵙⵏⵖⵎⵙ ⴷ ⵓⵎⵢⴰⵡⴰⴹ. ⵜⴰⵙⵎⴰⴹⴰⵍⵜ ⵏ ⵉⵎⵙⴽⴰⵍⵏ ⴷ ⴳⴰⵔ ⴰⵎⴰⵜⵜⵉⵡ ⵏ ⵓⵙⵏⵖⵎⵉⵙ ⵉⵥⵍⵉⵏ ⵙ ⵜⵏⴽⵔⴰ ⵜⴰⵎⴰⵟⵟⵓⵏⵜ, ⵙⴰⵜⵜⵉⵏ ⴰⴷⴷⴰⴷ ⵏ ⵡⴰⵟⵟⴰⵚ ⵏ ⵜⵓⵜⵍⴰⵢⵉⵏ ⴳ ⵓⵎⴰⴹⴰⵍ, ⵢⵉⵍⵉ ⴷⴰⵔⵙⵏ ⵢⵉⴹⵉⵚ ⵅⴼ ⵉⵎⵙⴰⵔⴰⵙⵉⵏ ⵉⴷⵍⵙⴰⵏⵏ ⵉⵎⴳⵓⵔⴰⵏⵏ ⴷ ⵉⴷⴰⵎⵙⴰⵏⵏ. ⵉⴷⴷⵖ ⵉⴼⴰⴼⴰ ⵅⴼ ⵓⵢⴰⴷ, ⵉⵙⴱⴷⴷⴰ ⵓⵙⵉⵏⴰⴳ ⴰⴳⵍⴷⴰⵏ ⵏ ⵜⵓⵙⵙⵏⴰ ⵜⴰⵎⴰⵣⵉⵖⵜ ⵢⴰⵜ ⵜⴰⵏⵏⴰⵢⵜ ⵜⴰⵙⵏⵎⴰⵍⴰⵢⵜ ⵏ ⵜⵎⴰⵣⵉⵖⵜ. ⵜⴰⵏⵏⴰⵢⵜ ⴰⴷ ⵜⵥⵍⵉ ⵙ ⵓⵏⵙⵎⴰⵍⴰ ⵏ ⵜⵓⵜⵍⴰⵢⵜ : ⵣⴳ ⵓⵙⵎⴰⵍ ⵏ ⵓⵎⵉⵥⴰⵕ ⴰⵔ ⴰⵙⴱⵓⵖⵍⵓ ⵏ ⵉⵎⴰⵙⵙⵏ ⵉⵙⵜⵉⵏ. ⴰⵔⵔⴰ ⴰⴷ ⵉⵙⵎⵏⵉⴷ ⵉⵙⵓⴼⵖⵏ ⵏ ⵓⵙⵉⵏⴰⴳ ⵃⵎⴰ ⴰⴷ ⵜⵥⴹⴰⵕ ⵜⵓⵜⵍⴰⵢⵜ ⵜⴰⵎⴰⵣⵉⵖⵜ ⴰⴷ ⵜⵜⴷⵔ ⴳ ⵡⴰⵎⵓⵏ ⵏ ⵓⵙⵏⵖⵎⵙ.

1. Introduction Section 4 gives a brief overview about the IRCAM’s Throughout human history, minority languages are realizations, especially those related to language planning replaced by politically, economically, or socio-culturally and language technology. Finally, Section 5 draws dominant ones. This is due to many factors that, every conclusion and exposes some future perspectives. day, endanger certain languages and threaten their transmission. These factors can be structured in three 2. Sociolinguistic context main reasons: Amazigh language (known as Berber language in the literature) stretches, in a geographically discontinuous - Language extinction caused by the decimation of manner, over a vast area of North Africa, from the Canary its community. Islands to the Siwa Oasis in Egypt in the north and from - Eradication of minority languages by certain the Mediterranean coast to Niger, Mali and Bukina Fasso linguistic policies. in the south. - Renunciation to transmit a language to future generations, so that they are not disadvantaged by being socially excluded. Some Moroccan languages have also suffered from these factors. According to the UNESCO Project: Atlas of the World’s Languages in Danger, seven Moroccan languages are in danger, two of them have completely disappeared. These are the Judeo Berber and the Tamazight of Aït Rouadi1. Unfortunately, current technological changes may, if no remedy is taken, dramatically increase this linguistic impoverishment. In this context, the Royal Institute of Amazigh Culture (RIAC; IRCAM in French) is committed to the preservation and the revitalization of the Amazigh language. Figure 1: Amazigh language distribution over North This paper aims to present the IRCAM’s achievements. Africa2 Thus, the remaining of this paper is organized as follow: Section 2 introduces the Amazigh sociolinguistic context. Section 3 presents the Royal Institute of Amazigh Culture. 2https://www.languagesoftheworld.info/language-contact- 2/many--varieties-traces-roman-heritage-moroccan- 1http://www.unesco.org/languages-atlas/fr/atlasmap.html 215arabic.html Historically, the Amazigh language has been Ministry of Information and Communication, autochthonous and is a member in the Hamito-Semitic or Ministry of Culture, Ministry of Justice, Ministry of “Afro-Asiatic” family. On the linguistic side, the language the Interior and Ministry of Public Service. is characterized by the proliferation of dialects due to - Academic missions: collection and transcription of historical, geographical and sociolinguistic factors. This various Amazigh cultural expressions with an eye to language has its specific , called , safeguarding, protecting and disseminating these but it was neglected except in the Saharian region where it expressions; studies and research on Amazigh was transmitted, mainly by women, without discontinuity, culture; promotion of artistic creation; codifying the from antiquity up till now. In the other areas, Amazigh Amazigh graphic system for teaching ends, was exclusively spoken and reserved for familial and production of didactic tools, elaboration of general informal domains. The old tifinagh script is found and specialized lexicons, elaboration of pedagogical engraved in stones and tombs in some historical sites plans of action; cooperation with universities in attested from 40 centuries. Its writing form has continued organizing research and Amazigh language and to change from the traditional Tuareg writing to the neo- culture development centers, training trainers; tifinagh in the end of the sixties. development of methods meant to strengthen and encourage the place of Amazigh in communication and information spaces; cooperation with cultural and scientific institutions at the national and international levels. During these two last decades, changes occur in the sociolinguistic field and are specifically related to the Amazigh language. What are the observed changes? How was managed the Amazigh language planning process in Morocco? Which were the priorities and what principles guided the decision making? These questions, among others, had to be answered by IRCAM in order to contribute to the new language policy of the country. Figure 2: Tinzouline Inscriptions (Zagora, Morocco) Indeed, the motive 6 of the founding text of IRCAM In Morocco, one may distinguish three major dialects on explains that it has "to deepen the linguistic policy defined the basis of their geographical situation: Tamazight of the by the National Charter for Education and Training North (known as or Rifian variety by linguists); (CNEF), which stipulates the introduction of Amazigh in Tamazight of Central Morocco and South-East, and the educational system". Indeed, Amazigh, in the National Tachelhiyt in the South-West and the High Atlas. Charter, was considered as an “opening", a language According the last demolinguistic data (2004), the facilitating the learning of the Arabic language (COSEF, Amazigh language is spoken by some 30% of the 2000: lever 9 paragraphs 115 and 116). With the creation Moroccan population (around 10 million inhabitants). of IRCAM, the ultimate goal is to introduce the Amazigh Politically, the Amazigh language has enjoyed its status at language, not as a facilitator but as a language taught and different levels, depending on the country where this a "national heritage" of all Moroccan pupils and students. language exists. In Morocco, the status of Amazigh has The "codification of the Amazigh script will facilitate its achieved an advanced level since 2011: its officialization, teaching, learning and diffusion, guarantee equal alongside Arabic, in the new Constitution. But the outline opportunities for all the children of our nation in acquiring of the Amazigh language preservation and promotion is knowledge, and consolidate national unity" (Dahir, 2001). mentioned since 2001 in the text creating and organizing Since then, the Amazigh has become, alongside Arabic, the Royal Institute of Amazigh Culture (Dahir or Royal the included in the new Constitution. Decree of 17th October 20013). As it can be observed, within a decade, the status of the 3. The Royal Institute of Amazigh Culture Amazigh gradually has changed: from “opening” (CNEF, 4 This institution came as an answer to the civil society 1999) to official language (Constitution, 2011: Art. 5 ) request since the very beginning of the 60 (after through a process of patrimonialization and language independence) which asked the recognition of the planning (Dahir, 2001). Amazigh language and culture by the State and its IRCAM had to translate the general orientations stipulated institutionalization in education, the media, culture and in the Dahir into concrete action to enable the introduction public administration. IRCAM is under royal tutelage and of the Amazigh language and culture in the public sphere is dedicated to the promotion of the Amazigh language but primarily in the educational and training system. and culture. Therefore, issues related to which Amazigh to teach and It has the following missions: with which script have arisen as an urgent and pressing priority. In other words, IRCAM had to deal with the - Advisory mission to the Royal Cabinet on measures "corpus" of the Amazigh language in Morocco. meant to promote Amazigh. - Partnership mission with the concerned institutions, in particular the Ministry of National Education, 4http://www.sgg.gov.ma/Portals/0/constitution/constitution_201 3https://www.ircam.ma/?q=fr/node/4668 2161_Fr.pdf 4. IRCAM Realizations The inclusion of the Amazigh language in public life, in Amazigh language has rich oral corpus with stories, songs Morocco, has resulted in the extension of its areas of use, which has generated the need for terminology, and a and histories that was threatened. To preserve this reference grammar. heritage, it is important to revitalize the language (Boukous, 2012) and ensure it transmission. In this aim, Thus, IRCAM has produced sectorial lexicons, including IRCAM piloted and drew up a strategy that has been grammatical, media and administrative terminologies translated into actions through steps involving the (Ameur et al., 2009ab; Ameur et al., 2013; Ameur et al., progressive planning of Amazigh, its insertion into the 2015), in addition to an Amazigh language general educational system and its integration into the digital dictionary (Ameur et al., 2017). Moreover, it has world. The following points present some of these actions. published a standard morphosyntactic reference books where the spelling rules were set (Boukhris et al., 2008, 4.1 Informatization Roadmap Laabdelaoui et al., 2012). To ensure the integration of Amazigh language in the In parallel, IRCAM has undertook projects to build information technology sphere and to guarantee its terminology and lexical databases. The terminology sustainability, IRCAM has elaborated a roadmap. This database contains terminological entries related to roadmap is organized in basic processes, designed to take grammatical, media and administrative themes5. This place in three phases: short, medium and long terms. database was used also in the production of the ‘LEXAM’ mobile application6 (Frain et al., 2014). Whereas, the These processes are represented by a chain starting from lexical database includes the usual lexicon belonging to 7 elementary processing, passing by the language resources the Amazigh variants (Ataa Allah et al., 2019). This constitution, and going towards generic applications. This database subject of online open access, to meet the needs chain ranges from adapting and improving tools based on of actors working in the fields of linguistic, teaching, translation and communication. new technologies to the development of applications (Ataa Allah and Boulaknadel, 2014a). Furthermore, it was also a question of endowing the Amazigh language with grammar tools. The first The main bricks of the Amazigh informatization roadmap developed tool concerns the morphosyntactic tagger (Ataa introduced below are graphic encoding, language tools Allah and Jaa, 2009), which helps linguists to annotate and resources, in addition to language learning materials. words, of the Amazigh corpus (Boulaknadel and Ataa Allah, 2013), as corresponding to a particular part of 4.2 Graphic Encoding speech (Ataa Allah et al., 2013). The second tool concerns Before the creation of IRCAM, the Amazigh language has the Amazigh conjugator8 (Ataa Allah and Boulaknadel, been writing, in Morocco, in alphabet or in Arabic 2014b). It allows an online access to the conjugation of script. In order to facilitate the decision making process the Amazigh verbs. Other tools are under study, they regarding the script choice, IRCAM produced a technical concern morphological analysis (Ataa Allah, 2014; Nejme report analyzing the two different scripts in use in addition et al., 2016), named entities (Talha et al., 2015), to the Amazigh ancestral writing system (tifinagh script). spellchecker (Chaabi et al., 2019) and machine translation system (Miftah et al., 2017; Taghbalout et al., 2018). After the adoption of tifinagh as the official script for writing the Amazigh language, in Morocco, on February 4.4 Language Learning Materials th 10 2003, the Unicode homologation of this script has Language learning plays an important role in its become a necessity (Andries, 2008). This encoding preservation and the transmission of its culture. Aware of enabled Amazigh language to have a prominent position that, IRCAM has realized many digital teaching aids and in the digital world both at the national and international educational materials9, in order to contribute to the levels. Beside tifinagh Unicode standardization, Amazigh was integrated also in the international standard discovery of the cultural heritage and the development of prescription keyboards ISO/IEC 9995, to facilitate its the reading and the lexical competences. The objectives of keyboarding. these materials are to familiarize the learner with tifinagh, to support oral expression, to prepare and facilitate the To ensure the conversion of all the Amazigh writing into a transition to writing, to develop reading taste, and to standard and a unique form, which is Unicode based enrich the learner’s vocabulary. tifinagh, a transliterator was developed (Ataa Allah and Boulaknadel, 2011). Moreover, to help visually impaired The elaborated materials are structured on three kinds. persons to learn Amazigh language, IRCAM proposed a Those for supporting reading: ⵜⵉⵣⵓⵣⴰⴼ [tizuzaf] Braille-based tifinagh writing system and built a Tifinagh- ‘Jewelry’ e-book and the digital game ⴰⴷ ⵏⵍⵎⴷ Braille convertor (Yakoubi et al., 2016). ⵜⵉⴼⵉⵏⴰⵖ [Ad nlmd tifinaghe] ‘Learning tifinagh’. 4.3 Language Tools and Resources Materials for developing vocabulary: the multimedia Since the colonial period, many studies have been undertaken. They have contributed to the collection of the 5http://tal.ircam.ma/talam/ref.php Amazigh lexicon and oral tradition, and have focused on 6http://tal.ircam.ma/talam/lexam.php linguistic features. However, most of these studies 7http://tal.ircam.ma/dglai/ covered local dialects only. 8http://tal.ircam.ma/conjugueur/ 2179 tal.ircam.ma/talam dictionaries ⵜⴰⵎⴰⵡⴰⵍⵜ ⵉⵏⵓ ⵜⴰⵡⵍⴰⴼⴰⵏⵜ [Tamawalt inu l’information et de communication. Asinag, 9:33-48, tawlafant] ‘My illustrated vocabulary’ and ⵜⴰⵎⴰⵡⴰⵍⵜ ⵏ IRCAM publications. ⵉⵎⵥⵥⵢⴰⵏⵏ [Tamawalt n imzzyann] ‘Children dictionary’ Ataa Allah, F. and Boulaknadel, S. (2014b). Amazigh th (Ataa Allah, 2011), the multimedia game ⵜⵉⵏⵎⵍ ⵏ ⵜⵎⴰⵣⵉⵖⵜ verb conjugator. In proceedings of the 9 International [Tinml n tmazight] ‘Amazigh School’, and the mobile Conference on Language Resources and Evaluation (LREC’14), pages1051-1055, Reykjavik. Iceland. games ⵉⵣⵡⵉⵍⵏ ⵙ ⵜⵎⴰⵣⵉⵖⵜ [Izwiln s tmazight] ‘Numbers Ataa Allah, F., Boulaknadel, S. and Frain, J. (2019). in Amazigh’ and ⴰⵡⴰⵍ ⵉⵏⵓ ⴰⵎⵣⵡⴰⵔⵓ [Awal inu amzwaru] Dictionnaire général de la langue amazighe informatisé. ‘My first words’. In addition to materials for developing In proceedings of the conférence sur la Diversité communicative competence, including multimedia Linguistique et TAL (DiLiTAL’19), Oujda, Morocco. support: ⵉⵎⵓⴷⴰⵔ ⵉⵔⴰⵎⵢⴰⵔⴰⵏ ⴷ ⵉⵏⴰⵎⵢⴰⵔⵏ [Imudar iramyarn Ataa Allah, F., Boulaknadel, S. and Souifi, H. (2014). Jeu d inamyarn] ‘Wild and Domestic Animals’ and ⵉⴳⴹⴰⴹ ⴷ d’étiquettes morphosyntaxiques de la langue amazighe. ⵉⴱⵓⵅⵅⴰ [IgDaD d ibukha] ‘Birds and Insects’. In addition Asinag, 9:171-184, IRCAM publications. to these materials, IRCAM is actually working on a Ataa Allah, F. and Jaa, H. Etiquetage morphosyntaxique : Massive Online Open Courses project that aims to make outil d’assistance dédié à la langue amazighe. In st Amazigh language courses available for all interested proceedings of the 1 Symposium International sur le persons inside and outside Morocco (Chaabi et al., 2018). Traitement Automatique de la Culture Amazighe (SITACAM’09), pages 110-119, Agadir, Morocco. Boukhris, F., Boumalk, A., Elmoujahid, E. and Souifi, H. 5. Conclusion (2008). La nouvelle grammaire de l’amazighe. IRCAM IRCAM has risen to a number of challenges, particularly Publications, Rabat. those related to the Amazigh language integration in the Boukous, A. (2012), Revitalizing the Amazigh language: digital world. Nevertheless, the validation of the organic stakes, challenges, and strategies. IRCAM Publications, law relating to the implementation of the Amazigh Rabat. language as an official one, which stipulates its insertion Boulaknadel, S. and Ataa Allah, F. (2013). Building a into education and the priority areas of public life, opens standard Amazigh corpus. In Advances in intelligent systems and computing, 179:91-98. up for IRCAM new challenges. In order to face these Chaabi, Y., Boulaknadel, S. and Ataa Allah, F. MOOC- challenges, some research projects are underway and IRCAM pour l’apprentissage de la langue Amazighe - other are planned. opportunités et défis. In proceedings of the 8th conférence internationale sur la Technologie de 6. Bibliographical References l’Information et de Communication pour l’Amazighe Andries, P. (2008). Unicode 5.0 en pratique, codage des (TICAM’18), Rabat, Morocco. caractères et internationalisation des logiciels et des Chaabi, Y., Outahajala, M. and Ataa Allah, F. (2019). documents. Dunod, France. Amazigh Spell Checker based on Naïve Bayes and N- Ameur, M., Boumalk, A., Iazzi, E.M, Souifi, H. and Gram. In proceedings of the International Conference Ansar K. (2009). Vocabulaire des médias, français- on Advanced Technologies and Humanitarian Sciences amazighe-anglais-arabe. IRCAM Publications, Rabat. (ICATHS’19), Rabat, Morocco. Ameur M., Bouhjar A., Boumalk A., El Azrak N. and Commission Spéciale d’Education et de Formation Laabdelaoui R. (2009). Vocabulaire grammatical. (COSEF) (2000), La Charte Nationale d’Education et IRCAM Publications, Rabat. de Formation, Royaume du Maroc, Rabat, Morocco. Ameur, M., Ansar K., Bouhjar A. and El Azrak, N. Miftah, N., Ataa Allah, F. and Taghbalout, I. (2017). (2013). Terminologie amazighe de l’audiovisuel. Sentence-aligned parallel corpus Amazigh-English. In IRCAM Publications, Rabat. proceedings of the International Conference on Ameur, M., Ansar K., Bouhjar A. and El Azrak, N. Information and Communication Systems, Irbid, (2015). Terminologie administrative. IRCAM Jordan. Publications, Rabat. Nejme, F., Boulaknadel, S. and Aboutajdine, D. (2016). Ameur M., Ansar, K., Boumalk A., El Azrak N. and Amamorph: finite state morphological analyzer for Laabdelaoui R. (2017). Dictionnaire général de la Amazigh. Journal of computing and information langue amazighe. IRCAM Publications, Rabat. technology, pages 91-110. Ataa Allah, F. (2011). Conception d’un dictionnaire Talha, M., Boulaknadel, S. and Aboutajdine, D. (2015). imagier sonore en ligne de la langue amazighe. In Development of Amazigh named entity recognition proceedings of the 8th Multidisciplinary Symposium on system using hybrid approach. In proceedings of the Design and Evaluation of Digital Content for Education 16th International Conference on Intelligent Text (SPDECE’11), pages 158-164, Ciudad real, . Processing and Computational Linguistics, pages 14-20, Ataa Allah, F. (2014). Finite-state transducer for Amazigh Caire, Egypt. verbal morphology. Literary & Linguistic Computing. Taghbalout, I., Ataa Allah, F. and El Marraki, M. (2018). Oxford University press. Towards UNL based machine translation for Moroccan Ataa Allah, F. and Boulaknadel, S. (2011). Convertisseur Amazigh language. International Journal of pour la langue amazighe : script arabe-latin-tifinaghe. In Computational Science and Engineering, 17(1): 43-54. proceedings of the 2nd Symposium International sur le Yakoubi, N., Frain, J. and Ataa Allah, F. (2016). Traitement Automatique de la Culture Amazighe Convertisseur Numérique : Tifinaghe-Braille. In (SITACAM’11), Agadir, Morocco. proceedings of the 7th conférence internationale sur la Ataa Allah, F. and Boulaknadel, S. (2014a). La promotion Technologie de l’Information et de Communication de l’amazighe à la lumière des technologies de 218 (TICAM’16), Rabat, Morocco.