LARA in the Service of Revivalistics and Documentary Linguistics
Total Page:16
File Type:pdf, Size:1020Kb
LARA in the Service of Revivalistics and Documentary Linguistics: Community Engagement and Endangered Languages∗ Ghil‘ad Zuckermann Sigurður Vigfússon The University of Adelaide The Communication Centre for the Deaf Australia and Hard of Hearing, Iceland [email protected] [email protected] Manny Rayner Neasa Ní Chiaráin FTI/TIM, University of Geneva Trinity College, Dublin Switzerland Ireland [email protected] [email protected] Nedelina Ivanova Hanieh Habibi The Communication Centre for the Deaf FTI/TIM, University of Geneva and Hard of Hearing, Iceland Switzerland [email protected] [email protected] Branislav Bédi The Árni Magnússon Institute for Icelandic Studies, Iceland [email protected] Abstract under threat. Given a particular language we care about, what can we do to improve its prospects? We argue that LARA, a new web platform that supports easy conversion of text into an on- To underline our commitment to this positive line multimedia form designed to support non- perspective, we will situate the discussion within native readers, is a good match to the task of the emerging trandisciplinary field of Revivalis- creating high-quality resources useful for lan- tics (Zuckermann, 2020), a neologism which com- guages in the revivalistics spectrum. We illus- bines the notions of “revival” and “linguistics”. trate with initial case studies in three widely This in no way should be read as playing down different endangered/revival languages: Irish (Gaelic); Icelandic Sign Language (ÍTM); and the importance of language documentation, which Barngarla, a reclaimed Australian Aboriginal for obvious reasons is an essential component of language. The exposition is presented from a the enterprise. Rather, we want to view documen- language community perspective. Links are tary linguistics through a revivalistic lens, docu- given to examples of LARA resources con- menting the language so that the material we pro- structed for each language. duce can directly help community members who are trying to strengthen their linguistic abilities but 1 Introduction may not be linguistically sophisticated. In this pa- When talking about languages with small, shrink- per, we argue that LARA (Learning And Reading ing or non-existent speaker bases, we can adopt a Assistant; (Akhlaghi et al., 2019)), an open source positive or a negative attitude. If we say “endan- web platform that supports easy conversion of text gered” or “dead” languages, the terms predisposes into multimodal form, offers functionality that fits us towards a negative, deficit point of view: per- surprisingly well with the goals of revivalistics, haps the most important thing is to try to document and makes available a plethora of possibilities for the language as well as we can while information rapid creation of useful online teaching materials. about it is still available. In this paper, we will In the rest of the paper, we start in §2 by giving in contrast accentuate the positive. All languages, a brief introduction to revivalistics, as here con- except those where 100% of the children in the ceptualised, and to LARA. In §§3–5, we present relevant group grow up speaking the language, are case studies showing how we have used LARA *∗ Authors in reverse alphabetical order. to construct resources for three widely different 13 Proceedings of the 4th Workshop on the Use of Computational Methods in the Study of Endangered Languages: Vol. 1 Papers, pages 13–23, Online, March 2–3, 2021. endangered/Sleeping Beauty (“dead”) languages: The basic approach is in line with Krashen’s in- Irish (Gaelic); Icelandic Sign Language (ÍTM); fluential Theory of Input (Krashen, 1982), sug- and Barngarla, a reclaimed Australian Aboriginal gesting that language learning proceeds most suc- language. The final section concludes and outlines cessfully when learners are presented with inter- ongoing work. esting and comprehensible L2 material in a low- anxiety situation. LARA implements this ab- 2 Background: Revivalistics and LARA stract programme by providing concrete assistance to L2 learners, making texts more comprehensi- 2.1 Revivalistics ble to help them develop their reading, vocab- Revivalistics is defined by Zuckermann (2020) ulary and listening skills. In particular, LARA as “a new global, trans-disciplinary field of en- texts include translations and human-recorded au- quiry surrounding language reclamation, revital- dio (video, in the case of sign languages) attached ization and reinvigoration”. Zuckermann consid- to words and sentences, and a personalised concor- ers these terms as different points on a “revival dance constructed from the learner’s reading his- cline or spectrum”. Here, Reclamation is the tory. An important point is that the concordance revival of a no-longer spoken language (“Sleep- is organised by lemma, rather than by surface ing/Dreaming Beauty”), the best-known case be- form; this requires the LARA text to be marked ing Hebrew; Revitalization is the revival of a up so that each word is annotated with its associ- severely endangered language, for example Ad- ated lemma, a process which for many languages nyamathanha of the Flinders Ranges in Australia; can be performed semi-automatically with an in- and Reinvigoration is the revival of an endan- tegrated third-party tagger/lemmatizer doing most gered language that still has children speaking it, of the work (Akhlaghi et al., 2020). for example the Celtic languages Irish and Welsh. From the user perspective, the consequence of Zuckermann argues at length that it is helpful to the above is that the learner, just by clicking or adopt a broad perspective, both when consider- hovering on a word, is always in a position to an- ing the above as instances of a single set of is- swer three questions: what does it mean, what sues, and when considering revivalistics as an in- does it sound like (look like, in the case of sign herently trans-disciplinary field of inquiry, which languages), and where have I seen some form of by its nature involves not only linguistics and lan- the word before. Figure1 shows an example for an guage technology but also anthropology, sociol- Irish text. Students can test their knowledge of a ogy, politics, law, mental health and other disci- text using several kinds of automatically generated plines. To help a language that is under threat, it is flashcards, with a new set of flashcards created on necessary to consider why it is under threat, what each run (Bédi et al., 2020). the consequences are for the (current and poten- tial) speakers of the language, what their motiva- Related platforms, from which the project has tion is, if any, for wanting to strengthen the lan- adapted some ideas, include Learning With Text3 guage, and what in practice can be done. and Clilstore4. The LARA tools are accessed In this paper, we will be most immediately con- through a free portal, divided into two layers. The cerned with language and language technology, core LARA engine consists of a suite of Python but the other aspects are also implicitly present. modules, which can also be run stand-alone from the command-line; on top, there is a web layer 2.2 LARA implemented in PHP. Comprehensive online doc- umentation is available (Rayner et al., 2020). LARA1 (Learning and Reading Assistant) is a collaborative open source2 project, active since In the following sections, we describe how mid-2018, whose goal is to develop tools that LARA is being used to create resources for the support conversion of plain texts into an interac- three languages which are our main focus in this tive multimedia form designed to support devel- paper. opment of L2 language skills by reading online. 1https://www:unige:ch/callector/lara/ 2https://sourceforge:net/projects/ 3https://sourceforge:net/projects/lwt/ callector-lara/ 4http://multidict:net/clilstore/ 14 Figure 1: Example of Irish LARA content, Fairceallach Fhinn Mhic Cumhaill, (‘Fionn’s burly friend’). A ‘play all’ audio button function is included at the top of the page to enable the listener to hear the entire story in one go (1). The text and images are in the pane on the left hand side. Clicking on a word displays information about it in the right hand pane. Here, the user has clicked on bhí = “to be (past tense)” (2), showing an automatically generated concordance; the lemma bí; and every variation of bí that is in this text (3). Hovering the mouse over a word plays audio and shows a popup translation at word-level. Clicking on a loudspeaker plays audio for the entire sentence as well as showing a popup translation (4). The back-arrows (5) link each line in the concordance to its context of occurrence. A link to the document can be found on the LARA content page. 3 Irish (Gaelic) language teaching in this context poses numerous challenges. Some are discussed here, as a pre- 3.1 The context lude to discussion of how LARA can help address Irish belongs to the Goidelic branch of the Celtic them. languages. It is community language spoken in relatively small regions (Gaeltacht regions), pri- 3.2 Challenges in the teaching of Irish marily in the West of Ireland, with daily speaker A major challenge is that the teachers are typ- numbers of about 20,586, or 0.43% of the Irish ically second language learners themselves, and population (CSO, 2016). Note, however, that there their own command of the language (or their con- are no monolingual communities, and even in the fidence in it) can be problematic. Teachers of- Gaeltacht, English is increasingly dominant. Out- ten feel overburdened with the major responsibil- side these rather small and scattered communities, ity placed on them in the revitalisation and main- Irish is spoken as a first language in individual tenance initiative, but report inadequate resources households, mostly in urban areas.