Jejueo Talking Dictionary: a Collaborative Online Database for Language Revitalization
Total Page:16
File Type:pdf, Size:1020Kb
Jejueo talking dictionary: A collaborative online database for language revitalization Moira Saltzman University of Michigan [email protected] Abstract 1. Introduction This paper describes the ongoing development of the Jejueo Talking Dictionary, a free online The purpose of this paper is to present the multimedia database and Android application. ongoing development of the Jejueo Talking Jejueo is a critically endangered language spoken Dictionary as an example of applying by 5,000-10,000 people throughout Jeju Province, South Korea, and in a diasporic enclave in Osaka, interdisciplinary methodology to create an Japan. Under contact pressure from Standard enduring, multipurpose record of an Korean, Jejueo is undergoing rapid attrition endangered language. In this paper I examine (Kang, 2005; Kang, 2007), and most fluent strategies for gathering extensive data to create speakers of Jejueo are now over 75 years old a multimodal online platform aimed at a wide (UNESCO, 2010). In recent years, talking variety of uses and user groups. The Jejueo dictionaries have proven to be valuable tools in Talking Dictionary project is tailored to language revitalization programs worldwide diverse user communities on Jeju Island, South (Nathan, 2006; Harrison and Anderson, 2006). Korea, where Jejueo, the indigenous language, As a collaborative team including linguists from is critically endangered and underdocumented, Jeju National University, members of the Jejueo Preservation Society, Jeju community members but where the population’s smart phone and outside linguists, we are currently building a penetration rate is 75% (Lee, 2014) and semi- web-based talking dictionary of Jejueo along with speakers are highly proficient users of an application for Android devices. The Jejueo technology (Song, 2012). The Jejueo Talking talking dictionary will compile existing annotated Dictionary is also intended for Jejueo speakers video corpora of Jejueo songs, conversational of varying degrees of fluency in Osaka, Japan, genres and regional mythology into a multimedia where up to 126,511 diasporic Jejuans reside database, to be supplemented by original (Southcott, 2013). A third aim of the Jejueo annotated video recordings of natural language Talking Dictionary is to create extensive use. Lexemes and definitions will be linguistic documentation of Jejeuo that will be accompanied by audio files of their pronunciation and occasional photos, in the case of items native available to the wider scientific community, as to Jeju. The audio and video data will be tagged the vast majority of existing documentary in Jejueo, Korean, Japanese and English so that materials on Jejueo are published in Korean. users may search or browse the dictionary in any The Jejueo Talking Dictionary will serve as an of these languages. Videos showing a range of online open-access repository of over 200 discourse types will have interlinear glossing, so hours of natural and ceremonial language use, that users may search Jejueo particles as well as with interlinear glossing in Jejueo, Korean, lexemes and grammatical topics, and find the Japanese and English. tools to construct original Jejeuo speech. The Jejueo talking dictionary will serve as a tool for language acquisition in Jejueo immersion 2 Background programs in schools, as well as a repository for oral history and ceremonial speech. The aim of 2.1 Language context this paper is to discuss how the interests of diverse user communities may be addressed by Very closely related to Korean, Jejueo is the the methodology, organization and scope of indigenous language of Jeju Island, South talking dictionaries. Korea. Jejueo has 5,000-10,000 speakers 122 Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages, pages 122–129, Honolulu, Hawai‘i, USA, March 6–7, 2017. c 2017 Association for Computational Linguistics located throughout the islands of Jeju Province Korean (Kang, 2005; Saltzman, 2014). Recent and in a diasporic enclave in Osaka, Japan. surveys on language ideologies of Jejueo With most fluent speakers over 75 years old, speakers (Kim, 2011; Kim, 2013) show that a Jejueo was classified as critically endangered roughly diglossic situation is maintained by by UNESCO in 2010. The Koreanic language present day language ideologies. In a series of family consists of at least two languages, qualitative interviews on language ideologies, Jejueo and Korean. Several regional varieties Kim (2013:33) finds common themes of Korean are spoken across the Korean suggesting that Korean is used as a means of peninsula, divided loosely along provincial showing respect to unfamiliar interlocutors, as lines. Jejueo and Korean are not mutually Korean “...is perceived as the language of intelligible, owing to Jejueo’s distinct lexicon distance and rationality”. Likewise Jejueo is and grammatical morphemes. Pilot research considered appropriate to use whenever (Yang, 2013) estimates that 20-25% of the interpersonal boundaries, such as distinctions lexicons of Jejueo and Korean overlap, and a within social hierarchies are perceived less recent study (O’Grady, 2015) found that salient than the intimacy and mutual trust two Jejueo is at most 12% intelligible to speakers or more people share. (Kim, 2013). of Korean on Korea’s mainland. 1 Jejueo conserves many Middle Korean phonological Yang’s (2013) pilot survey on language and lexical features lost to MSK, including the attitudes finds that while community members Middle Korean phoneme /ɔ/ and terms such as recognize Jejueo as a marker of Jeju identity pɨzʌp : Jejueo pusʌp ‘charcoal burner’ worth transmitting to future generations, few (Stonham, 2011: 97). Extensive lexical and speakers feel empowered to reverse the pattern morphological borrowing from Japanese, of language shift to Korean. There are no Mongolian and Manchurian is evident in longer monolingual speakers of Jejueo on Jeju Jejueo, owing to the Mongolian colonization or in Osaka. The examples below are samples of Jeju in the 13th and 14th centuries, Japan’s of the same declarative construction produced annexation of Korea and occupation of Jeju by a fluent Jejueo speaker in (1), a typical between 1910 and 1945, and centuries of trade younger Jejueo semi-speaker in (2), and the with Manchuria and Japan (Martin, 1993; Lee Korean translation (3). Jejueo morphemes in and Ramsey, 2000). Several place names in (2) are in boldface. Jeju are arguably Japonic in origin, e.g. Tamna, the first known name of Jeju Island (1) (Kwen ,1994:167; Vovin, 2013). Moreover, harmang -jʌŋ sontɕi -jʌŋ mik͈ aŋ several names for indigenous fruits and grandmother-CONJ grandchild-CONJ orange- vegetables on Jeju are borrowed from -ɯl tʰa -m -su -ta Japanese, e.g. mik͈aŋ ‘orange’. Mongolic ACC pick-PRS[PROG]-FO-DECL speakers left the lexical imprint of a robust “The grandmother and grandchild are picking inventory of terms describing horses and cows, oranges.” e.g. mɔl ‘horse’. Jejueo borrowed grammatical morphemes from the Tungusic language Manchurian, e.g the dative suffixal particle (2) *de < ti ‘to’ (Kang, 2005). harmang -koa sontɕa -oa kjul grandmother-CONJgrandchild-CONJ orange- 2.2 Current status of Jejueo -ɯl t͈a -ko i -su-ta The present situation in Jeju is one of language ACC pick-PROG-EXIST[PRS]-FO.DECL shift, where fewer than 10,000 people out of a “The grandmother and grandchild are picking population of 600,000 are fluent in Jejueo, and oranges.” features of Jejueo’s lexicon, morphosyntax and phonology are rapidly assimilating to (3) harmʌni -oa sontɕa -oa kjul 1 In a 2015 study O’Grady and Yang found that speakers grandmother-CONJ grandchild-CONJ orange- of Korean from four provinces on the mainland had rates -ɯl t͈a -ko is͈ -ʌjo of 8-12% intelligibility for Jejueo based on a ACC pick-PROG EXIST[PRS]-FO.DECL comprehension task of a one-minute recording of Jejueo connected speech. 123 “The grandmother and grandchild are picking publication of lexicographic materials may oranges.” even help indigenous languages be perceived as ‘real languages’ in the sociolinguistic While examples (1) and (3) have several marketplace. The lexicographic materials cognate forms, the majority of grammatical alone, however, do not engender sufficient particles are genetically unrelated. The motivation for a speech community to accusative particle -ɯl is shared by Korean maintain the use of their heritage language. and Jejueo, although in Jejueo the nominative Fishman (1991) warns against dictionary and accusative markers are most commonly projects that become ‘monuments’ to a dropped. In example (2) the construction the language rather than stimulating language use Jejueo morphemes have been replaced by and intergenerational transmission. Korean morphemes, save ‘grandmother’ and the verbal ending, a pattern typical of non- A recent study by O’Grady (2015) found that fluent speakers of Jejueo (Saltzman, 2014). the level of Jejueo transmission between generations shows a drastic decline. Given the task of answering content questions based on a 3 Jejueo lexicography and sustainability one-minute recording of Jejueo connected speech, heritage speakers in the 50-60 age Because most Korean linguists view Jejueo as a conservative dialect of Korean (Sohn, 1999; bracket demonstrated a comprehension level Song, 2012), lexical documentation of Jejueo of 89%, while heritage speakers between 20 has not been a scientific priority. The few and 29 showed just 12% comprehension, equal Jejueo lexicographic projects have been to that of citizens