International Conference Language Technologies for All (Lt4all): Enabling Linguistic Diversity and Multilingualism Worldwide
Total Page:16
File Type:pdf, Size:1020Kb

Load more
Recommended publications
-
A Sociolinguistic Profile of the Kyoli (Cori) [Cry] Language of Kaduna State, Nigeria
DigitalResources Electronic Survey Report 2020-012 A Sociolinguistic Profile of the Kyoli (Cori) [cry] Language of Kaduna State, Nigeria Ken Decker, John Muniru, Julius Dabet, Benard Abraham, Jonah Innocent A Sociolinguistic Profile of the Kyoli (Cori) [cry] Language of Kaduna State, Nigeria Ken Decker, John Muniru, Julius Dabet, Benard Abraham, Jonah Innocent SIL International® 2020 SIL Electronic Survey Report 2020-012, October 2020 © 2020 SIL International® All rights reserved Data and materials collected by researchers in an era before documentation of permission was standardized may be included in this publication. SIL makes diligent efforts to identify and acknowledge sources and to obtain appropriate permissions wherever possible, acting in good faith and on the best information available at the time of publication. Abstract This report describes a sociolinguistic survey conducted among the Kyoli-speaking communities in Jaba Local Government Area (LGA), Kaduna State, in central Nigeria. The Ethnologue (Eberhard et al. 2020a) classifies Kyoli [cry] as a Niger-Congo, Atlantic Congo, Volta-Congo, Benue-Congo, Plateau, Western, Northwestern, Hyamic language. During the survey, it was learned that the speakers of the language prefer to spell the name of their language <Kyoli>, which is pronounced as [kjoli] or [çjoli]. They refer to speakers of the language as Kwoli. We estimate that there may be about 7,000 to 8,000 speakers of Kyoli, which is most if not all the ethnic group. The goals of this research included gaining a better understanding of the role of Kyoli and other languages in the lives of the Kwoli people. Our data indicate that Kyoli is used at a sustainable level of orality, EGIDS 6a. -
Final Study Report on CEF Automated Translation Value Proposition in the Context of the European LT Market/Ecosystem
Final study report on CEF Automated Translation value proposition in the context of the European LT market/ecosystem FINAL REPORT A study prepared for the European Commission DG Communications Networks, Content & Technology by: Digital Single Market CEF AT value proposition in the context of the European LT market/ecosystem Final Study Report This study was carried out for the European Commission by Luc MEERTENS 2 Khalid CHOUKRI Stefania AGUZZI Andrejs VASILJEVS Internal identification Contract number: 2017/S 108-216374 SMART number: 2016/0103 DISCLAIMER By the European Commission, Directorate-General of Communications Networks, Content & Technology. The information and views set out in this publication are those of the author(s) and do not necessarily reflect the official opinion of the Commission. The Commission does not guarantee the accuracy of the data included in this study. Neither the Commission nor any person acting on the Commission’s behalf may be held responsible for the use which may be made of the information contained therein. ISBN 978-92-76-00783-8 doi: 10.2759/142151 © European Union, 2019. All rights reserved. Certain parts are licensed under conditions to the EU. Reproduction is authorised provided the source is acknowledged. 2 CEF AT value proposition in the context of the European LT market/ecosystem Final Study Report CONTENTS Table of figures ................................................................................................................................................ 7 List of tables .................................................................................................................................................. -
Annex H. Summary of the Early Grade Reading Materials Survey in Senegal
Annex H. Summary of the Early Grade Reading Materials Survey in Senegal Geography and Demographics 196,722 square Size: kilometers (km2) Population: 14 million (2015) Capital: Dakar Urban: 44% (2015) Administrative 14 regions Divisions: Religion: 95% Muslim 4% Christian 1% Traditional Source: Central Intelligence Agency (2015). Note: Population and percentages are rounded. Literacy Projected 2013 Primary School 2015 Age Population (aged 2.2 million Literacy a a 7–12 years): Rates: Overall Male Female Adult (aged 2013 Primary School 56% 68% 44% 84%, up from 65% in 1999 >15 years) GER:a Youth (aged 2013 Pre-primary School 70% 76% 64% 15%,up from 3% in 1999 15–24 years) GER:a Language: French Mean: 18.4 correct words per minute When: 2009 Oral Reading Fluency: Standard deviation: 20.6 Sample EGRA Where: 11 regions 18% zero scores Resultsb 11% reading with ≥60% Reading comprehension Who: 687 P3 students Comprehension: 52% zero scores Note: EGRA = Early Grade Reading Assessment; GER = Gross Enrollment Rate; P3 = Primary Grade 3. Percentages are rounded. a Source: UNESCO (2015). b Source: Pouezevara et al. (2010). Language Number of Living Languages:a 210 Major Languagesb Estimated Populationc Government Recognized Statusd 202 DERP in Africa—Reading Materials Survey Final Report 47,000 (L1) (2015) French “Official” language 3.9 million (L2) (2013) “National” language Wolof 5.2 million (L1) (2015) de facto largest LWC Pulaar 3.5 million (L1) (2015) “National” language Serer-Sine 1.4 million (L1) (2015) “National” language Maninkakan (i.e., Malinké) 1.3 million (L1) (2015) “National” language Soninke 281,000 (L1) (2015) “National” language Jola-Fonyi (i.e., Diola) 340,000 (L1) “National” language Balant, Bayot, Guñuun, Hassanya, Jalunga, Kanjaad, Laalaa, Mandinka, Manjaaku, “National” languages Mankaañ, Mënik, Ndut, Noon, __ Oniyan, Paloor, and Saafi- Saafi Note: L1 = first language; L2 = second language; LWC = language of wider communication. -
Language Classification in the Ethnologue and Its Consequences (Paper)
Sarah E. Cornwell The University of Western Ontario, London, Ontario, Canada Language Classification in The Ethnologue and its Consequences (Paper) Abstract: The Ethnologue is a widely used classificatory standard for the world’s 7000+ natural languages. However, the motives and processes used by The Ethnologue’s governing body, SIL International, have come under criticism by linguists. This paper investigates how The Ethnologue answers the question “What is a language?” through the theoretical lens presented by Bowker and Star in Sorting Things Out (1999) and presents some consequences of those classificatory decisions. 1. Introduction Language is central to human culture and experience. It is the central core of our communicative capacity, running through all media of communication. Due to this essential role, there is a need to classify and count languages and their speakers1. Governments need to communicate with citizens, libraries need to catalogue books by language, and search engines need to return webpages in the appropriate language for each searcher. For these reasons among others, there is a strong institutional need for a standardized classification structure and labelling system for languages. The most widespread current classificatory infrastructure for languages is based on The Ethnologue. This essay will critically evaluate The Ethnologue using the Foucauldian theory developed for investigating classification structures by Bowker and Star (1999) in Sorting Things Out: Classification and its Consequences. 2. The Ethnologue The Ethnologue is a catalogue of all the world’s languages. First compiled in 1951, The Ethnologue is currently in its 20th edition and contains descriptions of 7099 living languages (2017). Since 1997, The Ethnologue has been available freely online and widely accessible (Simons & Fennig, 2017). -
Mapping India's Language and Mother Tongue Diversity and Its
Mapping India’s Language and Mother Tongue Diversity and its Exclusion in the Indian Census Dr. Shivakumar Jolad1 and Aayush Agarwal2 1FLAME University, Lavale, Pune, India 2Centre for Social and Behavioural Change, Ashoka University, New Delhi, India Abstract In this article, we critique the process of linguistic data enumeration and classification by the Census of India. We map out inclusion and exclusion under Scheduled and non-Scheduled languages and their mother tongues and their representation in state bureaucracies, the judiciary, and education. We highlight that Census classification leads to delegitimization of ‘mother tongues’ that deserve the status of language and official recognition by the state. We argue that the blanket exclusion of languages and mother tongues based on numerical thresholds disregards the languages of about 18.7 million speakers in India. We compute and map the Linguistic Diversity Index of India at the national and state levels and show that the exclusion of mother tongues undermines the linguistic diversity of states. We show that the Hindi belt shows the maximum divergence in Language and Mother Tongue Diversity. We stress the need for India to officially acknowledge the linguistic diversity of states and make the Census classification and enumeration to reflect the true Linguistic diversity. Introduction India and the Indian subcontinent have long been known for their rich diversity in languages and cultures which had baffled travelers, invaders, and colonizers. Amir Khusru, Sufi poet and scholar of the 13th century, wrote about the diversity of languages in Northern India from Sindhi, Punjabi, and Gujarati to Telugu and Bengali (Grierson, 1903-27, vol. -
The Components of Sustainable Multilingual Education Programs
A Two-Way Bridge The Components of Sustainable Multilingual Education Programs Studies demonstrate that learning is most effective when the instruction is received in the language the learner knows best. This simple truth extends from basic reading and writing skills in the first language to second language acquisition. In multilingual education programs (MLE) that start with the mother tongue, learners use their own language for learning in the early grades, while also learning the official language as a classroom subject. As learners gain competence in understanding, speaking, reading and writing the language of education, teachers begin using it for instruction. This instructional bridge between the community language and the language of wider communication enables learners—children and adults alike—to meet their broader multilingual goals while retaining their local language and culture. This booklet addresses several important aspects of MLE: ■ The voices of ethnolinguistic minority communities are often not heard. Therefore, advocacy is appropriate for these communities to meet their MLE needs. ■ Conventional instructional methods are not adequate for MLE programs. Educators at both community and national levels need to develop their capacity to design and implement MLE programs. ■ Developing a writing system for a non-dominant language is a challenging but essential early step in developing an MLE program. ■ MLE not only requires the commitment and resources of the local community, but also the resources and expertise available from government agencies, NGOs or others. Resource linking brings the partners together so that each one contributes its own particular resources. ■ MLE gives children and adults a firm foundation for continuing to learn throughout their lives. -
English Planning for Global Cooperation in Sign Language Development Through Networks, Finland, the US, Colombia and Japan Consultants and Resources
SIL International 2009 Partners in Language Development UPDATE YEARS 1934–2009 “No one has ever seen a book in our language before.” Until this year, the Koda community in Bangladesh did not have a written form of their language. Some Koda people thought it was impossible to write their language. A cooperative education and development program for this community was started by Food for the Hungry and SIL. In 2009, a group of eager Koda men met several times with SIL staff to make decisions about a writing system for their language. The men—some of whom had only attended school for two years—participated in an orthography workshop with SIL consultants and began to write in their own language. They worked together to identify the speech sounds that are the same as in Bangla (the national language), the several sounds that are different, and how to Each workshop represent each sound in an alphabet—the first step towards uniform spelling. participant wrote When workshop participants were asked to write stories about their daily lives and a story in Koda and culture, they responded, “We have never written a story before.” But by the end of made his own book. These men have the workshop, each participant had written a short story and made his own book to started to fulfill their take back to his village—an achievement that gave them confidence and increased community’s desire to self-esteem. protect and preserve SIL grew out of one man’s concern for people speaking languages that lacked written their language. -
A Sketch of the Linguistic Geography of Signed Languages in the Caribbean1,2
OCCASIONAL PAPER No. 38 A SKETCH OF THE LINGUISTIC GEOGRAPHY OF SIGNED LANGUAGES IN THE CARIBBEAN Ben Braithwaite The University of the West Indies, St. Augustine June 2017 SCL OCCASIONAL PAPERS PAPER NUMBER 38—JUNE 2017 Edited by Ronald Kephart (2014–2016) and Joseph T. Farquharson (2016–2018), SCL Publications Officers Copy editing by Sally J. Delgado and Ronald Kephart Proofreading by Paulson Skerritt and Sulare Telford EDITORIAL BOARD Joseph T. Farquharson The University of the West Indies, Mona (Chair) Janet L. Donnelly College of the Bahamas David Frank SIL International Ronald Kephart University of North Florida Salikoko S. Mufwene University of Chicago Ian E. Robertson The University of the West Indies, St. Augustine Geraldine Skeete The University of the West Indies, St. Augustine Donald C. Winford Ohio State University PUBLISHED BY THE SOCIETY FOR CARIBBEAN LINGUISTICS (SCL) c/o Department of Language, Linguistics and Philosophy, The University of the West Indies, Mona campus, Kingston 7, Jamaica. <www.scl-online.net> © 2017 Ben Braithwaite. All rights reserved. Not to be reproduced in any form without the written permission of the author. ISSN 1726–2496 THE LINGUISTIC GEOGRAPHY OF SIGNED LANGUAGES 3 A Sketch of the Linguistic Geography of Signed Languages in the Caribbean1,2 Ben Braithwaite The University of the West Indies, St. Augustine 1. Introduction HE Caribbean… is the location of almost every type of linguistic “Tphenomenon, and of every type of language situation. For example, trade and contact jargons, creole languages and dialects, ethnic vernaculars, and regional and nonstandard dialects are all spoken. There are also ancestral languages used for religious purposes…, regional standards, and international standards. -
Ruken C¸Akici
RUKEN C¸AKICI personal information Official Ruket C¸akıcı Name Born in Turkey, 23 June 1978 email [email protected] website http://www.ceng.metu.edu.tr/˜ruken phone (H) +90 (312) 210 6968 · (M) +90 (532) 557 8035 work experience 2010- Instructor, METU METU Research and Teaching duties 1999-2010 Research Assistant, METU — Ankara METU Teaching assistantship of various courses education 2002-2008 University of Edinburgh, UK Doctor of School: School of Informatics Philosophy Thesis: Wide-Coverage Parsing for Turkish Advisors: Prof. Mark Steedman & Prof. Miles Osborne 1999-2002 Middle East Technical University Master of Science School: Computer Engineering Thesis: A Computational Interface for Syntax and Morphemic Lexicons Advisor: Prof. Cem Bozs¸ahin 1995-1999 Middle East Technical University Bachelor of Science School: Computer Engineering projects 1999-2001 AppTek/ Lernout & Hauspie Inc Language Pairing on Functional Structure: Lexical- Functional Grammar Based Machine Translation for English – Turkish. 150000USD. · Consultant developer 2007-2011 TUB¨ MEDID Turkish Discourse Treebank Project, TUBITAK 1001 program (107E156), 137183 TRY. · Researcher · (Now part of COST Action IS1312 (TextLink)) 2012-2015 Unsupervised Learning Methods for Turkish Natural Language Processing, METU BAP Project (BAP-08-11-2012-116), 30000 TRY. · Primary Investigator 2013-2015 TwiTR: Turkc¸e¨ ic¸in Sosyal Aglarda˘ Olay Bulma ve Bulunan Olaylar ic¸in Konu Tahmini (TwiTR: Event detection and Topic identification for events in social networks for Turkish language), TUBITAK 1001 program (112E275), 110750 TRY. · Researcher· (Now Part of ICT COST Action IC1203 (ENERGIC)) 2013-2016 Understanding Images and Visualizing Text: Semantic Inference and Retrieval by Integrating Computer Vision and Natural Language Processing, TUBITAK 1001 program (113E116), 318112 TRY. -
CALLS for PAPERS to the INTERNATIONAL CONFERENCE /HUMBOLT KOLLEG on PROSODY and LANGUAGE TECHNOLOGY: CHALLENGES and PROSPECTS, ABIDJAN MAY 4Th to MAY 9Th 2014
CALLS FOR PAPERS TO THE INTERNATIONAL CONFERENCE /HUMBOLT KOLLEG ON PROSODY AND LANGUAGE TECHNOLOGY: CHALLENGES AND PROSPECTS, ABIDJAN MAY 4th to MAY 9th 2014. 1. DESCRIPTION OF CONTENT: PROSODY AS A CORE COMPONENT OF LANGUAGE AND SPEECH Prosodic systems in language range from the melodies and rhythms of intonation in sentences and dialogue on the one hand to intricate duration and tonal patterns of words. Physically, prosody is commonly defined as the variation of speech sounds and pitch patterns in time and has long been considered in traditional linguistics as peripheral to the core communicative functions of speech acts. Research into African tonal languages has played a highly significant but often unappreciated role in uncovering the significance and intricacies of prosody, revealing autosegments of tone, intonation, stress, accent, vowel harmony, nasality and other phonological features which pattern in a quasi-independent fashion in speech, and are represented as hierarchically organized parallel phonetic and phonological information streams or tiers. Prosody has been studied from the perspectives of many different disciplines: literature, (particularly poetry and drama rhetoric), musicology, cognitive and behavioral psychology, the study of language acquisition and language and speech pathologies, forensic linguistics, computational and mathematical linguistics, language typology and historical linguistic reconstruction, and engineering disciplines concerned with the speech synthesis and recognition interfaces associated with modern mobile devices. These interdisciplinary developments have also touched research on African languages, though mainly in separate universities and by individuals working without the extensive laboratory support of European, American and East Asian research groups. A strength of research on the distinctive prosodic attributes of African languages is that results have been based on extensive and detailed fieldwork on a very large variety of languages. -
Tivoid Survey
Tivoid Survey Alan Starr and Clark Regnier SIL International 2008 SIL Electronic Survey Report 2008-022, December 2008 Copyright © 2008 Alan Starr, Clark Regnier, and SIL International All rights reserved 2 CONTENTS Abstract 1 Background 1.1 Classification 1.2 Purposes 1.3 Selection of Sites 1.4 Description of the Area 1.5 Team and Timing 2 Procedures 2.1 Wordlists 2.2 Recorded Text Testing (RTT) 2.3 Sociolinguistic Questionnaires 2.4 Bilingual Questionnaires 2.5 Group Interviews 3 Ipulo-Eman-Caka 3.1 Lexicostatistics 3.2 Ipulo and Eman 3.3 Caka and Eman 4 Icheve 4.1 Linguistic Results and Analysis 4.2 Sociolinguistic and Multilingual Profile 4.3 Conclusions and Decisions 5 Proposed Changes to Ethnologue and ALCAM Appendix 1: Tivoid Languages and Their Neighbors in Cameroon [ALCAM p. 369] Appendix 2: Group Interview Form Appendix 3: Sociolinguistic Questionnaire Appendix 4: 1. (Omitted) 2. Multilingualism 3. Development of the language Appendix 5: Bilingual Proficiency Questionnaire References 3 Abstract This 1990 survey investigates two Tiv-related languages: Assumbo and Mesaka. The purpose is to recommend which varieties should be standardized, based on both linguistic data (shared lexical percentages and inherent intelligibility) and sociolinguistic data (language use and attitudes). In particular, since the Assumbo dialect, Ipulo, as spoken in the village of Tinta is planned for development, there was need to discover the extent to which a written form of this dialect could be used. We investigated the use of Pidgin in key domains to determine community use and proficiency. [This report has not been peer reviewed.] 1 Background 1.1 Classification The Tiv-related languages under study in this survey are identified by the Ethnologue as Assumbo (Asumbo, Badzumbo) and Mesaka (Messaga, Iyon, Banagere). -
The Openhart 2013 Evalua on Workshop
Welcome to the OpenHaRT 2013 Evalua8on Workshop Informaon Technology Laboratory nist.gov/itl Informaon Access Division nist.gov/itl/iad Mark Przybocki, Mul(modal Informaon Group nist.gov/itl/iad/mig August 23rd, 2013 Washington D.C. , Omni Shoreham hotel The Mul8modal Informa8on Group’s Project Areas § Speech Recogni(on § Speaker Recogni(on § Dialog Management § Human Assisted Speaker Recogni(on § Topic Detec(on and Tracking § Speaker Segmentaon § Spoken Document Retrieval § Language Recogni(on § Voice Biometrics § ANSI/NIST-ITL Standard Voice Record § Tracking (Person/Object) § Text-to-Text § Event Detec(on § Speech-to-Text § Event Recoun(ng § Speech-to-Speech § Predic(ve Video Analy(cs § Image-to-Text § Metric Development § Named En(ty Iden(ficaon § (new) Data Analy(cs § Automac Content Extrac(on 2 Defini8on: MIG’s Evalua8on Cycle Evalua'on Driven Research NIST Data NIST Researchers Performance Planning NIST Core technology Assessment development Analysis and NIST NIST Workshop 3 NIST’s MT Program’s Legacy – Past 10 Years • 27 Evaluaon Events -- tracking the state-of-the-art in performance – (4) technology types text-text speech-text speech-speech Handwri[en_Images-text – (9) languages Arabic-2, Chinese, Dari, Farsi, Hindi, Korean, Pashto, Urdu, English • 11 genres of structured and unstructured content (nwire, web, Bnews, Bconv, food, speeches, editorials, handwri(ng-2, blogs, SMS, dialogs) • 60 Evaluaon Test Sets available to MT researchers (source, references, metrics, sample system output and official results for comparison) • Over 85 research groups >400 Primary Systems Evaluated AFRL – American Univ. Cairo – Apptek – ARL – BBN – BYU – Cambridge – Chinese Acad. Sci. – 5% 3% 1% CMU – Columbia Univ. – Fujitsu Research – 11% OpenMT Google – IBM – JHU – Kansas State – KCSL – Language Weaver – Microsoh Research – Ohio TIDES State – Oxford – Qatar – Queen Mary (London) – 23% 57% TRANSTAC RWTH Aachen – SAIC - Sakhr – SRI – Stanford – Systran – UMD – USC ISI – Univ.