Machine Translation of Low-Resource Spoken Dialects: Strategies for Normalizing Swiss German
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Compilation of a Swiss German Dialect Corpus and Its Application to Pos Tagging
Compilation of a Swiss German Dialect Corpus and its Application to PoS Tagging Nora Hollenstein Noemi¨ Aepli University of Zurich University of Zurich [email protected] [email protected] Abstract Swiss German is a dialect continuum whose dialects are very different from Standard German, the official language of the German part of Switzerland. However, dealing with Swiss German in natural language processing, usually the detour through Standard German is taken. As writing in Swiss German has become more and more popular in recent years, we would like to provide data to serve as a stepping stone to automatically process the dialects. We compiled NOAH’s Corpus of Swiss German Dialects consisting of various text genres, manually annotated with Part-of- Speech tags. Furthermore, we applied this corpus as training set to a statistical Part-of-Speech tagger and achieved an accuracy of 90.62%. 1 Introduction Swiss German is not an official language of Switzerland, rather it includes dialects of Standard German, which is one of the four official languages. However, it is different from Standard German in terms of phonetics, lexicon, morphology and syntax. Swiss German is not dividable into a few dialects, in fact it is a dialect continuum with a huge variety. Swiss German is not only a spoken dialect but increasingly used in written form, especially in less formal text types. Often, Swiss German speakers write text messages, emails and blogs in Swiss German. However, in recent years it has become more and more popular and authors are publishing in their own dialect. Nonetheless, there is neither a writing standard nor an official orthography, which increases the variations dramatically due to the fact that people write as they please with their own style. -
Syntactic Transformations for Swiss German Dialects
Syntactic transformations for Swiss German dialects Yves Scherrer LATL Universite´ de Geneve` Geneva, Switzerland [email protected] Abstract These transformations are accomplished by a set of hand-crafted rules, developed and evaluated on While most dialectological research so far fo- the basis of the dependency version of the Standard cuses on phonetic and lexical phenomena, we German TIGER treebank. Ultimately, the rule set use recent fieldwork in the domain of dia- lect syntax to guide the development of mul- can be used either as a tool for treebank transduction tidialectal natural language processing tools. (i.e. deriving Swiss German treebanks from Stan- In particular, we develop a set of rules that dard German ones), or as the syntactic transfer mod- transform Standard German sentence struc- ule of a transfer-based machine translation system. tures into syntactically valid Swiss German After the discussion of related work (Section 2), sentence structures. These rules are sensitive we present the major syntactic differences between to the dialect area, so that the dialects of more Standard German and Swiss German dialects (Sec- than 300 towns are covered. We evaluate the tion 3). We then show how these differences can transformation rules on a Standard German treebank and obtain accuracy figures of 85% be covered by a set of transformation rules that ap- and above for most rules. We analyze the most ply to syntactically annotated Standard German text, frequent errors and discuss the benefit of these such as found in treebanks (Section 4). In Section transformations for various natural language 5, we give some coverage figures and discuss the processing tasks. -
The Gruffalo Latin Edition Pdf Free Download
THE GRUFFALO LATIN EDITION Author: Julia Donaldson Number of Pages: 32 pages Published Date: 02 Aug 2012 Publisher: Pan MacMillan Publication Country: London, United Kingdom Language: English ISBN: 9780230759329 DOWNLOAD: THE GRUFFALO LATIN EDITION The Gruffalo Latin Edition PDF Book "What this power is I cannot say; all I know is that it exists. The Science of Being Great https: tsw. airline accidents from 1991-2000 in which the National Transportation Safety Board (NTSB) found crew error to be a causal factor. Park " " Shockingly fun. Sheep farming involves the housing, feeding, and guarding of the sheep, all detailed in the book. No previous background on ROS is required, as this book takes you from the ground up. " Abused as a child, Marisa Russo feared commitment and fell into a lifestyle of poor choices and negativity. This is a huge setback toward financial security and a massive drain on our economy. He was drowning in debt and completely fed up. The 20 contributions reflect the breadth and the depth of the work of Bernhard Thalheim in conceptual modeling and database theory during his scientific career spanning more than 35 years of active research. forgottenbooks. It clearly explains the principles of good practice in this area and provides training tools to help practitioners develop their knowledge and skills and embed these principles into their setting. In a time when tens of millions of people provide care for family members, older adults, and people with special needs, we should all be experts at it. His line grew into the container giant Sea-Land Services, and Cudahy charts its dramatic evolution into Maersk Sealand, the largest container line in the world. -
Language Contact at the Romance-Germanic Language Border
Language Contact at the Romance–Germanic Language Border Other Books of Interest from Multilingual Matters Beyond Bilingualism: Multilingualism and Multilingual Education Jasone Cenoz and Fred Genesee (eds) Beyond Boundaries: Language and Identity in Contemporary Europe Paul Gubbins and Mike Holt (eds) Bilingualism: Beyond Basic Principles Jean-Marc Dewaele, Alex Housen and Li wei (eds) Can Threatened Languages be Saved? Joshua Fishman (ed.) Chtimi: The Urban Vernaculars of Northern France Timothy Pooley Community and Communication Sue Wright A Dynamic Model of Multilingualism Philip Herdina and Ulrike Jessner Encyclopedia of Bilingual Education and Bilingualism Colin Baker and Sylvia Prys Jones Identity, Insecurity and Image: France and Language Dennis Ager Language, Culture and Communication in Contemporary Europe Charlotte Hoffman (ed.) Language and Society in a Changing Italy Arturo Tosi Language Planning in Malawi, Mozambique and the Philippines Robert B. Kaplan and Richard B. Baldauf, Jr. (eds) Language Planning in Nepal, Taiwan and Sweden Richard B. Baldauf, Jr. and Robert B. Kaplan (eds) Language Planning: From Practice to Theory Robert B. Kaplan and Richard B. Baldauf, Jr. (eds) Language Reclamation Hubisi Nwenmely Linguistic Minorities in Central and Eastern Europe Christina Bratt Paulston and Donald Peckham (eds) Motivation in Language Planning and Language Policy Dennis Ager Multilingualism in Spain M. Teresa Turell (ed.) The Other Languages of Europe Guus Extra and Durk Gorter (eds) A Reader in French Sociolinguistics Malcolm Offord (ed.) Please contact us for the latest book information: Multilingual Matters, Frankfurt Lodge, Clevedon Hall, Victoria Road, Clevedon, BS21 7HH, England http://www.multilingual-matters.com Language Contact at the Romance–Germanic Language Border Edited by Jeanine Treffers-Daller and Roland Willemyns MULTILINGUAL MATTERS LTD Clevedon • Buffalo • Toronto • Sydney Library of Congress Cataloging in Publication Data Language Contact at Romance-Germanic Language Border/Edited by Jeanine Treffers-Daller and Roland Willemyns. -
A Bavarian-Speaking Exception in Alemannic-Speaking Switzerland: the Case of Samnaun 48 Located
47 A Bavarian -speaking 1. Introduction1 data2 is missing, as the study by Gröger is the Exception in Alemannic- only detailed linguistic description of the speaking Switzerland: he municipality of Samnaun, situated German dialect in Samnaun. in the extreme east of Switzerland, In this paper, I present a new research The Case of Samnaun project which seeks to fill this gap. The T stands out from the rest of German- speaking Switzerland: Samnaun is usually project is dedicated to the current linguistic A Project Presentation described as the only place where not an situation in Samnaun. Its working title is Alemannic, but a Bavarian dialect is spoken Bairisch-alemannischer Sprachkontakt. Das Journal Article (cf., e.g., Sonderegger 2003: 2839; Wiesinger Spektrum der sprachlichen Variation in Susanne Oberholzer 1983: 817). This claim is based on a study by Samnaun (‘Bavarian-Alemannic Language Gröger (1924) and has found its way into Contact. The Range of Linguistic Variation in Samnaun has been described as the only Samnaun’). This paper describes Samnaun Bavarian-speaking municipality in Ale- many linguistic descriptions of (German- mannic-speaking Switzerland on the basis speaking) Switzerland. In the literature, this and summarises the available descriptions of of a study done in 1924. Hints in the viewpoint has been almost unchallenged for its language situation as well as the aims, literature about the presence of other over 90 years. Hints about the presence of research questions, and methods of the varieties for everyday communication – an other varieties (an Alemannic dialect as well planned project. intermediate variety on the dialect- In section 2, I sketch the municipality of standard -axis as well as an Alemannic as an intermediate variety on the dialect- dialect – have not resulted in more recent standard-axis) from the second half of the Samnaun. -
Morphological Analysis and Lemmatization for Swiss German Using Weighted Transducers
Proceedings of the 13th Conference on Natural Language Processing (KONVENS 2016) Morphological analysis and lemmatization for Swiss German using weighted transducers Reto Baumgartner University of Zurich [email protected] Abstract haar and Wyler, 1997, p. 37). Conversely it pos- sesses infinitive particles that are not known to StG. With written Swiss German becoming SwG consists of different local dialects that more popular in everyday use, it has be- mainly differ in phonology and to a lesser extent in come a target for text processing. The ab- vocabulary. There is no standard orthography, but sence of a standard orthography and the there are proposals for sound-character assignment variety of dialects, however, lead to a vast like Dieth-Schreibung (Dieth, 1986) or Bärndüt- variation in different spellings which makes schi Schrybwys (Marti, 1985a) that are, however, this task difficult. We built a system based not known to everyone. This results in a high vari- on weighted transducers that recognizes ability of spellings influenced by both dialects and over 90% of the tokens in certain texts. personal writing preferences. As an example for Weights ensure preferring the best analysis the StG word Jahr “year”, we found in our corpus for most words while at the same time al- Jahr and Jaar, Johr and Joor, even Joh for different lowing for very broad range of spelling pronunciations and spelling preferences. variations. Our morphological tagset that The lack of a standard orthography and the vast- we defined for this purpose and lemmas in ness of variants motivate the choices that have to Standard German open the possibility for be made to process these dialects. -
The Politics and Ideologies of Pluricentric German in L2 Teaching
Julia Ruck Webster Vienna Private University THE POLITICS AND IDEOLOGIES OF PLURICENTRIC GERMAN IN L2 TEACHING Abstract: Despite a history of rigorous linguistic research on the regional variation of German as well as professional initiatives to promote German, Austrian, and Swiss Standard German as equal varieties, there is still a lack of awareness and systematic incorporation of regional varieties in L2 German teaching. This essay follows two goals: First, it reviews the development of the pluricentric approach in the discourse on L2 German teaching as well as the political and ideological preconditions that form the backdrop of this discussion. Particular emphasis will be given to institutional tri-national collaborations and the standard language ideology. Second, by drawing on sociolinguistic insights on the use and speaker attitudes of (non-)standard varieties, this contribution argues that the pluricentric focus on national standard varieties in L2 German teaching falls short in capturing the complex socioculturally situated practices of language use in both (often dialectally-oriented) everyday and (often standard-oriented) formal and official domains of language use. I argue that the pluricentric approach forms an important step in overcoming the monocentric bias of one correct Standard German; however, for an approach to L2 German teaching that aims at representing linguistic and cultural diversity, it is necessary to incorporate both standard and non-standard varieties into L2 German teaching. Keywords: L2 German w language variation w language ideologies w language politics Ruck, Julia. “The Politics and Ideologies of Pluricentric German in L2 Teaching.” Critical Multilingualism Studies 8:1 (2020): pp. 17–50. ISSN 2325–2871. -
THE SWISS GERMAN LANGUAGE and IDENTITY: STEREOTYPING BETWEEN the AARGAU and the ZÜRICH DIALECTS by Jessica Rohr
Purdue University Purdue e-Pubs Open Access Theses Theses and Dissertations 12-2016 The wS iss German language and identity: Stereotyping between the Aargau and the Zurich dialects Jessica Rohr Purdue University Follow this and additional works at: https://docs.lib.purdue.edu/open_access_theses Part of the Anthropological Linguistics and Sociolinguistics Commons, and the Language Interpretation and Translation Commons Recommended Citation Rohr, Jessica, "The wS iss German language and identity: Stereotyping between the Aargau and the Zurich dialects" (2016). Open Access Theses. 892. https://docs.lib.purdue.edu/open_access_theses/892 This document has been made available through Purdue e-Pubs, a service of the Purdue University Libraries. Please contact [email protected] for additional information. THE SWISS GERMAN LANGUAGE AND IDENTITY: STEREOTYPING BETWEEN THE AARGAU AND THE ZÜRICH DIALECTS by Jessica Rohr A Thesis Submitted to the Faculty of Purdue University In Partial Fulfillment of the Requirements for the degree of Master of Arts Department of Languages and Cultures West Lafayette, Indiana December 2016 ii THE PURDUE UNIVERSITY GRADUATE SCHOOL STATEMENT OF THESIS APPROVAL Dr. John Sundquist, Chair Department of German and Russian Dr. Daniel J. Olson Department of Spanish Dr. Myrdene Anderson Department of Anthropology Approved by: Dr. Madeleine M Henry Head of the Departmental Graduate Program iii To my Friends and Family iv ACKNOWLEDGMENTS I would like to thank my major professor, Dr. Sundquist, and my committee members, Dr. Olson, and Dr. Anderson, for their support and guidance during this process. Your guidance kept me motivated and helped me put the entire project together, and that is greatly appreciated. -
Semantic Approaches in Natural Language Processing
Cover:Layout 1 02.08.2010 13:38 Seite 1 10th Conference on Natural Language Processing (KONVENS) g n i s Semantic Approaches s e c o in Natural Language Processing r P e g a u Proceedings of the Conference g n a on Natural Language Processing 2010 L l a r u t a This book contains state-of-the-art contributions to the 10th N n Edited by conference on Natural Language Processing, KONVENS 2010 i (Konferenz zur Verarbeitung natürlicher Sprache), with a focus s e Manfred Pinkal on semantic processing. h c a o Ines Rehbein The KONVENS in general aims at offering a broad perspective r on current research and developments within the interdiscipli - p p nary field of natural language processing. The central theme A Sabine Schulte im Walde c draws specific attention towards addressing linguistic aspects i t of meaning, covering deep as well as shallow approaches to se - n Angelika Storrer mantic processing. The contributions address both knowledge- a m based and data-driven methods for modelling and acquiring e semantic information, and discuss the role of semantic infor - S mation in applications of language technology. The articles demonstrate the importance of semantic proces - sing, and present novel and creative approaches to natural language processing in general. Some contributions put their focus on developing and improving NLP systems for tasks like Named Entity Recognition or Word Sense Disambiguation, or focus on semantic knowledge acquisition and exploitation with respect to collaboratively built ressources, or harvesting se - mantic information in virtual games. Others are set within the context of real-world applications, such as Authoring Aids, Text Summarisation and Information Retrieval. -
Searching for the Bilingual Advantage in Executive Functions in Speakers of Hunsrückisch and Other Minority Languages: a Literature Review
Searching for the bilingual advantage in executive functions in speakers of Hunsrückisch and other minority languages: a literature review Elisabeth Abreu and Bernardo Limberger (Rio Grande do Sul) Abstract The issue of a bilingual advantage on executive functions has been a hotbed for research and debate. Brazilian studies with the minority language Hunsrückisch have failed to replicate the finding of a bilingual advantage found in international studies with majority languages. This raises the question of whether the reasons behind the discrepant results are related to the lan- guage’s minority status. The goal of this paper was to investigate the bilingual advantage on executive functions in studies with minority languages. This is a literature review focusing on state of the art literature on bilingualism and executive functions; as well as a qualitative anal- ysis of the selected corpus in order to tackle the following questions: (1) is there evidence of a bilingual advantage in empirical studies involving minority languages? (2) are there common underlying causes for the presence or absence of the bilingual advantage? (3) can factors per- taining to the language’s minority status be linked to the presence or absence of a bilingual advantage? The analysis revealed that studies form a highly variable group, with mixed results regarding the bilingual advantage, as well as inconsistent controlling of social, cognitive and linguistic factors, as well as different sample sizes. As such, it was not possible to isolate which factors are responsible for the inconsistent results across studies. It is hoped this study will provide an overview that can serve as common ground for future studies involving the issue of a bilingual advantage with speakers of minority languages. -
Pioneers of Modern Geography: Translations Pertaining to German Geographers of the Late Nineteenth and Early Twentieth Centuries Robert C
Wilfrid Laurier University Scholars Commons @ Laurier GreyPlace 1990 Pioneers of Modern Geography: Translations Pertaining to German Geographers of the Late Nineteenth and Early Twentieth Centuries Robert C. West Follow this and additional works at: https://scholars.wlu.ca/grey Part of the Earth Sciences Commons, and the Human Geography Commons Recommended Citation West, Robert C. (1990). Pioneers of Modern Geography: Translations Pertaining to German Geographers of the Late Nineteenth and Early Twentieth Centuries. Baton Rouge: Department of Geography & Anthropology, Louisiana State University. Geoscience and Man, Volume 28. This Book is brought to you for free and open access by Scholars Commons @ Laurier. It has been accepted for inclusion in GreyPlace by an authorized administrator of Scholars Commons @ Laurier. For more information, please contact [email protected]. Pioneers of Modern Geography Translations Pertaining to German Geographers of the Late Nineteenth and Early Twentieth Centuries Translated and Edited by Robert C. West GEOSCIENCE AND MAN-VOLUME 28-1990 LOUISIANA STATE UNIVERSITY s 62 P5213 iiiiiiiii 10438105 DATE DUE GEOSCIENCE AND MAN Volume 28 PIONEERS OF MODERN GEOGRAPHY Digitized by the Internet Archive in 2017 https://archive.org/details/pioneersofmodern28west GEOSCIENCE & MAN SYMPOSIA, MONOGRAPHS, AND COLLECTIONS OF PAPERS IN GEOGRAPHY, ANTHROPOLOGY AND GEOLOGY PUBLISHED BY GEOSCIENCE PUBLICATIONS DEPARTMENT OF GEOGRAPHY AND ANTHROPOLOGY LOUISIANA STATE UNIVERSITY VOLUME 28 PIONEERS OF MODERN GEOGRAPHY TRANSLATIONS PERTAINING TO GERMAN GEOGRAPHERS OF THE LATE NINETEENTH AND EARLY TWENTIETH CENTURIES Translated and Edited by Robert C. West BATON ROUGE 1990 Property of the LfhraTy Wilfrid Laurier University The Geoscience and Man series is published and distributed by Geoscience Publications, Department of Geography & Anthropology, Louisiana State University. -
What Better Way to Observe Mother Language Day in Liverpool Than to Celebrate the Amazing Tapestry of Languages Within Our City?
International Mother Language Day is observed each year on February 21st! This year's theme is 'Fostering multilingualism for inclusion in education and society'. If you would like to learn more visit: https://en.unesco.org/commemorations/motherlanguageday What better way to observe Mother Language Day in Liverpool than to celebrate the amazing tapestry of languages within our City? EMTAS and the Modern Foreign Languages Team in School Improvement Liverpool have planned an action packed week! Why not join us for a week of lunchtime story sessions delivered in lots of different languages! The story sessions are suitable for children age 3- 7 years. Children must be accompanied by a parent, carer or other supervising adult. We have over 100 languages spoken across our schools in Liverpool. Sadly, we won't be able to showcase all of our languages this time, but please look out for our 'Tapestry of Languages' event during the Summer term! Don't forget to check out our session at 13.00 on Thursday 11th February. Zi Lan, from Pagoda Arts, will be telling us all about Chinese New Year. This session will be delivered in English and is suitable for children age 5-9yrs! Sessions will be delivered online via zoom. Please use the links below to register in advance for any session you would like to attend. After registering, you will receive a confirmation email containing further information about joining the session. 'Let's Celebrate' | FREE Programme of Events Monday 8th February 12 noon Levi Tafari, Performance Poet and urban griot, will be launching