Reuse of Free Resources in Machine Translation Between Nynorsk and Bokmal˚

Total Page:16

File Type:pdf, Size:1020Kb

Reuse of Free Resources in Machine Translation Between Nynorsk and Bokmal˚ Reuse of Free Resources in Machine Translation between Nynorsk and Bokmal˚ Kevin Unhammer Trond Trosterud Department of Linguistics Department of Linguistics University of Bergen University of Tromsø Bergen, Norway Tromsø, Norway [email protected] [email protected] Abstract guages with the same amount of speakers. We describe the creation of apertium-nn-nb, a ma- We describe the development of a chine translation (MT) system between Nynorsk two-way shallow-transfer machine and Bokmal˚ 1 built using these resources with translation system between Norwegian the Free and Open Source Apertium platform Nynorsk and Norwegian Bokmal˚ (Armentano-Oller et al., 2006). In the follow- built on the Apertium platform, using ing section we give an overview of the Aper- the Free and Open Source resources tium platform and Constraint Grammar. Section Norsk Ordbank and the Oslo–Bergen 3 describes how the available resources were in- Constraint Grammar tagger. We detail tegrated into Apertium, and how we dealt with the integration of these and other lexical and syntactic transfer (for which we did resources in the system along with not have Free resources available). As Bokmal˚ the construction of the lexical and and Nynorsk are mutually intelligible, a ‘gisting’ structural transfer, and evaluate the system would not find much use, our aim is to translation quality in comparison with make the translations acceptable for post-editing; another system. Finally, some future in the last two sections we give an evaluation of work is suggested. the translation quality in light of this, and a dis- cussion of the lessons learnt and how the system 1 Introduction may be further improved. The term Norwegian covers a variety of related 2 Design spoken dialects. Up until the 1800’s, Danish was the only written standard used in Norway. 2.1 The Apertium Pipeline Bokmal˚ emerged through various reforms which The Nynorsk–Bokmal˚ language pair follows the brought the written language closer to the spoken; design of the Apertium system, a highly modular, Nynorsk however, was created from the ground shallow-transfer pipeline MT system. Dictionar- up with the purpose of representing all the spoken ies written in XML are compiled into finite state dialects of Norway. As it is, certain dialects (es- transducers, so that word-for-word translations pecially around the Oslo area) correspond more are possible in both directions using only two with Bokmal,˚ while others are closer to Nynorsk. monolingual dictionaries (morphological analy- Nynorsk is “in a minority position in Norway, sis/generation) and one translational (transfer) with approximately 12% of the users” (Everson dictionary. Both dictionary types make use of and Trosterud, 2000), or around 450,000 people. paradigms to e.g. generalise over common suf- Although Nynorsk is in a minority position, fix sets (and their analyses), and directional con- there are quite good linguistic resources avail- able under Free licences, compared to many lan- 1Available from http://apertium.org J.A. P´erez-Ortiz,F. S´anchez-Mart´ınez,F.M. Tyers (eds.) Proceedings of the First International Workshop on Free/Open-Source Rule-Based Machine Translation, p. 35{42 Alacant, Spain, November 2009 straints, which state that a certain entry may be In the next section we describe the develop- analysed, but not generated, or vice versa. ment and use of the Apertium modules, including Hidden Markov models (HMM’s) are used af- CG, in apertium-nn-nb. ter analysis for part-of-speech disambiguation. The transfer module is finite state based and han- 3 Development dles three-stage chunking transfer, although we so 3.1 Resources far only use one-stage transfer, applying opera- tions directly on patterns of morphological cate- As our basis for the morphological analysis and 4 gories (described in further detail in section 3.5). generation, we used Norsk Ordbank , a GPL full Output from the transfer module is fed to mor- form dictionary with over 100, 000 lemmas. We phological generation. De-/reformatters applied also used the morphological disambiguator of the to the beginning and end of the pipeline let us pre- Oslo–Bergen Tagger (OBT), a high quality GPL serve formatting of various document types. Constraint Grammar (Hagen et al., 2000). Both of these use the same tag scheme. They were con- 2.2 Constraint Grammar verted into Apertium formats and tag schemes, as described below. The bilingual dictionary and the This language pair differs from most of the other transfer rules (for syntactic differences and agree- Apertium pairs in using a Constraint Grammar ment) were built from scratch. The following sec- 2 (CG) module as a pre-disambiguator (before the tions detail the process. HMM). CG’s (Karlsson, 1990) are hand-written rules which, given ambiguously tagged input (e.g. 3.2 Analysis and generation the English word ‘read’ tagged both as a past Like most Apertium language pairs, we use lttool- and present tense verb), may SELECT one read- box for morphological analysis and generation, ing/analysis over all the others, or REMOVE a which compiles XML-formatted entries into fast certain reading from the set of analyses. The finite state transducers and allows generalisations last reading is never removed. We may end up to be made across e.g. common suffix paradigms. with several readings if the input in fact was am- The full form dictionary entries (with morpho- biguous or the grammar didn’t manage to remove logical information like lemma, POS, inflection, what it should. CG’s may also MAP (add) new etc.) in Norsk Ordbank were semi-automatically tags to readings, typically syntactic function la- transformed into the lttoolbox format. First, one bels. Rules may check in either direction for the paradigm was created per lemma (always creat- existence of tags or even specific words, over ab- ing the longest possible suffix), then any dupli- solute or undefined distances. cate paradigms were merged. Closed classes (e.g. CG’s have been shown to be robust in handling pronouns, determiners) were added manually. unseen text, as well as reaching high accuracy lev- els. CG is also the only grammar-based method 3.3 Disambiguation to give results comparable to statistical taggers. The OBT and Norsk Ordbank use a different Where statistical taggers have been shown to have tagset from Apertium. We want the data from 3 a ceiling under 97% , Bick (2000, p. 187–188) apertium-nn-nb to be useful in creating new cites 99% precision and recall when fully disam- Apertium language pairs, so we converted the biguating with a CG tagger for Portuguese. tags to ones which conform as much as possi- In an MT context the important point is that the ble to other Apertium dictionaries. Most tags good CG results have made it possible to present could be replaced one-to-one, although some robust rule-based MT. Good examples are Bick were replaced with CG sets. To exemplify the and Hansen (2007). latter: the OBT uses the tags subst.appell and subst.prop for common and proper nouns, 2Using VISL CG-3, http://beta.visl.sdu.dk/cg3.html 3Leech et al. (1994); Brants (2000); Brill and Pop (1997) where Apertium uses n and np respectively, so all cite accuracy results between 96% and 97%. Both rules working on the single tag subst were Chanod and Tapanainen (1995) and Samuelsson and Vouti- lainen (1997) compare statistical and CG taggers. 4http://www.edd.uio.no/prosjekt/ordbanken/ 36 changed to work on the set consisting of the tags with this method though. One is that it may intro- n and np. Most of this conversion was done using duce a lot of false friends. For closely related lan- simple shell scripts. guages with such high overlap in the lexicon, the The Constraint Grammar runs as a pre- benefit outweighs the risk (and lists of common disambiguator, and does not always manage to false friends are not hard to come by in gram- remove all spurious analyses. We run Aper- mars). The other problem is that we add many tium’s statistical disambiguator module after this “radical forms”, e.g. Bokmal˚ words which exist step to make a final choice. An unsupervised in the Nynorsk dictionary but are far from being bigram model (Baum-Welch algorithm, 8 it- the most natural sounding Nynorsk translation. erations) was trained on Wikipedia text using We can easily put restrictions on these forms (or the apertium-tagger. Although the Apertium on all forms with a certain substring) so that they toolset allows for more advanced statistical mod- are only analysed, but not generated; but finding els (Sheikh, 2009) and methods for parameter es- all such pairs involves some work. timation (Sanchez-Mart´ ´ınez et al., 2008), so far We also added entries where there were pre- we have instead worked on improving the CG dictable changes, e.g. the Bokmal˚ adjective suffix where we spotted errors. Certain errors in the dis- -lig will typically be -leg in Nynorsk, etc. This ambiguation might be easier to spot when work- process, also used by Tyers et al. (2009, p. 4), ing with MT, and improvements to the CG could consists of be of benefit to others using the OBT. When dis- ambiguating for MT, it is important to keep in 1. finding Bokmal˚ entries without translations mind that we always have to end up with only one 2. running string replacements on these for typ- analysis, thus our version of the OBT is slightly ical differences in substrings more aggressive in removing readings. E.g. we 3. checking whether the altered entries actually use the following “heuristic” rule: exist in the Nynorsk analyser REMOVE (n) IF (0 adj)(-1 det)(1 subst); Running this method gave about 2500 nouns and verbs5.
Recommended publications
  • Norsk Ordbok - the Crown of Nynorsk Lexicography?
    Lars S. Vik0r, Sectionfor Norwegian Lexicography, University ofOslo Norsk Ordbok - the Crown of Nynorsk Lexicography? Abstract Norsk Ordbok 'Norwegian Dictionary' is a multi-volume dictionary of the Norwegian standard variety Nynorsk and the Norwegian dialects. It is one of the very few dictionaries which cover both a written standard language and the oral dialects on which this standard is based. It was initiated around 1930, based on dialect material collected by volunteers and stored in a vast card archive, and on a variety of written sources. At present, three oftwelve planned volumes have appeared, reaching into g. The paper gives a historical outline of the project, followed by a brief description of its structure and the types of information it gives. This is exemplified by the treatment of one particular word, bunad. Finally, some fundamental problems are briefly discussed: 1) the selection of lemmas, 2) the character of the sources, 3) the treatment of dialect forms, 4) the sequence of definitions. The full title of Norsk Ordbok is Norsk Ordbok. Ordbok over det norske folkemâlet og det nynorske skriftmâlet 'Norwegian Dictionary. A dic­ tionary of the Norwegian popular language [i.e. the Norwegian dialects], and the Nynorsk written language'. This title at once indicates the dual aspect of the dictionary: It gives integrated coverage of both oral dialects and a written standard language. This dual aspect is the most special distinguishing feature of Norsk Ordbok as a lexicographic work. Normally, dictionaries cover written standard languages or some aspect of them (or, in the case of pro­ nouncing dictionaries, oral standard language).
    [Show full text]
  • Language and Country List
    CONTENT LANGUAGE & COUNTRY LIST Languages by countries World map (source: United States. United Nations. [ online] no dated. [cited July 2007] Available from: www.un.org/Depts/Cartographic/english/htmain.htm) Multicultural Clinical Support Resource Language & country list Country Languages (official/national languages in bold) Country Languages (official/national languages in bold) Afghanistan Dari, Pashto, Parsi-Dari, Tatar, Farsi, Hazaragi Brunei Malay, English, Chinese, other minority languages Albania Tosk, Albanian Bulgaria Bulgarian, Turkish, Roma and other minority languages Algeria Arabic, French, Berber dialects Burkina Faso French, native African (Sudanic) languages 90% Andorra Catalán, French, Spanish, Portuguese Burundi Kirundi, French, Swahili, Rwanda Angola Portuguese, Koongo, Mbundu, Chokwe, Mbunda, Cambodia Khmer, French, English Antigua and English, local dialects, Arabic, Portuguese Cameroon French, English, 24 African language groups Barbuda Canada English, French, other minority languages Argentina Spanish, English, Italian, German, French Cape Verde Portuguese, Kabuverdianu, Criuolo Armenia Armenian, Yezidi, Russian Central French (official), Sangho (lingua franca, national), other minority Australia English, Indigenous and other minority languages African languages Austria German, Slovenian, Croatian, Hungarian, Republic Alemannisch, Bavarian, Sinte Romani, Walser Chad French, Arabic, Sara, more than 120 languages and dialects Azerbaijan Azerbaijani (Azeri), Russian, Armenian, other and minority languages Chile
    [Show full text]
  • A Concise History of English a Concise Historya Concise of English
    A Concise History of English A Concise HistoryA Concise of English Jana Chamonikolasová Masarykova univerzita Brno 2014 Jana Chamonikolasová chamonikolasova_obalka.indd 1 19.11.14 14:27 A Concise History of English Jana Chamonikolasová Masarykova univerzita Brno 2014 Dílo bylo vytvořeno v rámci projektu Filozofická fakulta jako pracoviště excelentního vzdě- lávání: Komplexní inovace studijních oborů a programů na FF MU s ohledem na požadavky znalostní ekonomiky (FIFA), reg. č. CZ.1.07/2.2.00/28.0228 Operační program Vzdělávání pro konkurenceschopnost. © 2014 Masarykova univerzita Toto dílo podléhá licenci Creative Commons Uveďte autora-Neužívejte dílo komerčně-Nezasahujte do díla 3.0 Česko (CC BY-NC-ND 3.0 CZ). Shrnutí a úplný text licenčního ujednání je dostupný na: http://creativecommons.org/licenses/by-nc-nd/3.0/cz/. Této licenci ovšem nepodléhají v díle užitá jiná díla. Poznámka: Pokud budete toto dílo šířit, máte mj. povinnost uvést výše uvedené autorské údaje a ostatní seznámit s podmínkami licence. ISBN 978-80-210-7479-8 (brož. vaz.) ISBN 978-80-210-7480-4 (online : pdf) ISBN 978-80-210-7481-1 (online : ePub) ISBN 978-80-210-7482-8 (online : Mobipocket) Contents Preface ....................................................................................................................4 Acknowledgements ................................................................................................5 Abbreviations and Symbols ...................................................................................6 1 Introduction ........................................................................................................7
    [Show full text]
  • Measuring Mutual Intelligibility in Scandinavia
    Faculty of Arts University of Groningen Measuring mutual intelligibility in Scandinavia Sebastian Kürschner, [email protected] Rijksuniversiteit Groningen Albert-Ludwigs-Universität Freiburg im Breisgau International Junior Research Meeting for Junior Applied Linguistics (Anéla/GAL), Groningen, January 24-26, 2007 Faculty of Arts University of Groningen Outline 1. Background: Mutual intelligibility in Scandinavia 2. Research project „Linguistic determinants of mutual intelligibility in Scandinavia” 3. Measuring linguistic distances Faculty of Arts University of Groningen Background: Mutual intelligibility in Scandinavia • The Scandinavian languages: – Mainland Scandinavian languages: Danish, Norwegian (Bokmål and Nynorsk), Swedish – Island Scandinavian languages: Faroese, Icelandic Faculty of Arts University of Groningen Mutual intelligibility • For the Mainland Scandinavian languages, mutual intelligibility is in principle possible • Semi-communication (Haugen 1966): – communication in closely related languages – each involved person uses her/his mothertongue Faculty of Arts University of Groningen Historical background • Danish, Norwegian, and Swedish are historically closely related – same root: North-Germanic languages – intense language contact in the Middle Ages with Low German Hanse- tradesmen • high number of similar loanwords in all three languages • grammatical simplification – common language policy • will to semi-communicate Faculty of Arts University of Groningen Research on mutual intelligibility in Scandinavia • Competence
    [Show full text]
  • Training Googles Syntaxnet to Understand Norwegian Bokmål and Nynorsk
    Training Googles SyntaxNet to understand Norwegian Bokmål and Nynorsk Bjarte Johansen Department of Information Science and Media Studies University of Bergen [email protected] Abstract We use Google’s open source neural network framework, SyntaxNet, to train a fully automatic part-of-speech language tagger for Norwegian Bokmål and Nynorsk. Using SyntaxNet, we are able to get comparable results with other tagger when tagging Bokmål and Nynorsk with part- of-speech. Both taggers are available and released as open source. 1 Introduction We use Google’s SyntaxNet, an open source neural network framework, to train a part-of-speech (PoS) tagger for Norwegian bokmål and nynorsk (Andor et al., 2016). PoS are categories of words that share the same grammatical properties. Examples of PoS are nouns, verbs, adjectives etc. PoS tagging can help computers reason about text by disambiguating the meaning of words in context. A word like "tie" is a noun in the sentence "His tie had an interesting pattern", but would be a verb in the sentence "He wasn’t able to tie a sailor’s knot." We show that using an off the shelf, state of the art, learning platform allows us to create PoS taggers for both of the Norwegian written languages that are at, or exceed, what other researchers are able to do on the same task. Both PoS taggers are available online at http://github.com/ljos/anna_lyse and are released under an open source license. The nynorsk tagger is the only PoS tagger, as far as we know, available that produces unambiguous output ready for use in other machine learning tasks.
    [Show full text]
  • Making Sense of Grammatical Variation in Norwegian Marianne Brodahl
    1 Making sense of grammatical variation in Norwegian Marianne Brodahl Sameien, Eivor Finset Spilling & Hans-Olav Enger1 Abstract This study examines two examples of grammatical variation in Norwegian inflection, strong versus weak verb conjugation and affixal versus periphrastic adjective comparison. The main claim is that they are not as arbitrary as one may think, they rather indicate a division of labour. The strong verb inflection tends to be motivated not only by phonology, but also by semantics. The affixal and periphrastic adjective comparisons tend to be used with different sets of adjectives and for different semantic purposes. These observations support the Principle of Contrast, the idea that a difference in forms normally will relate to a difference in some kind of meaning. Key words: inflection, semantics, verb conjugation, adjective comparison, Principle of Contrast 1 Introduction 1 This paper is based on research carried out by the first author (section 2, cf. Nilsen 2012) and the second author (section 3, cf. Spilling 2012, Spilling & Haugen 2013, 2014) during their MA studies, supervised by the third author, who also drafted this paper. The authors are jointly responsible. We wish to thank the workshop organisers, our audience, and, last but not least, the reviewers, for very valuable input. 2 In the introduction to this volume, the editors suggest that the language users’ choice of grammatical variants is “far from arbitrary but depends on a wide array of factors from various domains”. This insight can be related to the Principle of Contrast, the idea that “speakers take every difference in forms to mark a difference in meaning” (Clark 1993: 64) – when ‘meaning’ is construed so as to include sociolinguistic and stylistic variation.
    [Show full text]
  • Translations V1.10A Manual
    translations v1.10a 2021/01/17 Internationalization of LATEX 2" Packages Clemens Niederberger https://github.com/cgnieder/translations [email protected] Table of Contents 1 Motivation1 3.5 Dictionaries..........9 3.5.1 Background......9 2 License2 3.5.2 Own Dictionaries... 10 3.5.3 translations’ Ba- 3 Usage2 sic Dictionaries.... 11 3.1 Background..........2 4 Defined Languages 15 3.2 Available Commands.....3 4.1 Base Languages........ 15 3.3 A Small Example........7 4.2 Language Dialects....... 16 3.4 Usage in Packages.......8 4.3 Language Aliases....... 17 3.4.1 Basic Structure....8 3.4.2 The ‘fallback’ language8 5 References 19 1 Motivation This package provides means for package authors to have an easy interface for international- ization of their packages. The functionality of this package is in many parts also covered by the package translator [TW19] (part of the beamer bundle). Internationalization is also pos- sible with babel [Bra19] and it’s \addto\captionshlanguagei mechanism or KOMA-Script’s \providecaptionname and similar commands. However, I believe that translations is more exible than all of these. Unlike translator it detects the used (babel or polyglossia [Cha19]) language itself and provides expandable retrieving of the translated key. translations also provides support for language dialects which means package authors can for example distinguish between British, Australian, Canadian and US English. The rst draft of the package was written since I missed an expandable version of translator’s \translate command. Once I had the package available I began using it in various of my other packages so it got extended to the needs I faced there.
    [Show full text]
  • Towards ASR That Recognises Everyone in a Country with No Spoken Standard
    An ambitious move towards ASR that recognises everyone in a country with no spoken standard Benedicte Haraldstad Frostad [email protected] The Language Council of Norway Introduction The lack of support for one’s mother tongue in services and products with integrated automatic speech recognition (ASR) represents a challenge to European and national aims to ensure equal participation in society for all citizens (De Smedt 2012: 41, Directorate-General of the UNESCO 2007). The situation is pressing in Norway, where public and private institutions are increasing integration of ASR. Notably, the courts and The Storting, the Norwegian parliament, are initiating automatic dictation of all court and parliament sessions. This is a challenge in the majority language, Norwegian, which has diversity in both written and spoken forms that is unusual for a national language, and an even greater challenge for minority languages. Language policy and linguistic diversity Linguistic diversity in Norway Norwegian is a North Germanic language with approximately 5 mill. speakers, closely related to Swedish and Danish, and it is the majority language in Norway. It has two written and no spoken standard. It has a large number of dialects with significant phonetic, lexical and syntactic variation. (Skjekkeland 1997). There is no spoken variant with a status as an official language. None of the two written standards for Norwegian, Nynorsk (NN) and Bokmål (NB), can be said to correspond to a certain spoken variety. Målloven (the Language Act) regulates the Norwegian has two written standards, many dialects and no spoken standard. Speakers are used to linguistic diversity, and dialects are strong identity markers.
    [Show full text]
  • Language and Culture in Norway
    LANGUAGE AND CULTURE IN NORWAY EVAMAAGER0 Vestfold University College, Faculty of Education, Tjilnsberg (Noruega) PRESENTACIÓN Eva Maagerjil es profesora de lengua de la Universidad de Vestfold en Noruega. La Universidad de Valencia tiene establecido con esta universidad un convenio de intercambio de alumnos y profesores dentro del programa Sócrates-Erasmus. En este contexto hemos entablado relaciones prometedoras con profesores del área de lengua y literatura de dicha institución y estamos estudiando la puesta en marcha de un grupo conjunto de trabajo en tomo al bilingüismo. En una reciente visita a Valencia la profesora Maagerjil impartió una conferen­ cia sobre la diversidad lingüística y cultural en Noruega y su tratamiento en la escuela. En Noruega conviven diversos dialectos orales del noruego junto con dos versiones oficiales de lengua escrita y dos lenguas no indoeuropeas de las minorías étnicas lapona y finlandesa, a las que hay que añadir las múltiples len­ guas de las minorías de inmigrantes. El interés del tema, la singularidad del caso noruego y la oportunidad de estar preparando este número monográfico de Lenguaje y textos nos movieron a pedirle un artículo para incluir en nuestra revista. Creemos que puede constituir una interesante contribución al debate en tomo al tratamiento de la diversidad lingüística y cultural. Departamento de Didáctica de la Lengua y la Literatura. Universidad de Valencia Palabras clave: dialecto, lengua escrita, lengua oficial, plurilingüismo, inter­ culturalidad, lengua y escuela, lengua e identidad. Key words: Dialect, written language, officiallanguage, multilingualism, inter­ culture, language and school, language and identity. 1. INTRODUCTION Only 4,5 million Norwegians, two written Norwegian languages, a lot of dialects which have such a strong position that they can be spoken even from the platform in the parliament, and no oral standard language which children are taught in school.
    [Show full text]
  • Dialects of the North Germanic Language Group
    Scandinavian language dialects of the North Germanic language group 1 What is Norwegian? When Einar Haugen listed his ecological questions about a language, he was thinking about “Norwegian”, but he wasn’t even sure how to define “Norwegian”, because Norwegians had several different ways of speaking, and Norway even had two official written languages. Also, Haugen knew that the “Norwegian” he heard in the USA was different from the “Norwegian” he heard in Norway; and yet he could easily communicate with various types of Norwegians, as well as with Danes and with others in Scandinavia. To understand this situation, let’s begin, as Haugen did, by considering the history of the Norwegian dialects. Where did they come from? (1) What is the historical linguistic description of the language? How is it diachronically related to other languages? 2 the North Germanic language history Today’s North Germanic dialects are descendent from Old Norse. The map shows the two Old Norse dialect areas, as well as other Germanic dialect areas of the early 10th century: Old West Norse dialect Old East Norse dialect Old Gutnish Old English Crimean Gothic Other Germanic languages (somewhat mutually intelligible with Old Norse) The Old West Norse dialect was also spoken in Greenland. 3 current Scandinavian dialects These 18 Scandinavian dialects, spoken in 5 European countries, form a dialect continuum of mutual intelligibility. Elfdalian is sometimes considered a West Scandinavian dialect. In general, the East Scandinavian dialects are spoken in Denmark and Sweden, and the other dialects are spoken in Norway, Iceland, and the Faroe Islands. 4 insular vs.
    [Show full text]
  • Dialect Revalorization and Maximal Orthographic Distinction in Rural Norwegian Writing
    Loyola University Chicago Loyola eCommons Anthropology: Faculty Publications and Other Works Faculty Publications 9-1-2018 Tradition as Innovation: Dialect Revalorization and Maximal Orthographic Distinction in Rural Norwegian Writing Thea R. Strand Loyola University Chicago, [email protected] Follow this and additional works at: https://ecommons.luc.edu/anthropology_facpubs Part of the Anthropology Commons Recommended Citation Strand, Thea R.. Tradition as Innovation: Dialect Revalorization and Maximal Orthographic Distinction in Rural Norwegian Writing. Multilingua, 38, 1: 51-68, 2018. Retrieved from Loyola eCommons, Anthropology: Faculty Publications and Other Works, http://dx.doi.org/10.1515/multi-2018-0006 This Article is brought to you for free and open access by the Faculty Publications at Loyola eCommons. It has been accepted for inclusion in Anthropology: Faculty Publications and Other Works by an authorized administrator of Loyola eCommons. For more information, please contact [email protected]. This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. © De Gruyter 2018 Multilingua 2019; 38(1): 51–68 Thea R. Strand* Tradition as innovation: Dialect revaloriza- tion and maximal orthographic distinction in rural Norwegian writing https://doi.org/10.1515/multi-2018-0006 Abstract: In rural Valdres, Norway, the traditional regional dialect, called Valdresmål, has become an important resource for popular style and local development projects. Stigmatized through much of the twentieth century for its association with poor, rural, “backward” farmers and culture, Valdresmål has been thoroughly revalorized, with particularly high status among local youth and those involved in business and tourism. While today’s parents and grand- parents attest to historical pressures to adopt normative urban linguistic forms, many in Valdres now proclaim dialect pride and have re-embraced spoken Valdresmål in various forms of public, interdialectal communication.
    [Show full text]
  • Syncope of Long *Ī in Old Norse Nouns
    Syncope of long *ī in Old Norse nouns sverre stausland Johnsen 1 Introduction Anyone who is studying the historical developments of one or more languages is accustomed to observing the loss of unstressed short vowels. All attested Germanic languages are well known for having undergone a considerable amount of such vowel losses. The old North Germanic languages, whose best attested variant is Old Norse, are often used to exemplify such vowel loss, because these languages have the fortunate property of being attested both before and after these vowel losses took place. In the early Scandinavian Runic inscriptions, we find words such as -gasti r ʻguest’, lauka r ʻleek’, and magu r ʻson’, 1 words which in the classical Old Norse period appear as gestr , laukr , and mǫgr , all with syncope of the short unstressed vowel. A somewhat less intuitive process is the loss of long unstressed vowels. This is a less obvious process not only because it is harder to imagine how a sound with considerable duration can eventually completely disappear, but also because Old Norse in general is known for preserving original long vowels. Words which in the early Runic inscriptions have long vowels, such as rūnō r ‘runes’, raisidōka ‘I raised’, swestǣr ‘sister’, and fāhidē ‘painted’, 2 do all appear with a vowel in Old Norse: rúnar , reistak , systir and fáði . Yet as noted by Sievers (1878: 66), von Bahder (1880: 84), and Bugge (1885: 213), original long vowels appear, at least in some circumstances, to undergo complete syncope in Old Norse, although each of the three offered their own descriptive rule for explaining such vowel loss.
    [Show full text]