Corpora and Processing Tools for Non-Standard Contemporary and Diachronic Balkan Slavic

Total Page:16

File Type:pdf, Size:1020Kb

Corpora and Processing Tools for Non-Standard Contemporary and Diachronic Balkan Slavic Zurich Open Repository and Archive University of Zurich Main Library Strickhofstrasse 39 CH-8057 Zurich www.zora.uzh.ch Year: 2019 Corpora and Processing Tools for Non-Standard Contemporary and Diachronic Balkan Slavic Vukovic, Teodora ; Nora, Muheim ; Winistörfer, Olivier-Andreas ; Anastasia, Makarova ; Ivan, Šimko ; Sanja, Bradjan Abstract: The paper describes three corpora of different varieties of BS that are currently being developed with the goal of providing data for the analysis of the diatopic and diachronic variation in non-standard Balkan Slavic. The corpora includes spoken materials from Torlak, Macedonian dialects, as well as the manuscripts of pre-standardized Bulgarian. Apart from the texts, tools for PoS annotation and lemmatization for all varieties are being created, as well as syntactic parsing for Torlak and Bulgarian varieties. The corpora are built using a unified methodology, relying on the pest practices and state- of-the-art methods from the field. The uniform methodology allows the contrastive analysis of thedata from different varieties. The corpora under construction can be considered a crucial contribution tothe linguistic research on the languages in the Balkans as they provide the lacking data needed for the studies of linguistic variation in the Balkan Slavic, and enable the comparison of the said varieties with other neighbouring languages. Posted at the Zurich Open Repository and Archive, University of Zurich ZORA URL: https://doi.org/10.5167/uzh-175260 Conference or Workshop Item Published Version The following work is licensed under a Creative Commons: Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License. Originally published at: Vukovic, Teodora; Nora, Muheim; Winistörfer, Olivier-Andreas; Anastasia, Makarova; Ivan, Šimko; Sanja, Bradjan (2019). Corpora and Processing Tools for Non-Standard Contemporary and Diachronic Balkan Slavic. In: The 12th International Conference on Recent Advances in Natural Language Process- ing (RANLP 2019), Varna, Bulgaria, 2 September 2019 - 4 September 2019, 62-68. Corpora and Processing Tools for Non-Standard Contemporary and Diachronic Balkan Slavic Teodora Vukovic´ Nora Muheim Olivier Winistorfer¨ Ivan Simkoˇ Anastasia Makarova Sanja Bradjan Slavic Departement, University of Zurich, Plattenstrasse 43, Zurich [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] Abstract inflection and fully or partially grammaticalized post-positive definite articles (Lindstedt, 2000). The paper describes three corpora of Many other differences lie in the nominal and ff di erent varieties of BS that are currently verbal morphosyntax, adjectival morphology, and being developed with the goal of providing tense and mood system, as well as in the lexical and data for the analysis of the diatopic phraseological domain. Nonetheless, the area itself and diachronic variation in non-standard is not linguistically uniform. On the contrary - the Balkan Slavic. The corpora includes diversification starts from the division into official spoken materials from Torlak, Macedonian standard languages, further separated by various dialects, as well as the manuscripts of phonological and structural isoglosses (Ivic´, 1985; pre-standardized Bulgarian. Apart from Institute for Bulgarian Language, 2018; Stojkov, the texts, tools for PoS annotation 2002; Vidoeski, 1999). Variation occurs even in and lemmatization for all varieties are very small regions, such as the Timok area in being created, as well as syntactic South East Serbia, where substantial differences parsing for Torlak and Bulgarian varieties. occur in the use of post-posed articles between The corpora are built using a unified villages in the mountains and valleys or urban methodology, relying on the pest practices areas (Vukovic´ and Samardziˇ c´, 2018). There is a and state-of-the-art methods from the significant variation not only from an areal, but field. The uniform methodology allows also from a diachronic point of view. Looking at the contrastive analysis of the data from older texts written in pre-standardized Bulgarian, ff di erent varieties. The corpora under phases of change in the language happening over construction can be considered a crucial time are noticeable. contribution to the linguistic research on This variation is best analyzed in non-standard the languages in the Balkans as they varieties, unaffected or minimally affected by the provide the lacking data needed for the prescriptive standard, and ideally in the form of studies of linguistic variation in the Balkan spoken language since it represents arguably a Slavic, and enable the comparison of the more "natural"state. In this paper, we present three said varieties with other neighbouring ongoing projects on South Slavic dialectology and languages. diachronic linguistics, which are currently being 1 Introduction elaborated at the Slavic linguistics department at the University of Zurich. In these projects, special Balkan Slavic (BS) languages are the eastern focus is on places in time and space where the branch of South Slavic languages, which are change happens and the underlying grammatical known for their affiliation to the so-called processes and structures. Apart from linguistic Balkansprachbund. The languages that belong to questions, the focus of our research is on the this group are Bulgarian and Macedonian varieties identification of potential geographical and social as well as the Torlak dialects spoken in South factors influencing changes in Bulgarian, Torlak East Serbia and West Bulgaria. These languages and Macedonian. express typological differences from other Slavic For the purpose of this research, we are languages. Some of the differentiating features are: aiming at three corpora (and also processing complete or almost complete loss of the noun case 62 Proceedings of the Student Research Workshop (RANLPStud 2019) associated with RANLP 2019, pages 62–68 Varna, Bulgaria, 2 – 4 September, 2019. tools) for modern and historical non-standard in size. Texts were transcribed into Latin and BS varieties that reflect diachronic and diatopic Cyrillic script in order to make the data available variation within the Balkan Slavic languages. The to a wider audience. The texts are annotated with varieties covered are pre-standardized Bulgarian, grammatical information, lemmas, English glosses Macedonian dialects, and Torlak varieties from for each token, and English translations for each Serbia and Bulgaria. Apart from the large line (see the annotated sentence in example 1, collection of texts, the resources are equipped presented as it is in the corpus). The MSD are with part-of-speech (PoS) and morphosyntactic easily readable, but do not fin the standards used description (MSD) annotation, while some corpora in the field. The database represents an impressive also include a syntactic tree-bank and a layer achievement, and a particularly valuable one given of normalization. In order to make the corpora that it is the sole resource for dialectal research on mutually comparable, we are developing and BS at the moment. applying a uniform methodology. Methodology, (1) hm kvo´ se corpora and tools are based on the existing disc what.Sg.N.Interr Acc.Refl.Clt standards used in the field as well as resources . какво се for the standard languages of the area, wherever kazva´ seˇ say.3Sg.Impf.I available. This enables inter-comparability and казвам reproducibility of the data. At the same time re- ‘Hmmmm. What is it called?’ using already existing tools makes the process easier for the creators and it offers well-established The corpora described in this paper are methods too, which are discussed below. created with the goal to be comparable with other existing related resources – be it corpora 2 Related Work of dialects or of the standard language. The automatic processing tools developed for the There is currently little data available in digital language varieties included in the collection are form that allows the analysis of the mentioned created based on existing ones whenever possible. variation and change in BS. Bulgarian is the For example, the morphosyntactic tagger and only language supplemented with a corpus of lemmatizer for Torlak is an adaptation of the tagger non-standard spoken varieties apart from the for standard Serbian (Miliceviˇ c´ and Ljubesiˇ c´, standard ones (Alexander, Ronelle and Zhobov, 2016a). Vladimir, 2016; Alexander, 2015). Serbian only An important tool for Serbian is the ReLDI has corpora of written standard and computer- tagger, which assigns morphosyntactic tags and mediated communication (CMC) (CCSL; Utvic´, lemmas to text (Ljubesiˇ c´, 2016). The tagger 2011; Ljubesiˇ c´ and Klubickaˇ , 2014; Miliceviˇ c´ and was developed for standard Serbian, Croatian, Ljubesiˇ c´, 2016b). The only available searchable and Slovene using a character-level machine resource for Macedonian is an unannotated corpus translation method. It assigns tags specified by the of movie subtitles (Steinberger et al., 2014; Lison MULTEXT-East standards (Krstev et al., 2004). and Tiedemann, 2016). The mentioned corpora The python package allows training for any other of standard language (or movie subtitles) are variety with an novel training set as an input extremely valuable resources in itself, but they (Ljubesiˇ c´, 2016). provide little-to-no insight into variation because MULTEXT-East resources represent an they are a sample of only one variety by definition;
Recommended publications
  • 109. Formation of the Standard Language: Macedonian
    1470 XVIII. Slavische Standardsprachen 109. Formation of the Standard Language: Macedonian 1. The Beginnings of Local Schreibsprachen 2. Attempts at Codifying a Literary Language on a Dialectal Basis 3. Formation of a Macedonian Standard Language as Part of “Nation-Building” 4. Recent Developments since Independence in 1991 5. Codification and Use of Macedonian in the Diaspora 6. Literature (selected) Abstract In the 19 th century a number of churchmen in Macedonia wrote sermons and primers in a form of Slavonic that was close to vernacular speech. These can be seen as the first texts of written Macedonian, although their authors did not intend to create a new literary language. The first document of explicit Macedonian linguistic and political separatism is Rečnik od tri jezika (1875). In his book Za makedonckite raboti Krste P. Misirkov (1874Ϫ1926) advocated a literary language based on the Prilep-Bitola dialect, but the book was never distributed. After the partition of Macedonia (1913) Macedonian literary activity was tolerated in Bulgaria and Yugoslavia as “dialect literature”. There were thriv- ing theatre groups in Skopje and Sofia and Kočo Racin from Veles wrote poetry that has retained its literary value. The communist partisans used the local dialects in their pamphlets in Vardar Macedonia during the Axis occupation and in 1944 they declared that Macedonian would be the official language of the new Macedonian republic. The decisive codificational steps were Koneski’s grammar (1952Ϫ1954) and the ‘academy dictionary’ (1961Ϫ1966). Independence has terminated the Serbian/Macedonian diglos- sia that was characteristic of Vardar Macedonia in the 20 th century. There have been various attempts to codify Macedonian in the diaspora.
    [Show full text]
  • Proper Language, Proper Citizen: Standard Practice and Linguistic Identity in Primary Education
    Proper Language, Proper Citizen: Standard Linguistic Practice and Identity in Macedonian Primary Education by Amanda Carroll Greber A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Department of Slavic Languages and Literatures University of Toronto © Copyright by Amanda Carroll Greber 2013 Abstract Proper Language, Proper Citizen: Standard Linguistic Practice and Identity in Macedonian Primary Education Doctor of Philosophy 2013 Amanda Carroll Greber Department of Slavic Languages and Literatures University of Toronto This dissertation analyzes how the concept of the ideal citizen is shaped linguistically and visually in Macedonian textbooks and how this concept changes over time and in concert with changes in society. It is focused particularly on the role of primary education in the transmission of language, identity, and culture as part of the nation-building process. It is concerned with how schools construct linguistic norms in association with the construction of citizenship. The linguistic practices represented in textbooks depict “good language” and thus index also “good citizen.” Textbooks function as part of the broader sets of resources and practices with which education sets out to make citizens and thus they have an important role in shaping young people’s knowledge and feelings about the nation and nation-state, as well as language ideologies and practices. By analyzing the “ideal” citizen represented in a textbook we can begin to discern the goals of the government and society. To this end, I conduct a diachronic analysis of the Macedonian language used in elementary readers at several points from 1945 to 2000 using a combination of qualitative and quantitative methods.
    [Show full text]
  • Eight Fragments Serbian, Croatian, Bosnian
    EIGHT FRAGMENTS FROM THE WORLD OF MONTENEGRIN LANGUAGES AND SERBIAN, CROATIAN, SERBIAN, CROATIAN, BOSNIAN SERBIAN, CROATIAN, BOSNIAN AND FROM THE WORLD OF MONTENEGRIN EIGHT FRAGMENTS LANGUAGES Pavel Krejčí PAVEL KREJČÍ PAVEL Masaryk University Brno 2018 EIGHT FRAGMENTS FROM THE WORLD OF SERBIAN, CROATIAN, BOSNIAN AND MONTENEGRIN LANGUAGES Selected South Slavonic Studies 1 Pavel Krejčí Masaryk University Brno 2018 All rights reserved. No part of this e-book may be reproduced or transmitted in any form or by any means without prior written permission of copyright administrator which can be contacted at Masaryk University Press, Žerotínovo náměstí 9, 601 77 Brno. Scientific reviewers: Ass. Prof. Boryan Yanev, Ph.D. (Plovdiv University “Paisii Hilendarski”) Roman Madecki, Ph.D. (Masaryk University, Brno) This book was written at Masaryk University as part of the project “Slavistika mezi generacemi: doktorská dílna” number MUNI/A/0956/2017 with the support of the Specific University Research Grant, as provided by the Ministry of Education, Youth and Sports of the Czech Republic in the year 2018. © 2018 Masarykova univerzita ISBN 978-80-210-8992-1 ISBN 978-80-210-8991-4 (paperback) CONTENT ABBREVIATIONS ................................................................................................. 5 INTRODUCTION ................................................................................................. 7 CHAPTER 1 SOUTH SLAVONIC LANGUAGES (GENERAL OVERVIEW) ............................... 9 CHAPTER 2 SELECTED CZECH HANDBOOKS OF SERBO-CROATIAN
    [Show full text]
  • The Dialect of Novo Selo, Vidin Region: a Contribution to the Study of Mixed Dialects
    University of Calgary PRISM: University of Calgary's Digital Repository Arts Arts Research & Publications 2003 The dialect of Novo Selo, Vidin Region: A contribution to the study of mixed dialects Младенова, Олга М.; Mladenova, Olga M.; Младенов, Максим Сл.; Mladenov, Maxim Sl. http://hdl.handle.net/1880/44632 Other Downloaded from PRISM: https://prism.ucalgary.ca Максим Сл. Младенов. Говорът на Ново село, Видинско. Принос към проблема за смесените говори. София: Издателство на БАН, 1969 [Трудове по българска диалектология, кн. 6]. The Dialect of Novo Selo, Vidin Region: A Contribution to the Study of Mixed Dialects English summary by Olga M. Mladenova Maxim Sl. Mladenov, the author of the 1969 monograph reprinted in this volume,1 was a native speaker of the Novo-Selo dialect who preserved his fluency in the dialect all his life. The material for the monograph was collected over a period of about fourteen or fifteen years in the 1950s and the1960s. His monograph is not the first description of this dialect. It follows Stefan Mladenov’s study of the language and the national identity of Novo Selo (Vidin Region) published in Sbornik za narodni umotvorenija, nauka i knižnina, vol. 18, 1901, 471-506. Having conducted his research at a later date, M. Sl. Mladenov had the opportunity to record any modifications that had taken place in the dialect in the conditions of rapid cultural change, thus adding an extra dimension to this later study of the Novo-Selo dialect. His is a considerably more detailed survey of this unique dialect, which is the outcome of the lengthy coexistence of speakers of different Bulgarian dialect backgrounds in a Romanian environment.
    [Show full text]
  • The Common Slavic Element in Russian Culture
    COLUMBIA UNIVERSITY DEPARTMENT OF SLAVIC LANGUAGES SLAVIC STUDIES Slavic Philology Series NIKOLAI TRUBETZKOY THE COMMON SLAVIC ELEMENT IN RUSSIAN CULTURE Edited by Leon Stilman Copyright 1949 by the Ikpartmmt of Slavic Languqp Columk univmity The preparation md publication of the aavsrml seriea of work. wder UyZC -1ES hmrm been madm paseible by m gt~t from the Rockefeller Qoundmtion to the Dapartmat of Slrrie Professof N. Trubetzkoy's study on The Cannon Slavic Eleaent in Russian Culture was included in a volume of his collected writings which appeared in 1927, in Paris, under the general title K #roblcme russkogo scwo#o~~anijo.Tbe article was trans- lated fm the Russian bg a group of graduate students of the Departant of Slavic Languages, Columbia Universi tr, including: Ime Barnsha, Hamball Berger, Tanja Cizevslra, Cawrence G, Jones, Barbara Laxtimer, Henry H. Hebel, Jr., Nora B. Sigerist- Beeson and Rita Slesser, The editor fobad it advisable to eli- atnate a number of passqes and footnotes dealing with minor facts; on the other bad, some additions (mainly chro~ologieal data) were made in a fen iwstances; these additions, ia most instances, were incorporated in tbe text in order to amid overburdening it with footnotes; they are purely factual in nature md affect In no the views and interpretations of tbe author. L. S. CONTENTS I Popular ad literarp lan@=ge.- Land11.de and d1abct.- Pxot+Slavic: itn dlalnte$ratlon: Bouthorn, Weatern and EwGern Slavi0.- Li torarr landuadem: thelr evolutiarr: their cnlatlon to apoken vernsaulam ..... 11 Old Church Slevonle: Its origiao and Its role.- The early reeensLma.- Old Bulgmrian Church Slavonlc and its progaget1on.- Church Blavoaie in Russia: sound changes; the Eastern and Wentern Russian trnditloa: the the second South Slavic influenca: the uakfled Ruseisn rocenaim ..........
    [Show full text]
  • UC Berkeley GAIA Books
    UC Berkeley GAIA Books Title Revitalizing Bulgarian Dialectology Permalink https://escholarship.org/uc/item/9hc6x8hp Authors Alexander, Ronelle Zhobov, Vladimir Publication Date 2004-07-28 Supplemental Material https://escholarship.org/uc/item/9hc6x8hp#supplemental Peer reviewed eScholarship.org Powered by the California Digital Library University of California Revitalizing Bulgarian Dialectology Edited by Ronelle Alexander & Vladimir Zhobov Published in association with University of California Press Introduction This volume summarizes the results of a joint North American - Bulgarian research project in dialectology, which culminated in a joint field expedition in the summer of 1996. The project was co-directed by Professor Ronelle Alexander of the University of California, Berkeley, and Professor Todor Bojadz¬iev and then Assistant Professor Vladimir Zµobov, both of Sofia University. The field research team was composed of three teachers (in which Professor Alexander represented the American side and Vladimir Zµobov and Georgi Kolev represented the Bulgarian side), three North Americans who were at the time graduate students in Slavic linguistics (Jonathan Barnes and Matthew Baerman at the University of California, Berkeley, and Elisabeth Elliott at the University of Toronto) and three Bulgarians who were at the time undergraduate students in Slavic philology at Sofia University (Tanya Delc¬eva, Kamen Petrov and Pet¿r Sµis¬kov); Krasimira Koleva of Sµumen University also joined the team during the first phase of the expedition. Field research was carried out in three different regions of Bulgaria: the villages of Kozic¬ino and Golica (often referred to together as the “Erkec¬” dialect, after the older name of the first of these villages) in eastern Bulgaria, the town of Trjavna (plus outlying villages) in north central Bulgaria, and the villages of Gela and Stik¿l (near Sµiroka L¿ka) in the Rhodope mountains.
    [Show full text]
  • Zive Naj Vsi Narodi K I Hrepene Docakat Dan D E Koder Sonce Hodi P Repir I Z Sveta Bo Pregnan D E Rojak P Rost Bo Vsak N E Vrag
    12. letno srečanje Združenja za slovansko jezikoslovje Povzetki prispevkov 12th Slavic Linguistics Society Annual Meeting Book of Abstracts XII ежегодная конференция Общества славянского языкознания Тезисы докладов Ljubljana, 21.–24. 9. 2017 v / / Zive naj vsi narodi / / / K i hrepene docakatv dan , / / / D e koder sonce hodi , / / P repir/ iz sveta bo/ pregnan , / D e rojak / P rost bo vsak , / / / N e vrag, le sosed bo mejak. 12. letno srečanje Združenja za slovansko jezikoslovje: Povzetki prispevkov 12th Slavic Linguistics Society Annual Meeting: Book of Abstracts XII ежегодная конференция Общества славянского языкознания: Тезисы докладов ISBN: 978-961-05-0027-8 Urednika / Editors / Редакторы: Luka Repanšek, Matej Šekli Recenzenti / Peer-reviewers / Рецензенты: Aleksandra Derganc, Marko Hladnik, Gašper Ilc, Zenaida Karavdić, Simona Kranjc, Domen Krvina, Nina Ledinek, Frančiška Lipovšek, Franc Marušič, Tatjana Marvin, Petra Mišmaš, Matic Pavlič, Анастасия Ильинична Плотникова / Anastasiia Plotnikova, Luka Repanšek, Михаил Николаевич Саенко / Mikhail Saenko, Дмитрий Владимирович Сичинава / Dmitri Sitchinava, Vera Smole, Mojca Smolej, Marko Snoj, Petra Stankovska, Andrej Stopar, Saška Štumberger, Hotimir Tivadar, Mitja Trojar, Mladen Uhlik, Mojca Žagar Karer, Rok Žaucer, Andreja Žele, Sašo Živanović Vodja konference / Conference leadership / Руководитель конференции: Matej Šekli Organizacijski odbor konference / Conference board / Оргкомитет конференции: Branka Kalenić Ramšak, Domen Krvina, Alenka Lap, Oto Luthar, Tatjana Marvin, Mojca
    [Show full text]
  • Peace Corps Bislama an Introductory Guide to the Bulgarian Language
    An Introductory Guide to the Bulgarian Language Hosted for free on livelingua.com AN INTRODUCTORY GUIDE TO THE BULGARIAN LANGUAGE For those who are interested in facts and information, the following is a short description of the Bulgarian Language. Bulgarian is a Slavonic language, like Russian, Czech, Polish, and Serbian. Its history is centuries old and it is the earliest written language. It is a phonetic language, i.e. you pronounce what is written. Of course, first you get familiar with the Cyrillic alphabet! Overall, it is easy to be mastered by English speakers: its structure is similar to English; besides so many Volunteers have learned it and speak it fluently. Naturally there are some distinctive features like the gender of nouns and the verb system. The Old Bulgarian period dates back to the creation of the alphabet, the Glagolitsa, (circa 862 - 863 AD) by the Thessaloniki monks, Cyril and Methodius. The invention of the Glagolitic alphabet, comprised of 41 letters, constituted an original creative act. It was a new graphic system and ingenious creation, exactly adapted to the phonetic peculiarities of the Old Bulgarian language. Between the ninth and eleventh centuries yet another script was created in Bulgaria - the Cyrillic alphabet, or Kirilitsa. It includes the 24 letters of the Greek titular code lettering to which several new signs have been added for the sounds peculiar to the Old Bulgarian tongue. The Cyrillic script is used by (among others) Bulgarians, Russians, Ukrainians, Byelorussians, Serbians, Macedonians and Mongolians, who adopted it from the Russians. The Modern Bulgarian period is the third phase in the historical evolution of the traditional language of the Old Bulgarian and Middle Bulgarian periods.
    [Show full text]
  • Macedonian: Genealogy, Typology and Sociolinguistics
    JEZIKOSLOVNI ZAPISKI 26 2020 2 43 MATEJ šEKLI MACEDONIAN: GENEALOGY, TYPOLOGY AND SOCIOLINGUISTICS COBISS: 1.01 HTTPS://DOI.ORG/10.3986/JZ.26.2.04 Makedonščina: geneo-, tipo- in sociolingvistična opredelitev V prispevku je makedonščina opredeljena geneo-, tipo- in sociolingvistično. S stališča genealoškega jezikoslovja je s pomočjo relativne kronologije in zemljepisne razširjeno- sti relevantnih jezikovnih sprememb prikazano oblikovanje makedonščine in bolgarščine kot geolektov znotraj vzhodne južne slovanščine. Z gledišča tipološkega jezikoslovja je umeščena v kontekst balkanske jezikovne zveze. Sociolingvistični pogled pa prikaže pro- ces standardizacije in pravni položaj sodobnega makedonskega knjižnega/standardnega jezika kot sociolekta. Ključne besede: makedonščina, bolgarščina, genealosko jezikoslovje, tipološko jeziko- slovje, sociolingvistika The article attempts to define Macedonian from the view-point of linguistic genealogy and typology as well as sociolinguistics. The genesis of Macedonian and Bulgarian as geolects within Eastern South Slavic is discussed from the vantage point of genealogical linguistics, using the relative chronology and the geographical distribution of the individ- ual linguistic changes. The typological part of the discussion then attempts to establish the position of Macedonian in the context of the so-called Balkan Sprachbund. Finally, the process of standardisation and the legal status of modern Macedonian literary/standard language as a sociolect are presented, thus shedding additional light on the
    [Show full text]
  • Language Ideologies of the Bunjevac Minority in Vojvodina: Historical Backgrounds and the Post-1991 Situation
    Language Ideologies of the Bunjevac Minority in Vojvodina: Historical Backgrounds and the Post-1991 Situation Masumi Kameda 1. Introduction 1-1. Overview Bunjevci (singular: Bunjevac) are South Slavic Catholic people sit- uated mainly in the autonomous province of Vojvodina (especially that of Bačka region1 in the northwest part of Serbia), southern Hungary, Cro- atian coastal area (Dalmatia and Lika), and in western Herzegovina. The Bunjevac dialect2 is a Štovakian dialect form of the western South Slavic languages and shows Ikavian reflexes of Common Slavic vowel jat’.3 The modern realizations of jat’ (e, ije/je, and i) are named Ekavian, (I) jekavian and Ikavian, and Ikavian variant is characteristic for the speech 1 The Bačka region is today divided into Hungarian and Serbian sections. 2 This paper mostly refers to the language of Bunjevci as the Bunjevac “dia- lect” following the official current denomination “Bunjevački govor” (literally meaning “Bunjevac speech”) in the Republic of Serbia. 3 The Serbo-Croatian speaking territory is divided into three major dialect areas, named after three forms of the interrogative pronoun “what”: Štokavian, Kajkavian, and Čakavian. Štokavian is the base of standard Serbian, Croatian, Bosnian, and Montenegrin languages, and on the other hand, Kajkavian and Čakavian is dialect forms of Croatian. The subdivisions of the dialectical varia- tions are based on the accentual system and reflexes of jat’. - 95 - MasuMi KaMeda of Istrian Dalmatian region of Croatia, while Ekavian is commonly as- sociated with standard Serbian, and (I)jekavian with standard Croatian. The main phonological features of Bunjevac dialect are as follows: (1) strong Ikavian, (2) loss of phoneme h or its replacement by v and j, (3) shortened form of ao / eo to o, and (4) loss of non-accented i.
    [Show full text]
  • Old Church Slavonic: an Elementary Grammar 1
    0. INTRODUCTION 0.1 What is Old Church Slavonic? The Slavonic languages belong to the Indo-European Group of languages, to which the Germanic languages (which include English) belong, as well as the Romance languages, which include Latin, the Celtic languages, the Iranian languages, Sanskrit, Greek and Armenian. The Slavonic languages are usually divided into three groups: West Slavonic, including Polish and Czech and Upper and Lower Sorbian; East Slavonic, comprising Russian, Byelorussian and Ukrainian, and South Slavonic; the South Slavonic group consists of Slovenian, Serbo-Croat, Macedonian and Bulgarian. The first Slavonic language to be committed to writing was a South Slavonic dialect of Bulgarian or Macedonian type, and is called Old Church Slavonic because of its function. (In some works it is called Old Bulgarian, but this is best avoided as Old Church Slavonic is not identical with the Bulgarian language of that time.) For the relationship of Old Church Slavonic with Indo- European, see G. Nandrif, Old Church Slavonic Grarmar^ London, 1959. In the year 862, Prince Rostislav of Moravia, which was then an important grouping of Slavonic peoples in Central Europe, sent a request to the Byzantine Emperor Michael III for Slavonic- speaking missionaries to spread Christianity in his lands and counter the influence of German clergy. Constantine and his brother Methodius were chosen because the former had already proved himself as a scholar and missionary, while the latter was an experienced administrator. Both were natives of Salonica where there was a Slavonic-speaking population. The brothers' work in Moravia was approved by the pope, and in 869 Constantine and Methodius travelled to Rome where Constantine died, having taken monastic vows and assumed the name of Cyril.
    [Show full text]
  • On Some Recent Pomak Writing Activities in Greece: Ethno-Cultural Context and Linguistic Peculiarities
    ESUKA – JEFUL 2011, 2 – 1: 261 – 272 ON SOME RECENT POMAK WRITING ACTIVITIES IN GREECE: ETHNO-CULTURAL CONTEXT AND LINGUISTIC PECULIARITIES Maria Manova Humboldt University of Berlin Abstract. Despite numerous attempts at codifying their language, the Pomaks in Greece, a linguistic as well as religious minority, do not generally put into writing this variety, which is considered to be a Bulgarian dialect. Up until about fifteen years ago, there was an absence of any kind of lexicographic tradition. The grammars, dic- tionaries etc. that appeared in Greece in the mid-1990’s can be classified as “external” codifications, since most of them were made by the majority. Over the last few years, however, an increasing minority-activism has changed the situation somewhat. Some writ- ing has begun to emerge from the community, but the variety is still far from fitting the criteria for micro-literacy, the codification of which is difficult due to the different idiolectal varieties of the language actors, which are far away from a uniform orthographical norm as well as an alphabet. However, the publications in the mi- nority language are seen as evidence of cultural emancipation and linguistic vitality. This article deals with the issues of language and literacy among the Pomaks in Greece and presents a case study of the ethno-linguistic orientation of the currently most productive Pomak language activist’s writings. Keywords: minority languages in the Balkans, identity, language planning, written use of minority languages, lexical modernization 1. Introduction The Pomaks converted to Islam during the period of Otto- man rule in the Balkans.
    [Show full text]