Slavic-Albanian Language Contact, Convergence, and Coexistence

Total Page:16

File Type:pdf, Size:1020Kb

Slavic-Albanian Language Contact, Convergence, and Coexistence Slavic-Albanian Language Contact, Convergence, and Coexistence Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Matthew Cowan Curtis, M.A. Graduate Program in Slavic Linguistics The Ohio State University 2012 Dissertation Committee: Brian D. Joseph, Advisor Charles E. Gribble Daniel E. Collins Copyright by Matthew Cowan Curtis 2012 Abstract As historical relationships of Slavs and Albanians in the western Balkans have been subject to a wide range of scholarly interpretations, this dissertation seeks to present the facts of linguistic evidence of Slavic-Albanian contact, and apply them to an informed understanding of Slavs’ and Albanians’ interactions historically. Although individual linguistic features are important for establishing the historical fact of language contact, only a systematic, comprehensive analysis of the several interrelated parts of language— vocabulary, phonology, and morphosyntax—can indicate how the languages, and the communities speaking them, have been affected by the long-standing contact. This study also considers the languages from the perspective of several language-contact theories, creating a multifaceted approach that reveals strengths and weaknesses of each theory, and also paints a multidimensional picture of the effects of language contact and socio- cultural reasons for the languages’ changes. This layered analysis demonstrates that contact between Slavs and Albanians has brought about many linguistic changes, particularly in dialects that have remained in contact with one another. While the most obvious effects are the plenteous lexical borrowings, language contact is also present in phonology and morphosyntax, thus affecting every aspect of the dialects in contact. As the linguistic data shows, Albanian and Slavic communities have enriched one another linguistically and likely in other aspects of their cultural inheritances as well. ii Acknowledgements Writing this dissertation, and conducting the research analyzed in this dissertation has been possibly only because of the cooperation and collaboration of many, many individuals. And while I am solely responsible for this work, I surely could not have done this without the help of so many great colleagues, friends, and families. Although I know that I will unwittingly and unintentionally omit some of those who have helped me, I would like to list the principle helpers in my research. Even a list of those who have helped me may give some indication of the collaborative nature of my research and how my work has been enriched by the help of so many willing colleagues. Particular thanks should be given to my advisor and mentor, Dr. Brian D. Joseph as well as my other professors in the department of Slavic and East European Languages and Literatures at the Ohio State University: Charles E. Gribble, Daniel E. Collins, Andrea Sims, Predrag Matejic, and Ludmila Isurin. I am also grateful to professors in the Linguistic Department I have worked with, especially Don Winford and Hope Dawson. Others who have helped refine my ideas include Cynthia Hallen, Henry Cooper, Victor Friedman, Robert Greenberg, Ronelle Alexander, Marc Greenberg, Olga Mladenova, Elisabeth Elliott, Andrej Sobolev, Motoki Nomachi, and Andrew Dombrowski. These have been tremendous guides in my graduate education at the Ohio State University, yet of equal importance has been my fruitful and enriching collegiality with my fellow graduate students in this department: Maggie Harrison, Susan Vdovechenko, iii Spencer Robinson, Josh Pennington, Anastasia Smirnova, Lauren Ressue, Miriam Whiting, Marcela Michalkova, Martin Michalek, Larysa Stepanova, Maria Alley, Ljiljana Djuraskovic, Vedrana Mihalicek and many other good friends. In addition, research for this dissertation was completed with the generous support of a Fulbright-Hays Doctoral Dissertation Research Abroad scholarship. Thanks are due to the administrators of this scholarship, particularly Joanna Kukielka-Blaser at the Ohio State University, as well as public affairs assistants to the United States Embassies, Svetlana Breca in Prishtina, Gazmend Ilazi in Skopje and Bernard Çobaj in Podgorica. Thanks also to the number of scholars that helped me obtain data for this study: Rexhep Ismajli of the Kosovo Academy of Arts and Sciences, Julie and Xhevair Kolgjini of the American University in Kosovo, Lindita Rugova at the University of Prishtina, Arburim Iseni of the State University of Tetovo, Arta Toçi of South East European University, Eleni Bužarovska of Kiril and Methodius University in Skopje, Macedonia and Gordana Đurković of the University of Montenegro as well as their students. Also deserving of thanks are Marjan Markoviḱ, Zuzana Topolińska, and Liljana Mitkovska at the Research Center for Areal Linguistics at the Macedonian Academy of Arts and Sciences. Thanks also to the many librarians who have helped in this research, both in the Balkans and throughout North America that have allowed me to access materials via interlibrary loan. Finally, I would like to thank my family for their courage, determination and patience in joining me in these adventures as well as the more prosaic aspects of completing this work and the degree attached to it. Their loyalty to me has been fierce and constant; without their support this work would have been impossible. To them I also dedicate this dissertation. iv Vita 1998......................................Brighton High School (Salt Lake City, UT) 2003......................................B.A. Linguistics, (Computers and Languages Minor) Brigham Young University 2005......................................M.A. Russian and East European Studies, Indiana University 2005–2006............................University Fellow, The Ohio State University 2006–2009, 2010–2011........Graduate Teaching Associate, Department of Slavic and East European Languages and Literatures, The Ohio State University 2009–2010.............................Fulbright–Hays Doctoral Dissertation Research Abroad Scholarship Recipient (Kosovo, Macedonia, Albania, Montenegro) 2011...................................... Associate Faculty, Arizona State University, Critical Languages Institute, Intermediate Albanian 2011–2012.............................Presidential Fellow, The Ohio State University Publications Petar II Petrović Njegoš and Gjergj Fishta: Composers of National Epics, The Carl Beck Papers in Russian and East European Studies, 1808. Pittsburgh: University of Pittsburgh Press, 2007. v “Xhorxh, xhuxhmaxhuxh and the xhaxhallarë: The Xenophonemic Status of Albanian /xh/.” Balkanistica 23: 67–96. (2010). “Explaining compelling similarities in Macedonian and Montenegrin dialects: Perfects and adjectival participles.” Zeitschrift für Balkanologie (46/2) 155–183. (2010). “Serbo-Croatian and Albanian Attempts at Language Standardization: A Brief History and Comparison from an American Perspective” State University of Tetovo Scientific Observer. (1/2–3) 10–17. (2011). Fields of Study Major Field: Slavic Linguistics Other Fields: Historical–Comparative Linguistics, Sociolinguistics, Balkan Linguistics, Albanian Linguistics, Eastern European Studies vi Table of Contents Abstract................................................................................................................................ii Acknowledgments..............................................................................................................iii Vita.......................................................................................................................................v List of Tables....................................................................................................................viii List of Figures......................................................................................................................x List of Abbreviations.........................................................................................................xii List of Place Names Used in Study..................................................................................xiv Introduction..........................................................................................................................1 Chapter 1: Sociolinguistic Setting.......................................................................................7 Chapter 2: Lexicon.............................................................................................................51 Chapter 3: Lexicon: Chronology of Borrowings...............................................................92 Chapter 4: Phonology......................................................................................................127 Chapter 5: Morphosyntax................................................................................................269 Chapter 6: Conclusion......................................................................................................373 Bibliography....................................................................................................................387 vii List of Tables Table 2.1. Borrowings from Slavic into Albanian (According to Svane 1992)................67 Table 2.2. Borrowings from Albanian into Slavic (According to Hoxha 2001)................68 Table 4.1. Summary of Processes Proposed for Analyzing Language Contact...............137 Table 4.2. Changes Affecting -ur/-ull According to First Proposal.................................179
Recommended publications
  • Failures and Achievements of Albanian Nationalism in the Era of Nationalism
    FAILURES AND ACHIEVEMENTS OF ALBANIAN NATIONALISM IN THE ERA OF NATIONALISM Nuray BOZBORA ABSTRACT The development of Albanian nationalism was not uniform from the beginning and it followed distinct patterns. First there were local protest movements, some were culturally based while others were created by the local elite to protest against local and specific problems. Later these different patterns in Albanian nationalism turned into mass uprising during the 1910 and 1911s. The aim of this paper is to understand the crucial period of mass uprising of Albanians and to analyse how these different patterns in the movement had participated and expressed themselves, what the basic motivation of uniting around a common purpose was, and ease and difficulties in this regard. Keywords: Nationalism, Nation, National Identity, Albanian Nationalism, Balkan Nationalism MİLLİYETÇİLİK DÖNEMİNDE ARNAVUT MİLLİYETÇİLİĞİNİN YETERSİZLİĞİ VE BAŞARILARI ÖZET Arnavut milliyetçiliği başlangıcından itibaren tek tip bir hareket olarak gelişmemiş kendi içinde farklılıklar göstermiştir. Hareket içindeki bu farklı gelişme biçimleri önce yerel protestolar, kültürel temelli hareketler ve belirli sorunlara karşı yerel seçkinlerin protesto harekeleri olarak ortaya çıkmıştır. Daha sonra Arnavut milliyetçiliği içindeki bu farklı gelişme biçimleri, 1910 ve 1911 yıllarında kitlesel bir ayaklanmaya dönüşmüştür. Çalışmanın amacı, kitlesel Arnavut ayaklanmasının ortaya çıktığı bu önemli dönemi, hareket içindeki farklılıkların kendilerini nasıl konumlandırdığı, nasıl ifade ettiği
    [Show full text]
  • Downloads/Reports/2016/Pdf/BTI 2016 Kosova.Pdf
    Tourism governance in post-war transition: The case of Kosova REKA, Shqiperim Available from the Sheffield Hallam University Research Archive (SHURA) at: http://shura.shu.ac.uk/24197/ A Sheffield Hallam University thesis This thesis is protected by copyright which belongs to the author. The content must not be changed in any way or sold commercially in any format or medium without the formal permission of the author. When referring to this work, full bibliographic details including the author, title, awarding institution and date of the thesis must be given. Please visit http://shura.shu.ac.uk/24197/ and http://shura.shu.ac.uk/information.html for further details about copyright and re-use permissions. "Tourism governance in post-war transition: the case of Kosova" Shqiperim Reka A thesis submitted in partial fulfilment of the requirements of Sheffield Hallam University for the degree of Doctor of Philosophy February 2017 Abstract The aim of this research study was to examine tourism governance in post-war transition with specific reference to the influence of political, economic and social factors, institutional arrangements, collaboration and power relations. Within this context, a crucial objective was to assess the role of mindset. Reviewing the literature in relation to the key concepts, it was discovered that research tends to focus on political and economic transition, whereas the social dimension, despite its importance, is largely neglected. Similarly, tourism governance has been overlooked in studies of tourism in post-war transition. Furthermore, the literature on tourism governance rarely takes the issue of mindset into account. To address these gaps in knowledge, a qualitative research approach was applied to study tourism governance in post-war transitional Kosova.
    [Show full text]
  • A4 Z 43: Tfde`Uj Ez]] 2Fx #'
    + , : 5% )! ; ! ; ; ,-., /012 .0.01 -.)/' .)%2 % ' +>1137 35"/6.*4116"</ **9"+B3 5161>+B3/4 96/%"196>9"3*4". +/..>%/3 '76"'76"3../.= .>% ."3+6."%>. +6"357".6 9".+1/3"."337 56."5>3 6C5."9"5<D"C4"5" 1 + *+#??,,) --@ A"* " $ &3&4/56 /13 % & R !" 34516 tested the demand for five-day custodial interrogation of ormer Finance and Home Chidambaram. FMinister P Chidambaram Solicitor General (SG) will spend at least four days in Tushar Mehta, representing the CBI custody. A Delhi court the CBI, told the court that the on Thursday allowed the CBI agency was not extorting con- plea for custodial interrogation fession but it has the right to of Chidambaram in the INX reach the root of the case. Media corruption case till Besides Sibal, senior advo- August 26. The agency had cate Abhishek M Singhvi # 34516 sought a five-day remand to appeared for Chidambaram unearth the larger conspiracy and opposed CBI’s plea saying n a desperate move to stoke in the case. that the former Union Minister Iviolence in Jammu & Special Judge Ajay Kumar was not a flight risk. Singhvi Kashmir after the abrogation of Kuhar asked the CBI to con- said that the entire CBI case its special status under Article duct medical examination on was based on the statement of 370 and to internationalise the Chidambaram as per the rules. Indrani Mukherjea, who has issue, Pakistan has started The court also allowed the turned approver in the case. recruiting battle-hardened !" # $% family members and lawyers of Chidambaram cannot Afghan and Pashtun fighters to % Chidambaram to meet him answer what the CBI wants to create trouble in the State.
    [Show full text]
  • Albanian Language Identification in Text
    5 BSHN(UT)23/2017 ALBANIAN LANGUAGE IDENTIFICATION IN TEXT DOCUMENTS *KLESTI HOXHA.1, ARTUR BAXHAKU.2 1University of Tirana, Faculty of Natural Sciences, Department of Informatics 2University of Tirana, Faculty of Natural Sciences, Department of Mathematics email: [email protected] Abstract In this work we investigate the accuracy of standard and state-of-the-art language identification methods in identifying Albanian in written text documents. A dataset consisting of news articles written in Albanian has been constructed for this purpose. We noticed a considerable decrease of accuracy when using test documents that miss the Albanian alphabet letters “Ë” and “Ç” and created a custom training corpus that solved this problem by achieving an accuracy of more than 99%. Based on our experiments, the most performing language identification methods for Albanian use a naïve Bayes classifier and n-gram based classification features. Keywords: Language identification, text classification, natural language processing, Albanian language. Përmbledhje Në këtë punim shqyrtohet saktësia e disa metodave standarde dhe bashkëkohore në identifikimin e gjuhës shqipe në dokumente tekstuale. Për këtë qëllim është ndërtuar një bashkësi të dhënash testuese e cila përmban artikuj lajmesh të shkruara në shqip. Për tekstet shqipe që nuk përmbajnë gërmat “Ë” dhe “Ç” u vu re një zbritje e konsiderueshme e saktësisë së identifikimit të gjuhës. Për këtë arsye u krijua një korpus trajnues i posaçëm që e zgjidhi këtë problem duke arritur një saktësi prej më shumë se 99%. Bazuar në eksperimentet e kryera, metodat më të sakta për identifikimin e gjuhës shqipe përdorin një klasifikues “naive Bayes” dhe veçori klasifikuese të bazuara në n-grame.
    [Show full text]
  • “These Were Hard Times for Skanderbeg, but He Had an Ally, the Hungarian Hunyadi” Episodes in Albanian–Hungarian Historical Contacts
    CORE Metadata, citation and similar papers at core.ac.uk Provided by Repository of the Academy'sACTA Library BALCANO-HUNGARICA 1. 1 “These were hard times for Skanderbeg, but he had an ally, the Hungarian Hunyadi” Episodes in Albanian–Hungarian Historical Contacts It is of inestimable significance for Albanian studies in Hungary that the Hungarian Academy of Sciences has had the opportunity to produce and publish Edited by the present book which constitutes a Krisztián Csaplár-Degovics major contribution towards enabling this book to serve as a kind of third volume of Illyrisch-Albanische Forschungen (1916). Although there has been no organized Albanian research in Hungary, the chapters in this book clearly demonstrate that researchers well versed in the various historical periods have engaged in a joint investigation of the Albanian–Hungarian past. The studies reveal new research findings, many of which will cause a sensation in the world of Albanian studies. The book is a distillation of con tem- porary Hungarian work on Albanian Episodes in Albanian–Hungarian Historical Contacts studies and also a salute by the Hungarian Academy of Sciences and the Hungarian ISBN 978-963-416-184-4 Ministry of Foreign Affairs and Trade to the joint Albanian–Hungarian and Austro–Hungarian past. 9 789634 161844 albán1.indd 1 7/30/2019 2:05:25 PM “These were hard times for Skanderbeg, but he had an ally, the Hungarian Hunyadi” Episodes in Albanian–Hungarian Historical Contacts Acta Balcano-Hungarica 1. ※ Series managing editors: Pál Fodor and Antal Molnár
    [Show full text]
  • Albanian Families' History and Heritage Making at the Crossroads of New
    Voicing the stories of the excluded: Albanian families’ history and heritage making at the crossroads of new and old homes Eleni Vomvyla UCL Institute of Archaeology Thesis submitted for the award of Doctor in Philosophy in Cultural Heritage 2013 Declaration of originality I, Eleni Vomvyla confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. Signature 2 To the five Albanian families for opening their homes and sharing their stories with me. 3 Abstract My research explores the dialectical relationship between identity and the conceptualisation/creation of history and heritage in migration by studying a socially excluded group in Greece, that of Albanian families. Even though the Albanian community has more than twenty years of presence in the country, its stories, often invested with otherness, remain hidden in the Greek ‘mono-cultural’ landscape. In opposition to these stigmatising discourses, my study draws on movements democratising the past and calling for engagements from below by endorsing the socially constructed nature of identity and the denationalisation of memory. A nine-month fieldwork with five Albanian families took place in their domestic and neighbourhood settings in the areas of Athens and Piraeus. Based on critical ethnography, data collection was derived from participant observation, conversational interviews and participatory techniques. From an individual and family group point of view the notion of habitus led to diverse conceptions of ethnic identity, taking transnational dimensions in families’ literal and metaphorical back- and-forth movements between Greece and Albania.
    [Show full text]
  • Albania Andorra Austria Belgium Cyprus Denmark Estonia
    An asylum-seeker and her child receive medical assistance at a Fedasil asylum centre in Belgium. Albania Andorra Austria Belgium Cyprus Denmark Estonia Finland France Germany Greece Holy See Iceland Ireland Italy Latvia Liechtenstein Lithuania Luxembourg Malta Monaco Netherlands Norway Portugal San Marino Spain Sweden Switzerland United Kingdom of Great Britain and Northern Ireland 324 UNHCRGlobalReport2011 Northern, Western and Southern Europe OPERATIONAL HIGHLIGHTS l UNHCR worked closely with European Governments to UNHCR stepped up efforts to support capacity-building, ensure that persons fleeing events in North Africa and solidarity and responsibility-sharing among States in the seeking protection had access to asylum procedures, region. This included the relocation within the EU of 230 particularly in Italy and Malta, where many of those refugees from Malta in 2011 as part of the EUREMA I rescued at sea in the Mediterranean disembarked. In project. addition to those received within European State borders, l UNHCR continued to help strengthen the quality of resettlement places for some 1,370 refugees in North national asylum systems. The conclusions of the Further Africa were identified in 10 European countries. Developing Quality project, which was funded by the l Greece received strong support from UNHCR in its bid to European Union (EU) and involved 12 EU Member States, reform its asylum system. The task was challenging in were presented to Governments and others at a light of the economic crisis in the country. With conference in Brussels in September. UNHCR also important court decisions pointing to the vulnerability of completed a project to improve the quality of the asylum asylum-seekers, notably under the Dublin regulations, system in Sweden.
    [Show full text]
  • Form, Function and History of the Present Suffix -I/-Ën in Albanian and Its Dialects
    M.A. Lopuhaä Form, function and history of the present suffix -i/-ën in Albanian and its dialects Master Thesis, July 1, 2014 Supervisor: Dr. M.A.C. de Vaan Contents 1 Introduction 4 2 Conventions and notation 5 3 Background and statement of the problem 7 3.1 The Albanian verbal system ................................... 7 3.2 The Proto-Albanian verbal system ............................... 8 3.3 Main research questions ..................................... 9 3.4 Previous work on the subject .................................. 9 4 Morphological changes from Old Albanian to Modern Albanian 11 4.1 Verbal endings in Old and Modern Albanian .......................... 11 4.2 Present singular .......................................... 12 4.3 Present plural ........................................... 12 4.4 Imperfect and subjunctive .................................... 13 5 Proto-Albanian reconstruction 14 6 Proto-Indo-European reconstruction 17 6.1 Vocalic nasals in Albanian .................................... 17 6.2 The reality of a PIE suffix *-n-ie/o- ............................... 18 7 Dialectal information 20 7.1 Buzuku .............................................. 23 7.2 Northwestern Geg ........................................ 23 7.3 Northern Geg ........................................... 24 7.4 Northeastern Geg ......................................... 25 7.5 Central Geg ............................................ 26 7.6 Southern Geg ........................................... 27 7.7 Transitory dialects .......................................
    [Show full text]
  • The Albanian Case in Italy
    Palaver Palaver 9 (2020), n. 1, 221-250 e-ISSN 2280-4250 DOI 10.1285/i22804250v9i1p221 http://siba-ese.unisalento.it, © 2020 Università del Salento Majlinda Bregasi Università “Hasan Prishtina”, Pristina The socioeconomic role in linguistic and cultural identity preservation – the Albanian case in Italy Abstract In this article, author explores the impact of ever changing social and economic environment in the preservation of cultural and linguistic identity, with a focus on Albanian community in Italy. Comparisons between first major migration of Albanians to Italy in the XV century and most recent ones in the XX, are drawn, with a detailed study on the use and preservation of native language as main identity trait. This comparison presented a unique case study as the descendants of Arbëresh (first Albanian major migration) came in close contact, in a very specific set of circumstances, with modern Albanians. Conclusions in this article are substantiated by the survey of 85 immigrant families throughout Italy. The Albanian language is considered one of the fundamental elements of Albanian identity. It was the foundation for the rise of the national awareness process during Renaissance. But the situation of Albanian language nowadays in Italy among the second-generation immigrants shows us a fragile identity. Keywords: Language identity; national identity; immigrants; Albanian language; assimilation. 221 Majlinda Bregasi 1. An historical glance There are two basic dialect forms of Albanian, Gheg (which is spoken in most of Albania north of the Shkumbin river, as well as in Montenegro, Kosovo, Serbia, and Macedonia), and Tosk, (which is spoken on the south of the Shkumbin river and into Greece, as well as in traditional Albanian diaspora settlements in Italy, Bulgaria, Greece and Ukraine).
    [Show full text]
  • Një Udhëzues Për Festat Pagane Shqiptare- -A Guide to Albanian Pagan Festivities- © Pagan Shqiptar © © Atp © -2021
    (Mali i Tomorrit, ‘‘Olimpi’’ Shqiptar - Tomorri Mountain, Albanian "Olympus") ⊕ -NJË UDHËZUES PËR FESTAT PAGANE SHQIPTARE- -A GUIDE TO ALBANIAN PAGAN FESTIVITIES- © PAGAN SHQIPTAR © © ATP © -2021- 2 1 -Parathënie ⊕ Preface- Këto artikuj të përmbledhur në këtë vepër përbënjë veprën e parë kushtuar krejtësisht festave pagane shqiptare. Botuar gjatë gjithë vitit 2020, secili prej tyre përqëndrohet në një festë specifike në një mënyrë të shkurtër e cila do të mundësojë një kuptim më të mirë të tyre, madje edhe për ata që nuk kanë njohuri mbi festat tona pagane shqiptare, por gjithashtu përmbajnë shumë detaje interesante që nxjerrin në pah perspektiva dhe kuptime të reja mbi festat tona pagane. Në të vërtetë, këto prezantime nuk janë prezantime të themeluara tashmë të festave tona antike pagane, por ato janë të mbushura me interpretime që i japin mundësi lexuesit të kuptojë dhe vlersojë ato në nivelet më të larta. Në nivel simbolik, çdo festë duhet të kuptohet si reflektim tokësor i një realiteti më të lartë kozmik. Duke u rrotulluar rreth këtyre festave, paraardhësit tanë ishin në një akordim me ritmin e Natyrës dhe në harmoni me Kozmosin, disiplina hynore e universit. Në nivelin historik, fakti që disa nga këto praktika u vunë re ende në mesin e njerëzve tanë në mes të shekullit XX, dëshmon se ato nuk janë një pjesë e parëndësishme e identitetit tonë. Për më tepër, respektimi i tyre, pavarësisht nga ndarjet e besimeve midis njerëzve tanë, dëshmon identitetin tonë të përbashkët dhe na bën me të vërtetë një popull. Në të vërtetë, nën festat tona antike pagane, ne gjejmë vlera që paraardhësit tanë i respektonin dhe i konsideronin si më të dashurit e tyre.
    [Show full text]
  • Morphological Tagging and Lemmatization of Albanian: a Manually Annotated Corpus and Neural Models
    Morphological Tagging and Lemmatization of Albanian: A Manually Annotated Corpus and Neural Models Nelda Kote¹, Marenglen Biba², Jenna Kanerva³, Samuel Rönnqvist³ and Filip Ginter³ ¹Faculty of Information Technology, Polytechnic University of Tirana, Albania ²Faculty of Information Technology, New York University of Tirana, Albania ³TurkuNLP Group, University of Turku, Finland [email protected], [email protected], {jmnybl, saanro, figint}@utu.fi Abstract In this paper, we present the first publicly available part-of-speech and morphologically tagged corpus for the Albanian language, as well as a neural morphological tagger and lemmatizer trained on it. There is currently a lack of available NLP resources for Albanian, and its complex grammar and morphology present challenges to their development. We have created an Albanian part-of-speech corpus based on the Universal Dependencies schema for morphological annotation, containing about 118,000 tokens of naturally occuring text collected from different text sources, with an addition of 67,000 tokens of artificially created simple sentences used only in training. On this corpus, we subsequently train and evaluate segmentation, morphological tagging and lemmatization models, using the Turku Neural Parser Pipeline. On the held-out evaluation set, the model achieves 92.74% accuracy on part-of-speech tagging, 85.31% on morphological tagging, and 89.95% on lemmatization. The manually annotated corpus, as well as the trained models are available under an open license. Keywords: part-of-speech tagging, morphological tagging, lemmatization, Albanian language 1. Introduction 2. Related Work The Albanian language is an Indo-European language There are several previous attempts to develop NLP tools spoken by around 7 million native speakers mostly in the for Albanian.
    [Show full text]
  • Dr. Robert Elsie Like with Unsophisticated
    INTRODUCTION Faïk bey Konitza (1875-1942) was one of the great figures of Albanian intellectual culture in the early decades of the twentieth century and was no doubt the first Albanian whom one might consider to have been a real European. Konitza was born on 15 March 1875 in the now village of Konitsa in the Pindus mountains in northern Greece, not far from the present Albanian border. After elementary schooling in Turkish in his native village, he studied at the Jesuit Saverian College in Shkodra which offered him not only some instruction in Albanian but also an initial contact with central European culture and Western ideas. From there, he continued his schooling at the French- language Imperial Galata secondary school in Constantinople. In 1890, at the age of fifteen, he was sent to study in France where he spent the next seven years. After initial education at secondary schools in Lisieux (1890) and Carcassonne (1892), he registered at the University of Dijon, from which he graduated in 1895 in Romance philology. After graduation, he moved to Paris for two years where he studied mediaeval French, Latin and Greek at the Collège de France. He finished his studies at Harvard University in the United States, although little is known of this period of his life. As a result of his highly varied educational background, he was able to speak and write Albanian, Italian, French, German, English and Turkish fluently. Konitza's stay in France, a country of long-standing liberal democratic traditions, was to have a profound effect on him and he was able to acquire and adopt the patterns of Western thinking as no Albanian intellectual had ever done before him.
    [Show full text]