Linguistic Complexity: What Do Albanian Dialects Show?
Total Page:16
File Type:pdf, Size:1020Kb
Linguistic complexity and language (contact) history: The case of Albanian dialects Maria S. Morozova, Alexander Yu. Rusakov, (Maria A. Ovsjannikova) Institute for Linguistic Studies of the Russian Academy of Sciences (ILS RAS) Saint Petersburg State University (SPbSU) Saint Petersburg, Russia [email protected], [email protected], ([email protected]) Balkan Languages and Dialects: Corpus-based and Quantitative Studies October 18–20, 2018 Institute for Linguistic Studies of the Russian Academy of Sciences (Saint Petersburg, Russia) Roadmap 1. Introduction • Goals of the paper 2. Phonetics and grammar: complexity of the Albanian varieties • What is linguistic complexity? • Data used in the study • Complexity on the Albanian dialectal map • Complexity and closeness of the Albanian varieties 3. Lexicon: comparing “grammatical” and “lexical” data on the Albanian varieties • Lexicon and closeness of the Albanian varieties • “Grammatical” and “lexical” closeness • Complexity and lexical borrowings in the Albanian varieties • Complexity and unique words in the Albanian varieties 4. Conclusions Goals of the paper The study has two interrelated goals. 1. To measure the level of complexity of Albanian varieties. To examine the correlation of the complexity level with the real processes in the history of the Albanian language, such as language and ethnic contact situations of different types, isolation of some groups of varieties, population movements, etc., and try to throw light on the “balkanization” processes in the Albanian-speaking area. Further, it would be interesting to address some issues relevant for the Balkan area as a whole, i.e. the degrees of balkanization of the other Balkan varieties. 2. To prove, deepen and specify our knowledge about the character of Albanian dialect division and its history, using the parameter of linguistic complexity and the quantitative data on the Albanian dialect lexicon for measuring and examining the dialect variation and the degree of closeness. Roadmap 1. Introduction • Goals of the paper 2. Phonetics and grammar: complexity of the Albanian varieties • What is linguistic complexity? • Data used in the study • Complexity on the Albanian dialectal map • Complexity and closeness of the Albanian varieties 3. Lexicon: comparing “grammatical” and “lexical” data on the Albanian varieties • Lexicon and closeness of the Albanian varieties • “Grammatical” and “lexical” closeness • Complexity and lexical borrowings in the Albanian varieties • Complexity and unique words in the Albanian varieties 4. Conclusions What is linguistic complexity? • “grammatical complexity – complexity of the strictly linguistic domains of phonology, morphosyntax, lexicon, etc. and their components. <…> [C]omplexity can be measured as follows. – For each subsystem of the grammar, the number of elements it contains. <…> – The number of paradigmatic variants, or degrees of freedom, of each such element or set of elements: allophones, allomorphs, declension or conjugation classes. <…> – Syntagmatic phenomena <…> – Constraints on elements, alloforms, and syntagmatic dependencies, including constraints on their combination” (Nichols 2009: 111-112). Data used in the study • Dialectological Atlas of Albanian Language (DAAL 2007–2008): dialectal maps with 131 villages in the main area and 14 villages in diaspora. • Phonological and morphological data for the Atlas was collected in 1970–1980s using a questionnaire, with 65 questions on phonology, 80 questions on grammar (DAAL 2007: 437-453). • The lexical volume of the Atlas (DAAL 2008) maps the local terms for 260 lexical items. Data used in the study • Albanian varieties for the current study are drawn from the main area: – 93 located in the Republic of Albania and in the adjacent part of Greece (Çamëri) – 25 located in Kosovo and in the Republic of Serbia (Preševo) – 7 located in the Republic of Macedonia – 6 located in the Republic of Montenegro • Diaspora varieties spoken in Serbia (Pešter), Croatia (Zadar), Greece, and Italy were not taken into consideration in the study. Data used in the study • Each of the 131 dialectal varieties was described in terms of 27 binary features (e.g. presence of /θ/ or presence of supercompound verb forms), basing on the features represented in DAAL. 1 = presence of a feature 0 = absence of a feature • All selected features fall into types 1 and 2 (the number of elements in phonology and morphology, and the number of paradigmatic variants), according to Nichols 2009. • Complexity of each variety was then calculated as a simple sum of its features. maxCompl = 23 minCompl = 9 Complexity on the Albanian dialectal map • The color grading from black to white was used to show the linguistic complexity from 23 to 9. Observations: 1. A strong decrease of linguistic complexity in the direction from north to south, i.e. from the Gheg to the Tosk area. 2. Less articulated decrease from west to east can be seen, especially in the northern part of the map. Complexity on the Albanian dialectal map Observation 1: • The complexity of the majority of Gheg varieties ranges from 16 to 23 (except for two locations in Dibra, with the total complexity scores of 13 and 14). • The complexity of all Tosk varieties ranges from 9 to 14. • This corresponds with the ideas about the stronger balkanization of the Tosk area (because of its adjacency to the “center of balkanization” located in the region of Ohrid and Prespa lakes, see Lindstedt 2000, etc.). Complexity on the Albanian dialectal map Observation 2: • The decrease of complexity from west to east, which can be seen especially in the northern part of the map, to some extent correlates with the actual contact situation. E.g. the Northeastern Gheg varieties spoken in Kosovo and the Central Gheg varieties spoken in Macedonia and on the border with it, where the Albanian- Slavic contact is ongoing, are less complex than the Northern Gheg and Central Gheg varieties in the territory of Albania. • Some correlation between the relief and the level of linguistic complexity was observed. • The NWG varieties in the mountainous Northern Albania (Albanian Alps) and in the isolated area around Lake Skadar are more likely to be complex than the NEG varieties in the central part of Kosovo and the plateau of Dukagjin. • In the central part of Albania, less complex varieties are spoken on the coast (Durrës, Kavajë), while more complex can be found in the highlands. Complexity and closeness of the Albanian varieties • Multidimensional scaling (MDS) with R was applied for assessment and visual representation of closeness of the Albanian varieties. Cf. a similar study by Andrew Dombrowski, with a network analysis of the dialects of Macedonian (2014). – Formally, each variety is ideally associated with a 27-dimensional vector. – Comparisons are calculated based on comparable features shared by a pair of varieties. – Shorter/longer distances between points in the MDS plot correspond to the lower/higher degree of closeness of varieties. Closeness of the Albanian varieties • The surprisingly big distance between (all) Gheg and (all) Tosk dialects may point at the secondary character of the old Gheg/Tosk border, which follows the river Shkumbin in the mid of Albania (i.e. the Gheg/Tosk separation did not arise in situ, see Русаков 2013), reflect the subsequent ethnic and linguistic changes in the Tosk area (massive language shift of Slavic and Aromanian population, see Десницкая 1976), or may result from the combination of these two situations. Complexity and closeness of the Albanian varieties COMPLEXITY lower higher Roadmap 1. Introduction • Goals of the paper 2. Phonetics and grammar: complexity of the Albanian varieties • What is linguistic complexity? • Data used in the study • Complexity on the Albanian dialectal map • Complexity and closeness of the Albanian varieties 3. Lexicon: comparing “grammatical” and “lexical” data on the Albanian varieties • Lexicon and closeness of the Albanian varieties • “Grammatical” and “lexical” closeness • Complexity and lexical borrowings in the Albanian varieties • Complexity and unique words in the Albanian varieties 4. Conclusions Lexicon: comparing “grammatical” and “lexical” data on the Albanian varieties • For this part of the study we analyzed 219 lexical maps from DAAL. The maps reflect mainly semantic fields of trees and plants, wild and domestic animals, household terms, and names for the objects of material culture. • All lexemes were classified in “inherited words” (including Ancient Greek and Latin loanwords) and “borrowings”. • Each of the 131 dialectal varieties was described in terms of 219 non-binary features. • Multidimensional scaling (MDS) with R was applied for measuring the closeness of the varieties, as in the grammatical part. Lexicon and closeness of the Albanian varieties Lexicon and closeness of the Albanian varieties Lexicon and closeness of the Albanian varieties “Grammatical” and “lexical” closeness • Both sets of the results correlate very well with the traditional dialect classification. “grammatical” closeness “lexical” closeness “Grammatical” and “lexical” closeness • Lexical parameters support the previously expressed idea about the great distance between Gheg and Tosk dialects. On the other hand, they demonstrate a degree closeness between Southern Gheg and Northern Tosk dialect which may indicate the relatively late contacts between Gheg and Tosk in around Shkumbin river. “grammatical” closeness “lexical” closeness “Grammatical” and “lexical”