Jazykovedný Ústav Ľudovíta Štúra

JAZYKOVEDNÝ ÚSTAV ĽUDOVÍTA ŠTÚRA SLOVENSKEJ AKADÉMIE VIED 2 ROČNÍK 70, 2019 JAZYKOVEDNÝ ČASOPIS ________________________________________________________________VEDEcKÝ ČASOPIS PRE OTáZKY TEóRIE JAZYKA JOURNAL Of LINGUISTIcS ScIENTIfIc JOURNAL fOR ThE Theory Of LANGUAGE ________________________________________________________________ hlavná redaktorka/Editor-in-chief: doc. Mgr. Gabriela Múcsková, PhD. Výkonní redaktori/Managing Editors: PhDr. Ingrid Hrubaničová, PhD., Mgr. Miroslav Zumrík, PhD. Redakčná rada/Editorial Board: PhDr. Klára Buzássyová, CSc. (Bratislava), prof. PhDr. Juraj Dolník, DrSc. (Bratislava), PhDr. Ingrid Hrubaničová, PhD. (Bra tislava), doc. Mgr. Martina Ivanová, PhD. (Prešov), Mgr. Nicol Janočková, PhD. (Bratislava), Mgr. Alexandra Jarošová, CSc. (Bratislava), prof. PaedDr. Jana Kesselová, CSc. (Prešov), PhDr. Ľubor Králik, CSc. (Bratislava), PhDr. Viktor Krupa, DrSc. (Bratislava), doc. Mgr. Gabriela Múcsková, PhD. (Bratislava), Univ. Prof. Mag. Dr. Stefan Michael Newerkla (Viedeň – Ra- kúsko), Associate Prof. Mark Richard Lauersdorf, Ph.D. (Kentucky – USA), prof. Mgr. Martin Ološtiak, PhD. (Prešov), prof. PhDr. Slavomír Ondrejovič, DrSc. (Bratislava), prof. PaedDr. Vladimír Patráš, CSc. (Banská Bystrica), prof. PhDr. Ján Sabol, DrSc. (Košice), prof. PhDr. Juraj Vaňko, CSc. (Nitra), Mgr. Miroslav Zumrík, PhD. (Bratislava), prof. PhDr. Pavol Žigo, CSc. (Bratislava). Technický_______________________________________________________________ redaktor/Technical editor: Mgr. Vladimír Radik Vydáva/Published by: Jazykovedný ústav Ľudovíta Štúra Slovenskej akadémie vied - v tlačenej podobe vo vydavateľstve SAP – Slovak Academic Press, s. r. o. - elektronicky vo vydavateľstve Sciendo – De Gruyter https://content.sciendo.com/view/journals/jazcas/jazcas-overview.xml Adresa redakcie/Editorial address: Jazykovedný ústav Ľ. Štúra SAV, Panská 26, 811 01 Bratislava Kontakt: [email protected] Elektronická verzia časopisu je dostupná na internetovej adrese/The electronic version of the journal is available at: http://www.juls.savba.sk/ediela/jc/ Vychádza trikrát ročne/Published triannually Dátum vydania aktuálneho čísla (2019/70/2) – október 2019 CiteScore 2018: 0,24 SCImago Journal Rank (SJR) 2018: 0,122 Source Normalized Impact per Paper (SNIP) 2018: 0,476 JAZYKOVEDNÝ ČASOPIS je evidovaný v databázach/JOURNAL Of LINGUISTIcS is covered by the following services: Baidu Scholar; Cabell‘s Directory; CEJSH (The Central European Journal of Social Sciences and Humanities); CNKI Scholar (China National Knowledge Infrastructure); CNPIEC – cnpLINKer; Dimensions; DOAJ (Directory of Open Access Journals); EBSCO (relevant databases); EBSCO Discovery Service; ERIH PLUS (European Reference Index for the Humanities and Social Sciences); Genamics JournalSeek; Google Scholar; IBR (International Bibliography of Reviews of Scholarly Literature in the Humanities and Social Sciences); IBZ (International Bibliography of Periodical Literature in the Humanities and Social Sciences); International Medieval Bibliography; J-Gate; JournalGuide; JournalTOCs; KESLI-NDSL (Korean National Discovery for Science Leaders); Linguistic Bibliography; Linguistics Abstracts Online; Microsoft Academic; MLA International Bibliography; MyScienceWork; Naviga (Softweco); Primo Central (ExLibris); ProQuest (relevant databases); Publons; QOAM (Quality Open Access Market); ReadCube; SCImago (SJR); SCOPUS; Sherpa/RoMEO; Summon (ProQuest); TDNet; Ulrich’s Periodicals Directory/ulrichsweb; WanFang Data; WorldCat (OCLC). ISSN 0021-5597 (tlačená verzia/print) ISSN 1338-4287 (verzia online) MIČ 49263 JAZyKOVEDNÝ ČASOPIS JAZYKOVEDNÝ ÚSTAV ĽUDOVÍTA ŠTÚRA SLOVENSKEJ AKADÉMIE VIED 2 ROČNÍK 70, 2019 NLP, corpus Linguistics, Language Dynamics and change SLOVKO 2019 Tematické číslo Jazykovedného časopisu venované počítačovému spracovaniu prirodzeného jazyka, korpusovej lingvistike, jazykovej dynamike a zmenám. Prizvaní editori: Mgr. Jana Levická, PhD. Mgr. Miroslav Zumrík, PhD. CONTENTS 135 Foreword 136 Predhovor CORPUS-BASED/DRIVEN LINGUISTIC AND DH RESEARCH 139 LUCIA JASINSKÁ: Colloquial Lexemes in Journalistic Texts 148 Pavla KOCHOVÁ: Frequency in Corpora as a Signal of Lexicalization (On the Absolute Usage of Comparative and Superlative Adjectives) 158 JAKUB SLÁMA AND BARBORA Štěpánková: On the Valency of Various Types of Adverbs and Its Lexicographic Description 170.MÁRIA ŠIMKOVÁ: The Synchronic Dynamics of Words Ending in -ita/-osť 180 JAKOB HORSCH: Slovak Comparative Correlatives: New Insights 191.MARIANNA Hudcovičová: Analysis of Verbal Prepositional “of” Structures 200 PAULA KYSELICA AND RENÉ M. GENIS: Temporal ‘since’ in Slovak: Conjunction(s) and Aspect Choice – a Corpus Study 216 ZUZANA KOMRSKOVÁ AND PETRA POUKAROVÁ: In Which Clause Do Subordinate Conjunctions Prosodically Belong? 225.YULIA KUVSHINSKAYA: Russian Indefinite Pronoun kakoi-libo: Non-standard Usage and Changes in the Semantics 234 Victor ZAKHAROV: Ways of Automatic Identification of Words Belonging toSemantic Fields 244 ZUZANA čERNÁ AND RADEK čECH: Analysis of the Lemma mateřství (Motherhood) 254 IGOR BoguslavsKY AND LEONID IOMDIN: Corpus-supported Semantic Studies: Part/Whole Expressions in Russian 267 RADEK čECH, Pavel KOSEK, OLGA NavrÁTILOVÁ AND JÁN Mačutek: Wackernagel’s Position and Contact Position of Pronominal Enclitics in Older Czech. Competition or Cooperation? 276 OKSANA NIKA AND SVITLANA HRYTSYNA: Frequency Dictionary of 16th Century Cyrillic Written Monument 289 JANA KOBZOVÁ: Kinship Terminology in Western Slavic Languages Based on Copora Analysis 299 ADRIAN JAN ZASINA: Gender-specific Adjectives in Czech Newspapers and Magazines CORPUS BUILDING 315.MACIEJ OGRODNICZUK, Rafał L. GÓRSKI, MAREK łaziński AND PIOTR Pęzik: From the National Corpus of Polish to the Polish Corpus Infrastructure 324.MARIE Kopřivová, ZUZANA KOMRSKOVÁ, PETRA POUKAROVÁ AND David LUKEŠ: Relevant Criteria for Selection of Spoken Data: Theory Meets Practice 336 HANA Goláňová AND Martina Waclawičová: The DIALEKT Corpus and Its Possibilities 345.MICHAELA Mošaťová AND KatarÍNA GAJDOŠOVÁ: Annotation in the Corpus of Texts of Students Learning Slovak as a Foreign Language (ERRKORP) 358 VLADIMÍR Petkevič, Jaroslava Hlaváčová, KLÁRA Osolsobě, Martin SVÁŠEK AND JOSEF ŠIMANDL: Parts of Speech in NovaMorf, a New Morphological Annotation of Czech 370.KLÁRA Osolsobě AND HANA ŽiŽková: Improving Nominalized Adjectives Tagging 380 Jaroslava Hlaváčová, MARIE MIKULOVÁ, BARBORA Štěpánková AND JAN Hajič: Modifications of the Czech Morphological Dictionary for Consistent Corpus Annotation 390.Mija Bon and Polona Gantar: Levels of Annotation in the Slovene Training Corpus ssj500k 2.2 CREATION AND USE OF LANGUAGE RESOURCES 403 Zdeňka UREŠOVÁ, Eva Fučíková, Eva Hajičová AND JAN Hajič: Meaning and Semantic Roles in CzEngClass Lexicon 412.MAGDA Ševčíková AND LUKÁŠ KYJÁNEK: Introducing Semantic Labels into the DeriNet Network 424 VERONIKA Kolářová, ANNA VERNEROVÁ AND Jonathan VERNER: Non-systemic Valency Behavior of Czech Deverbal Nouns Based on the NomVallex Lexicon 434 VÁclava KETTNEROVÁ AND MARKÉta LopatKOVÁ: Towards Reciprocal Deverbal Nouns in Czech: From Reciprocal Verbs to Reciprocal Nouns 444 ERIK CITTERBERG AND ADRIANA VÁLKOVÁ: Processing of Derivational Features for (Semi) Automatic Creation of Dictionary Definitions in the User Interface (CZEDD) for Learning Czech as a Second Language: Suffix -tel and -ista 456.Katharina PROCHAZKA, LUDWIG MAXIMILIAN BREUER AND AGNES KIM: Conception and Development of an Open Database System on Historical Multilingualism in Austria NATURAL LANGUAGE PROCESSING 469 PETR Pořízka: On Possibilities and Methods of Analysis of Thematic Expressions in Spoken Texts 481 RÓbert SABO, PETER KRAMMER, JÁN MojŽiš AND MARCEL KvassaY: Identification of Spontaneous Spoken Texts in Slovak 491.MAREK GRÁC, MARKÉta MasopustovÁ AND MARIE Valíčková: Affordable Annotation of the Mobile App Reviews FOREWORD “There is one part of corpus linguistics that becomes more and more significant, namely, the one that is not primarily concerned only with corpus building or the issues of representativeness, possibilities of lemmatization and morphological or other types of tagging. It comprises methods of corpus mining, i.e., methods broadening language descriptions and enhancing their quality, revealing new relations and inventing new con- cepts for description of language reality.“ V. Cvrček and T. Kováříková Reading through the papers of anniversary SLOVKO 2019 proceedings, the 2011 quote of V. Cvrček and T. Kovaříková from the journal Naše řeč still proves valid. The observation, however, does not diminish the relevance of neither part of corpus-aided ways of language research. Our ambition with this years’ 10th anniversary international linguistic conference SLOVKO was therefore to structure the proceedings so that they would reflect the rich thematic and methodical diversity of contemporary corpus linguistics and natural language processing. The first group out of four covers the “corpus mining” research. Corpus-based and -driven linguistic and DH-related papers use corpora as a well-established resource for deeper understanding of language phenomena, e.g. specific grammar, lexical, or semantic issues of individual Slavic languages from synchronic or diachronic point of view. The second bulk of papers consists of those dedicated to issues of corpus and digital infrastructure building, especially possibilities of improvement of corpus tagging. The third group of papers focuses on creation and use of language resources (lexicons

Load more