GOVOR / SPEECH Zagreb, Godina 37, 2020, Broj 2, (2021) Mrežna Inačica: ISSN 1849-2126 UDK 81'34(05)"540.6" CODEN GOVOEB Tiskana Inačica: ISSN 0352-7565
Total Page:16
File Type:pdf, Size:1020Kb
GOVOR / SPEECH Zagreb, godina 37, 2020, broj 2, (2021) Mrežna inačica: ISSN 1849-2126 UDK 81'34(05)"540.6" CODEN GOVOEB Tiskana inačica: ISSN 0352-7565 Izdavač ODJEL ZA FONETIKU HRVATSKOGA FILOLOŠKOG DRUŠTVA Uredništvo Gordana VAROŠANEC-ŠKARIĆ, glavna urednica Petra ACZÉL Sveučilište Corvinus, Budimpešta, Mađarska Dana BOATMAN Johns Hopkins Hospital, Baltimore, SAD Almasa DEFTERDAREVIĆ Filozofski fakultet, Sarajevo, Bosna i Hercegovina Mária GÓSY Mađarska akademija znanosti, Budimpešta, Mađarska William J. HARDCASTLE Queen Margaret University, Edinburgh, UK Damir HORGA Filozofski fakultet, Zagreb, Hrvatska Patricia KEATING University of California, Los Angeles, SAD Nikolaj LAZIĆ Filozofski fakultet, Zagreb, Hrvatska Marko LIKER Filozofski fakultet, Zagreb, Hrvatska Vesna MILDNER Filozofski fakultet, Zagreb, Hrvatska Elenmari PLETIKOS OLOF Filozofski fakultet, Zagreb, Hrvatska Vesna POŽGAJ HADŽI Filozofski fakultet, Ljubljana, Slovenija Ján SABOL Filozofski fakultet, Košice, Slovačka Irena SAWICKA Filološki fakultet, Torunj, Poljska Mirjana SOVILJ Institut za eksperimentalnu fonetiku i patologiju govora "Đorđe Kostić", Beograd, Srbija Jelena VLAŠIĆ DUIĆ Filozofski fakultet, Zagreb, Hrvatska Tajnica: Diana TOMIĆ Lektorica: Katarina VARENICA Izvršna tajnica: Ana VIDOVIĆ ZORIĆ Korektorica: Marica ŽIVKO Design ovitka: Zlatko ŠIMUNOVIĆ Grafičko uređenje i prijelom Jordan BIĆANIĆ, Odsjek za fonetiku, Filozofski fakultet, Zagreb Prilozi objavljeni u Govoru referiraju se u sljedećim sekundarnim izvorima: ERIH Plus, Scopus, Linguistics Bibliography Online, LLBA – Linguistics and Language Behavior Abstracts, MLA International Bibliography, FRANCIS, MLA Directory of Periodicals, Communication Source, ProQuest Linguistics Collection, Elsevier. Adresa uredništva Filozofski fakultet, Odsjek za fonetiku, I. Lučića 3, 10 000 Zagreb, Hrvatska Telefoni: 385 (0)1 409 23 74, 385 (0)1 409 20 97, 385 (0)1 409 20 98 E-mail: [email protected], [email protected], [email protected] Elektronička inačica dostupna je na stranici: http://www.hfiloloskod.hr/index.php/casopisi/govor Ovaj je broj tiskan uz financijsku potporu Ministarstva znanosti i obrazovanja Republike Hrvatske. Tisak: Tiskara "Rotim i Market", Lukavec Naklada: 150 primjeraka SADRŽAJ / CONTENTS Jelena KUVAČ KRALJEVIĆ, Gordana HRŽICA, Lana KOLOGRANIĆ BELIĆ Croatian Corpus of Non-Professional Written Language – Typical speakers and speakers with language disorders Hrvatski korpus neprofesionalnoga pisanog jezika osoba s jezičnim poremećajima i osoba bez jezičnih poremećaja ................................................................................. 125-147 Philipp WASSERSCHEIDT, Marija MANDIĆ, Nadine VOLLSTÄDT, Ana JOVANOVIĆ, Ivana TANASIJEVIĆ, Teodora VUKOVIĆ, Ivana VUČINA SIMOVIĆ, Uliana YAZHINOVA, Anđelka ZEČEVIĆ Corpus-based analysis of spoken narratives. Introducing a corpus and a search tool Korpusna analiza govornoga pripovijedanja. Prikaz korpusa i alata za pretragu ........ 149-178 Ana LEKO KRHEN, Gordana HRŽICA, Natalija KOKOT Sintaktičke sposobnosti djece koja mucaju Syntactic skills of children who stutter .................................................................... 179-204 Agnieszka KAŁDONEK-CRNJAKOVIĆ Teaching an FL to students with ADHD Poučavanje stranoga jezika učenicima s ADHD-om ................................................ 205-222 Elenmari PLETIKOS OLOF Prikaz knjige Gordane Varošanec-Škarić Forenzična fonetika. Zagreb, IBIS grafika, 2019. ...................................................................................................................... 223-229 Mihaela MATEŠIĆ, Biljana STOJANOVSKA XXXIV. međunarodni znanstveni skup HDPL‐a Jezično i izvanjezično u međudjelovanju. Split, Hrvatska, od 24. do 26. rujna 2020. godine................................................... 231-238 Diana TOMIĆ Izvještaj o radu Odjela za fonetiku od listopada 2016. do lipnja 2020. godine ......... 239-245 Upute autorima ...................................................................................................... 247-250 Information for authors .......................................................................................... 251-254 GOVOR 37, 2020, 2, (2021) 125 Izvorni znanstveni rad Rukopis primljen 6. 11. 2019. Prihvaćen za tisak 3. 3. 2021. https://doi.org/10.22210/govor.2020.37.07 Jelena Kuvač Kraljević, Gordana Hržica [email protected], [email protected] Faculty of Education and Rehabilitation Sciences, University of Zagreb Croatia Lana Kologranić Belić [email protected] Polyclinic for the Rehabilitation of Listening and Speech SUVAG, Zagreb Croatia Croatian Corpus of Non‐Professional Written Language – Typical speakers and speakers with language disorders Summary Corpora, as annotated archives of human communication, are objective, reliable resources for language analysis. Here we present the corpus of non-professional written Croatian, based on 1-year sampling of writings by typical speakers and speakers with language disorders. This corpus provides a unique resource because it samples language used by non-professionals, in contrast to corpora based on texts by professional writers (such as journalists, scholars or novelists) sampled over more than a century. In addition, our corpus contains written language from typical and impaired speakers sampled under identical conditions, allowing detailed analyses of language use. This paper describes the language tasks (essay, story generation, non- formal and formal letter and dictation) used to elicit text production, and procedures for sampling and annotation used to generate the corpus. Its usefulness is illustrated through language productivity analyses of transcripts of different genres produced by writers of different age and language status. This corpus may prove useful for the analysis of writing skills in typical and language-impaired speakers of Croatian. Keywords: Croatian Corpus of Non-Professional Written Language, written language, genres, language disorders 126 J. Kuvač Kraljević, G. Hržica, L. Kologranić Belić: Croatian corpus of written language 125-147 1. INTRODUCTION A corpus is a body of written text or transcribed speech that can serve as an objective, reliable basis for linguistic analysis and description (Kennedy, 2014). The history of text analysis can be traced back to the 13th century, when the Christian Bible was manually indexed, and particularly impressive growth in the development of language corpora has occurred in the past 50 years. During this time, various types of corpora have been developed in different languages. They have been used in the range of areas, such as language teaching and learning, forensic linguistics, translation studies, sociolinguistics, and pragmatics (see McCarthy & O’Keeffe, 2010). If a corpus is to serve as a source of evidence for linguistic descriptions and analyses of human communicative ability, it should linguistically describe a speaker’s language performance (Leech, 1992, p. 107). Linguistic competence and performance are too complex to be described adequately by introspection and elicitation alone (Svartvik, 1992). Therefore, corpus analysis should be seen as complementary to the other methods of language analyses, including experiments. Indeed, a corpus is an empirical basis for testing principles of linguistic theories (Kennedy, 2014). Corpora can be compiled for many different purposes, and the purpose helps determine corpus size, style and content. General or core corpora consist of a body of texts that enable linguists to address questions related to vocabulary, grammar or discourse structure. Examples are the British National Corpus (www.natcorp.ox.ac.uk /corpus/index.xml) or Croatian National Corpus (Tadić, 2009). Specialized corpora, in contrast, are designed with specific purposes in mind. Croatian examples are the Croatian Child Language Corpus (Kovačević, 2002), which provides information about the specificity of child language development; the Croatian Adult Spoken Language Corpus (HrAL; Kuvač Kraljević & Hržica, 2016), which provides information about spoken grammar and lexicon in adulthood; and the Croatian Discourse Corpus of Speakers with Aphasia (CroDA; Kuvač Kraljević, Hržica, & Lice, 2017), which supports analyses of spoken discourse skills and error production of adult speakers with aphasia. All three corpora are available within TalkBank (https://talkbank.org), a large database of spoken-language corpora covering different languages (MacWhinney, 2002; MacWhinney, Fromm, Forbes, & Holland, 2011; MacWhinney & Wagner, 2010). Most corpora of written language are based on carefully selected texts produced mostly by professional writers. Corpora of professional writing provide much useful GOVOR 37, 2020, 2, (2021) 127 information but cannot be representative of everyday written language use, such as in emails, letters, notes, essays, and business correspondence. Spoken corpora are much more prone to include non-professional speakers, but there is a great discrepancy in size of written and spoken corpora. Raso and Mello (2014) warn that moving towards big data in corpus linguistics does not necessary fill a gap in linguistic resources i.e., does not provide linguists with the means to study spoken language. Similar can be said for non-professional writing. Such resources are rare, an often restricted to small number of words and to limited number of genres. For example, Schler, Koppel, Argamon, and Pennebaker (2006) have collected relatively large corpus of 140 million words, but it is restricted to blogs.