Glottometrics 40 2018
Total Page:16
File Type:pdf, Size:1020Kb
Glottometrics 40 2018 RAM-Verlag ISSN 2625-8226 Glottometrics Indexed in ESCI by Thomson Reuters and SCOPUS by Elsevier Glottometrics ist eine unregelmäßig er- Glottometrics is a scientific journal for the scheinende Zeitschrift (2-3 Ausgaben pro quantitative research on language and text Jahr) für die quantitative Erforschung von published at irregular intervals (2-3 times a Sprache und Text. year). Beiträge in Deutsch oder Englisch sollten Contributions in English or German writ- an einen der Herausgeber in einem gängi- ten with a common text processing system gen Textverarbeitungssystem (vorrangig (preferably WORD) should be sent to one WORD) geschickt werden. of the editors. Glottometrics kann aus dem Internet her- Glottometrics can be downloaded from the untergeladen, auf CD-ROM (in PDF For- Internet, obtained on CD-ROM (in PDF) mat) oder in Buchform bestellt werden. or in form of printed copies. Herausgeber – Editors G. Altmann Univ. Bochum (Germany) [email protected] K.-H. Best Univ. Göttingen (Germany) [email protected] R. Čech Univ. Ostrava (Czech Republic) [email protected] F. Fan Univ. Dalian (China) [email protected] P. Grzybek Univ. Graz (Austria) [email protected] E. Kelih Univ. Vienna (Austria) [email protected] R. Köhler Univ. Trier (Germany) [email protected] H. Liu Univ. Zhejiang (China) [email protected] J. Mačutek Univ. Bratislava (Slovakia) [email protected] A. Mehler Univ. Frankfurt (Germany) [email protected] G. Wimmer Univ. Bratislava (Slovakia) [email protected] P. Zörnig Univ. Brasilia (Brasilia) [email protected] External academic peers for Glottometrics Prof. Dr. Haruko Sanada Rissho University,Tokyo, Japan (http://www.ris.ac.jp/en/); Link to Prof. Dr. Sanada: http://researchmap.jp/read0128740/?lang=english; mailto:[email protected] Prof. Dr.Thorsten Roelcke TU Berlin, Berlin, Germany ( http://www.tu-berlin.de/ ) Link to Prof. Dr.Roelcke: http://www.daf.tu-berlin.de/menue/deutsch_als_fremd- und_fachsprache/personal/professoren_und_pds/prof_dr_thorsten_roelcke/ mailto:Thosten Roellcke ([email protected]) Bestellungen der CD-ROM oder der gedruckten Form sind zu richten an Orders for CD-ROM or printed copies to RAM-Verlag [email protected] Herunterladen / Downloading: http://www.ram-verlag.de Die Deutsche Bibliothek – CIP-Einheitsaufnahme Glottometrics. –40 (2018). – Lüdenscheid: RAM-Verlag, 2018 Erscheint unregelmäßig. – Auch im Internet als elektronische Ressource unter der Adresse http://www.ram-verlag.eu verfügbar. Bibliographische Deskription nach 40 (2018) ISSN 1617-8351 Contents Alexander Mehler, Rüdiger Gleim, Andy Lücking, Tolga Uslu, Christian Stegbauer On the Self-similarity of Wikipedia Talks: 1 - 45 a Combined Discourse-analytical and Quantitative Approach Anastasia Gnatciuc, Hanna Gnatchuk Linking Elements of German Compounds in the Texts 46 - 50 of Technical Science Pavel Kosek, Radek Čech, Olga Navrátilová, Ján Mačutek On the Development of Old Czech (En)clitics 51 - 62 Sergej Andreev, Fengxiang Fan, Gabriel Altmann Adnominal Aggregation 63 - 76 Biyan Yu, Yue Jiang Probability Distribution of Syntactic Divergences of Determiner 77 - 90 his-(adjective)-Noun Structure in English-to-Chinese Translation Yu Yang, Se-Eun Jhang A Menzerath-Altmann Model for NP length and Complexity 91 - 103 in Maritime English Xinying Chen, Carlos Gómez-Rodríguez, Ramon Ferrer-i-Cancho A Dependency Look at the Reality of Constituency 104 - 106 Glottometrics 40, 2018, 1-45 On the Self-similarity of Wikipedia Talks: a Combined Discourse-analytical and Quantitative Approach1 Alexander Mehler,2 Rüdiger Gleim, Andy Lücking, Tolga Uslu and Christian Stegbauer Abstract: Do the talk pages in Wikipedia, referred to as Wikicussions, exhibit effects of mass commu- nication? In order to provide an answer to this question, we assess Wikicussions from the point of view of dialog theory and identify characteristics specific to this webgenre. We then show that webgenres of this sort evolve into a state of multidimensional scale invariance that is simultaneously reflected on several syntactic and pragmatic dimensions – irrespective of the underlying topic being discussed and the composition of the underlying community of discussants. We also show that a system exhibiting multidimensional scale invariance interferes with thematic classification. The resulting confusability of the gestalt of Wikicussions in terms of their thematic provenance and their underlying participation struc- ture is not just caused by the predominance of small units. Rather it also concerns larger or even largest Wikicussions. According to these findings, we distinguish two sorts of self-similarity of Wikipedia’s discussion space: horizontally, regarding thematically demarcated subparts of this space, and vertically regarding the gestalt of top-level sections in relation to Wikicussions. Our analysis is exemplified by means of the discussion space of the German Wikipedia. The results suggest that a quantitative discourse analysis of big dialogical data as provided by Wikicussions is a promising way to explain the peculiarities of this medium: it can be a starting point for a corresponding theory formation. Keywords: webgenre, Wikicussion, dialog theory, quantitative discourse analysis, multidimensional scale invariance, self-similarity 1. Introduction Wikipedia is a genuine webgenre (Santini et al. 2010) that integrates several subgenres such as articles, portals, and so-called talk pages. Talk pages are the subject of this article. They manifest multiparty multi-threaded online conversations to which multitudes of discussants (i.e., prosumers in the sense of Tapscott and Williams (2008)) may participate. Talk pages serve as forums for debating the content of collaboratively written articles in order to improve, for example, their quality – as in the case of task-oriented article talk pages (Gómez et al. 2011) – or to communicate self-expression – as in the case of user talk pages3 (Kittur et al. 2007b; Laniado 1 This article is dedicated to Reinhard Köhler on the occasion of his 65th anniversary. 2 Text Technology Lab, Goethe University Frankfurt, Robert-Mayer-Straße 10, D-60325 Frankfurt am Main, Germany. Mail: [email protected] 3 Note that self-expression, for example, by means of authority claims also concerns article talk pages (Bender et al. 2011; Oxley et al. 2010; Marin et al. 2011). 1 Alexander Mehler, Rüdiger Gleim, Andy Lücking, Tolga Uslu and Christian Stegbauer et al. 2011; Laniado et al. 2012; Iosub et al. 2014)).4 Article talk pages serve a wide range of functions in support of coordinating work on Wikipedia (Backstrom et al. 2013). This includes, for example, planning of editing activities, conflict resolution, communicating or negotiating Wikipedia’s goals, norms and policies, or extending Wikipedia as a knowledge base or even as a software system (Bryant et al. 2005; Arazy et al. 2011; Viégas et al. 2004; Viégas et al. 2007; Schneider et al. 2011; Schneider et al. 2012). In this way, talk pages transport social influence in social communities of online collaborating users in a way that never existed before the advent of this medium: Wikipedia’s prosumers build “online communities of practice” (Bryant et al. 2005; Hara et al. 2010) for knowledge sharing as well as for sharing practices of knowledge sharing. Whereas the shareability (Freyd 1983) of the former kind of knowledge is addressed by article talk pages, the shareability of the latter meta-knowledge is the topic of so-called Wikipedia talk pages (Hara et al. 2010). The status of Wikipedia as a novel webgenre and of talk pages as one of its subgenres is justified in several ways. Researchers claim, for example, that wiki media have fundamentally changed the way people communicate since they affect fundamental processes such as opinion formation and collective problem solving (Wang et al. 2012). Others claim that Wikipedia has changed the status of collective work (Welser et al. 2011) outweighing the work on ancestor gen- res (e.g., offline encyclopedia). This qualitative innovation is said to be accompanied by a quan- titative one regarding the “exponential growth of asynchronous online conversations” (Hoque and Carenini 2015) manifested by media such as Twitter, blogs and talk pages. Unlike face-to- face dialogs or multilogs, online discussions are open in terms of space, time (Kaltenbrunner and Laniado 2012), participation structure and subtopics under discussion though being restricted by the framing topic of the corresponding article. Wikipedia establishes the largest encyclopedia that ever existed (Iosub et al. 2014) by means of the collaboration of a multitude of editors in a self-organized manner subject to a loose governance (Arazy et al. 2011). As a by-product of writing encyclopedias, this cooperation is also seen as a source for the formation of collective memories (Ferron and Massa 2014). In order to approach these and related goals, Wikipedia has to balance out (i) the needs of a wide range of users regarding (ii) a variety of subgoals subject to (iii) a diversity of boundary conditions thereby entering into fluent equilibria of all included variables (as exemplified by Kittur et al. (2007a) regarding Wikipedia’s participation structure): 1. The first range of variables comprises (readers in the role of) so-called free-riders (Antin and Cheshire 2010), lurkers (Preece et al. 2004), serendipitous editors (Antin and Cheshire 2010), legitimate peripheral participators (Bryant et al. 2005), low-edit users (Kittur