Wikipedia Cultural Diversity Dataset: a Complete Cartography for 300
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
1 Wikipedia: an Effective Anarchy Dariusz Jemielniak, Ph.D
Wikipedia: An Effective Anarchy Dariusz Jemielniak, Ph.D. Kozminski University [email protected] Paper presented at the Society for Applied Anthropology conference in Baltimore, MD (USA), 27-31 March, 2012 (work in progress) This paper is the first report from a virtual ethnographic study (Hine, 2000; Kozinets, 2010) of Wikipedia community conducted 2006-2012, by the use of participative methods, and relying on an narrative analysis of Wikipedia organization (Czarniawska, 2000; Boje, 2001; Jemielniak & Kostera, 2010). It serves as a general introduction to Wikipedia community, and is also a basis for a discussion of a book in progress, which is going to address the topic. Contrarily to a common misconception, Wikipedia was not the first “wiki” in the world. “Wiki” (originated from Hawaiian word for “quick” or “fast”, and named after “Wiki Wiki Shuttle” on Honolulu International Airport) is a website technology based on a philosophy of tracking changes added by the users, with a simplified markup language (allowing easy additions of, e.g. bold, italics, or tables, without the need to learn full HTML syntax), and was originally created and made public in 1995 by Ward Cunningam, as WikiWikiWeb. WikiWikiWeb was an attractive choice among enterprises and was used for communication, collaborative ideas development, documentation, intranet, knowledge management, etc. It grew steadily in popularity, when Jimmy “Jimbo” Wales, then the CEO of Bomis Inc., started up his encyclopedic project in 2000: Nupedia. Nupedia was meant to be an online encyclopedia, with free content, and written by experts. In an attempt to meet the standards set by professional encyclopedias, the creators of Nupedia based it on a peer-review process, and not a wiki-type software. -
Universality, Similarity, and Translation in the Wikipedia Inter-Language Link Network
In Search of the Ur-Wikipedia: Universality, Similarity, and Translation in the Wikipedia Inter-language Link Network Morten Warncke-Wang1, Anuradha Uduwage1, Zhenhua Dong2, John Riedl1 1GroupLens Research Dept. of Computer Science and Engineering 2Dept. of Information Technical Science University of Minnesota Nankai University Minneapolis, Minnesota Tianjin, China {morten,uduwage,riedl}@cs.umn.edu [email protected] ABSTRACT 1. INTRODUCTION Wikipedia has become one of the primary encyclopaedic in- The world: seven seas separating seven continents, seven formation repositories on the World Wide Web. It started billion people in 193 nations. The world's knowledge: 283 in 2001 with a single edition in the English language and has Wikipedias totalling more than 20 million articles. Some since expanded to more than 20 million articles in 283 lan- of the content that is contained within these Wikipedias is guages. Criss-crossing between the Wikipedias is an inter- probably shared between them; for instance it is likely that language link network, connecting the articles of one edition they will all have an article about Wikipedia itself. This of Wikipedia to another. We describe characteristics of ar- leads us to ask whether there exists some ur-Wikipedia, a ticles covered by nearly all Wikipedias and those covered by set of universal knowledge that any human encyclopaedia only a single language edition, we use the network to under- will contain, regardless of language, culture, etc? With such stand how we can judge the similarity between Wikipedias a large number of Wikipedia editions, what can we learn based on concept coverage, and we investigate the flow of about the knowledge in the ur-Wikipedia? translation between a selection of the larger Wikipedias. -
Attitudes Towards Linguistic Diversity in the Hebrew Bible
Many Peoples of Obscure Speech and Difficult Language: Attitudes towards Linguistic Diversity in the Hebrew Bible The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Power, Cian Joseph. 2015. Many Peoples of Obscure Speech and Difficult Language: Attitudes towards Linguistic Diversity in the Hebrew Bible. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences. Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:23845462 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA MANY PEOPLES OF OBSCURE SPEECH AND DIFFICULT LANGUAGE: ATTITUDES TOWARDS LINGUISTIC DIVERSITY IN THE HEBREW BIBLE A dissertation presented by Cian Joseph Power to The Department of Near Eastern Languages and Civilizations in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the subject of Near Eastern Languages and Civilizations Harvard University Cambridge, Massachusetts August 2015 © 2015 Cian Joseph Power All rights reserved. Dissertation Advisor: Professor Peter Machinist Cian Joseph Power MANY PEOPLES OF OBSCURE SPEECH AND DIFFICULT LANGUAGE: ATTITUDES TOWARDS LINGUISTIC DIVERSITY IN THE HEBREW BIBLE Abstract The subject of this dissertation is the awareness of linguistic diversity in the Hebrew Bible—that is, the recognition evident in certain biblical texts that the world’s languages differ from one another. Given the frequent role of language in conceptions of identity, the biblical authors’ reflections on language are important to examine. -
Volunteer Contributions to Wikipedia Increased During COVID-19 Mobility Restrictions
Volunteer contributions to Wikipedia increased during COVID-19 mobility restrictions Thorsten Ruprechter1,*, Manoel Horta Ribeiro2, Tiago Santos1, Florian Lemmerich3, Markus Strohmaier3,4, Robert West2, and Denis Helic1 1Graz University of Technology, 8010 Graz, Austria 2EPFL, 1015 Lausanne, Switzerland 3RWTH Aachen University, 52062 Aachen, Germany 4GESIS – Leibniz Institute for the Social Sciences, 50667 Cologne, Germany *Corresponding author ([email protected]) Wikipedia, the largest encyclopedia ever created, is a global initiative driven by volunteer contribu- tions. When the COVID-19 pandemic broke out and mobility restrictions ensued across the globe, it was unclear whether Wikipedia volunteers would become less active in the face of the pandemic, or whether they would rise to meet the increased demand for high-quality information despite the added stress inflicted by this crisis. Analyzing 223 million edits contributed from 2018 to 2020 across twelve Wikipedia language editions, we find that Wikipedia’s global volunteer community responded remarkably to the pandemic, substantially increasing both productivity and the number of newcom- ers who joined the community. For example, contributions to the English Wikipedia increased by over 20% compared to the expectation derived from pre-pandemic data. Our work sheds light on the response of a global volunteer population to the COVID-19 crisis, providing valuable insights into the behavior of critical online communities under stress. Wikipedia is the world’s largest encyclopedia, one of the most prominent volunteer-based information systems in existence [18, 29], and one of the most popular destinations on the Web [2]. On an average day in 2019, users from around the world visited Wikipedia about 530 million times and editors voluntarily contributed over 870 thousand edits to one of Wikipedia’s language editions (Supplementary Table 1). -
Europeanness in Aarhus 2017'S Programme of Events: Identity
Culture, Practice & Europeanization, 2020, Vol. 5, No. 1, 16-33 …………………………………………………………………………………………………………………………………………. Europeanness in Aarhus 2017’s programme of events: Identity constructions and narratives Antoinette Fage-Butler ([email protected]), Katja Gorbahn ([email protected]) Aarhus University, Denmark _________________________________________________________________________ According to EU cultural policy, European Capitals of Culture (ECOCs) should include a ‘Eu- ropean dimension’ that promotes cultural collaborations across EU countries and high- lights the diversity and similarity of European cultures. However, the European dimension has been underplayed in ECOC events (Lähdesmäki, 2014b) and has not been particularly visible in official communication about ECOC events (European Commission, 2010). The purpose of this study is to investigate narratives of Europeanness that provide templates for identification in the official programme of events for Aarhus 2017, using a qualitative discourse analytical approach and computational tools. The findings reveal that ‘Europe’ is linked to other spatial/geopolitical levels, and that narratives of Europeanness draw on discourses of categorical identity and relational identity. The various representations of Europeanness in Aarhus 2017’s programme of events are discussed with respect to exist- ing empirical studies and theories of European identity, as well as the evolving aims of ECOC. Keywords: European Capitals of Culture (ECOCs), Narratives, Europeanness, European identity, Discourse, Digital text analysis, Aarhus 2017, ECOC programme of events 1. Introduction The European Capitals of Culture (ECOC) project has been in existence since 1985. It has been described as “a flagship cultural initiative of the European Union” (Barroso, 2009, 1), which should further civic identification with the EU, and political integration (Shore, 2000) by winning over EU citizens’ “hearts and minds” (Patel, 2013, 2). -
A Topic-Aligned Multilingual Corpus of Wikipedia Articles for Studying Information Asymmetry in Low Resource Languages
Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), pages 2373–2380 Marseille, 11–16 May 2020 c European Language Resources Association (ELRA), licensed under CC-BY-NC A Topic-Aligned Multilingual Corpus of Wikipedia Articles for Studying Information Asymmetry in Low Resource Languages Dwaipayan Roy, Sumit Bhatia, Prateek Jain GESIS - Cologne, IBM Research - Delhi, IIIT - Delhi [email protected], [email protected], [email protected] Abstract Wikipedia is the largest web-based open encyclopedia covering more than three hundred languages. However, different language editions of Wikipedia differ significantly in terms of their information coverage. We present a systematic comparison of information coverage in English Wikipedia (most exhaustive) and Wikipedias in eight other widely spoken languages (Arabic, German, Hindi, Korean, Portuguese, Russian, Spanish and Turkish). We analyze the content present in the respective Wikipedias in terms of the coverage of topics as well as the depth of coverage of topics included in these Wikipedias. Our analysis quantifies and provides useful insights about the information gap that exists between different language editions of Wikipedia and offers a roadmap for the Information Retrieval (IR) community to bridge this gap. Keywords: Wikipedia, Knowledge base, Information gap 1. Introduction other with respect to the coverage of topics as well as Wikipedia is the largest web-based encyclopedia covering the amount of information about overlapping topics. -
Modeling Popularity and Reliability of Sources in Multilingual Wikipedia
information Article Modeling Popularity and Reliability of Sources in Multilingual Wikipedia Włodzimierz Lewoniewski * , Krzysztof W˛ecel and Witold Abramowicz Department of Information Systems, Pozna´nUniversity of Economics and Business, 61-875 Pozna´n,Poland; [email protected] (K.W.); [email protected] (W.A.) * Correspondence: [email protected] Received: 31 March 2020; Accepted: 7 May 2020; Published: 13 May 2020 Abstract: One of the most important factors impacting quality of content in Wikipedia is presence of reliable sources. By following references, readers can verify facts or find more details about described topic. A Wikipedia article can be edited independently in any of over 300 languages, even by anonymous users, therefore information about the same topic may be inconsistent. This also applies to use of references in different language versions of a particular article, so the same statement can have different sources. In this paper we analyzed over 40 million articles from the 55 most developed language versions of Wikipedia to extract information about over 200 million references and find the most popular and reliable sources. We presented 10 models for the assessment of the popularity and reliability of the sources based on analysis of meta information about the references in Wikipedia articles, page views and authors of the articles. Using DBpedia and Wikidata we automatically identified the alignment of the sources to a specific domain. Additionally, we analyzed the changes of popularity and reliability in time and identified growth leaders in each of the considered months. The results can be used for quality improvements of the content in different languages versions of Wikipedia. -
The Heart of the Metropole: Urban Space and Interracial Relationships
This is a repository copy of The heart of the metropole: Urban space and interracial relationships in ‘Fidelidade’ by Vimala Devi, ‘Um Encontro Imprevisto’ by Henrique de Senna Fernandes, and ‘Nina’ by Orlanda Amarílis. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/105264/ Version: Accepted Version Article: Melo e Castro, P (2017) The heart of the metropole: Urban space and interracial relationships in ‘Fidelidade’ by Vimala Devi, ‘Um Encontro Imprevisto’ by Henrique de Senna Fernandes, and ‘Nina’ by Orlanda Amarílis. Forum for Modern Language Studies, 53 (4). pp. 405-429. ISSN 0015-8518 https://doi.org/10.1093/fmls/cqx038 (c) The Author 2017. Published by Oxford University Press for the Court of the University of St Andrews. All rights reserved. This is a pre-copyedited, author-produced version of an article published in Modern Language Studies uploaded in accordance with the publisher's self-archiving policy. The version of record is available online at: https://doi.org/10.1093/fmls/cqx038 Reuse Items deposited in White Rose Research Online are protected by copyright, with all rights reserved unless indicated otherwise. They may be downloaded and/or printed for private study, or other acts as permitted by national copyright laws. The publisher or other rights holders may allow further reproduction and re-use of the full text version. This is indicated by the licence information on the White Rose Research Online record for the item. Takedown If you consider content in White Rose Research Online to be in breach of UK law, please notify us by emailing [email protected] including the URL of the record and the reason for the withdrawal request. -
Omnipedia: Bridging the Wikipedia Language
Omnipedia: Bridging the Wikipedia Language Gap Patti Bao*†, Brent Hecht†, Samuel Carton†, Mahmood Quaderi†, Michael Horn†§, Darren Gergle*† *Communication Studies, †Electrical Engineering & Computer Science, §Learning Sciences Northwestern University {patti,brent,sam.carton,quaderi}@u.northwestern.edu, {michael-horn,dgergle}@northwestern.edu ABSTRACT language edition contains its own cultural viewpoints on a We present Omnipedia, a system that allows Wikipedia large number of topics [7, 14, 15, 27]. On the other hand, readers to gain insight from up to 25 language editions of the language barrier serves to silo knowledge [2, 4, 33], Wikipedia simultaneously. Omnipedia highlights the slowing the transfer of less culturally imbued information similarities and differences that exist among Wikipedia between language editions and preventing Wikipedia’s 422 language editions, and makes salient information that is million monthly visitors [12] from accessing most of the unique to each language as well as that which is shared information on the site. more widely. We detail solutions to numerous front-end and algorithmic challenges inherent to providing users with In this paper, we present Omnipedia, a system that attempts a multilingual Wikipedia experience. These include to remedy this situation at a large scale. It reduces the silo visualizing content in a language-neutral way and aligning effect by providing users with structured access in their data in the face of diverse information organization native language to over 7.5 million concepts from up to 25 strategies. We present a study of Omnipedia that language editions of Wikipedia. At the same time, it characterizes how people interact with information using a highlights similarities and differences between each of the multilingual lens. -
Multilingual Ranking of Wikipedia Articles with Quality and Popularity Assessment in Different Topics
computers Article Multilingual Ranking of Wikipedia Articles with Quality and Popularity Assessment in Different Topics Włodzimierz Lewoniewski * , Krzysztof W˛ecel and Witold Abramowicz Department of Information Systems, Pozna´nUniversity of Economics and Business, 61-875 Pozna´n,Poland * Correspondence: [email protected]; Tel.: +48-(61)-639-27-93 Received: 10 May 2019; Accepted: 13 August 2019; Published: 14 August 2019 Abstract: On Wikipedia, articles about various topics can be created and edited independently in each language version. Therefore, the quality of information about the same topic depends on the language. Any interested user can improve an article and that improvement may depend on the popularity of the article. The goal of this study is to show what topics are best represented in different language versions of Wikipedia using results of quality assessment for over 39 million articles in 55 languages. In this paper, we also analyze how popular selected topics are among readers and authors in various languages. We used two approaches to assign articles to various topics. First, we selected 27 main multilingual categories and analyzed all their connections with sub-categories based on information extracted from over 10 million categories in 55 language versions. To classify the articles to one of the 27 main categories, we took into account over 400 million links from articles to over 10 million categories and over 26 million links between categories. In the second approach, we used data from DBpedia and Wikidata. We also showed how the results of the study can be used to build local and global rankings of the Wikipedia content. -
15Th-17Th Century) Essays on the Spread of Humanistic and Renaissance Literary (15Th-17Th Century) Edited by Giovanna Siedina
45 BIBLIOTECA DI STUDI SLAVISTICI Giovanna Siedina Giovanna Essays on the Spread of Humanistic and Renaissance Literary Civilization in the Slavic World Civilization in the Slavic World (15th-17th Century) Civilization in the Slavic World of Humanistic and Renaissance Literary Essays on the Spread (15th-17th Century) edited by Giovanna Siedina FUP FIRENZE PRESUNIVERSITYS BIBLIOTECA DI STUDI SLAVISTICI ISSN 2612-7687 (PRINT) - ISSN 2612-7679 (ONLINE) – 45 – BIBLIOTECA DI STUDI SLAVISTICI Editor-in-Chief Laura Salmon, University of Genoa, Italy Associate editor Maria Bidovec, University of Naples L’Orientale, Italy Scientific Board Rosanna Benacchio, University of Padua, Italy Maria Cristina Bragone, University of Pavia, Italy Claudia Olivieri, University of Catania, Italy Francesca Romoli, University of Pisa, Italy Laura Rossi, University of Milan, Italy Marco Sabbatini, University of Pisa, Italy International Scientific Board Giovanna Brogi Bercoff, University of Milan, Italy Maria Giovanna Di Salvo, University of Milan, Italy Alexander Etkind, European University Institute, Italy Lazar Fleishman, Stanford University, United States Marcello Garzaniti, University of Florence, Italy Harvey Goldblatt, Yale University, United States Mark Lipoveckij, University of Colorado-Boulder , United States Jordan Ljuckanov, Bulgarian Academy of Sciences, Bulgaria Roland Marti, Saarland University, Germany Michael Moser, University of Vienna, Austria Ivo Pospíšil, Masaryk University, Czech Republic Editorial Board Giuseppe Dell’Agata, University of Pisa, Italy Essays on the Spread of Humanistic and Renaissance Literary Civilization in the Slavic World (15th-17th Century) edited by Giovanna Siedina FIRENZE UNIVERSITY PRESS 2020 Essays on the Spread of Humanistic and Renaissance Literary Civilization in the Slavic World (15th- 17th Century) / edited by Giovanna Siedina. – Firenze : Firenze University Press, 2020. -
An Analysis of Contributions to Wikipedia from Tor
Are anonymity-seekers just like everybody else? An analysis of contributions to Wikipedia from Tor Chau Tran Kaylea Champion Andrea Forte Department of Computer Science & Engineering Department of Communication College of Computing & Informatics New York University University of Washington Drexel University New York, USA Seatle, USA Philadelphia, USA [email protected] [email protected] [email protected] Benjamin Mako Hill Rachel Greenstadt Department of Communication Department of Computer Science & Engineering University of Washington New York University Seatle, USA New York, USA [email protected] [email protected] Abstract—User-generated content sites routinely block contri- butions from users of privacy-enhancing proxies like Tor because of a perception that proxies are a source of vandalism, spam, and abuse. Although these blocks might be effective, collateral damage in the form of unrealized valuable contributions from anonymity seekers is invisible. One of the largest and most important user-generated content sites, Wikipedia, has attempted to block contributions from Tor users since as early as 2005. We demonstrate that these blocks have been imperfect and that thousands of attempts to edit on Wikipedia through Tor have been successful. We draw upon several data sources and analytical techniques to measure and describe the history of Tor editing on Wikipedia over time and to compare contributions from Tor users to those from other groups of Wikipedia users. Fig. 1. Screenshot of the page a user is shown when they attempt to edit the Our analysis suggests that although Tor users who slip through Wikipedia article on “Privacy” while using Tor. Wikipedia’s ban contribute content that is more likely to be reverted and to revert others, their contributions are otherwise similar in quality to those from other unregistered participants and to the initial contributions of registered users.