Gender Gap in Wikipedia Editing
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Cultural Anthropology Through the Lens of Wikipedia: Historical Leader Networks, Gender Bias, and News-Based Sentiment
Cultural Anthropology through the Lens of Wikipedia: Historical Leader Networks, Gender Bias, and News-based Sentiment Peter A. Gloor, Joao Marcos, Patrick M. de Boer, Hauke Fuehres, Wei Lo, Keiichi Nemoto [email protected] MIT Center for Collective Intelligence Abstract In this paper we study the differences in historical World View between Western and Eastern cultures, represented through the English, the Chinese, Japanese, and German Wikipedia. In particular, we analyze the historical networks of the World’s leaders since the beginning of written history, comparing them in the different Wikipedias and assessing cultural chauvinism. We also identify the most influential female leaders of all times in the English, German, Spanish, and Portuguese Wikipedia. As an additional lens into the soul of a culture we compare top terms, sentiment, emotionality, and complexity of the English, Portuguese, Spanish, and German Wikinews. 1 Introduction Over the last ten years the Web has become a mirror of the real world (Gloor et al. 2009). More recently, the Web has also begun to influence the real world: Societal events such as the Arab spring and the Chilean student unrest have drawn a large part of their impetus from the Internet and online social networks. In the meantime, Wikipedia has become one of the top ten Web sites1, occasionally beating daily newspapers in the actuality of most recent news. Be it the resignation of German national soccer team captain Philipp Lahm, or the downing of Malaysian Airlines flight 17 in the Ukraine by a guided missile, the corresponding Wikipedia page is updated as soon as the actual event happened (Becker 2012. -
Position Description Addenda
POSITION DESCRIPTION January 2014 Wikimedia Foundation Executive Director - Addenda The Wikimedia Foundation is a radically transparent organization, and much information can be found at www.wikimediafoundation.org . That said, certain information might be particularly useful to nominators and prospective candidates, including: Announcements pertaining to the Wikimedia Foundation Executive Director Search Kicking off the search for our next Executive Director by Former Wikimedia Foundation Board Chair Kat Walsh An announcement from Wikimedia Foundation ED Sue Gardner by Wikimedia Executive Director Sue Gardner Video Interviews on the Wikimedia Community and Foundation and Its History Some of the values and experiences of the Wikimedia Community are best described directly by those who have been intimately involved in the organization’s dramatic expansion. The following interviews are available for viewing though mOppenheim.TV . • 2013 Interview with Former Wikimedia Board Chair Kat Walsh • 2013 Interview with Wikimedia Executive Director Sue Gardner • 2009 Interview with Wikimedia Executive Director Sue Gardner Guiding Principles of the Wikimedia Foundation and the Wikimedia Community The following article by Sue Gardner, the current Executive Director of the Wikimedia Foundation, has received broad distribution and summarizes some of the core cultural values shared by Wikimedia’s staff, board and community. Topics covered include: • Freedom and open source • Serving every human being • Transparency • Accountability • Stewardship • Shared power • Internationalism • Free speech • Independence More information can be found at: https://meta.wikimedia.org/wiki/User:Sue_Gardner/Wikimedia_Foundation_Guiding_Principles Wikimedia Policies The Wikimedia Foundation has an extensive list of policies and procedures available online at: http://wikimediafoundation.org/wiki/Policies Wikimedia Projects All major projects of the Wikimedia Foundation are collaboratively developed by users around the world using the MediaWiki software. -
A Topic-Aligned Multilingual Corpus of Wikipedia Articles for Studying Information Asymmetry in Low Resource Languages
Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), pages 2373–2380 Marseille, 11–16 May 2020 c European Language Resources Association (ELRA), licensed under CC-BY-NC A Topic-Aligned Multilingual Corpus of Wikipedia Articles for Studying Information Asymmetry in Low Resource Languages Dwaipayan Roy, Sumit Bhatia, Prateek Jain GESIS - Cologne, IBM Research - Delhi, IIIT - Delhi [email protected], [email protected], [email protected] Abstract Wikipedia is the largest web-based open encyclopedia covering more than three hundred languages. However, different language editions of Wikipedia differ significantly in terms of their information coverage. We present a systematic comparison of information coverage in English Wikipedia (most exhaustive) and Wikipedias in eight other widely spoken languages (Arabic, German, Hindi, Korean, Portuguese, Russian, Spanish and Turkish). We analyze the content present in the respective Wikipedias in terms of the coverage of topics as well as the depth of coverage of topics included in these Wikipedias. Our analysis quantifies and provides useful insights about the information gap that exists between different language editions of Wikipedia and offers a roadmap for the Information Retrieval (IR) community to bridge this gap. Keywords: Wikipedia, Knowledge base, Information gap 1. Introduction other with respect to the coverage of topics as well as Wikipedia is the largest web-based encyclopedia covering the amount of information about overlapping topics. -
Modeling Popularity and Reliability of Sources in Multilingual Wikipedia
information Article Modeling Popularity and Reliability of Sources in Multilingual Wikipedia Włodzimierz Lewoniewski * , Krzysztof W˛ecel and Witold Abramowicz Department of Information Systems, Pozna´nUniversity of Economics and Business, 61-875 Pozna´n,Poland; [email protected] (K.W.); [email protected] (W.A.) * Correspondence: [email protected] Received: 31 March 2020; Accepted: 7 May 2020; Published: 13 May 2020 Abstract: One of the most important factors impacting quality of content in Wikipedia is presence of reliable sources. By following references, readers can verify facts or find more details about described topic. A Wikipedia article can be edited independently in any of over 300 languages, even by anonymous users, therefore information about the same topic may be inconsistent. This also applies to use of references in different language versions of a particular article, so the same statement can have different sources. In this paper we analyzed over 40 million articles from the 55 most developed language versions of Wikipedia to extract information about over 200 million references and find the most popular and reliable sources. We presented 10 models for the assessment of the popularity and reliability of the sources based on analysis of meta information about the references in Wikipedia articles, page views and authors of the articles. Using DBpedia and Wikidata we automatically identified the alignment of the sources to a specific domain. Additionally, we analyzed the changes of popularity and reliability in time and identified growth leaders in each of the considered months. The results can be used for quality improvements of the content in different languages versions of Wikipedia. -
Title of Thesis: ABSTRACT CLASSIFYING BIAS
ABSTRACT Title of Thesis: CLASSIFYING BIAS IN LARGE MULTILINGUAL CORPORA VIA CROWDSOURCING AND TOPIC MODELING Team BIASES: Brianna Caljean, Katherine Calvert, Ashley Chang, Elliot Frank, Rosana Garay Jáuregui, Geoffrey Palo, Ryan Rinker, Gareth Weakly, Nicolette Wolfrey, William Zhang Thesis Directed By: Dr. David Zajic, Ph.D. Our project extends previous algorithmic approaches to finding bias in large text corpora. We used multilingual topic modeling to examine language-specific bias in the English, Spanish, and Russian versions of Wikipedia. In particular, we placed Spanish articles discussing the Cold War on a Russian-English viewpoint spectrum based on similarity in topic distribution. We then crowdsourced human annotations of Spanish Wikipedia articles for comparison to the topic model. Our hypothesis was that human annotators and topic modeling algorithms would provide correlated results for bias. However, that was not the case. Our annotators indicated that humans were more perceptive of sentiment in article text than topic distribution, which suggests that our classifier provides a different perspective on a text’s bias. CLASSIFYING BIAS IN LARGE MULTILINGUAL CORPORA VIA CROWDSOURCING AND TOPIC MODELING by Team BIASES: Brianna Caljean, Katherine Calvert, Ashley Chang, Elliot Frank, Rosana Garay Jáuregui, Geoffrey Palo, Ryan Rinker, Gareth Weakly, Nicolette Wolfrey, William Zhang Thesis submitted in partial fulfillment of the requirements of the Gemstone Honors Program, University of Maryland, 2018 Advisory Committee: Dr. David Zajic, Chair Dr. Brian Butler Dr. Marine Carpuat Dr. Melanie Kill Dr. Philip Resnik Mr. Ed Summers © Copyright by Team BIASES: Brianna Caljean, Katherine Calvert, Ashley Chang, Elliot Frank, Rosana Garay Jáuregui, Geoffrey Palo, Ryan Rinker, Gareth Weakly, Nicolette Wolfrey, William Zhang 2018 Acknowledgements We would like to express our sincerest gratitude to our mentor, Dr. -
The Culture of Wikipedia
Good Faith Collaboration: The Culture of Wikipedia Good Faith Collaboration The Culture of Wikipedia Joseph Michael Reagle Jr. Foreword by Lawrence Lessig The MIT Press, Cambridge, MA. Web edition, Copyright © 2011 by Joseph Michael Reagle Jr. CC-NC-SA 3.0 Purchase at Amazon.com | Barnes and Noble | IndieBound | MIT Press Wikipedia's style of collaborative production has been lauded, lambasted, and satirized. Despite unease over its implications for the character (and quality) of knowledge, Wikipedia has brought us closer than ever to a realization of the centuries-old Author Bio & Research Blog pursuit of a universal encyclopedia. Good Faith Collaboration: The Culture of Wikipedia is a rich ethnographic portrayal of Wikipedia's historical roots, collaborative culture, and much debated legacy. Foreword Preface to the Web Edition Praise for Good Faith Collaboration Preface Extended Table of Contents "Reagle offers a compelling case that Wikipedia's most fascinating and unprecedented aspect isn't the encyclopedia itself — rather, it's the collaborative culture that underpins it: brawling, self-reflexive, funny, serious, and full-tilt committed to the 1. Nazis and Norms project, even if it means setting aside personal differences. Reagle's position as a scholar and a member of the community 2. The Pursuit of the Universal makes him uniquely situated to describe this culture." —Cory Doctorow , Boing Boing Encyclopedia "Reagle provides ample data regarding the everyday practices and cultural norms of the community which collaborates to 3. Good Faith Collaboration produce Wikipedia. His rich research and nuanced appreciation of the complexities of cultural digital media research are 4. The Puzzle of Openness well presented. -
The Importance of Female Voices
The Importance of Female Voices Goals Collaborate to read and understand a text, and examine and use topic-specific vocabulary; Identify and define individual and collective knowledge, differentiating knowledge of girls and boys. Essential Questions Are women today participating in Canadian public life in percentages equal to that of men? Have efforts to increase women’s participation in public life succeeded in Canada? Does it matter if female voices are not heard in more equal proportions? Enduring Understandings Even after much progress in the past 30 years, women are vastly under-represented in many aspects of Canadian public life. This imbalance even showed up in Wikipedia, the open online resource encyclopedia, where only 15-20% of content contributors are women. Efforts to increase female participation in public life in the Canada have succeeded but growth is slow. The executives of the Wikipedia foundation viewed the small number of female contributors as a problem. Many believe that increasing the percentage of female voices will enrich diversity of opinions, and better reflect the user community. Materials Define Gender Gap? (2011) Making the edit: why we need more women in Wikipedia (2019) Wikipedia is a mirror of the world's gender biases (2018) Vocabulary intractable [ in-trak-tuh-buh ] (adjective) not easily controlled or directed; not docile or manageable; stubborn; obstinate. nuance [ noo-ahns ] (noun) a subtle difference or distinction in expression or meaning. disparity [ di-ˈsper-ə-tee ] (noun) difference; containing or made up of fundamentally different or unequal elements. egalitarian [ ih-gal-i-tair-ee-uh n ] (adjective) characterized by a belief in the equal status of all people in political, economic, or social life. -
An Analysis of Contributions to Wikipedia from Tor
Are anonymity-seekers just like everybody else? An analysis of contributions to Wikipedia from Tor Chau Tran Kaylea Champion Andrea Forte Department of Computer Science & Engineering Department of Communication College of Computing & Informatics New York University University of Washington Drexel University New York, USA Seatle, USA Philadelphia, USA [email protected] [email protected] [email protected] Benjamin Mako Hill Rachel Greenstadt Department of Communication Department of Computer Science & Engineering University of Washington New York University Seatle, USA New York, USA [email protected] [email protected] Abstract—User-generated content sites routinely block contri- butions from users of privacy-enhancing proxies like Tor because of a perception that proxies are a source of vandalism, spam, and abuse. Although these blocks might be effective, collateral damage in the form of unrealized valuable contributions from anonymity seekers is invisible. One of the largest and most important user-generated content sites, Wikipedia, has attempted to block contributions from Tor users since as early as 2005. We demonstrate that these blocks have been imperfect and that thousands of attempts to edit on Wikipedia through Tor have been successful. We draw upon several data sources and analytical techniques to measure and describe the history of Tor editing on Wikipedia over time and to compare contributions from Tor users to those from other groups of Wikipedia users. Fig. 1. Screenshot of the page a user is shown when they attempt to edit the Our analysis suggests that although Tor users who slip through Wikipedia article on “Privacy” while using Tor. Wikipedia’s ban contribute content that is more likely to be reverted and to revert others, their contributions are otherwise similar in quality to those from other unregistered participants and to the initial contributions of registered users. -
Wikimedia Foundation 2010-11 Plan
Wikimedia Foundation The Year Ahead, and the Year in Review SUE GARDNER WIKIMANIA – WASHINGTON DC – 14 JULY 2012 Purpose of this presentation is two things: Tell you what the WMF did in 2011-12 Tell you what the WMF plans to do in 2012-13 3 34.8 million dollars 4 You wrote the projects. You earned the goodwill. Your work is what donors are supporting. 5 6 7 Recapping 2011-12 8 Recapping 2011-12 Activities Supported 25% increase in readership from 400 to 500 million UVs; Visual Editor – released with save/edit capability & new parser, in restricted namespace on Mediawiki.org; Global Ed work is being done in 12 countries, and in eight the work is driven by volunteers – 19 million characters added to Wikipedia, with 50% female editors; The mobile platform has been redeveloped, Android app released, mobile PVs up 187%. Wikipedia Zero launched with new deals with two companies covering 28 countries, and more underway; Editor recruitment activity is underway in India and Brazil; Wikimedia Labs is launched and exceeding participation targets; Internationalization team has made significant improvements, particularly for Indic languages. It has also created new translation tools; New editor engagement – MoodBar/Feedback Dashboard, AFTv5, New Pages Feed tool, plus many research projects and small experiments (e.g., warnings & welcoming study, lapsed editor e-mail experiment,Teahouse). Nonetheless, editors are down to 85 from 89K; Upload wizard increased image uploads 27% over the year, with WLM causing a visible spike in September 2011. 9 Strategy Plan 2015 Goals (for reference) 1) Increase readership. 2) Increase the quantity of material we offer. -
World Literature According to Wikipedia: Introduction to a Dbpedia-Based Framework
World Literature According to Wikipedia: Introduction to a DBpedia-Based Framework Christoph Hube,1 Frank Fischer,2 Robert J¨aschke,1,3 Gerhard Lauer,4 Mads Rosendahl Thomsen5 1 L3S Research Center, Hannover, Germany 2 National Research University Higher School of Economics, Moscow, Russia 3 University of Sheffield, United Kingdom 4 G¨ottingenCentre for Digital Humanities, Germany 5 Aarhus University, Denmark Among the manifold takes on world literature, it is our goal to contribute to the discussion from a digital point of view by analyzing the representa- tion of world literature in Wikipedia with its millions of articles in hundreds of languages. As a preliminary, we introduce and compare three different approaches to identify writers on Wikipedia using data from DBpedia, a community project with the goal of extracting and providing structured in- formation from Wikipedia. Equipped with our basic set of writers, we analyze how they are represented throughout the 15 biggest Wikipedia language ver- sions. We combine intrinsic measures (mostly examining the connectedness of articles) with extrinsic ones (analyzing how often articles are frequented by readers) and develop methods to evaluate our results. The better part of our findings seems to convey a rather conservative, old-fashioned version of world literature, but a version derived from reproducible facts revealing an implicit literary canon based on the editing and reading behavior of millions of peo- ple. While still having to solve some known issues, the introduced methods arXiv:1701.00991v1 [cs.IR] 4 Jan 2017 will help us build an observatory of world literature to further investigate its representativeness and biases. -
How to Contribute Climate Change Information to Wikipedia : a Guide
HOW TO CONTRIBUTE CLIMATE CHANGE INFORMATION TO WIKIPEDIA Emma Baker, Lisa McNamara, Beth Mackay, Katharine Vincent; ; © 2021, CDKN This work is licensed under the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/legalcode), which permits unrestricted use, distribution, and reproduction, provided the original work is properly credited. Cette œuvre est mise à disposition selon les termes de la licence Creative Commons Attribution (https://creativecommons.org/licenses/by/4.0/legalcode), qui permet l’utilisation, la distribution et la reproduction sans restriction, pourvu que le mérite de la création originale soit adéquatement reconnu. IDRC Grant/ Subvention du CRDI: 108754-001-CDKN knowledge accelerator for climate compatible development How to contribute climate change information to Wikipedia A guide for researchers, practitioners and communicators Contents About this guide .................................................................................................................................................... 5 1 Why Wikipedia is an important tool to communicate climate change information .................................................................................................................................. 7 1.1 Enhancing the quality of online climate change information ............................................. 8 1.2 Sharing your work more widely ......................................................................................................8 1.3 Why researchers should -
Semantically Annotated Snapshot of the English Wikipedia
Semantically Annotated Snapshot of the English Wikipedia Jordi Atserias, Hugo Zaragoza, Massimiliano Ciaramita, Giuseppe Attardi Yahoo! Research Barcelona, U. Pisa, on sabbatical at Yahoo! Research C/Ocata 1 Barcelona 08003 Spain {jordi, hugoz, massi}@yahoo-inc.com, [email protected] Abstract This paper describes SW1, the first version of a semantically annotated snapshot of the English Wikipedia. In recent years Wikipedia has become a valuable resource for both the Natural Language Processing (NLP) community and the Information Retrieval (IR) community. Although NLP technology for processing Wikipedia already exists, not all researchers and developers have the computational resources to process such a volume of information. Moreover, the use of different versions of Wikipedia processed differently might make it difficult to compare results. The aim of this work is to provide easy access to syntactic and semantic annotations for researchers of both NLP and IR communities by building a reference corpus to homogenize experiments and make results comparable. These resources, a semantically annotated corpus and a “entity containment” derived graph, are licensed under the GNU Free Documentation License and available from http://www.yr-bcn.es/semanticWikipedia. 1. Introduction 2. Processing Wikipedia1, the largest electronic encyclopedia, has be- Starting from the XML Wikipedia source we carried out a come a widely used resource for different Natural Lan- number of data processing steps: guage Processing tasks, e.g. Word Sense Disambiguation (Mihalcea, 2007), Semantic Relatedness (Gabrilovich and • Basic preprocessing: Stripping the text from the Markovitch, 2007) or in the Multilingual Question Answer- XML tags and dividing the obtained text into sen- ing task at Cross-Language Evaluation Forum (CLEF)2.