Towards an On-Demand Simple Portuguese Wikipedia
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Cultural Anthropology Through the Lens of Wikipedia: Historical Leader Networks, Gender Bias, and News-Based Sentiment
Cultural Anthropology through the Lens of Wikipedia: Historical Leader Networks, Gender Bias, and News-based Sentiment Peter A. Gloor, Joao Marcos, Patrick M. de Boer, Hauke Fuehres, Wei Lo, Keiichi Nemoto [email protected] MIT Center for Collective Intelligence Abstract In this paper we study the differences in historical World View between Western and Eastern cultures, represented through the English, the Chinese, Japanese, and German Wikipedia. In particular, we analyze the historical networks of the World’s leaders since the beginning of written history, comparing them in the different Wikipedias and assessing cultural chauvinism. We also identify the most influential female leaders of all times in the English, German, Spanish, and Portuguese Wikipedia. As an additional lens into the soul of a culture we compare top terms, sentiment, emotionality, and complexity of the English, Portuguese, Spanish, and German Wikinews. 1 Introduction Over the last ten years the Web has become a mirror of the real world (Gloor et al. 2009). More recently, the Web has also begun to influence the real world: Societal events such as the Arab spring and the Chilean student unrest have drawn a large part of their impetus from the Internet and online social networks. In the meantime, Wikipedia has become one of the top ten Web sites1, occasionally beating daily newspapers in the actuality of most recent news. Be it the resignation of German national soccer team captain Philipp Lahm, or the downing of Malaysian Airlines flight 17 in the Ukraine by a guided missile, the corresponding Wikipedia page is updated as soon as the actual event happened (Becker 2012. -
Modeling Popularity and Reliability of Sources in Multilingual Wikipedia
information Article Modeling Popularity and Reliability of Sources in Multilingual Wikipedia Włodzimierz Lewoniewski * , Krzysztof W˛ecel and Witold Abramowicz Department of Information Systems, Pozna´nUniversity of Economics and Business, 61-875 Pozna´n,Poland; [email protected] (K.W.); [email protected] (W.A.) * Correspondence: [email protected] Received: 31 March 2020; Accepted: 7 May 2020; Published: 13 May 2020 Abstract: One of the most important factors impacting quality of content in Wikipedia is presence of reliable sources. By following references, readers can verify facts or find more details about described topic. A Wikipedia article can be edited independently in any of over 300 languages, even by anonymous users, therefore information about the same topic may be inconsistent. This also applies to use of references in different language versions of a particular article, so the same statement can have different sources. In this paper we analyzed over 40 million articles from the 55 most developed language versions of Wikipedia to extract information about over 200 million references and find the most popular and reliable sources. We presented 10 models for the assessment of the popularity and reliability of the sources based on analysis of meta information about the references in Wikipedia articles, page views and authors of the articles. Using DBpedia and Wikidata we automatically identified the alignment of the sources to a specific domain. Additionally, we analyzed the changes of popularity and reliability in time and identified growth leaders in each of the considered months. The results can be used for quality improvements of the content in different languages versions of Wikipedia. -
The Culture of Wikipedia
Good Faith Collaboration: The Culture of Wikipedia Good Faith Collaboration The Culture of Wikipedia Joseph Michael Reagle Jr. Foreword by Lawrence Lessig The MIT Press, Cambridge, MA. Web edition, Copyright © 2011 by Joseph Michael Reagle Jr. CC-NC-SA 3.0 Purchase at Amazon.com | Barnes and Noble | IndieBound | MIT Press Wikipedia's style of collaborative production has been lauded, lambasted, and satirized. Despite unease over its implications for the character (and quality) of knowledge, Wikipedia has brought us closer than ever to a realization of the centuries-old Author Bio & Research Blog pursuit of a universal encyclopedia. Good Faith Collaboration: The Culture of Wikipedia is a rich ethnographic portrayal of Wikipedia's historical roots, collaborative culture, and much debated legacy. Foreword Preface to the Web Edition Praise for Good Faith Collaboration Preface Extended Table of Contents "Reagle offers a compelling case that Wikipedia's most fascinating and unprecedented aspect isn't the encyclopedia itself — rather, it's the collaborative culture that underpins it: brawling, self-reflexive, funny, serious, and full-tilt committed to the 1. Nazis and Norms project, even if it means setting aside personal differences. Reagle's position as a scholar and a member of the community 2. The Pursuit of the Universal makes him uniquely situated to describe this culture." —Cory Doctorow , Boing Boing Encyclopedia "Reagle provides ample data regarding the everyday practices and cultural norms of the community which collaborates to 3. Good Faith Collaboration produce Wikipedia. His rich research and nuanced appreciation of the complexities of cultural digital media research are 4. The Puzzle of Openness well presented. -
Proceedings of the 3Rd Workshop on Building and Using Comparable
Workshop Programme Saturday, May 22, 2010 9:00-9:15 Opening Remarks Invited talk 9:15 Comparable Corpora Within and Across Languages, Word Frequency Lists and the KELLY Project Adam Kilgarriff 10:30 Break Session 1: Building Comparable Corpora 11:00 Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of Ma- chine Translation Inguna Skadin¸a, Andrejs Vasil¸jevs, Raivis Skadin¸š, Robert Gaizauskas, Dan Tufi¸s and Tatiana Gornostay 11:30 Statistical Corpus and Language Comparison Using Comparable Corpora Thomas Eckart and Uwe Quasthoff 12:00 Wikipedia as Multilingual Source of Comparable Corpora Pablo Gamallo and Isaac González López 12:30 Trillions of Comparable Documents Pascale Fung, Emmanuel Prochasson and Simon Shi 13:00 Lunch break i Saturday, May 22, 2010 (continued) Session 2: Parallel and Comparable Corpora for Machine Translation 14:30 Improving Machine Translation Performance Using Comparable Corpora Andreas Eisele and Jia Xu 15:00 Building a Large English-Chinese Parallel Corpus from Comparable Patents and its Experimental Application to SMT Bin Lu, Tao Jiang, Kapo Chow and Benjamin K. Tsou 15:30 Automatic Terminologically-Rich Parallel Corpora Construction José João Almeida and Alberto Simões 16:00 Break Session 3: Contrastive Analysis 16:30 Foreign Language Examination Corpus for L2-Learning Studies Piotr Banski´ and Romuald Gozdawa-Goł˛ebiowski 17:00 Lexical Analysis of Pre and Post Revolution Discourse in Portugal Michel Généreux, Amália Mendes, L. Alice Santos Pereira and M. Fernanda Bace- lar do Nascimento -
ORES: Lowering Barriers with Participatory Machine Learning in Wikipedia
ORES: Lowering Barriers with Participatory Machine Learning in Wikipedia AARON HALFAKER∗, Microsoft, USA R. STUART GEIGER†, University of California, San Diego, USA Algorithmic systems—from rule-based bots to machine learning classifiers—have a long history of supporting the essential work of content moderation and other curation work in peer production projects. From counter- vandalism to task routing, basic machine prediction has allowed open knowledge projects like Wikipedia to scale to the largest encyclopedia in the world, while maintaining quality and consistency. However, conversations about how quality control should work and what role algorithms should play have generally been led by the expert engineers who have the skills and resources to develop and modify these complex algorithmic systems. In this paper, we describe ORES: an algorithmic scoring service that supports real-time scoring of wiki edits using multiple independent classifiers trained on different datasets. ORES decouples several activities that have typically all been performed by engineers: choosing or curating training data, building models to serve predictions, auditing predictions, and developing interfaces or automated agents that act on those predictions. This meta-algorithmic system was designed to open up socio-technical conversations about algorithms in Wikipedia to a broader set of participants. In this paper, we discuss the theoretical mechanisms of social change ORES enables and detail case studies in participatory machine learning around ORES from the 5 years since its deployment. CCS Concepts: • Networks → Online social networks; • Computing methodologies → Supervised 148 learning by classification; • Applied computing → Sociology; • Software and its engineering → Software design techniques; • Computer systems organization → Cloud computing; Additional Key Words and Phrases: Wikipedia; Reflection; Machine learning; Transparency; Fairness; Algo- rithms; Governance ACM Reference Format: Aaron Halfaker and R. -
Critical Point of View: a Wikipedia Reader
w ikipedia pedai p edia p Wiki CRITICAL POINT OF VIEW A Wikipedia Reader 2 CRITICAL POINT OF VIEW A Wikipedia Reader CRITICAL POINT OF VIEW 3 Critical Point of View: A Wikipedia Reader Editors: Geert Lovink and Nathaniel Tkacz Editorial Assistance: Ivy Roberts, Morgan Currie Copy-Editing: Cielo Lutino CRITICAL Design: Katja van Stiphout Cover Image: Ayumi Higuchi POINT OF VIEW Printer: Ten Klei Groep, Amsterdam Publisher: Institute of Network Cultures, Amsterdam 2011 A Wikipedia ISBN: 978-90-78146-13-1 Reader EDITED BY Contact GEERT LOVINK AND Institute of Network Cultures NATHANIEL TKACZ phone: +3120 5951866 INC READER #7 fax: +3120 5951840 email: [email protected] web: http://www.networkcultures.org Order a copy of this book by sending an email to: [email protected] A pdf of this publication can be downloaded freely at: http://www.networkcultures.org/publications Join the Critical Point of View mailing list at: http://www.listcultures.org Supported by: The School for Communication and Design at the Amsterdam University of Applied Sciences (Hogeschool van Amsterdam DMCI), the Centre for Internet and Society (CIS) in Bangalore and the Kusuma Trust. Thanks to Johanna Niesyto (University of Siegen), Nishant Shah and Sunil Abraham (CIS Bangalore) Sabine Niederer and Margreet Riphagen (INC Amsterdam) for their valuable input and editorial support. Thanks to Foundation Democracy and Media, Mondriaan Foundation and the Public Library Amsterdam (Openbare Bibliotheek Amsterdam) for supporting the CPOV events in Bangalore, Amsterdam and Leipzig. (http://networkcultures.org/wpmu/cpov/) Special thanks to all the authors for their contributions and to Cielo Lutino, Morgan Currie and Ivy Roberts for their careful copy-editing. -
A Wikipedia Reader
UvA-DARE (Digital Academic Repository) Critical point of view: a Wikipedia reader Lovink, G.; Tkacz, N. Publication date 2011 Document Version Final published version Link to publication Citation for published version (APA): Lovink, G., & Tkacz, N. (2011). Critical point of view: a Wikipedia reader. (INC reader; No. 7). Institute of Network Cultures. http://www.networkcultures.org/_uploads/%237reader_Wikipedia.pdf General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl) Download date:05 Oct 2021 w ikipedia pedai p edia p Wiki CRITICAL POINT OF VIEW A Wikipedia Reader 2 CRITICAL POINT OF VIEW A Wikipedia Reader CRITICAL POINT OF VIEW 3 Critical Point of View: A Wikipedia Reader Editors: Geert Lovink -
Analyzing Wikidata Transclusion on English Wikipedia
Analyzing Wikidata Transclusion on English Wikipedia Isaac Johnson Wikimedia Foundation [email protected] Abstract. Wikidata is steadily becoming more central to Wikipedia, not just in maintaining interlanguage links, but in automated popula- tion of content within the articles themselves. It is not well understood, however, how widespread this transclusion of Wikidata content is within Wikipedia. This work presents a taxonomy of Wikidata transclusion from the perspective of its potential impact on readers and an associated in- depth analysis of Wikidata transclusion within English Wikipedia. It finds that Wikidata transclusion that impacts the content of Wikipedia articles happens at a much lower rate (5%) than previous statistics had suggested (61%). Recommendations are made for how to adjust current tracking mechanisms of Wikidata transclusion to better support metrics and patrollers in their evaluation of Wikidata transclusion. Keywords: Wikidata · Wikipedia · Patrolling 1 Introduction Wikidata is steadily becoming more central to Wikipedia, not just in maintaining interlanguage links, but in automated population of content within the articles themselves. This transclusion of Wikidata content within Wikipedia can help to reduce maintenance of certain facts and links by shifting the burden to main- tain up-to-date, referenced material from each individual Wikipedia to a single repository, Wikidata. Current best estimates suggest that, as of August 2020, 62% of Wikipedia ar- ticles across all languages transclude Wikidata content. This statistic ranges from Arabic Wikipedia (arwiki) and Basque Wikipedia (euwiki), where nearly 100% of articles transclude Wikidata content in some form, to Japanese Wikipedia (jawiki) at 38% of articles and many small wikis that lack any Wikidata tran- sclusion. -
Thanks for Stopping By: a Study of “Thanks
Thanks for Stopping By: A Study of “Thanks” Usage on Wikimedia Swati Goel Ashton Anderson Leila Zia [email protected] [email protected] [email protected] Henry M. Gunn High School University of Toronto Wikimedia Foundation ABSTRACT editors a better experience, can therefore increase editor activity. The Thanks feature on Wikipedia, also known as “Thanks”, is a tool A positive environment may actually be one of the most crucial with which editors can quickly and easily send one other positive elements for increasing engagement, as social factors tend to out- feedback [1]. The aim of this project is to better understand this fea- weigh even those surrounding usability with regards to positively ture: its scope, the characteristics of a typical “Thanks” interaction, affecting contribution [3]. The impact of these social factors could and the effects of receiving a thank on individual editors. Westudy be quite significant, as a community member’s internal value sys- the motivational impacts of “Thanks” because maintaining editor tems can be influenced by external rewards, thus making positive engagement is a central problem for crowdsourced repositories of feedback an extremely useful tool in building online communi- knowledge such as Wikimedia. Our main findings are that most ties [6]. The Thanks feature could therefore represent an important editors have not been exposed to the Thanks feature (meaning they resource for building a positive Wiki community. have never given nor received a thank), thanks are typically sent “Thanks” is no longer a new Wiki feature, having been imple- upwards (from less experienced to more experienced editors), and mented on English Wikipedia on May 30th, 2013 and introduced receiving a thank is correlated with having high levels of editor to all projects soon thereafter. -
Wikipedia Data Analysis
Wikipedia data analysis. Introduction and practical examples Felipe Ortega June 29, 2012 Abstract This document offers a gentle and accessible introduction to conduct quantitative analysis with Wikipedia data. The large size of many Wikipedia communities, the fact that (for many languages) the already account for more than a decade of activity and the precise level of detail of records accounting for this activity represent an unparalleled opportunity for researchers to conduct interesting studies for a wide variety of scientific disciplines. The focus of this introductory guide is on data retrieval and data preparation. Many re- search works have explained in detail numerous examples of Wikipedia data analysis, includ- ing numerical and graphical results. However, in many cases very little attention is paid to explain the data sources that were used in these studies, how these data were prepared before the analysis and additional information that is also available to extend the analysis in future studies. This is rather unfortunate since, in general, data retrieval and data preparation usu- ally consumes no less than 75% of the whole time devoted to quantitative analysis, specially in very large datasets. As a result, the main goal of this document is to fill in this gap, providing detailed de- scriptions of available data sources in Wikipedia, and practical methods and tools to prepare these data for the analysis. This is the focus of the first part, dealing with data retrieval and preparation. It also presents an overview of useful existing tools and frameworks to facilitate this process. The second part includes a description of a general methodology to undertake quantitative analysis with Wikipedia data, and open source tools that can serve as building blocks for researchers to implement their own analysis process. -
Wikimedia Foundation Metrics Meeting 5 November 2015 Agenda
Wikimedia Foundation metrics meeting 5 November 2015 Agenda Welcome Jimmy & Lila Strategy update Community update Metrics Showcase: Content translation (CX) Q&A Photo by Benh LIEU SONG, CC BY SA 4.0 Welcome! Requisition hires: Contractors, interns & volunteers: ● Zachary McCune - Comms - SF ● Gretchen Holtman - Advancement - CA ● Chris Schilling - CE - IL ● Natalie Cadranel - Advancement - SF ● Jan Drewniak - Engineering - Poland ● Mohammed Abdulai - CE - Ghana ● Morgan Jue - CE - SF Anniversaries Megan Hernandez (6 yrs) Winifred Olliff (5 yrs) Quim Gil (3 yrs) Amanda Bittaker (1 yr) Grace Gellerman (1 yr) Jacob Rogers (1 yr) Jerry Kim (1 yr) Juliet Barbara (1 yr) Niharika Kohli (1 yr) Sandra George (1 yr) Stas Malyshev (1 yr) Photo by Pink Sherbet Photography, CC BY SA 2.0 Jimmy & Lila Lourdes Cardenal Photo by Ruben Ortega, CC BY SA 4.0 Photo by Lourdes Photo by Lourdes Cardenal, CC BY SA 3.0 Sae Kitamura Photo by Kakidai, CC BY SA 3.0 Jeevan Jose Photo by Muhamed Sherif, CC BY SA 3.0 Photo by Jee Photo by Jeevan Jose, CC BY SA 3.0 Ravan Jaafar Altaie Photo by Adam Novak, CC BY SA 3.0 Manjai Lee Wikiplorer visualization tool Paulina Sánchez Photo by V Grigas, CC BY SA 3.0 Lionel Allorge Photo by V Grigas, CC BY SA 3.0 Photo by Lionel Photo by ToucanWings, CC BY SA 3.0 Culture & Community Thank you!!! 1. Communication? What communication? 2. HR at the C-level. 3. Culture survey. 4. Superprotect is gone! 5. Brown-bags: culture & strategy. Mark your calendars for MONDAY. -
Gender Gap in Wikipedia Editing
5 Gender Gap in Wikipedia Editing A Cross Language Comparison Paolo Massa and Asta Zelenkauskaite INTRODUCTION According to various surveys, the percentage of women editing Wikipedia barely reaches 10 percent.1, 2 The issue of gender distribution on Wikipedia was first brought to public attention by an article in the New York Times on January 2011. The article, titled “Define Gender Gap? Look Up Wikipedia’s Contributor List,”3 started by highlighting how, in just ten years, the Wikipedia community accomplished some remarkable goals, such as reaching more than 3.5 million articles in English and starting an online encyclopedia in more than 250 languages. Yet Wikipedia failed to reach at least a minimal gender balance: according to the United Nations and Maas- tricht University 2010 reported survey, less than 13 percent of contributors were female. A more recent survey carried out in 2011 by the Wikimedia Foundation, the nonprofit organization that coordinates the various Wikipedia projects,4 reported even a steeper gender gap: women account for just 9 percent of editors. The importance of Wikipedia editors’ diversity is relevant since Wikipedia is increasingly becoming one of the most accessed Web sources for information needs. Some 53 percent of American Internet users searched for information on Wikipedia as of May 2010; 88 percent of 2,318 university students used Wikipedia during a course-related research process, and finally, Wikipedia is the sixth most visited site on the entire Web.5 Thus, because so many people read the content of Wikipedia pages, it is important to become aware that these pages reflect the point of view of a pre- dominantly male population.