Wikimedia Foundation Metrics Meeting 5 November 2015 Agenda
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Cultural Anthropology Through the Lens of Wikipedia: Historical Leader Networks, Gender Bias, and News-Based Sentiment
Cultural Anthropology through the Lens of Wikipedia: Historical Leader Networks, Gender Bias, and News-based Sentiment Peter A. Gloor, Joao Marcos, Patrick M. de Boer, Hauke Fuehres, Wei Lo, Keiichi Nemoto [email protected] MIT Center for Collective Intelligence Abstract In this paper we study the differences in historical World View between Western and Eastern cultures, represented through the English, the Chinese, Japanese, and German Wikipedia. In particular, we analyze the historical networks of the World’s leaders since the beginning of written history, comparing them in the different Wikipedias and assessing cultural chauvinism. We also identify the most influential female leaders of all times in the English, German, Spanish, and Portuguese Wikipedia. As an additional lens into the soul of a culture we compare top terms, sentiment, emotionality, and complexity of the English, Portuguese, Spanish, and German Wikinews. 1 Introduction Over the last ten years the Web has become a mirror of the real world (Gloor et al. 2009). More recently, the Web has also begun to influence the real world: Societal events such as the Arab spring and the Chilean student unrest have drawn a large part of their impetus from the Internet and online social networks. In the meantime, Wikipedia has become one of the top ten Web sites1, occasionally beating daily newspapers in the actuality of most recent news. Be it the resignation of German national soccer team captain Philipp Lahm, or the downing of Malaysian Airlines flight 17 in the Ukraine by a guided missile, the corresponding Wikipedia page is updated as soon as the actual event happened (Becker 2012. -
Modeling Popularity and Reliability of Sources in Multilingual Wikipedia
information Article Modeling Popularity and Reliability of Sources in Multilingual Wikipedia Włodzimierz Lewoniewski * , Krzysztof W˛ecel and Witold Abramowicz Department of Information Systems, Pozna´nUniversity of Economics and Business, 61-875 Pozna´n,Poland; [email protected] (K.W.); [email protected] (W.A.) * Correspondence: [email protected] Received: 31 March 2020; Accepted: 7 May 2020; Published: 13 May 2020 Abstract: One of the most important factors impacting quality of content in Wikipedia is presence of reliable sources. By following references, readers can verify facts or find more details about described topic. A Wikipedia article can be edited independently in any of over 300 languages, even by anonymous users, therefore information about the same topic may be inconsistent. This also applies to use of references in different language versions of a particular article, so the same statement can have different sources. In this paper we analyzed over 40 million articles from the 55 most developed language versions of Wikipedia to extract information about over 200 million references and find the most popular and reliable sources. We presented 10 models for the assessment of the popularity and reliability of the sources based on analysis of meta information about the references in Wikipedia articles, page views and authors of the articles. Using DBpedia and Wikidata we automatically identified the alignment of the sources to a specific domain. Additionally, we analyzed the changes of popularity and reliability in time and identified growth leaders in each of the considered months. The results can be used for quality improvements of the content in different languages versions of Wikipedia. -
Proceedings of the 3Rd Workshop on Building and Using Comparable
Workshop Programme Saturday, May 22, 2010 9:00-9:15 Opening Remarks Invited talk 9:15 Comparable Corpora Within and Across Languages, Word Frequency Lists and the KELLY Project Adam Kilgarriff 10:30 Break Session 1: Building Comparable Corpora 11:00 Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of Ma- chine Translation Inguna Skadin¸a, Andrejs Vasil¸jevs, Raivis Skadin¸š, Robert Gaizauskas, Dan Tufi¸s and Tatiana Gornostay 11:30 Statistical Corpus and Language Comparison Using Comparable Corpora Thomas Eckart and Uwe Quasthoff 12:00 Wikipedia as Multilingual Source of Comparable Corpora Pablo Gamallo and Isaac González López 12:30 Trillions of Comparable Documents Pascale Fung, Emmanuel Prochasson and Simon Shi 13:00 Lunch break i Saturday, May 22, 2010 (continued) Session 2: Parallel and Comparable Corpora for Machine Translation 14:30 Improving Machine Translation Performance Using Comparable Corpora Andreas Eisele and Jia Xu 15:00 Building a Large English-Chinese Parallel Corpus from Comparable Patents and its Experimental Application to SMT Bin Lu, Tao Jiang, Kapo Chow and Benjamin K. Tsou 15:30 Automatic Terminologically-Rich Parallel Corpora Construction José João Almeida and Alberto Simões 16:00 Break Session 3: Contrastive Analysis 16:30 Foreign Language Examination Corpus for L2-Learning Studies Piotr Banski´ and Romuald Gozdawa-Goł˛ebiowski 17:00 Lexical Analysis of Pre and Post Revolution Discourse in Portugal Michel Généreux, Amália Mendes, L. Alice Santos Pereira and M. Fernanda Bace- lar do Nascimento -
ORES: Lowering Barriers with Participatory Machine Learning in Wikipedia
ORES: Lowering Barriers with Participatory Machine Learning in Wikipedia AARON HALFAKER∗, Microsoft, USA R. STUART GEIGER†, University of California, San Diego, USA Algorithmic systems—from rule-based bots to machine learning classifiers—have a long history of supporting the essential work of content moderation and other curation work in peer production projects. From counter- vandalism to task routing, basic machine prediction has allowed open knowledge projects like Wikipedia to scale to the largest encyclopedia in the world, while maintaining quality and consistency. However, conversations about how quality control should work and what role algorithms should play have generally been led by the expert engineers who have the skills and resources to develop and modify these complex algorithmic systems. In this paper, we describe ORES: an algorithmic scoring service that supports real-time scoring of wiki edits using multiple independent classifiers trained on different datasets. ORES decouples several activities that have typically all been performed by engineers: choosing or curating training data, building models to serve predictions, auditing predictions, and developing interfaces or automated agents that act on those predictions. This meta-algorithmic system was designed to open up socio-technical conversations about algorithms in Wikipedia to a broader set of participants. In this paper, we discuss the theoretical mechanisms of social change ORES enables and detail case studies in participatory machine learning around ORES from the 5 years since its deployment. CCS Concepts: • Networks → Online social networks; • Computing methodologies → Supervised 148 learning by classification; • Applied computing → Sociology; • Software and its engineering → Software design techniques; • Computer systems organization → Cloud computing; Additional Key Words and Phrases: Wikipedia; Reflection; Machine learning; Transparency; Fairness; Algo- rithms; Governance ACM Reference Format: Aaron Halfaker and R. -
Critical Point of View: a Wikipedia Reader
w ikipedia pedai p edia p Wiki CRITICAL POINT OF VIEW A Wikipedia Reader 2 CRITICAL POINT OF VIEW A Wikipedia Reader CRITICAL POINT OF VIEW 3 Critical Point of View: A Wikipedia Reader Editors: Geert Lovink and Nathaniel Tkacz Editorial Assistance: Ivy Roberts, Morgan Currie Copy-Editing: Cielo Lutino CRITICAL Design: Katja van Stiphout Cover Image: Ayumi Higuchi POINT OF VIEW Printer: Ten Klei Groep, Amsterdam Publisher: Institute of Network Cultures, Amsterdam 2011 A Wikipedia ISBN: 978-90-78146-13-1 Reader EDITED BY Contact GEERT LOVINK AND Institute of Network Cultures NATHANIEL TKACZ phone: +3120 5951866 INC READER #7 fax: +3120 5951840 email: [email protected] web: http://www.networkcultures.org Order a copy of this book by sending an email to: [email protected] A pdf of this publication can be downloaded freely at: http://www.networkcultures.org/publications Join the Critical Point of View mailing list at: http://www.listcultures.org Supported by: The School for Communication and Design at the Amsterdam University of Applied Sciences (Hogeschool van Amsterdam DMCI), the Centre for Internet and Society (CIS) in Bangalore and the Kusuma Trust. Thanks to Johanna Niesyto (University of Siegen), Nishant Shah and Sunil Abraham (CIS Bangalore) Sabine Niederer and Margreet Riphagen (INC Amsterdam) for their valuable input and editorial support. Thanks to Foundation Democracy and Media, Mondriaan Foundation and the Public Library Amsterdam (Openbare Bibliotheek Amsterdam) for supporting the CPOV events in Bangalore, Amsterdam and Leipzig. (http://networkcultures.org/wpmu/cpov/) Special thanks to all the authors for their contributions and to Cielo Lutino, Morgan Currie and Ivy Roberts for their careful copy-editing. -
A Wikipedia Reader
UvA-DARE (Digital Academic Repository) Critical point of view: a Wikipedia reader Lovink, G.; Tkacz, N. Publication date 2011 Document Version Final published version Link to publication Citation for published version (APA): Lovink, G., & Tkacz, N. (2011). Critical point of view: a Wikipedia reader. (INC reader; No. 7). Institute of Network Cultures. http://www.networkcultures.org/_uploads/%237reader_Wikipedia.pdf General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl) Download date:05 Oct 2021 w ikipedia pedai p edia p Wiki CRITICAL POINT OF VIEW A Wikipedia Reader 2 CRITICAL POINT OF VIEW A Wikipedia Reader CRITICAL POINT OF VIEW 3 Critical Point of View: A Wikipedia Reader Editors: Geert Lovink -
Analyzing Wikidata Transclusion on English Wikipedia
Analyzing Wikidata Transclusion on English Wikipedia Isaac Johnson Wikimedia Foundation [email protected] Abstract. Wikidata is steadily becoming more central to Wikipedia, not just in maintaining interlanguage links, but in automated popula- tion of content within the articles themselves. It is not well understood, however, how widespread this transclusion of Wikidata content is within Wikipedia. This work presents a taxonomy of Wikidata transclusion from the perspective of its potential impact on readers and an associated in- depth analysis of Wikidata transclusion within English Wikipedia. It finds that Wikidata transclusion that impacts the content of Wikipedia articles happens at a much lower rate (5%) than previous statistics had suggested (61%). Recommendations are made for how to adjust current tracking mechanisms of Wikidata transclusion to better support metrics and patrollers in their evaluation of Wikidata transclusion. Keywords: Wikidata · Wikipedia · Patrolling 1 Introduction Wikidata is steadily becoming more central to Wikipedia, not just in maintaining interlanguage links, but in automated population of content within the articles themselves. This transclusion of Wikidata content within Wikipedia can help to reduce maintenance of certain facts and links by shifting the burden to main- tain up-to-date, referenced material from each individual Wikipedia to a single repository, Wikidata. Current best estimates suggest that, as of August 2020, 62% of Wikipedia ar- ticles across all languages transclude Wikidata content. This statistic ranges from Arabic Wikipedia (arwiki) and Basque Wikipedia (euwiki), where nearly 100% of articles transclude Wikidata content in some form, to Japanese Wikipedia (jawiki) at 38% of articles and many small wikis that lack any Wikidata tran- sclusion. -
DLDP Digital Language Survival Kit
The Digital Language Diversity Project Digital Language Survival Kit The DLDP Recommendations to Improve Digital Vitality The DLDP Recommendations to Improve Digital Vitality Imprint The DLDP Digital Language Survival Kit Authors: Klara Ceberio Berger, Antton Gurrutxaga Hernaiz, Paola Baroni, Davyth Hicks, Eleonore Kruse, Vale- ria Quochi, Irene Russo, Tuomo Salonen, Anneli Sarhimaa, Claudia Soria This work has been carried out in the framework of The Digital Language Diversity Project (w ww. dldp.eu), funded by the European Union under the Erasmus+ Programme (Grant Agreement no. 2015-1-IT02-KA204- 015090) © 2018 This work is licensed under a Creative Commons Attribution 4.0 International License. Cover design: Eleonore Kruse Disclaimer This publication reflects only the authors’ view and the Erasmus+ National Agency and the Com- mission are not responsible for any use that may be made of the information it contains. www.dldp.eu www.facebook.com/digitallanguagediversity [email protected] www.twitter.com/dldproject 2 The DLDP Recommendations to Improve Digital Vitality Recommendations at a Glance Digital Capacity Recommendations Indicator Level Recommendations Digital Literacy 2,3 Increasing digital literacy among your native language-speaking community 2,3 Promote the upskilling of language mentors, activists or dissemi- nators 2,3 Establish initiatives to inform and educate speakers about how to acquire and use particular communication and content creation skills 2 Teaching digital literacy to children in your language community through -
Thanks for Stopping By: a Study of “Thanks
Thanks for Stopping By: A Study of “Thanks” Usage on Wikimedia Swati Goel Ashton Anderson Leila Zia [email protected] [email protected] [email protected] Henry M. Gunn High School University of Toronto Wikimedia Foundation ABSTRACT editors a better experience, can therefore increase editor activity. The Thanks feature on Wikipedia, also known as “Thanks”, is a tool A positive environment may actually be one of the most crucial with which editors can quickly and easily send one other positive elements for increasing engagement, as social factors tend to out- feedback [1]. The aim of this project is to better understand this fea- weigh even those surrounding usability with regards to positively ture: its scope, the characteristics of a typical “Thanks” interaction, affecting contribution [3]. The impact of these social factors could and the effects of receiving a thank on individual editors. Westudy be quite significant, as a community member’s internal value sys- the motivational impacts of “Thanks” because maintaining editor tems can be influenced by external rewards, thus making positive engagement is a central problem for crowdsourced repositories of feedback an extremely useful tool in building online communi- knowledge such as Wikimedia. Our main findings are that most ties [6]. The Thanks feature could therefore represent an important editors have not been exposed to the Thanks feature (meaning they resource for building a positive Wiki community. have never given nor received a thank), thanks are typically sent “Thanks” is no longer a new Wiki feature, having been imple- upwards (from less experienced to more experienced editors), and mented on English Wikipedia on May 30th, 2013 and introduced receiving a thank is correlated with having high levels of editor to all projects soon thereafter. -
Gender Gap in Wikipedia Editing
5 Gender Gap in Wikipedia Editing A Cross Language Comparison Paolo Massa and Asta Zelenkauskaite INTRODUCTION According to various surveys, the percentage of women editing Wikipedia barely reaches 10 percent.1, 2 The issue of gender distribution on Wikipedia was first brought to public attention by an article in the New York Times on January 2011. The article, titled “Define Gender Gap? Look Up Wikipedia’s Contributor List,”3 started by highlighting how, in just ten years, the Wikipedia community accomplished some remarkable goals, such as reaching more than 3.5 million articles in English and starting an online encyclopedia in more than 250 languages. Yet Wikipedia failed to reach at least a minimal gender balance: according to the United Nations and Maas- tricht University 2010 reported survey, less than 13 percent of contributors were female. A more recent survey carried out in 2011 by the Wikimedia Foundation, the nonprofit organization that coordinates the various Wikipedia projects,4 reported even a steeper gender gap: women account for just 9 percent of editors. The importance of Wikipedia editors’ diversity is relevant since Wikipedia is increasingly becoming one of the most accessed Web sources for information needs. Some 53 percent of American Internet users searched for information on Wikipedia as of May 2010; 88 percent of 2,318 university students used Wikipedia during a course-related research process, and finally, Wikipedia is the sixth most visited site on the entire Web.5 Thus, because so many people read the content of Wikipedia pages, it is important to become aware that these pages reflect the point of view of a pre- dominantly male population. -
Linguistic and Cultural Diversity in Cyberspace
Ministry of Culture of the Russian Federation Federal Agency for Press and Mass Communications of the Russian Federation Government of the Republic of Sakha (Yakutia) Commission of the Russian Federation for UNESCO Russian Committee of the UNESCO Information for All Programme Ammosov North-Eastern Federal University Interregional Library Cooperation Centre Linguistic and Cultural Diversity in Cyberspace Proceedings of the 3nd International Conference (Yakutsk, Russian Federation, 30 June – 3 July 2014) Moscow 2015 Financial support for this publication is provided by the Government of the Republic of Sakha (Yakutia) and the Government of Khanty-Mansiysk Autonomous Okrug-Ugra Compilers: Evgeny Kuzmin, Anastasia Parshakova, Daria Ignatova Translators: Tatiana Butkova and Elena Malyavskaya English text edited by Anastasia Parshakova Editorial board: Evgeny Kuzmin, Sergey Bakeykin, Tatiana Murovana, Anastasia Parshakova, Nadezhda Zaikova Linguistic and Cultural Diversity in Cyberspace. Proceedings of the 3rd International Conference (Yakutsk, Russian Federation, 30 June – 3 July, 2014). – Moscow: Interregional Library Cooperation Centre, 2015. – 408 p. The book includes communications by the participants of the 3rd International Conference on Linguistic and Cultural Diversity in Cyberspace (Yakutsk, Russian Federation, 30 June – 3 July, 2014), where various aspects of topical political, philosophical and technological challenges of preserving multilingualism in the world and developing it in cyberspace were discussed. The authors share national vision and experience of supporting and promoting linguistic and cultural diversity, express their views on the role of education and ICTs in these processes. The authors are responsible for the choice and presentation of facts and for the opinions expressed, which are not necessarily those of the compilers. ISBN 978-5-91515-063-0 © Interregional Library Cooperation Centre, 2015 2 Contents Preface ............................................................................................................................... -
History Characteristics
Wiki 1 Wiki A wiki ( /en-us-wiki.oggˈwɪki/ WIK-ee) is a website that allows the creation and editing of any number of interlinked web pages via a web browser using a simplified markup language or a WYSIWYG text editor.[1] [2] [3] Wikis are typically powered by wiki software and are often used to create collaborative works. Examples include community websites, corporate intranets, knowledge management systems, and note services. The software can also be used for personal notetaking. Wikis serve different purposes. Some permit control over different functions (levels of access). For example editing rights may permit changing, adding or removing material. Others may permit access without enforcing access control. Other rules can be imposed for organizing content. Ward Cunningham, the developer of the first wiki software, WikiWikiWeb, originally described it as "the simplest [4] online database that could possibly work." "Wiki" (pronounced Hawaiian pronunciation: [ˈwiti] or Hawaiian [5] pronunciation: [ˈviti]) is a Hawaiian word for "fast". History WikiWikiWeb was the first wiki.[6] Ward Cunningham started developing WikiWikiWeb in Portland, Oregon, in 1994, and installed it on the Internet domain c2.com [7] [8] on March 25, 1995. It was named by Cunningham, who remembered a Honolulu International Airport counter employee telling him to take the "Wiki Wiki Shuttle" bus that runs between the airport's terminals. According to Cunningham, "I chose wiki-wiki as an alliterative substitute for 'quick' and thereby avoided naming this stuff quick-web."[9] [10] Cunningham was in part inspired by Apple's HyperCard. Apple had Wiki Wiki Shuttle at Honolulu International designed a system allowing users to create virtual "card stacks" Airport supporting links among the various cards.