Computational Social Roles

Total Page:16

File Type:pdf, Size:1020Kb

Computational Social Roles COMPUTATIONAL SOCIAL ROLES Diyi Yang CMU-LTI-19-001 Language Technologies Institute School of Computer Science Carnegie Mellon University 5000 Forbes Ave., Pittsburgh, PA 15213 www.lti.cs.cmu.edu Thesis Committee: Robert E. Kraut, Co-Chair (Carnegie Mellon University) Eduard Hovy, Co-Chair (Carnegie Mellon University) Brandy L. Aven (Carnegie Mellon University) Dan Jurafsky (Stanford University) Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Language and Information Technologies Copyright c 2019 Diyi Yang Keywords: Computational social science, conversational acts, computer-supported cooperative work, generative model, human-computer interaction, machine learning, natural language processing, neural network, online health communities, recommender system, social network, social support, semantics, self-disclosure, social roles, social science, well-being, Wikipedia Abstract Millions of people participate in online communities, exchange expertise and ideas, and collaborate to produce complex artifacts. They often enact a variety of social roles in the process of helping their communities and the public at large, which strongly influence the amount and types of work they do, and how they coordinate their activities. Better understanding members’ roles benefits members by clarifying how they should behave to participate effectively and also benefits the community overall by encouraging members to contribute in ways that best use their skills and interests. Social sciences have provided rich theoretic taxonomies of social roles within groups, while natural language processing techniques enable us to automate the identification of social roles in online communities. However, most social science work has focused on generic roles without accommo- dating the activities associated with tasks in specific contexts or automat- ing the process of role identification. While there has been work to date about automatic role inference, identification of social roles has not had a corresponding strong emphasis in the language technologies community. A variety of methods were developed to extract specific “roles” or patterns in different contexts, lacking generalized definitions about what are roles and systematic methods to extract roles. Moreover, how roles change over time and how the awareness of roles influence role holders’ performance and the group production, have not been adequately researched in both fields. This thesis advocates for both theories of social science and models of text analysis to better define roles, develop ways to extract roles and optimally recommend roles to users. Concretely, this work defines what are social roles, introduces five measurable facets associated with social roles, and pro- posed a generic methodology for role identification. It also demonstrated how to computationally model social role and its facets in two socially im- portant contexts - Wikipedia and Cancer Survivor Network. Via combining theories of social roles and computational models for role identification on those two large-scale contexts, this research reveals details about emergent, behavioral, and functioning roles, and a set of computational techniques to identify such roles via fine-grained operationalization of role holders’ behaviors. This work fills the longstanding gap in role theory and empirical modeling about emergent roles in online communities, and lays the foun- dation for future work to identify and analyze roles that people enacted in group processes both online and offline. iv Acknowledgments I’m always grateful to my adviser Bob Kraut — the best adviser one could ever have. He is a great scholar and acts as a tremendous role model for me. Over the last several years we work together, I learned from him what is research and how to do rigorous research. I was trained in computer science, but Bob equipped me well with knowledge and mindset from social science. Although the journey of PhD is like taking a roller coaster, Bob provides as much support as I needed no matter there were ups or downs. These years of working with Bob is the most valuable part of my experiences at CMU. I’m very lucky to have Eduard Hovy as my co-adviser. He always en- courages me to go bigger, deeper and broader of my research direction. He is knowledgeable, visionary, and teaches me on how to think critically. Our advise meetings were not only about specific research projects, but also discussions around research visions, career plans, personality, leadership, etc. He believes in me, sets the right goals for me although I felt lost and not confident about myself most of the times. Without his encouragement and support, I would have still been the shy and timid myself. I’m deeply honored to have Brandy Aven and Dan Jurafsky on my thesis committee. I benefit a lot from Brandy’s expertise, who always push me to think of the big picture around my research. Dan is supportive, insightful, brilliant, and one of the great leaders in our field. I’m grateful that he hosted my three-month visit to Stanford NLP group. It was such a joy working with Dan and he provided me a lot of guidance and help over our meetings. I thank LTI and HCI faculty Alan Black, Jamie Callan, Chris Dyer, Alex Hauptmann, Alon Lavie, Louis-Philippe Morency, Graham Neubig, Carolyn Rosé, Yulia Tsvetkov and Laura Dabbish for their advice and help to my research or job-seeking. A special thanks to the administrative stuff: Stacey Young, Christy Melucci, Laura Alford, and JaRon Pitts. I would like to thank Aaron Halfaker from Wikimedia Foundation, Bin Gao, Tie-Yan Liu, Scott Counts from Microsoft Research, Ryan Ritter and Mark Handel from Facebook Research, for hosting my internships. I’m extremely grateful for numerous friends and collaborators, including David Adamson, Waleed Ammar, Danqi Chen, Jiaxin Chen, Vivian Yung- Nung Chen, Zihang Dai, Sauvik Das, Pradeep Dasigi, Yulun Du, Manaal Faruqui, Fang Fei, Kartik Goyal, Liangke Gui, Anhong Guo, Hyeju Jang, He He, Ting-Hao Kenneth Huang, Junjie Hu, Liang He, Zhiting Hu, Dirk Hovy, Haojian Jin, Yohan Jo, Guokun Lai, Huiying Li, Mu Li, Jiwei Li, Bin Liu, Hanxiao Liu, Zhengzhong Liu, Zack Lipton, Anna Kasunic, Xuezhe Ma, Elijah Mayfield, Will Monroe, Felicia Ng, Joseph Seering, Chenxi Song, Lu Sun, Di Wang, Haohan Wang, Ling Wang, Xu Wang, William Yang Wang, Yi- Chia Wang, Yu-Xiang Wang, Fan Wei, Miaomiao Wen, Yuexin Wu, Pengtao Xie, Qizhe Xie, Zhang Yang, Zhilin Yang, Zi Yang, Zichao Yang, Zheng Yao, Pengcheng Yin, Zhou Yu, Hao Zhang, Shikun Zhang, Yang Zhang, Guoqing Zheng, Haiyi Zhu, and many others. I want to specifically thank Chenxi, Jiaxin, Joseph, Shikun, Xu, and Zheng for being a listening ear and helping me overcome stress and hard times in life and work. I would like to thank my undergraduate adviser Yong Yu (and our ACM Honored Class), mentors Tianqi Chen and Weinan Zhang who started me on my first several research projects. I thank my parents for raising me up, for their unconditional support, and for guiding me to where I am today. I would like to thank my boyfriend Peter for his patience, encouragement, trust, and love along the way. Finally, I thank Facebook PhD Fellowship, CMU Presidential Fellowship and funds from Google and NIH for generously supporting this research. vi Contents 1 Introduction1 1.1 Thesis Overview.................................4 1.1.1 Role Theory................................4 1.1.2 Methodology for Identifying Roles...................4 1.1.3 Case Studies of Role Identification...................4 1.2 Research Impact..................................6 2 Role Theory9 2.1 Theoretical Modeling............................... 10 2.2 Computational Modeling............................ 12 2.3 Role Framework.................................. 14 2.3.1 Agent.................................... 14 2.3.2 Interaction................................. 16 2.3.3 Goal.................................... 17 2.3.4 Expectation................................ 18 2.3.5 Context................................... 20 2.4 Relevant Processes about Roles......................... 21 2.5 Summary...................................... 23 3 Methodology for Identifying Roles 25 3.1 Generic Methodology............................... 25 3.1.1 Role Postulation............................. 26 3.1.2 Role Definition.............................. 27 3.1.3 Role Identification............................ 30 3.1.4 Role Evaluation.............................. 34 3.2 Iterative Process of Identification........................ 36 3.3 Reflection...................................... 39 vii I Role Identification on Wikipedia 41 4 Identifying Roles of Editors 43 4.1 Introduction.................................... 44 4.2 Related Work................................... 46 4.3 Research Question and Data........................... 48 4.4 Predicting Edit Categories............................ 48 4.4.1 Edit Categories Construction...................... 49 4.4.2 Feature Space Design........................... 50 4.5 Modeling Editor Roles.............................. 52 4.5.1 Role Identification Method....................... 53 4.5.2 Derived Roles Exploration and Validation............... 54 4.6 Improving Article Quality............................ 57 4.6.1 Model Design............................... 58 4.6.2 Result Discussion............................. 62 4.7 Discussion and Conclusion........................... 63 4.8 Reflection...................................... 64 5 Identifying Semantic Edit Intention 65 5.1 Introduction...................................
Recommended publications
  • Universality, Similarity, and Translation in the Wikipedia Inter-Language Link Network
    In Search of the Ur-Wikipedia: Universality, Similarity, and Translation in the Wikipedia Inter-language Link Network Morten Warncke-Wang1, Anuradha Uduwage1, Zhenhua Dong2, John Riedl1 1GroupLens Research Dept. of Computer Science and Engineering 2Dept. of Information Technical Science University of Minnesota Nankai University Minneapolis, Minnesota Tianjin, China {morten,uduwage,riedl}@cs.umn.edu [email protected] ABSTRACT 1. INTRODUCTION Wikipedia has become one of the primary encyclopaedic in- The world: seven seas separating seven continents, seven formation repositories on the World Wide Web. It started billion people in 193 nations. The world's knowledge: 283 in 2001 with a single edition in the English language and has Wikipedias totalling more than 20 million articles. Some since expanded to more than 20 million articles in 283 lan- of the content that is contained within these Wikipedias is guages. Criss-crossing between the Wikipedias is an inter- probably shared between them; for instance it is likely that language link network, connecting the articles of one edition they will all have an article about Wikipedia itself. This of Wikipedia to another. We describe characteristics of ar- leads us to ask whether there exists some ur-Wikipedia, a ticles covered by nearly all Wikipedias and those covered by set of universal knowledge that any human encyclopaedia only a single language edition, we use the network to under- will contain, regardless of language, culture, etc? With such stand how we can judge the similarity between Wikipedias a large number of Wikipedia editions, what can we learn based on concept coverage, and we investigate the flow of about the knowledge in the ur-Wikipedia? translation between a selection of the larger Wikipedias.
    [Show full text]
  • Volunteer Contributions to Wikipedia Increased During COVID-19 Mobility Restrictions
    Volunteer contributions to Wikipedia increased during COVID-19 mobility restrictions Thorsten Ruprechter1,*, Manoel Horta Ribeiro2, Tiago Santos1, Florian Lemmerich3, Markus Strohmaier3,4, Robert West2, and Denis Helic1 1Graz University of Technology, 8010 Graz, Austria 2EPFL, 1015 Lausanne, Switzerland 3RWTH Aachen University, 52062 Aachen, Germany 4GESIS – Leibniz Institute for the Social Sciences, 50667 Cologne, Germany *Corresponding author ([email protected]) Wikipedia, the largest encyclopedia ever created, is a global initiative driven by volunteer contribu- tions. When the COVID-19 pandemic broke out and mobility restrictions ensued across the globe, it was unclear whether Wikipedia volunteers would become less active in the face of the pandemic, or whether they would rise to meet the increased demand for high-quality information despite the added stress inflicted by this crisis. Analyzing 223 million edits contributed from 2018 to 2020 across twelve Wikipedia language editions, we find that Wikipedia’s global volunteer community responded remarkably to the pandemic, substantially increasing both productivity and the number of newcom- ers who joined the community. For example, contributions to the English Wikipedia increased by over 20% compared to the expectation derived from pre-pandemic data. Our work sheds light on the response of a global volunteer population to the COVID-19 crisis, providing valuable insights into the behavior of critical online communities under stress. Wikipedia is the world’s largest encyclopedia, one of the most prominent volunteer-based information systems in existence [18, 29], and one of the most popular destinations on the Web [2]. On an average day in 2019, users from around the world visited Wikipedia about 530 million times and editors voluntarily contributed over 870 thousand edits to one of Wikipedia’s language editions (Supplementary Table 1).
    [Show full text]
  • Multilingual Ranking of Wikipedia Articles with Quality and Popularity Assessment in Different Topics
    computers Article Multilingual Ranking of Wikipedia Articles with Quality and Popularity Assessment in Different Topics Włodzimierz Lewoniewski * , Krzysztof W˛ecel and Witold Abramowicz Department of Information Systems, Pozna´nUniversity of Economics and Business, 61-875 Pozna´n,Poland * Correspondence: [email protected]; Tel.: +48-(61)-639-27-93 Received: 10 May 2019; Accepted: 13 August 2019; Published: 14 August 2019 Abstract: On Wikipedia, articles about various topics can be created and edited independently in each language version. Therefore, the quality of information about the same topic depends on the language. Any interested user can improve an article and that improvement may depend on the popularity of the article. The goal of this study is to show what topics are best represented in different language versions of Wikipedia using results of quality assessment for over 39 million articles in 55 languages. In this paper, we also analyze how popular selected topics are among readers and authors in various languages. We used two approaches to assign articles to various topics. First, we selected 27 main multilingual categories and analyzed all their connections with sub-categories based on information extracted from over 10 million categories in 55 language versions. To classify the articles to one of the 27 main categories, we took into account over 400 million links from articles to over 10 million categories and over 26 million links between categories. In the second approach, we used data from DBpedia and Wikidata. We also showed how the results of the study can be used to build local and global rankings of the Wikipedia content.
    [Show full text]
  • Florida State University Libraries
    )ORULGD6WDWH8QLYHUVLW\/LEUDULHV 2020 Wiki-Donna: A Contribution to a More Gender-Balanced History of Italian Literature Online Zoe D'Alessandro Follow this and additional works at DigiNole: FSU's Digital Repository. For more information, please contact [email protected] THE FLORIDA STATE UNIVERSITY COLLEGE OF ARTS & SCIENCES WIKI-DONNA: A CONTRIBUTION TO A MORE GENDER-BALANCED HISTORY OF ITALIAN LITERATURE ONLINE By ZOE D’ALESSANDRO A Thesis submitted to the Department of Modern Languages and Linguistics in partial fulfillment of the requirements for graduation with Honors in the Major Degree Awarded: Spring, 2020 The members of the Defense Committee approve the thesis of Zoe D’Alessandro defended on April 20, 2020. Dr. Silvia Valisa Thesis Director Dr. Celia Caputi Outside Committee Member Dr. Elizabeth Coggeshall Committee Member Introduction Last year I was reading Una donna (1906) by Sibilla Aleramo, one of the most important ​ ​ works in Italian modern literature and among the very first explicitly feminist works in the Italian language. Wanting to know more about it, I looked it up on Wikipedia. Although there exists a full entry in the Italian Wikipedia (consisting of a plot summary, publishing information, and external links), the corresponding page in the English Wikipedia consisted only of a short quote derived from a book devoted to gender studies, but that did not address that specific work in great detail. As in-depth and ubiquitous as Wikipedia usually is, I had never thought a work as important as this wouldn’t have its own page. This discovery prompted the question: if this page hadn’t been translated, what else was missing? And was this true of every entry for books across languages, or more so for women writers? My work in expanding the entry for Una donna was the beginning of my exploration ​ ​ into the presence of Italian women writers in the Italian and English Wikipedias, and how it relates back to canon, Wikipedia, and gender studies.
    [Show full text]
  • Does Wikipedia Matter? the Effect of Wikipedia on Tourist Choices Marit Hinnosaar, Toomas Hinnosaar, Michael Kummer, and Olga Slivko Discus­­ Si­­ On­­ Paper No
    Dis cus si on Paper No. 15-089 Does Wikipedia Matter? The Effect of Wikipedia on Tourist Choices Marit Hinnosaar, Toomas Hinnosaar, Michael Kummer, and Olga Slivko Dis cus si on Paper No. 15-089 Does Wikipedia Matter? The Effect of Wikipedia on Tourist Choices Marit Hinnosaar, Toomas Hinnosaar, Michael Kummer, and Olga Slivko First version: December 2015 This version: September 2017 Download this ZEW Discussion Paper from our ftp server: http://ftp.zew.de/pub/zew-docs/dp/dp15089.pdf Die Dis cus si on Pape rs die nen einer mög lichst schnel len Ver brei tung von neue ren For schungs arbei ten des ZEW. Die Bei trä ge lie gen in allei ni ger Ver ant wor tung der Auto ren und stel len nicht not wen di ger wei se die Mei nung des ZEW dar. Dis cus si on Papers are inten ded to make results of ZEW research prompt ly avai la ble to other eco no mists in order to encou ra ge dis cus si on and sug gesti ons for revi si ons. The aut hors are sole ly respon si ble for the con tents which do not neces sa ri ly repre sent the opi ni on of the ZEW. Does Wikipedia Matter? The Effect of Wikipedia on Tourist Choices ∗ Marit Hinnosaar† Toomas Hinnosaar‡ Michael Kummer§ Olga Slivko¶ First version: December 2015 This version: September 2017 September 25, 2017 Abstract We document a causal influence of online user-generated information on real- world economic outcomes. In particular, we conduct a randomized field experiment to test whether additional information on Wikipedia about cities affects tourists’ choices of overnight visits.
    [Show full text]
  • Wikipedia Matters∗
    Wikipedia Matters∗ Marit Hinnosaar† Toomas Hinnosaar‡ Michael Kummer§ Olga Slivko¶ September 29, 2017 Abstract We document a causal impact of online user-generated information on real-world economic outcomes. In particular, we conduct a randomized field experiment to test whether additional content on Wikipedia pages about cities affects tourists’ choices of overnight visits. Our treatment of adding information to Wikipedia increases overnight visits by 9% during the tourist season. The impact comes mostly from improving the shorter and incomplete pages on Wikipedia. These findings highlight the value of content in digital public goods for informing individual choices. JEL: C93, H41, L17, L82, L83, L86 Keywords: field experiment, user-generated content, Wikipedia, tourism industry 1 Introduction Asymmetric information can hinder efficient economic activity. In recent decades, the Internet and new media have enabled greater access to information than ever before. However, the digital divide, language barriers, Internet censorship, and technological con- straints still create inequalities in the amount of accessible information. How much does it matter for economic outcomes? In this paper, we analyze the causal impact of online information on real-world eco- nomic outcomes. In particular, we measure the impact of information on one of the primary economic decisions—consumption. As the source of information, we focus on Wikipedia. It is one of the most important online sources of reference. It is the fifth most ∗We are grateful to Irene Bertschek, Avi Goldfarb, Shane Greenstein, Tobias Kretschmer, Thomas Niebel, Marianne Saam, Greg Veramendi, Joel Waldfogel, and Michael Zhang as well as seminar audiences at the Economics of Network Industries conference in Paris, ZEW Conference on the Economics of ICT, and Advances with Field Experiments 2017 Conference at the University of Chicago for valuable comments.
    [Show full text]
  • Wikipedia Matters∗
    Wikipedia Matters∗ Marit Hinnosaar† Toomas Hinnosaar‡ Michael Kummer§ Olga Slivko¶ July 14, 2019 Abstract We document a causal impact of online user-generated information on real-world economic outcomes. In particular, we conduct a randomized field experiment to test whether additional content on Wikipedia pages about cities affects tourists’ choices of overnight visits. Our treatment of adding information to Wikipedia in- creases overnight stays in treated cities compared to non-treated cities. The impact is largely driven by improvements to shorter and relatively incomplete pages on Wikipedia. Our findings highlight the value of content in digital public goods for informing individual choices. JEL: C93, H41, L17, L82, L83, L86 Keywords: field experiment, user-generated content, Wikipedia, tourism industry ∗We are grateful to Irene Bertschek, Avi Goldfarb, Shane Greenstein, Tobias Kretschmer, Michael Luca, Thomas Niebel, Marianne Saam, Greg Veramendi, Joel Waldfogel, and Michael Zhang as well as seminar audiences at the Economics of Network Industries conference (Paris), ZEW Conference on the Economics of ICT (Mannheim), Advances with Field Experiments 2017 Conference (University of Chicago), 15th Annual Media Economics Workshop (Barcelona), Conference on Digital Experimentation (MIT), and Digital Economics Conference (Toulouse) for valuable comments. Ruetger Egolf, David Neseer, and Andrii Pogorielov provided outstanding research assistance. Financial support from SEEK 2014 is gratefully acknowledged. †Collegio Carlo Alberto and CEPR, [email protected] ‡Collegio Carlo Alberto, [email protected] §Georgia Institute of Technology & University of East Anglia, [email protected] ¶ZEW, [email protected] 1 1 Introduction Asymmetric information can hinder efficient economic activity (Akerlof, 1970). In recent decades, the Internet and new media have enabled greater access to information than ever before.
    [Show full text]
  • Companies and Wikipedia: Friend Or Foe?
    Lundquist Wikipedia Research 2017. Since 2008 COMPANIES AND WIKIPEDIA: FRIEND OR FOE? In this report: European, Italian, German and Swiss editions (Updated in May 2017) Wikipedia is the fifth most visited website in the world, with articles about companies perennially well positioned on the first page of search results. Yet despite this visibility, the articles about the 100 largest companies in Europe, the 100 largest in Italy, the 30 German companies included in the DAX 30 index and the 48 largest in Switzerland often lack information and present critical issues. With the already small number of active Wikipedia editors decreasing, this situation is likely to worsen. Some companies think that by editing articles about themselves they have an easy workaround. However, any company that does so runs the risk of violating Wikipedia’s rules, therefore creating a hostile environment. The potential resulting reputational backlash would be enormous. Since the first edition in 2008 our research revealed the low quality of the vast majority of company Wikipedia pages. For this reason, Lundquist has refined a set of tested guidelines, in order to help companies to safely approach the free encyclopedia as well as engage with its vast online community in a constructive manner. This proposed alliance entails abiding by Wikipedia’s rules so as to ensure information is accurate. When done correctly, a Wikipedia article is a win for both the encyclopedia and companies. 1 - Lundquist Wikipedia Research 2015 LUNDQUIST WIKIPEDIA RESEARCH As part of its research into online corporate CONTENTS information, the Lundquist Wikipedia Research covers the article content of major corporations.
    [Show full text]
  • Quality Assessment Process in Wikipedia's Vetrina
    Observatorio (OBS*) Journal, 8 (2009), 148-161 1646-5954/ERC123483/2009 148 Quality assessment process in Wikipedia’s Vetrina: the role of the community’s policies and rules Sara Monaci, Università degli Studi di Torino Abstract The increasing growth of Wikipedia poses many questions about its organizational model and its development as a free-open knowledge repository. Yochai Benkler describes Wikipedia as a CBPP (commons-based peer production) system: a platform which enables users to easily generate knowledge contents and to manage them collaboratively and on free-voluntary basis. Quality is one of the main concerns related to such a system. How would a CBPP environment guarantee at the same time the openness of its organization and a good level of accreditation? The paper offers an overview of the quality assessment processes in it.wiki’s Vetrina section. It also suggests an explanation to quality assessment which questions Benkler’s hypothesis. Thanks to a qualitative analysis carried out through in-depth interviews to Wikipedia users and through a period of ethnographic observation, the paper outlines Vetrina’s organization and the factors related to the evaluation of quality contents. Copyright © 2009 (Sara Monaci). Licensed under the Creative Commons Attribution Noncommercial No Derivatives (by-nc-nd). Available at http://obs.obercom.pt. Observatorio (OBS*) Journal, 8 (2009) Sara Monaci 139 Introduction. Wikipedia as a tool for collective knowledge production Since its appearance in 2001, Wikipedia gained progressively the attention of media and Internet users as an exemplary project of on line collaboration. The biggest Web Encyclopaedia includes today 10 million articles in more than 250 languages and involves about 75.000 active users distributed worldwide (http://en.wikipedia.org/wiki/Wikipedia:About).
    [Show full text]
  • Wikipedia @ 20
    Wikipedia @ 20 Wikipedia @ 20 Stories of an Incomplete Revolution Edited by Joseph Reagle and Jackie Koerner The MIT Press Cambridge, Massachusetts London, England © 2020 Massachusetts Institute of Technology This work is subject to a Creative Commons CC BY- NC 4.0 license. Subject to such license, all rights are reserved. The open access edition of this book was made possible by generous funding from Knowledge Unlatched, Northeastern University Communication Studies Department, and Wikimedia Foundation. This book was set in Stone Serif and Stone Sans by Westchester Publishing Ser vices. Library of Congress Cataloging-in-Publication Data Names: Reagle, Joseph, editor. | Koerner, Jackie, editor. Title: Wikipedia @ 20 : stories of an incomplete revolution / edited by Joseph M. Reagle and Jackie Koerner. Other titles: Wikipedia at 20 Description: Cambridge, Massachusetts : The MIT Press, [2020] | Includes bibliographical references and index. Identifiers: LCCN 2020000804 | ISBN 9780262538176 (paperback) Subjects: LCSH: Wikipedia--History. Classification: LCC AE100 .W54 2020 | DDC 030--dc23 LC record available at https://lccn.loc.gov/2020000804 Contents Preface ix Introduction: Connections 1 Joseph Reagle and Jackie Koerner I Hindsight 1 The Many (Reported) Deaths of Wikipedia 9 Joseph Reagle 2 From Anarchy to Wikiality, Glaring Bias to Good Cop: Press Coverage of Wikipedia’s First Two Decades 21 Omer Benjakob and Stephen Harrison 3 From Utopia to Practice and Back 43 Yochai Benkler 4 An Encyclopedia with Breaking News 55 Brian Keegan 5 Paid with Interest: COI Editing and Its Discontents 71 William Beutler II Connection 6 Wikipedia and Libraries 89 Phoebe Ayers 7 Three Links: Be Bold, Assume Good Faith, and There Are No Firm Rules 107 Rebecca Thorndike- Breeze, Cecelia A.
    [Show full text]
  • A Systematic Review of Scholarly Research on Wikipedia
    Downloaded from orbit.dtu.dk on: Oct 06, 2021 The People’s Encyclopedia Under the Gaze of the Sages: A Systematic Review of Scholarly Research on Wikipedia Okoli, Chitu; Mehdi, Mohamad; Mesgari, Mostafa; Nielsen, Finn Årup; Lanamäki, Arto Link to article, DOI: 10.2139/ssrn.2021326 Publication date: 2012 Document Version Publisher's PDF, also known as Version of record Link back to DTU Orbit Citation (APA): Okoli, C., Mehdi, M., Mesgari, M., Nielsen, F. Å., & Lanamäki, A. (2012). The People’s Encyclopedia Under the Gaze of the Sages: A Systematic Review of Scholarly Research on Wikipedia. https://doi.org/10.2139/ssrn.2021326 General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. Users may download and print one copy of any publication from the public portal for the purpose of private study or research. You may not further distribute the material or use it for any profit-making activity or commercial gain You may freely distribute the URL identifying the publication in the public portal If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. The people’s encyclopedia under the gaze of the sages: A systematic review of scholarly research on Wikipedia Chitu Okoli John Molson School of Business,
    [Show full text]
  • Wikipedia Seeks to Speak Your Language Using Language Detection to Search the Right Wikipedia by Trey Jones, Software Engineer, Wikimedia Foundation
    Wikipedia seeks to speak your language Using language detection to search the right Wikipedia by Trey Jones, Software Engineer, Wikimedia Foundation Photo by Colin, CC BY­SA 4.0. ​ ​ ​ Wikipedia readers speak many languages. Sometimes they search for phrases not in the language of the current wiki. This unfortunately leads to poor search results. A recent survey of English Wikipedia queries identified searches in 40 different languages! ​ ​ ​ To better help people find what they are looking for, the Wikimedia Discovery department has ​ ​ ​ rolled out language identification software to detect and then redirect unsuccessful searches in other languages (if needed). Searches will be redirected to a Wikipedia that is more likely to provide results, and those cross­wiki results will be displayed along with the local­wiki results, if any. We’ve started by enabling the language identification and search redirection for English, French, German, Italian, and Spanish Wikipedias, with more to come. “Foreign language” searches on a given Wikipedia can work just fine—phrases and mottos ​ ​ (scientia est lux lucis, ἀγεωμέτρητος μηδεὶς εἰσίτω), places (さいたま市, ᏣᎳᎩᎯ ᎠᏰᎵ), people ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ (จาพนม ยีรมย , Екатерина Гордеева), events (春节, ), and other things (한글, ܐܝܡܪܐ, Հայերեն ​ ั ์ ​ ​ ​ ​ ​ ​ ​होल ​ ​ ​ ​ ​ ​ Վիքիփեդիա)—but they don’t work quite so well in many cases. Hence the idea to try to detect ​ the language of unsuccessful searches—i.e., those that get fewer than three results—and redirect them to another Wikipedia when appropriate. The automatic language detection works similarly to the way you might identify a language you don’t know—by looking for bits of text that “look like” a particular language.
    [Show full text]