The Production and Consumption of Quality Content in Peer Production Communities

Total Page:16

File Type:pdf, Size:1020Kb

The Production and Consumption of Quality Content in Peer Production Communities The Production and Consumption of Quality Content in Peer Production Communities A Dissertation SUBMITTED TO THE FACULTY OF THE UNIVERSITY OF MINNESOTA BY Morten Warncke-Wang IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Loren Terveen and Brent Hecht December, 2016 Copyright c 2016 M. Warncke-Wang All rights reserved. Portions copyright c Association for Computing Machinery (ACM) and is used in accordance with the ACM Author Agreement. Portions copyright c Association for the Advancement of Artificial Intelligence (AAAI) and is used in accordance with the AAAI Distribution License. Map tiles used in illustrations are from Stamen Design and published under a CC BY 3.0 license, and from OpenStreetMap and published under the ODbL. Dedicated in memory of John. T. Riedl, Ph.D professor, advisor, mentor, squash player, friend i Acknowledgements This work would not be possible without the help and support of a great number of people. I would first like to thank my family, in particular my parents Kari and Hans who have had to see yet another of their offspring fly off to the United States for a prolonged period of time. They have supported me in pursuing my dreams even when it meant I would be away from them. I also thank my brother Hans and his fiancée Anna for helping me get settled in Minnesota, solving everyday problems, and for the good company whenever I stopped by. Likewise, I thank my brother Carl and his wife Bente whom I have only been able to see ever so often, but it has always been moments I have cherished. Thanks as well to my nieces and nephews who brighten up the day whenever I get to visit, and help me remember that discovering the world is always a lot of fun! John Riedl was a fantastic advisor and mentor to me, and Loren Terveen and Brent Hecht did an excellent job of helping me refocus and keep going after John’s unfortunate passing. Since then, Loren and Brent have been wonderful advisors who have kept me on track with producing solid research and I am grateful for everything I have learned from them. I would also like to thank the other two members of my committee, Ching Ren and John Carlis, who have both showed me how to be a better researcher. Thanks as well to everyone at GroupLens Research for creating and supporting an environment where science prospers. First of all, thank you to Joe Konstan, John Riedl, and Loren Terveen for seeing my potential when I did not myself and being instrumental in my move to Minnesota to pursue a Ph.D. Secondly, thank you to my senior students who led by excellent example and helped shape my work: Tony Lam, Aaron Halfaker, Katie Panciera, and Michael Ekstrand. Thanks also to the undergraduate students I got to work and publish papers with: Vlad Ayukaev, Vivek Ranjan, and Connor McMahon. I would also like to thank my fellow graduate stu- dents Jacob Thebault-Spieker and Isaac Johnson for helping me understand how geographic science works and helping me make sense of our findings. GroupLens has also had some great lab assistants and I thank them for helping out with ev- eryday issues, in particular I would like to thank Angela Brandt for welcoming me to GroupLens and helping me understand how things work in the Midwest. Lastly, ii as mentioned GroupLens has a fantastic environment where science prospers, and I would have liked to name every single person for being awesome and helping that happen! My fellow researchers also deserve many thanks. Dan Cosley helped make our Wikipedia machine learning work a reality and I have greatly benefited from his research experience and perspectives. He also created SuggestBot, which has now been serving recommendations to Wikipedia contributors for about ten years and keeps going strong. Zhenhua Dong was a wonderful collaborator on my first paper and his tireless efforts still inspire me. Last, but very much not the least, thank you to Anuradha Uduwage for defying categorisation by being a great collaborator, graduate student, and friend. I would also like to thank my fellow researchers at the Wikimedia Foundation for the opportunities they have given me as a Research Fellow there. Thanks also goes to students and faculty at the University of Washington for welcoming me to the Pacific Northwest. In particular I would like to thank professors Benjamin Mako Hill and David McDonald for providing me with places to contribute and study, as well as graduate students Amirah Majid and Amanda Menking for making the research lab a place that feels like home. I am also sending thanks to my friends on both sides of the Atlantic ocean who help keep my eyes open to fresh perspectives, have my back when I struggle, make sure I laugh, and bring a smile to my face whenever I think of them. I cannot thank my girlfriend Cameo enough for her support over the past three and a half years as I have worked on completing this thesis. She has helped me find alternative perspectives, asked challenging questions, but first and foremost made sure that I do not go through a single day without feeling appreciated nor without at least one good laugh. Finally, I would like to thank the National Science Foundation for financial support under a variety of grants: IIS 08-08692, IIS 08-45351, IIS 09-68483, and IIS 11-11201. iii Abstract Over the past 25 years, commons-based peer production [Ben02] has be- come a vital part of the information technology landscape. There are successful projects in different areas such as open source software (e.g. Apache and Fire- fox), encyclopedias (e.g. Wikipedia), and map data (e.g. OpenStreetMap). A common theme in all these communities is that they are mainly volunteer-driven and that contributors are able to self-select what they want to work on. Studies on contributor motivation in peer production have found “fun” and “appropriate challenges” to be strong factors [Nov07; LW05], both associated with the sen- sation of vital engagement often referred to as “flow” [NC03]. Peer production contributors also often refer to altruism, the desire to be helpful to others, as a motivating factor [BH13]. To what extent does this bottom-up, interest-driven, volunteer-based content production paradigm meet the needs of consumers of this content? This thesis presents our work on improving our understanding of how peer production communities produce quality content and whether said quality con- tent is produced in areas where there is demand for it. We study this from three perspectives and make contributions as follows: we investigate what textual features describe content quality in Wikipedia and develop a high-performance prediction model solely based on features contributors can easily improve (so called “actionable features”); we apply a coherent framework for describing and evaluating quality improvement projects in order to discover factors associated with the success and failure of these types of projects; we introduce an analyt- ical framework that allows is to identify the misalignment between supply of and demand for quality content in peer production communities and measure the impact this has on a community’s audience. iv The research presented in this thesis provides us with a deeper understand- ing of quality content in peer production communities. These communities have created software, encyclopedic content, and maps that in many ways improve our everyday lives as well as those of millions of others. At the same time we have identified areas where there is a shortage of quality content and discussed future work that can help reduce this problem. This thesis thus lays the foun- dation upon which we can build improved communities and positively impact a large part of the world’s population. v Contents List of Tables viii List of Figures x 1 Introduction 1 2 Background 5 2.1 Peer-produced Content Quality . 6 2.2 How is Content Quality Assessed? . 9 2.3 Can We Automate Quality Assessment? . 10 2.4 The Production of Quality Content . 13 2.5 Consumption of Quality Content . 18 2.6 Misalignment Between Production and Consumption . 19 3 Predicting Content Quality Using Actionable Features 22 3.1 An Actionable Quality Model . 24 3.2 Improving the Actionable Quality Model . 41 3.3 Usage and Impact . 48 4 Understanding Quality Improvement Projects 50 4.1 Unified Descriptive Framework . 50 vi Contents 4.2 The Quality Improvement Projects Studied . 51 4.3 Datasets . 55 4.4 Measuring Project Performance . 59 4.5 Results . 60 4.6 Limitations . 74 4.7 Conclusion . 75 5 Misalignment Between Supply and Demand 77 5.1 Research questions . 78 5.2 The Perfect Alignment Hypothesis . 79 5.3 Methods and Datasets . 81 5.4 Misalignment in Wikipedia . 89 5.5 Misalignment in OpenStreetMap . 101 5.6 Discussion . 118 5.7 Future Work and Limitations . 120 5.8 Conclusion . 122 6 Conclusion 124 6.1 Broader Implications . 126 6.2 Future Work . 129 6.3 In Conclusion . 132 Bibliography 133 vii List of Tables 3.1 Feature list of all four models . 30 3.2 Classification results of all four models . 30 3.3 Overall gain ratio evaluation for all 17 features . 34 3.4 Classification results for all classifiers on 2-class problem . 34 3.5 Classification results for all seven assessment classes . 39 3.6 Confusion matrix for classification of all seven assessment classes . 39 3.7 Quality Improvement Project classifier confusion matrix . 44 3.8 Quality Improvement Project classifier prediction error . 45 3.9 Comparison between OLR models for reassessments and predictions .
Recommended publications
  • A Grouplens Perspective
    From: AAAI Technical Report WS-98-08. Compilation copyright © 1998, AAAI (www.aaai.org). All rights reserved. RecommenderSystems: A GroupLensPerspective Joseph A. Konstan*t , John Riedl *t, AI Borchers,* and Jonathan L. Herlocker* *GroupLensResearch Project *Net Perceptions,Inc. Dept. of ComputerScience and Engineering 11200 West78th Street University of Minnesota Suite 300 Minneapolis, MN55455 Minneapolis, MN55344 http://www.cs.umn.edu/Research/GroupLens/ http://www.netperceptions.com/ ABSTRACT identifying sets of articles by keyworddoes not scale to a In this paper, wereview the history and research findings of situation in which there are thousands of articles that the GroupLensResearch project I and present the four broad contain any imaginable set of keywords. Taken together, research directions that we feel are most critical for these two weaknesses represented an opportunity for a new recommender systems. type of filtering, that would focus on finding which INTRODUCTION:A History of the GroupLensProject available articles matchhuman notions of quality and taste. The GroupLens Research project began at the Computer Such a system would be able to produce a list of articles Supported Cooperative Work (CSCW)Conference in 1992. that each user wouldlike, independentof their content. Oneof the keynote speakers at the conference lectured on a Wedecided to apply our ideas in the domain of Usenet his vision of an emerging information economy,in which news. Usenet screamsfor better information filtering, with most of the effort in the economywould revolve around hundreds of thousands of articles posted daily. Manyof the production, distribution, and consumptionof information, articles in each Usenet newsgroupare on the sametopic, so rather than physical goods and services.
    [Show full text]
  • An Analysis of Contributions to Wikipedia from Tor
    Are anonymity-seekers just like everybody else? An analysis of contributions to Wikipedia from Tor Chau Tran Kaylea Champion Andrea Forte Department of Computer Science & Engineering Department of Communication College of Computing & Informatics New York University University of Washington Drexel University New York, USA Seatle, USA Philadelphia, USA [email protected] [email protected] [email protected] Benjamin Mako Hill Rachel Greenstadt Department of Communication Department of Computer Science & Engineering University of Washington New York University Seatle, USA New York, USA [email protected] [email protected] Abstract—User-generated content sites routinely block contri- butions from users of privacy-enhancing proxies like Tor because of a perception that proxies are a source of vandalism, spam, and abuse. Although these blocks might be effective, collateral damage in the form of unrealized valuable contributions from anonymity seekers is invisible. One of the largest and most important user-generated content sites, Wikipedia, has attempted to block contributions from Tor users since as early as 2005. We demonstrate that these blocks have been imperfect and that thousands of attempts to edit on Wikipedia through Tor have been successful. We draw upon several data sources and analytical techniques to measure and describe the history of Tor editing on Wikipedia over time and to compare contributions from Tor users to those from other groups of Wikipedia users. Fig. 1. Screenshot of the page a user is shown when they attempt to edit the Our analysis suggests that although Tor users who slip through Wikipedia article on “Privacy” while using Tor. Wikipedia’s ban contribute content that is more likely to be reverted and to revert others, their contributions are otherwise similar in quality to those from other unregistered participants and to the initial contributions of registered users.
    [Show full text]
  • A Recommender System for Groups of Users
    PolyLens: A Recommender System for Groups of Users Mark O’Connor, Dan Cosley, Joseph A. Konstan, and John Riedl Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN 55455 USA {oconnor;cosley;konstan;riedl}@cs.umn.edu Abstract. We present PolyLens, a new collaborative filtering recommender system designed to recommend items for groups of users, rather than for individuals. A group recommender is more appropriate and useful for domains in which several people participate in a single activity, as is often the case with movies and restaurants. We present an analysis of the primary design issues for group recommenders, including questions about the nature of groups, the rights of group members, social value functions for groups, and interfaces for displaying group recommendations. We then report on our PolyLens prototype and the lessons we learned from usage logs and surveys from a nine-month trial that included 819 users. We found that users not only valued group recommendations, but were willing to yield some privacy to get the benefits of group recommendations. Users valued an extension to the group recommender system that enabled them to invite non-members to participate, via email. Introduction Recommender systems (Resnick & Varian, 1997) help users faced with an overwhelming selection of items by identifying particular items that are likely to match each user’s tastes or preferences (Schafer et al., 1999). The most sophisticated systems learn each user’s tastes and provide personalized recommendations. Though several machine learning and personalization technologies can attempt to learn user preferences, automated collaborative filtering (Resnick et al., 1994; Shardanand & Maes, 1995) has become the preferred real-time technology for personal recommendations, in part because it leverages the experiences of an entire community of users to provide high quality recommendations without detailed models of either content or user tastes.
    [Show full text]
  • Community, Impact and Credit: Where Should I Submit My Papers?
    Panels February 23–27, 2013, San Antonio, Texas, USA Community, Impact and Credit: Where should I submit my papers? Abstract Aaron Halfaker R. Stuart Geiger We (the authors of CSCWs program) have finite time and GroupLens Research School of Information energy that can be invested into our publications and the University of Minnesota University of CA, Berkeley research communities we value. While we want our work [email protected] [email protected] to have the most impact possible, we also want to grow Cliff Lampe Loren Terveen and support productive research communities within School of Information GroupLens Research which to have this impact. This panel discussion explores Michigan State University University of Minnesota the costs and benefits of submitting papers to various [email protected] [email protected] tiers of conferences and journals surrounding CSCW and Amy Bruckman Brian Keegan reflects on the value of investing hours into building up a College of Computing Northeastern University research community. Georgia Inst. of Technology [email protected] [email protected] Author Keywords Aniket Kittur Geraldine Fitzpatrick community; credit; impact; publishing; peer review HCI Institute Vienna University of Carnegie Mellon University Technology ACM Classification Keywords [email protected] geraldine.fi[email protected] H.5.0. [Information Interfaces and Presentation (e.g. HCI)]: General Introduction We (the authors of CSCWs program) have finite time and energy that can be invested into our publications and the research communities we value. In order to allow our work to have an impact, we must also grow and maintain Copyright is held by the author/owner(s).
    [Show full text]
  • Improvements in Holistic Recommender System Research
    Improvements in Holistic Recommender System Research A DISSERTATION SUBMITTED TO THE FACULTY OF THE UNIVERSITY OF MINNESOTA BY Daniel Allen Kluver IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Joseph A. Konstan August, 2018 c Daniel Allen Kluver 2018 ALL RIGHTS RESERVED Dedication This dissertation is dedicated to my family, my friends, my advisers John Riedl and Joseph Konstan, my colleagues, both at GroupLens research and at Macalester College, and everyone else who believed in me and supported me along the way. Your support meant everything when I couldn’t support myself. Your belief meant everything when I couldn’t believe in myself. I couldn’t have done this without your help. i Abstract Since the mid 1990s, recommender systems have grown to be a major area of de- ployment in industry, and research in academia. A through-line in this research has been the pursuit, above all else, of the perfect algorithm. With this admirable focus has come a neglect of the full scope of building, maintaining, and improving recom- mender systems. In this work I outline a system deployment and a series of offline and online experiments dedicated to improving our holistic understanding of recommender systems. This work explores the design, algorithms, early performance, and interfaces of recommender systems within the scope of how they are interconnected with other aspects of the system. This work explores many indivisual aspects of a recommender system while keeping in mind how they are connected to other aspects of
    [Show full text]
  • Detecting Undisclosed Paid Editing in Wikipedia
    Detecting Undisclosed Paid Editing in Wikipedia Nikesh Joshi, Francesca Spezzano, Mayson Green, and Elijah Hill Computer Science Department, Boise State University Boise, Idaho, USA [email protected] {nikeshjoshi,maysongreen,elijahhill}@u.boisestate.edu ABSTRACT 1 INTRODUCTION Wikipedia, the free and open-collaboration based online ency- Wikipedia is the free online encyclopedia based on the principle clopedia, has millions of pages that are maintained by thousands of of open collaboration; for the people by the people. Anyone can volunteer editors. As per Wikipedia’s fundamental principles, pages add and edit almost any article or page. However, volunteers should on Wikipedia are written with a neutral point of view and main- follow a set of guidelines when editing Wikipedia. The purpose tained by volunteer editors for free with well-defned guidelines in of Wikipedia is “to provide the public with articles that summa- order to avoid or disclose any confict of interest. However, there rize accepted knowledge, written neutrally and sourced reliably” 1 have been several known incidents where editors intentionally vio- and the encyclopedia should not be considered as a platform for late such guidelines in order to get paid (or even extort money) for advertising and self-promotion. Wikipedia’s guidelines strongly maintaining promotional spam articles without disclosing such. discourage any form of confict-of-interest (COI) editing and require In this paper, we address for the frst time the problem of identi- editors to disclose any COI contribution. Paid editing is a form of fying undisclosed paid articles in Wikipedia. We propose a machine COI editing and refers to editing Wikipedia (in the majority of the learning-based framework using a set of features based on both cases for promotional purposes) in exchange for compensation.
    [Show full text]
  • Truth and Lies of What People Are Really Thinking
    Dedication For Lex and Stella Epigraph Everything we see is a perspective, not the truth —Falsely attributed to Marcus Aurelius (author unknown) Contents Cover Title Page Dedication Epigraph Introduction Part One Genuine Deceptions 1 Body Language Lies 2 Powerful Thinking 3 Mind Your Judgments 4 Scan for Truth Part Two Dating 5 They’re Totally Checking Me Out! 6 Playing Hard-to-Get 7 Just Feeling Sorry for Me 8 I’m Being Ghosted 9 What a Complete Psycho! 10 They Are Running the Show 11 I’m Going to Pay for That! 12 They Are So Mad at Me 13 A Lying Cheat? 14 Definitely into My Friend 15 A Match Made in Heaven? 16 They Are So Breaking Up! Part Three Friends and Family 17 Thick as Thieves 18 My New Bff? 19 Fomo 20 Control Freak 21 Too Close for Comfort 22 They’ll Never Fit in with My Family 23 House on Fire! 24 I Am Boring the Pants off Them 25 Lying Through Their Teeth 26 Persona Non Grata 27 Invisible Me Part Four Working Life 28 I Aced That Interview—So, Where’s the Job Offer? 29 They Hate My Work 30 Big Dog 31 Never Going to See Eye to Eye 32 Cold Fish 33 They’re Gonna Blow! 34 This Meeting Is a Waste of Time 35 Looks Like a Winning Team 36 So, You Think You’re the Boss 37 Hand in the Cookie Jar Summary Bonus Bluff Learn more Acknowledgments Notes About the Authors Praise for Truth & Lies Also by Mark Bowden Copyright About the Publisher Introduction WE CAN ALL RECALL EXAMPLES when our sense of what another person was thinking turned out to be the truth.
    [Show full text]
  • User Session Identification Based on Strong Regularities in Inter-Activity
    User Session Identification Based on Strong Regularities in Inter-activity Time Aaron Halfaker Os Keyes Daniel Kluver Wikimedia Foundation Wikimedia Foundation GroupLens Research [email protected] [email protected] University of Minnesota [email protected] Jacob Thebault-Spieker Tien Nguyen Kenneth Shores GroupLens Research GroupLens Research GroupLens Research University of Minnesota University of Minnesota University of Minnesota [email protected] [email protected] [email protected] Anuradha Uduwage Morten Warncke-Wang GroupLens Research GroupLens Research University of Minnesota University of Minnesota [email protected] [email protected] ABSTRACT 1. INTRODUCTION Session identification is a common strategy used to develop In 2012, we had an idea for a measurement strategy that metrics for web analytics and behavioral analyses of user- would bring insight and understanding to the nature of par- facing systems. Past work has argued that session identifica- ticipation in an online community. While studying partic- tion strategies based on an inactivity threshold is inherently ipation in Wikipedia, the open, collaborative encyclopedia, arbitrary or advocated that thresholds be set at about 30 we found ourselves increasingly curious about the amount of minutes. In this work, we demonstrate a strong regularity time that volunteer contributors invested into the encyclope- in the temporal rhythms of user initiated events across sev- dia's construction. Past work measuring editor engagement eral different domains of online activity (incl. video gaming, relied on counting the number of contributions made by a search, page views and volunteer contributions). We de- user1, but we felt that the amount of time editors spent scribe a methodology for identifying clusters of user activity editing might serve as a more appropriate measure.
    [Show full text]
  • Critical Point of View: a Wikipedia Reader
    w ikipedia pedai p edia p Wiki CRITICAL POINT OF VIEW A Wikipedia Reader 2 CRITICAL POINT OF VIEW A Wikipedia Reader CRITICAL POINT OF VIEW 3 Critical Point of View: A Wikipedia Reader Editors: Geert Lovink and Nathaniel Tkacz Editorial Assistance: Ivy Roberts, Morgan Currie Copy-Editing: Cielo Lutino CRITICAL Design: Katja van Stiphout Cover Image: Ayumi Higuchi POINT OF VIEW Printer: Ten Klei Groep, Amsterdam Publisher: Institute of Network Cultures, Amsterdam 2011 A Wikipedia ISBN: 978-90-78146-13-1 Reader EDITED BY Contact GEERT LOVINK AND Institute of Network Cultures NATHANIEL TKACZ phone: +3120 5951866 INC READER #7 fax: +3120 5951840 email: [email protected] web: http://www.networkcultures.org Order a copy of this book by sending an email to: [email protected] A pdf of this publication can be downloaded freely at: http://www.networkcultures.org/publications Join the Critical Point of View mailing list at: http://www.listcultures.org Supported by: The School for Communication and Design at the Amsterdam University of Applied Sciences (Hogeschool van Amsterdam DMCI), the Centre for Internet and Society (CIS) in Bangalore and the Kusuma Trust. Thanks to Johanna Niesyto (University of Siegen), Nishant Shah and Sunil Abraham (CIS Bangalore) Sabine Niederer and Margreet Riphagen (INC Amsterdam) for their valuable input and editorial support. Thanks to Foundation Democracy and Media, Mondriaan Foundation and the Public Library Amsterdam (Openbare Bibliotheek Amsterdam) for supporting the CPOV events in Bangalore, Amsterdam and Leipzig. (http://networkcultures.org/wpmu/cpov/) Special thanks to all the authors for their contributions and to Cielo Lutino, Morgan Currie and Ivy Roberts for their careful copy-editing.
    [Show full text]
  • ORES Scalable, Community-Driven, Auditable, Machine Learning-As-A-Service in This Presentation, We Lay out Ideas
    ORES Scalable, community-driven, auditable, machine learning-as-a-service In this presentation, we lay out ideas... This is a conversation starter, not a conversation ender. We put forward our best ideas and thinking based on our partial perspectives. We look forward to hearing yours. Outline - Opportunities and fears of artificial intelligence - ORES: A break-away success - The proposal & asks We face a significant challenge... There’s been a significant decline in the number of active contributors to the English-language Wikipedia. Contributors have fallen by 40% over the past eight years, to about 30,000. Hamstrung by our own success? “Research indicates that the problem may be rooted in Wikipedians’ complex bureaucracy and their often hard-line responses to newcomers’ mistakes, enabled by semi-automated tools that make deleting new changes easy.” - MIT Technology Review (December 1, 2015) It’s the revert-new-editors’-first-few-edits-and-alienate-them problem. Imagine an algorithm that can... ● Judge whether an edit was made in good faith, so that new editors don’t have their first contributions wiped out. ● Detect bad-faith edits and help fight vandalism. Machine learning and prediction have the power to support important work in open contribution projects like Wikipedia at massive scales. But it’s far from a no-brainer. Quite the opposite. On the bright side, Machine learning has the potential to help our projects scale by reducing the workload of editors and enhancing the value of our content. by Luiz Augusto Fornasier Milani under CC BY-SA 4.0, from Wikimedia Commons. On the dark side, AIs have the potential to perpetuate biases and silence voices in novel and insidious ways.
    [Show full text]
  • Package 'Wikipedir'
    Package ‘WikipediR’ February 5, 2017 Type Package Title A MediaWiki API Wrapper Version 1.5.0 Date 2017-02-04 Author Oliver Keyes [aut, cre], Brock Tilbert [ctb] Maintainer Oliver Keyes <[email protected]> Description A wrapper for the MediaWiki API, aimed particularly at the Wikimedia 'production' wikis, such as Wikipedia. It can be used to retrieve page text, information about users or the history of pages, and elements of the category tree. License MIT + file LICENSE Imports httr, jsonlite Suggests testthat, knitr, WikidataR, pageviews BugReports https://github.com/Ironholds/WikipediR/issues URL https://github.com/Ironholds/WikipediR/ VignetteBuilder knitr RoxygenNote 5.0.1 NeedsCompilation no Repository CRAN Date/Publication 2017-02-05 08:44:55 R topics documented: categories_in_page . .2 pages_in_category . .3 page_backlinks . .4 page_content . .5 page_external_links . .6 page_info . .7 page_links . .8 1 2 categories_in_page query . .9 random_page . .9 recent_changes . 10 revision_content . 11 revision_diff . 12 user_contributions . 14 user_information . 15 WikipediR . 17 Index 18 categories_in_page Retrieves categories associated with a page. Description Retrieves categories associated with a page (or list of pages) on a MediaWiki instance Usage categories_in_page(language = NULL, project = NULL, domain = NULL, pages, properties = c("sortkey", "timestamp", "hidden"), limit = 50, show_hidden = FALSE, clean_response = FALSE, ...) Arguments language The language code of the project you wish to query, if appropriate. project The project you wish to query ("wikiquote"), if appropriate. Should be provided in conjunction with language. domain as an alternative to a language and project combination, you can also provide a domain ("rationalwiki.org") to the URL constructor, allowing for the querying of non-Wikimedia MediaWiki instances. pages A vector of page titles, with or without spaces, that you want to retrieve cate- gories for.
    [Show full text]
  • A Roll of the Dice
    Confidence Displays and Training in Recommender Systems Sean M. McNee, Shyong K. Lam, Catherine Guetzlaff, Joseph A. Konstan, and John Riedl GroupLens Research Project Department of Computer Science and Engineering University of Minnesota, MN 55455 USA {mcnee, lam, guetzlaf, konstan, riedl}@cs.umn.edu Abstract: Recommender systems help users sort through vast quantities of information. Sometimes, however, users do not know if they can trust the recommendations they receive. Adding a confidence metric has the potential to improve user satisfaction and alter user behavior in a recommender system. We performed an experiment to measure the effects of a confidence display as a component of an existing collaborative filtering- based recommender system. Minimal training improved use of the confidence display compared to no training. Novice users were less likely to notice, understand, and use the confidence display than experienced users of the system. Providing training about a confidence display to experienced users greatly reduced user satisfaction in the recommender system. These results raise interesting issues and demonstrate subtle effects about how and when to train users when adding features to a system. Keywords: Recommender Systems, Collaborative Filtering, Confidence Displays, Risk, Training range. For example, a fan of science fiction may 1 Introduction receive a recommendation for a particular art film because other science fiction fans liked it. This People face the problem of information overload serendipity in recommendations is one of the every day. As the number of web sites, books, strengths of collaborative filtering. magazines, research papers, and so on continue to It is also one of its weaknesses. CF algorithms rise, it is getting harder to keep up.
    [Show full text]