Towards a Better Understanding of Peer-Produced Structured Content Value

Total Page:16

File Type:pdf, Size:1020Kb

Towards a Better Understanding of Peer-Produced Structured Content Value Towards a Better Understanding of Peer-Produced Structured Content Value A THESIS SUBMITTED TO THE FACULTY OF THE UNIVERSITY OF MINNESOTA BY Andrew Hall IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Loren Terveen July, 2019 Copyright © Andrew Hall 2019. i Acknowledgements I would first like to thank my Ph.D. advisor, Dr. Loren Terveen, for his assistance over the last five years. Throughout the Ph.D. program, I had countless informal meetings with Loren in which he taught me how to be a more logical and critical thinker. Rather than directly providing answers to many of the questions that I would have for him, Loren helped me develop the necessary tools to derive such answers myself. He would also frequently push me to get out of my comfort zone to pursue challenging research questions/directions that had large potential impacts. I would like to thank Dr. Aaron Halfaker who is a Principle Research Scientist at the Wikimedia Foundation. I started actively working with Aaron during the spring of 2017 as a research intern at the Wikimedia Foundation. However, our collaboration did not finish once the internship ended, and for the last two years Aaron has served in a role that is essentially that of a co-advisor. I greatly appreciate Aaron’s willingness to discuss both low-level technical details and high-level research questions and anything in between during our weekly meetings. I would also like to thank many of the professors at the University of Minnesota for helping me succeed with both coursework and research. Specifically, I would like to thank Drs. Haiyi Zhu and Daniel Keefe for being a part of my thesis committee and for providing feedback when I proposed the thesis. Additionally, I’m grateful to Dr. Eric Van Wyk who introduced me to research as an undergraduate student and helped me realize that graduate school could serve an important role towards achieving my career goals. Many current and formers members of GroupLens Research have helped me succeed in the projects I performed. GroupLens is a very collaborative lab, and I believe that this has strengthened my research tremendously. I would especially like to thank Dr. Jacob Thebault-Spieker, Allen Lin, Sarah McRoberts, Dr. Isaac Johnson, Dr. Brent Hecht, and Dr. Shilad Sen. Finally, I would like to thank my parents and brother for all the support they have provided over a challenging five years of my life. ii Dedication To my parents, grandparents, and brother. iii Abstract Over the last 30 years, peer production has created everything from software (e.g. Linux) to encyclopedia articles (e.g. Wikipedia) to geographic data (e.g. OpenStreetMap). In recent years, peer production has increased its focus on the production of structured (key-value pair) content. This content is designed to be consumed by applications and algorithms. This thesis explores two challenges towards generating content that is as valuable as possible to these applications/algorithms. The first challenge is unique to the context of peer-produced structured data and is focused on a tension between the core peer production ethos of contributor freedom and the need for highly-standardized data in order for applications/algorithms to effectively operate. To explore this tension between freedom and standardization, I qualitatively analyze the ways in which it surfaces and then quantitatively analyze its impact. For the second challenge, I compare how different levels of automation affect content value. Contributions in peer production come from manual editing, semi-automated tool editing, and fully-automated bot editing. I use two important lenses to study the value provided by these different types of contributions. Specifically, I study value by considering 1) the relationship between content quality and demand, and 2) problematic societal-level content biases (e.g. along male versus female, Global North versus Global South, and urban versus rural lines). While peer-production research has explored these two lenses of value in the past, it has not sought to develop a robust understanding in the context of structured content. To ensure that automated and manual contributions are effectively differentiated, I also develop a bot detection model. Finally, I provide implications based on my results. For example, my work motivates socio-technical tools that can reduce the manual effort required to contribute structured data and tools that direct effort towards in-demand content. iv Table of Contents List of Tables ........................................................................................................................... vii List of Figures ........................................................................................................................ viii Notes on the Content in this Thesis .................................................................................. ix 1 Introduction ...................................................................................................................... 1 1.1 Defining the Problem Space ............................................................................................ 1 1.1.1 Challenge 1: Contributor Freedom versus Data Standardization. ............................. 1 1.1.2 Challenge 2: Understanding the Different Roles that Manual and Automated Contributions Play in Affecting Content Value ................................................................................... 3 1.2 Research Questions ............................................................................................................ 3 1.3 Summary of Challenge 1 Studies ................................................................................... 3 1.3.1 Exploring the Tension Between Freedom and Data Standardization ...................... 3 1.3.2 Measuring Contributor Freedom’s Effect on Data Standardization .......................... 4 1.4 Summary of Challenge 2 Studies ................................................................................... 4 1.4.1 Unidentified Bot Detection ......................................................................................................... 4 1.4.2 Comparing Content Value Produced by Manual and Automated Contributions Along Three Intuitive Dimensions ........................................................................................................... 4 1.5 Thesis Organization ........................................................................................................... 5 2 BacKground and Related WorK .................................................................................. 6 2.1 BacKground on Peer Production ................................................................................... 6 2.2 Brief BacKground on Communities Studied in this Thesis ................................... 6 2.2.1 OpenStreetMap ................................................................................................................................ 6 2.2.2 Wikidata .............................................................................................................................................. 7 2.3 Performing Contributions in Peer Production ......................................................... 8 2.3.1 Manual Contributions .................................................................................................................... 8 2.3.2 Automated Contributions ............................................................................................................ 8 2.4 Research Studying Contributor Freedom within Peer Production ................... 9 2.5 Studies of Content Value in Peer Production .......................................................... 10 3 Understanding the Causes of a Tension Between Freedom and Standardization in Peer-Produced Structured Content .......................................... 12 3.1 Introduction ....................................................................................................................... 12 3.2 Related WorK ..................................................................................................................... 12 3.3 Method .................................................................................................................................. 13 3.4 Results and Interpretations .......................................................................................... 14 3.4.1 Theme 1: Freedom vs. Metadata Completeness ............................................................ 15 3.4.2 Theme 2: Project-Specific Freedom and Metadata Correctness: Humanitarian OpenStreetMap (HOT) ............................................................................................................................... 17 3.4.3 Theme 3: Cultural Differences Make Global Metadata Correctness Standards Difficult to Achieve and Maintain .......................................................................................................... 18 3.4.4 Theme 4: Community-Management Obstacles to Achieving Consensus ............. 22 3.4.5 Theme 5: Data Representation Prevents Conceptual Correctness ......................... 23 3.4.6 Theme 6: Data Entry Tools May Harm Metadata Correctness and Privilege Certain Users .................................................................................................................................................. 24 3.5 Reflecting on Correctness, Community, and Code ................................................
Recommended publications
  • A Grouplens Perspective
    From: AAAI Technical Report WS-98-08. Compilation copyright © 1998, AAAI (www.aaai.org). All rights reserved. RecommenderSystems: A GroupLensPerspective Joseph A. Konstan*t , John Riedl *t, AI Borchers,* and Jonathan L. Herlocker* *GroupLensResearch Project *Net Perceptions,Inc. Dept. of ComputerScience and Engineering 11200 West78th Street University of Minnesota Suite 300 Minneapolis, MN55455 Minneapolis, MN55344 http://www.cs.umn.edu/Research/GroupLens/ http://www.netperceptions.com/ ABSTRACT identifying sets of articles by keyworddoes not scale to a In this paper, wereview the history and research findings of situation in which there are thousands of articles that the GroupLensResearch project I and present the four broad contain any imaginable set of keywords. Taken together, research directions that we feel are most critical for these two weaknesses represented an opportunity for a new recommender systems. type of filtering, that would focus on finding which INTRODUCTION:A History of the GroupLensProject available articles matchhuman notions of quality and taste. The GroupLens Research project began at the Computer Such a system would be able to produce a list of articles Supported Cooperative Work (CSCW)Conference in 1992. that each user wouldlike, independentof their content. Oneof the keynote speakers at the conference lectured on a Wedecided to apply our ideas in the domain of Usenet his vision of an emerging information economy,in which news. Usenet screamsfor better information filtering, with most of the effort in the economywould revolve around hundreds of thousands of articles posted daily. Manyof the production, distribution, and consumptionof information, articles in each Usenet newsgroupare on the sametopic, so rather than physical goods and services.
    [Show full text]
  • An Analysis of Contributions to Wikipedia from Tor
    Are anonymity-seekers just like everybody else? An analysis of contributions to Wikipedia from Tor Chau Tran Kaylea Champion Andrea Forte Department of Computer Science & Engineering Department of Communication College of Computing & Informatics New York University University of Washington Drexel University New York, USA Seatle, USA Philadelphia, USA [email protected] [email protected] [email protected] Benjamin Mako Hill Rachel Greenstadt Department of Communication Department of Computer Science & Engineering University of Washington New York University Seatle, USA New York, USA [email protected] [email protected] Abstract—User-generated content sites routinely block contri- butions from users of privacy-enhancing proxies like Tor because of a perception that proxies are a source of vandalism, spam, and abuse. Although these blocks might be effective, collateral damage in the form of unrealized valuable contributions from anonymity seekers is invisible. One of the largest and most important user-generated content sites, Wikipedia, has attempted to block contributions from Tor users since as early as 2005. We demonstrate that these blocks have been imperfect and that thousands of attempts to edit on Wikipedia through Tor have been successful. We draw upon several data sources and analytical techniques to measure and describe the history of Tor editing on Wikipedia over time and to compare contributions from Tor users to those from other groups of Wikipedia users. Fig. 1. Screenshot of the page a user is shown when they attempt to edit the Our analysis suggests that although Tor users who slip through Wikipedia article on “Privacy” while using Tor. Wikipedia’s ban contribute content that is more likely to be reverted and to revert others, their contributions are otherwise similar in quality to those from other unregistered participants and to the initial contributions of registered users.
    [Show full text]
  • A Recommender System for Groups of Users
    PolyLens: A Recommender System for Groups of Users Mark O’Connor, Dan Cosley, Joseph A. Konstan, and John Riedl Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN 55455 USA {oconnor;cosley;konstan;riedl}@cs.umn.edu Abstract. We present PolyLens, a new collaborative filtering recommender system designed to recommend items for groups of users, rather than for individuals. A group recommender is more appropriate and useful for domains in which several people participate in a single activity, as is often the case with movies and restaurants. We present an analysis of the primary design issues for group recommenders, including questions about the nature of groups, the rights of group members, social value functions for groups, and interfaces for displaying group recommendations. We then report on our PolyLens prototype and the lessons we learned from usage logs and surveys from a nine-month trial that included 819 users. We found that users not only valued group recommendations, but were willing to yield some privacy to get the benefits of group recommendations. Users valued an extension to the group recommender system that enabled them to invite non-members to participate, via email. Introduction Recommender systems (Resnick & Varian, 1997) help users faced with an overwhelming selection of items by identifying particular items that are likely to match each user’s tastes or preferences (Schafer et al., 1999). The most sophisticated systems learn each user’s tastes and provide personalized recommendations. Though several machine learning and personalization technologies can attempt to learn user preferences, automated collaborative filtering (Resnick et al., 1994; Shardanand & Maes, 1995) has become the preferred real-time technology for personal recommendations, in part because it leverages the experiences of an entire community of users to provide high quality recommendations without detailed models of either content or user tastes.
    [Show full text]
  • Community, Impact and Credit: Where Should I Submit My Papers?
    Panels February 23–27, 2013, San Antonio, Texas, USA Community, Impact and Credit: Where should I submit my papers? Abstract Aaron Halfaker R. Stuart Geiger We (the authors of CSCWs program) have finite time and GroupLens Research School of Information energy that can be invested into our publications and the University of Minnesota University of CA, Berkeley research communities we value. While we want our work [email protected] [email protected] to have the most impact possible, we also want to grow Cliff Lampe Loren Terveen and support productive research communities within School of Information GroupLens Research which to have this impact. This panel discussion explores Michigan State University University of Minnesota the costs and benefits of submitting papers to various [email protected] [email protected] tiers of conferences and journals surrounding CSCW and Amy Bruckman Brian Keegan reflects on the value of investing hours into building up a College of Computing Northeastern University research community. Georgia Inst. of Technology [email protected] [email protected] Author Keywords Aniket Kittur Geraldine Fitzpatrick community; credit; impact; publishing; peer review HCI Institute Vienna University of Carnegie Mellon University Technology ACM Classification Keywords [email protected] geraldine.fi[email protected] H.5.0. [Information Interfaces and Presentation (e.g. HCI)]: General Introduction We (the authors of CSCWs program) have finite time and energy that can be invested into our publications and the research communities we value. In order to allow our work to have an impact, we must also grow and maintain Copyright is held by the author/owner(s).
    [Show full text]
  • Improvements in Holistic Recommender System Research
    Improvements in Holistic Recommender System Research A DISSERTATION SUBMITTED TO THE FACULTY OF THE UNIVERSITY OF MINNESOTA BY Daniel Allen Kluver IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Joseph A. Konstan August, 2018 c Daniel Allen Kluver 2018 ALL RIGHTS RESERVED Dedication This dissertation is dedicated to my family, my friends, my advisers John Riedl and Joseph Konstan, my colleagues, both at GroupLens research and at Macalester College, and everyone else who believed in me and supported me along the way. Your support meant everything when I couldn’t support myself. Your belief meant everything when I couldn’t believe in myself. I couldn’t have done this without your help. i Abstract Since the mid 1990s, recommender systems have grown to be a major area of de- ployment in industry, and research in academia. A through-line in this research has been the pursuit, above all else, of the perfect algorithm. With this admirable focus has come a neglect of the full scope of building, maintaining, and improving recom- mender systems. In this work I outline a system deployment and a series of offline and online experiments dedicated to improving our holistic understanding of recommender systems. This work explores the design, algorithms, early performance, and interfaces of recommender systems within the scope of how they are interconnected with other aspects of the system. This work explores many indivisual aspects of a recommender system while keeping in mind how they are connected to other aspects of
    [Show full text]
  • Detecting Undisclosed Paid Editing in Wikipedia
    Detecting Undisclosed Paid Editing in Wikipedia Nikesh Joshi, Francesca Spezzano, Mayson Green, and Elijah Hill Computer Science Department, Boise State University Boise, Idaho, USA [email protected] {nikeshjoshi,maysongreen,elijahhill}@u.boisestate.edu ABSTRACT 1 INTRODUCTION Wikipedia, the free and open-collaboration based online ency- Wikipedia is the free online encyclopedia based on the principle clopedia, has millions of pages that are maintained by thousands of of open collaboration; for the people by the people. Anyone can volunteer editors. As per Wikipedia’s fundamental principles, pages add and edit almost any article or page. However, volunteers should on Wikipedia are written with a neutral point of view and main- follow a set of guidelines when editing Wikipedia. The purpose tained by volunteer editors for free with well-defned guidelines in of Wikipedia is “to provide the public with articles that summa- order to avoid or disclose any confict of interest. However, there rize accepted knowledge, written neutrally and sourced reliably” 1 have been several known incidents where editors intentionally vio- and the encyclopedia should not be considered as a platform for late such guidelines in order to get paid (or even extort money) for advertising and self-promotion. Wikipedia’s guidelines strongly maintaining promotional spam articles without disclosing such. discourage any form of confict-of-interest (COI) editing and require In this paper, we address for the frst time the problem of identi- editors to disclose any COI contribution. Paid editing is a form of fying undisclosed paid articles in Wikipedia. We propose a machine COI editing and refers to editing Wikipedia (in the majority of the learning-based framework using a set of features based on both cases for promotional purposes) in exchange for compensation.
    [Show full text]
  • Truth and Lies of What People Are Really Thinking
    Dedication For Lex and Stella Epigraph Everything we see is a perspective, not the truth —Falsely attributed to Marcus Aurelius (author unknown) Contents Cover Title Page Dedication Epigraph Introduction Part One Genuine Deceptions 1 Body Language Lies 2 Powerful Thinking 3 Mind Your Judgments 4 Scan for Truth Part Two Dating 5 They’re Totally Checking Me Out! 6 Playing Hard-to-Get 7 Just Feeling Sorry for Me 8 I’m Being Ghosted 9 What a Complete Psycho! 10 They Are Running the Show 11 I’m Going to Pay for That! 12 They Are So Mad at Me 13 A Lying Cheat? 14 Definitely into My Friend 15 A Match Made in Heaven? 16 They Are So Breaking Up! Part Three Friends and Family 17 Thick as Thieves 18 My New Bff? 19 Fomo 20 Control Freak 21 Too Close for Comfort 22 They’ll Never Fit in with My Family 23 House on Fire! 24 I Am Boring the Pants off Them 25 Lying Through Their Teeth 26 Persona Non Grata 27 Invisible Me Part Four Working Life 28 I Aced That Interview—So, Where’s the Job Offer? 29 They Hate My Work 30 Big Dog 31 Never Going to See Eye to Eye 32 Cold Fish 33 They’re Gonna Blow! 34 This Meeting Is a Waste of Time 35 Looks Like a Winning Team 36 So, You Think You’re the Boss 37 Hand in the Cookie Jar Summary Bonus Bluff Learn more Acknowledgments Notes About the Authors Praise for Truth & Lies Also by Mark Bowden Copyright About the Publisher Introduction WE CAN ALL RECALL EXAMPLES when our sense of what another person was thinking turned out to be the truth.
    [Show full text]
  • User Session Identification Based on Strong Regularities in Inter-Activity
    User Session Identification Based on Strong Regularities in Inter-activity Time Aaron Halfaker Os Keyes Daniel Kluver Wikimedia Foundation Wikimedia Foundation GroupLens Research [email protected] [email protected] University of Minnesota [email protected] Jacob Thebault-Spieker Tien Nguyen Kenneth Shores GroupLens Research GroupLens Research GroupLens Research University of Minnesota University of Minnesota University of Minnesota [email protected] [email protected] [email protected] Anuradha Uduwage Morten Warncke-Wang GroupLens Research GroupLens Research University of Minnesota University of Minnesota [email protected] [email protected] ABSTRACT 1. INTRODUCTION Session identification is a common strategy used to develop In 2012, we had an idea for a measurement strategy that metrics for web analytics and behavioral analyses of user- would bring insight and understanding to the nature of par- facing systems. Past work has argued that session identifica- ticipation in an online community. While studying partic- tion strategies based on an inactivity threshold is inherently ipation in Wikipedia, the open, collaborative encyclopedia, arbitrary or advocated that thresholds be set at about 30 we found ourselves increasingly curious about the amount of minutes. In this work, we demonstrate a strong regularity time that volunteer contributors invested into the encyclope- in the temporal rhythms of user initiated events across sev- dia's construction. Past work measuring editor engagement eral different domains of online activity (incl. video gaming, relied on counting the number of contributions made by a search, page views and volunteer contributions). We de- user1, but we felt that the amount of time editors spent scribe a methodology for identifying clusters of user activity editing might serve as a more appropriate measure.
    [Show full text]
  • Critical Point of View: a Wikipedia Reader
    w ikipedia pedai p edia p Wiki CRITICAL POINT OF VIEW A Wikipedia Reader 2 CRITICAL POINT OF VIEW A Wikipedia Reader CRITICAL POINT OF VIEW 3 Critical Point of View: A Wikipedia Reader Editors: Geert Lovink and Nathaniel Tkacz Editorial Assistance: Ivy Roberts, Morgan Currie Copy-Editing: Cielo Lutino CRITICAL Design: Katja van Stiphout Cover Image: Ayumi Higuchi POINT OF VIEW Printer: Ten Klei Groep, Amsterdam Publisher: Institute of Network Cultures, Amsterdam 2011 A Wikipedia ISBN: 978-90-78146-13-1 Reader EDITED BY Contact GEERT LOVINK AND Institute of Network Cultures NATHANIEL TKACZ phone: +3120 5951866 INC READER #7 fax: +3120 5951840 email: [email protected] web: http://www.networkcultures.org Order a copy of this book by sending an email to: [email protected] A pdf of this publication can be downloaded freely at: http://www.networkcultures.org/publications Join the Critical Point of View mailing list at: http://www.listcultures.org Supported by: The School for Communication and Design at the Amsterdam University of Applied Sciences (Hogeschool van Amsterdam DMCI), the Centre for Internet and Society (CIS) in Bangalore and the Kusuma Trust. Thanks to Johanna Niesyto (University of Siegen), Nishant Shah and Sunil Abraham (CIS Bangalore) Sabine Niederer and Margreet Riphagen (INC Amsterdam) for their valuable input and editorial support. Thanks to Foundation Democracy and Media, Mondriaan Foundation and the Public Library Amsterdam (Openbare Bibliotheek Amsterdam) for supporting the CPOV events in Bangalore, Amsterdam and Leipzig. (http://networkcultures.org/wpmu/cpov/) Special thanks to all the authors for their contributions and to Cielo Lutino, Morgan Currie and Ivy Roberts for their careful copy-editing.
    [Show full text]
  • ORES Scalable, Community-Driven, Auditable, Machine Learning-As-A-Service in This Presentation, We Lay out Ideas
    ORES Scalable, community-driven, auditable, machine learning-as-a-service In this presentation, we lay out ideas... This is a conversation starter, not a conversation ender. We put forward our best ideas and thinking based on our partial perspectives. We look forward to hearing yours. Outline - Opportunities and fears of artificial intelligence - ORES: A break-away success - The proposal & asks We face a significant challenge... There’s been a significant decline in the number of active contributors to the English-language Wikipedia. Contributors have fallen by 40% over the past eight years, to about 30,000. Hamstrung by our own success? “Research indicates that the problem may be rooted in Wikipedians’ complex bureaucracy and their often hard-line responses to newcomers’ mistakes, enabled by semi-automated tools that make deleting new changes easy.” - MIT Technology Review (December 1, 2015) It’s the revert-new-editors’-first-few-edits-and-alienate-them problem. Imagine an algorithm that can... ● Judge whether an edit was made in good faith, so that new editors don’t have their first contributions wiped out. ● Detect bad-faith edits and help fight vandalism. Machine learning and prediction have the power to support important work in open contribution projects like Wikipedia at massive scales. But it’s far from a no-brainer. Quite the opposite. On the bright side, Machine learning has the potential to help our projects scale by reducing the workload of editors and enhancing the value of our content. by Luiz Augusto Fornasier Milani under CC BY-SA 4.0, from Wikimedia Commons. On the dark side, AIs have the potential to perpetuate biases and silence voices in novel and insidious ways.
    [Show full text]
  • Package 'Wikipedir'
    Package ‘WikipediR’ February 5, 2017 Type Package Title A MediaWiki API Wrapper Version 1.5.0 Date 2017-02-04 Author Oliver Keyes [aut, cre], Brock Tilbert [ctb] Maintainer Oliver Keyes <[email protected]> Description A wrapper for the MediaWiki API, aimed particularly at the Wikimedia 'production' wikis, such as Wikipedia. It can be used to retrieve page text, information about users or the history of pages, and elements of the category tree. License MIT + file LICENSE Imports httr, jsonlite Suggests testthat, knitr, WikidataR, pageviews BugReports https://github.com/Ironholds/WikipediR/issues URL https://github.com/Ironholds/WikipediR/ VignetteBuilder knitr RoxygenNote 5.0.1 NeedsCompilation no Repository CRAN Date/Publication 2017-02-05 08:44:55 R topics documented: categories_in_page . .2 pages_in_category . .3 page_backlinks . .4 page_content . .5 page_external_links . .6 page_info . .7 page_links . .8 1 2 categories_in_page query . .9 random_page . .9 recent_changes . 10 revision_content . 11 revision_diff . 12 user_contributions . 14 user_information . 15 WikipediR . 17 Index 18 categories_in_page Retrieves categories associated with a page. Description Retrieves categories associated with a page (or list of pages) on a MediaWiki instance Usage categories_in_page(language = NULL, project = NULL, domain = NULL, pages, properties = c("sortkey", "timestamp", "hidden"), limit = 50, show_hidden = FALSE, clean_response = FALSE, ...) Arguments language The language code of the project you wish to query, if appropriate. project The project you wish to query ("wikiquote"), if appropriate. Should be provided in conjunction with language. domain as an alternative to a language and project combination, you can also provide a domain ("rationalwiki.org") to the URL constructor, allowing for the querying of non-Wikimedia MediaWiki instances. pages A vector of page titles, with or without spaces, that you want to retrieve cate- gories for.
    [Show full text]
  • A Roll of the Dice
    Confidence Displays and Training in Recommender Systems Sean M. McNee, Shyong K. Lam, Catherine Guetzlaff, Joseph A. Konstan, and John Riedl GroupLens Research Project Department of Computer Science and Engineering University of Minnesota, MN 55455 USA {mcnee, lam, guetzlaf, konstan, riedl}@cs.umn.edu Abstract: Recommender systems help users sort through vast quantities of information. Sometimes, however, users do not know if they can trust the recommendations they receive. Adding a confidence metric has the potential to improve user satisfaction and alter user behavior in a recommender system. We performed an experiment to measure the effects of a confidence display as a component of an existing collaborative filtering- based recommender system. Minimal training improved use of the confidence display compared to no training. Novice users were less likely to notice, understand, and use the confidence display than experienced users of the system. Providing training about a confidence display to experienced users greatly reduced user satisfaction in the recommender system. These results raise interesting issues and demonstrate subtle effects about how and when to train users when adding features to a system. Keywords: Recommender Systems, Collaborative Filtering, Confidence Displays, Risk, Training range. For example, a fan of science fiction may 1 Introduction receive a recommendation for a particular art film because other science fiction fans liked it. This People face the problem of information overload serendipity in recommendations is one of the every day. As the number of web sites, books, strengths of collaborative filtering. magazines, research papers, and so on continue to It is also one of its weaknesses. CF algorithms rise, it is getting harder to keep up.
    [Show full text]