A Grouplens Perspective

Total Page:16

File Type:pdf, Size:1020Kb

A Grouplens Perspective From: AAAI Technical Report WS-98-08. Compilation copyright © 1998, AAAI (www.aaai.org). All rights reserved. RecommenderSystems: A GroupLensPerspective Joseph A. Konstan*t , John Riedl *t, AI Borchers,* and Jonathan L. Herlocker* *GroupLensResearch Project *Net Perceptions,Inc. Dept. of ComputerScience and Engineering 11200 West78th Street University of Minnesota Suite 300 Minneapolis, MN55455 Minneapolis, MN55344 http://www.cs.umn.edu/Research/GroupLens/ http://www.netperceptions.com/ ABSTRACT identifying sets of articles by keyworddoes not scale to a In this paper, wereview the history and research findings of situation in which there are thousands of articles that the GroupLensResearch project I and present the four broad contain any imaginable set of keywords. Taken together, research directions that we feel are most critical for these two weaknesses represented an opportunity for a new recommender systems. type of filtering, that would focus on finding which INTRODUCTION:A History of the GroupLensProject available articles matchhuman notions of quality and taste. The GroupLens Research project began at the Computer Such a system would be able to produce a list of articles Supported Cooperative Work (CSCW)Conference in 1992. that each user wouldlike, independentof their content. Oneof the keynote speakers at the conference lectured on a Wedecided to apply our ideas in the domain of Usenet his vision of an emerging information economy,in which news. Usenet screamsfor better information filtering, with most of the effort in the economywould revolve around hundreds of thousands of articles posted daily. Manyof the production, distribution, and consumptionof information, articles in each Usenet newsgroupare on the sametopic, so rather than physical goods and services. Paul Resnick, then syntactic techniques that identify topic are muchless a student at MIT, and nowa professor at the University of valuable in Usenet. Further, different people value very Michigan, and one of us (Riedl) were movedby the talk different sets of articles, with somepeople participating in consider the technical challenges that would have to be long discussion threads that other people couldn’t imagine overcometo enable the information economy.We realized even reading. that as the amount of information increased enormously, Wedeveloped a system that falls into the class that is now while people’s ability to process information remained called automatic collaborative filtering. It collects ratings stable, one of the critical challenges would be technology from people on articles, combinesthe ratings statistically, that would automate matching people with the information and produces recommendations for other people of how they would find most valuable. muchthey are likely to like each article. Therewere two main thrusts of research activity in this area Weinvited people to participate in using GroupLensfrom that we knewof: (1) Artificial Intelligence (AI) research all over the Internet, and studied the effect of the systemon develop tools that would serve as a "knowledgerobot", or users. Users resisted our early attempts to establish multi- knowbot, continually seeking out information, reading and dimensional rating schemes, including characteristics such understandingit, and returning with the informationthat the as quality of the writing, and suitability of the topic for the knowbotdetermined would be most valuable to its user. (2) newsgroup. Rating on multiple dimensions was too much Information Filtering (IF) research to develop even more work. Wechanged to single-dimension ratings, with the efficient tools for selecting documents that contain dimension being "What score would you have liked keywordsof interest to a user. These techniques were, and GroupLensto predict for you for this article?" continue to be fruitful, but we felt they each have one serious weakness. In the case of the knowbot,the weakness Wefound that users did change behavior in response to the is that we are still a significant distance from technology recommendations,reading a muchhigher percentage of the that can understand articles in the waya humandoes. In the articles that GroupLenspredicted they would like than of case of Information Filtering, the weakness is that either randomly selected articles, or articles GroupLens predicted they would not like. However,there were many 1 GroupLensT is a trademark of Net Perceptions, Inc, articles for whichGroupLens was unable to provide ratings, M because even with a two to three hundred users, there were which develops and markets the GroupLens simply too manyarticles in the six newsgroups we were Recommendation Engine. Net Perceptions allows the studying. A greater density of ratings by article wouldhave University of Minnesota to use the name "GroupLens improvedthe usability of the system for most users. The Research" for continuity. The ideas and opinions low ratings density was compoundedby the first rater expressed in this paper are those of the authors and do not problem, which is the problem that a pure collaborative represent opinions of Net Perceptions, Inc. filtering system cannot possibly makerecommendations to 60 the first person that reads each article. Oneeffect of these may yield more accurate recommendations. Even if the two problems is that some beginning users of the system increased accuracy is offset by the smaller numberof items saw little value from GroupLensinitially, and hence never available to establish user correlations, partitioning maybe developed the habit of contributing ratings, though they valuable because it can help scale the performanceof the continued to use GroupLens-enablednews readers. system; each partition can be run in parallel on a separate Becausemost users did not like most articles, and because server. GroupLenswas effective at identifying articles users would To explore the potential of item partitioning, we considered like, users requested the ability to scan a newsgroupfor the three partitioning strategies for MovieLens: random articles that were predicted to be of high interest to them. partitions, partitions by movie genre, and partitions This led to our exploring a different style of interface to a generated algorithmically by clustering based on ratings. collaborative filtering system, the TopNinterface. Rather Clustering-based partitions produced a slight loss in than predicting a score for each article, a TopNinterface prediction accuracy as partitions grew smaller, but showed greedily seeks articles that are likely to have high scores for promise for a reasonable trade-off between performance an individual user, and recommendsthose articles to that and accuracy. Moviegenre partitions yielded less accurate user. Eventually, such an interface might be able to present recommendations than cluster-based ones, though some each of us with a list of the 20-30 most interesting articles genres were muchmore accurate, and others muchless so). for us from all of Usenet each morning. Randompartitions were slightly worse still. The value of item partitions clearly depends on the domain of the Our key lesson learned was that a very high volume, low recommendationsystem and the density of ratings within quality system like Usenet would require a very large and across potential partitions (our earlier Usenet work numberof users for collaborative filtering to be successful. found that mixing widely different newsgroups together For our research purposes, we needed a lower volume, reduced accuracy). One advantage of the clustering result higher density testbed. Our colleagues from Digital is that it maybe morebroadly applicable in domains where Equipment Corporation were closing downtheir research items lack obviousattributes for partitioning. system on movie recommendations,and offered us the data Wealso looked at the value of user partitioning, starting to jump-start a similar system using GroupLens. We with the extreme case of pre-computed symmetric launched our system in the summerof 1997, and have been neighborhoods based on our clustering algorithm; these running it since at www.movielens.umn.edu.MovieLens is were small partitions of about 200 users. If symmetric entirely web-based,and has several thousandregular users. neighborhoodsyield good results, time per recommendation Users rate movies, and MovieLens recommends other can be reduced dramatically, since substantial per- movies to them. neighborhood computation can be performed incrementally Over the past six years of research, we have learned that and amortized across the neighbors. We found that the people are hungry for effective tools for information accuracy of recommendationswas almost as good as using filtering, and that collaborative filtering is an exciting the full data set, but that the coverage(i.e., the numberof complementto existing filtering systems. Users value both movies for which we could computea recommendation)fell the taste-based recommendations, and the sense of by 14%. To restore coverage we introduced a two level communitythey get by participating in a group filtering hierarchy of users. process. However,there are manyopen research problems Users from each other neighborhoodwere collapsed into a still in collaborative filtering. Belowwe discuss our early single composite user. Each neighborhood then had all results on some of these problems, and outline the users represented, similar users were represented at full remaining problems we feel to be most important to the resolution and the more distant
Recommended publications
  • The Use of Time Dimension in Recommender Systems for Learning
    The Use of Time Dimension in Recommender Systems for Learning Eduardo José de Borba1, Isabela Gasparini1 and Daniel Lichtnow2 1Graduate Program in Applied Computing (PPGCA), Department of Computer Science (DCC), Santa Catarina State University (UDESC), Paulo Malschitzki 200, Joinville, Brazil 2Polytechnic School, Federal University of Santa Maria (UFSM), Av. Roraima 1000, Santa Maria, Brazil Keywords: Recommender System, Context-aware, Time, Learning. Abstract: When the amount of learning objects is huge, especially in the e-learning context, users could suffer cognitive overload. That way, users cannot find useful items and might feel lost in the environment. Recommender systems are tools that suggest items to users that best match their interests and needs. However, traditional recommender systems are not enough for learning, because this domain needs more personalization for each user profile and context. For this purpose, this work investigates Time-Aware Recommender Systems (Context-aware Recommender Systems that uses time dimension) for learning. Based on a set of categories (defined in previous works) of how time is used in Recommender Systems regardless of their domain, scenarios were defined that help illustrate and explain how each category could be applied in learning domain. As a result, a Recommender System for learning is proposed. It combines Content-Based and Collaborative Filtering approaches in a Hybrid algorithm that considers time in Pre- Filtering and Post-Filtering phases. 1 INTRODUCTION potential to improve the quality of recommendation (Campos et al., 2014). Nowadays, there are distinct educational approaches, This work aims to identify how Time-aware RS e.g. online learning, blended learning, face-to-face (Context-aware RS that uses time context) can be learning, etc.
    [Show full text]
  • Mobile Application Recommender System
    UPTEC IT 10 025 Examensarbete 30 hp December 2010 Mobile Application Recommender System Christoffer Davidsson Abstract Mobile Application Recommender System Christoffer Davidsson Teknisk- naturvetenskaplig fakultet UTH-enheten With the amount of mobile applications available increasing rapidly, users have to put a lot of effort into finding applications of interest. The purpose of this thesis is to Besöksadress: investigate how to aid users in the process of discovering new mobile applications by Ångströmlaboratoriet Lägerhyddsvägen 1 providing them with recommendations. A prototype system is then built as a Hus 4, Plan 0 proof-of-concept. Postadress: The work of the thesis is divided into three phases where the aim of the first phase is Box 536 751 21 Uppsala to study related work and related systems to identify promising concepts and features. During the second phase, a prototype system is designed and implemented. Telefon: The outcome and result of the first two phases is then evaluated and analyzed in the 018 – 471 30 03 third and final phase. Telefax: 018 – 471 30 00 The prototype system integrates and extends an existing recommender engine previously used to recommend media items. As a part of the system, an Android Hemsida: application is developed, which observes user actions and presents recommended http://www.teknat.uu.se/student applications to the user. In parallel to the development, the system was tested by a small group of users recruited among colleagues at Ericsson. The data generated during this test period is analyzed to show the usefulness of observed user actions over explicit ratings and the dependency on context for application usage.
    [Show full text]
  • A Systematic Review and Taxonomy of Explanations in Decision Support and Recommender Systems
    Noname manuscript No. (will be inserted by the editor) A Systematic Review and Taxonomy of Explanations in Decision Support and Recommender Systems Ingrid Nunes · Dietmar Jannach Received: date / Accepted: date Abstract With the recent advances in the field of artificial intelligence, an increasing number of decision-making tasks are delegated to software systems. A key requirement for the success and adoption of such systems is that users must trust system choices or even fully automated decisions. To achieve this, explanation facilities have been widely investigated as a means of establishing trust in these systems since the early years of expert systems. With today's increasingly sophisticated machine learning algorithms, new challenges in the context of explanations, accountability, and trust towards such systems con- stantly arise. In this work, we systematically review the literature on expla- nations in advice-giving systems. This is a family of systems that includes recommender systems, which is one of the most successful classes of advice- giving software in practice. We investigate the purposes of explanations as well as how they are generated, presented to users, and evaluated. As a result, we derive a novel comprehensive taxonomy of aspects to be considered when de- signing explanation facilities for current and future decision support systems. The taxonomy includes a variety of different facets, such as explanation objec- tive, responsiveness, content and presentation. Moreover, we identified several challenges that remain unaddressed so far, for example related to fine-grained issues associated with the presentation of explanations and how explanation facilities are evaluated. Keywords Explanation · Decision Support System · Recommender System · Expert System · Knowledge-based System · Systematic Review · Machine Learning · Trust · Artificial Intelligence I.
    [Show full text]
  • User Data Analytics and Recommender System for Discovery Engine
    User Data Analytics and Recommender System for Discovery Engine Yu Wang Master of Science Thesis Stockholm, Sweden 2013 TRITA-ICT-EX-2013: 88 User Data Analytics and Recommender System for Discovery Engine Yu Wang [email protected] Whaam AB, Stockholm, Sweden Royal Institute of Technology, Stockholm, Sweden June 11, 2013 Supervisor: Yi Fu, Whaam AB Examiner: Prof. Johan Montelius, KTH Abstract On social bookmarking website, besides saving, organizing and sharing web pages, users can also discovery new web pages by browsing other’s bookmarks. However, as more and more contents are added, it is hard for users to find interesting or related web pages or other users who share the same interests. In order to make bookmarks discoverable and build a discovery engine, sophisticated user data analytic methods and recommender system are needed. This thesis addresses the topic by designing and implementing a prototype of a recommender system for recommending users, links and linklists. Users and linklists recommendation is calculated by content- based method, which analyzes the tags cosine similarity. Links recommendation is calculated by linklist based collaborative filtering method. The recommender system contains offline and online subsystem. Offline subsystem calculates data statistics and provides recommendation candidates while online subsystem filters the candidates and returns to users. The experiments show that in social bookmark service like Whaam, tag based cosine similarity method improves the mean average precision by 45% compared to traditional collaborative method for user and linklist recommendation. For link recommendation, the linklist based collaborative filtering method increase the mean average precision by 39% compared to the user based collaborative filtering method.
    [Show full text]
  • A Recommender System for Groups of Users
    PolyLens: A Recommender System for Groups of Users Mark O’Connor, Dan Cosley, Joseph A. Konstan, and John Riedl Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN 55455 USA {oconnor;cosley;konstan;riedl}@cs.umn.edu Abstract. We present PolyLens, a new collaborative filtering recommender system designed to recommend items for groups of users, rather than for individuals. A group recommender is more appropriate and useful for domains in which several people participate in a single activity, as is often the case with movies and restaurants. We present an analysis of the primary design issues for group recommenders, including questions about the nature of groups, the rights of group members, social value functions for groups, and interfaces for displaying group recommendations. We then report on our PolyLens prototype and the lessons we learned from usage logs and surveys from a nine-month trial that included 819 users. We found that users not only valued group recommendations, but were willing to yield some privacy to get the benefits of group recommendations. Users valued an extension to the group recommender system that enabled them to invite non-members to participate, via email. Introduction Recommender systems (Resnick & Varian, 1997) help users faced with an overwhelming selection of items by identifying particular items that are likely to match each user’s tastes or preferences (Schafer et al., 1999). The most sophisticated systems learn each user’s tastes and provide personalized recommendations. Though several machine learning and personalization technologies can attempt to learn user preferences, automated collaborative filtering (Resnick et al., 1994; Shardanand & Maes, 1995) has become the preferred real-time technology for personal recommendations, in part because it leverages the experiences of an entire community of users to provide high quality recommendations without detailed models of either content or user tastes.
    [Show full text]
  • Community, Impact and Credit: Where Should I Submit My Papers?
    Panels February 23–27, 2013, San Antonio, Texas, USA Community, Impact and Credit: Where should I submit my papers? Abstract Aaron Halfaker R. Stuart Geiger We (the authors of CSCWs program) have finite time and GroupLens Research School of Information energy that can be invested into our publications and the University of Minnesota University of CA, Berkeley research communities we value. While we want our work [email protected] [email protected] to have the most impact possible, we also want to grow Cliff Lampe Loren Terveen and support productive research communities within School of Information GroupLens Research which to have this impact. This panel discussion explores Michigan State University University of Minnesota the costs and benefits of submitting papers to various [email protected] [email protected] tiers of conferences and journals surrounding CSCW and Amy Bruckman Brian Keegan reflects on the value of investing hours into building up a College of Computing Northeastern University research community. Georgia Inst. of Technology [email protected] [email protected] Author Keywords Aniket Kittur Geraldine Fitzpatrick community; credit; impact; publishing; peer review HCI Institute Vienna University of Carnegie Mellon University Technology ACM Classification Keywords [email protected] geraldine.fi[email protected] H.5.0. [Information Interfaces and Presentation (e.g. HCI)]: General Introduction We (the authors of CSCWs program) have finite time and energy that can be invested into our publications and the research communities we value. In order to allow our work to have an impact, we must also grow and maintain Copyright is held by the author/owner(s).
    [Show full text]
  • Chapter: Music Recommender Systems
    Chapter 13 Music Recommender Systems Markus Schedl, Peter Knees, Brian McFee, Dmitry Bogdanov, and Marius Kaminskas 13.1 Introduction Boosted by the emergence of online music shops and music streaming services, digital music distribution has led to an ubiquitous availability of music. Music listeners, suddenly faced with an unprecedented scale of readily available content, can easily become overwhelmed. Music recommender systems, the topic of this chapter, provide guidance to users navigating large collections. Music items that can be recommended include artists, albums, songs, genres, and radio stations. In this chapter, we illustrate the unique characteristics of the music recommen- dation problem, as compared to other content domains, such as books or movies. To understand the differences, let us first consider the amount of time required for ausertoconsumeasinglemediaitem.Thereisobviouslyalargediscrepancyin consumption time between books (days or weeks), movies (one to a few hours), and asong(typicallyafewminutes).Consequently,thetimeittakesforausertoform opinions for music can be much shorter than in other domains, which contributes to the ephemeral, even disposable, nature of music. Similarly, in music, a single M. Schedl (!)•P.Knees Department of Computational Perception, Johannes Kepler University Linz, Linz, Austria e-mail: [email protected]; [email protected] B. McFee Center for Data Science, New York University, New York, NY, USA e-mail: [email protected] D. Bogdanov Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain e-mail: [email protected] M. Kaminskas Insight Centre for Data Analytics, University College Cork, Cork, Ireland e-mail: [email protected] ©SpringerScience+BusinessMediaNewYork2015 453 F. Ricci et al. (eds.), Recommender Systems Handbook, DOI 10.1007/978-1-4899-7637-6_13 454 M.
    [Show full text]
  • A Multitask Ranking System
    Recommending What Video to Watch Next: A Multitask Ranking System Zhe Zhao, Lichan Hong, Li Wei, Jilin Chen, Aniruddh Nath, Shawn Andrews, Aditee Kumthekar, Maheswaran Sathiamoorthy, Xinyang Yi, Ed Chi Google, Inc. {zhezhao,lichan,liwei,jilinc,aniruddhnath,shawnandrews,aditeek,nlogn,xinyang,edchi}@google.com ABSTRACT promising items. We present experiments and lessons learned from In this paper, we introduce a large scale multi-objective ranking building such a ranking system on a large-scale industrial video system for recommending what video to watch next on an indus- publishing and sharing platform. trial video sharing platform. The system faces many real-world Designing and developing a real-world large-scale video recom- challenges, including the presence of multiple competing ranking mendation system is full of challenges, including: objectives, as well as implicit selection biases in user feedback. To • There are often diferent and sometimes conficting objec- tackle these challenges, we explored a variety of soft-parameter tives which we want to optimize for. For example, we may sharing techniques such as Multi-gate Mixture-of-Experts so as to want to recommend videos that users rate highly and share efciently optimize for multiple ranking objectives. Additionally, with their friends, in addition to watching. we mitigated the selection biases by adopting a Wide & Deep frame- • There is often implicit bias in the system. For example, a user work. We demonstrated that our proposed techniques can lead to might have clicked and watched a video simply because it substantial improvements on recommendation quality on one of was being ranked high, not because it was the one that the the world’s largest video sharing platforms.
    [Show full text]
  • When Diversity Met Accuracy: a Story of Recommender Systems †
    proceedings Extended Abstract When Diversity Met Accuracy: A Story of Recommender Systems † Alfonso Landin * , Eva Suárez-García and Daniel Valcarce Department of Computer Science, University of A Coruña, 15071 A Coruña, Spain; [email protected] (E.S.-G.); [email protected] (D.V.) * Correspondence: [email protected]; Tel.: +34-881-01-1276 † Presented at the XoveTIC Congress, A Coruña, Spain, 27–28 September 2018. Published: 14 September 2018 Abstract: Diversity and accuracy are frequently considered as two irreconcilable goals in the field of Recommender Systems. In this paper, we study different approaches to recommendation, based on collaborative filtering, which intend to improve both sides of this trade-off. We performed a battery of experiments measuring precision, diversity and novelty on different algorithms. We show that some of these approaches are able to improve the results in all the metrics with respect to classical collaborative filtering algorithms, proving to be both more accurate and more diverse. Moreover, we show how some of these techniques can be tuned easily to favour one side of this trade-off over the other, based on user desires or business objectives, by simply adjusting some of their parameters. Keywords: recommender systems; collaborative filtering; diversity; novelty 1. Introduction Over the years the user experience with different services has shifted from a proactive approach, where the user actively look for content, to one where the user is more passive and content is suggested to her by the service. This has been possible due to the advance in the field of recommender systems (RS), making it possible to make better suggestions to the users, personalized to their preferences.
    [Show full text]
  • Improvements in Holistic Recommender System Research
    Improvements in Holistic Recommender System Research A DISSERTATION SUBMITTED TO THE FACULTY OF THE UNIVERSITY OF MINNESOTA BY Daniel Allen Kluver IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Joseph A. Konstan August, 2018 c Daniel Allen Kluver 2018 ALL RIGHTS RESERVED Dedication This dissertation is dedicated to my family, my friends, my advisers John Riedl and Joseph Konstan, my colleagues, both at GroupLens research and at Macalester College, and everyone else who believed in me and supported me along the way. Your support meant everything when I couldn’t support myself. Your belief meant everything when I couldn’t believe in myself. I couldn’t have done this without your help. i Abstract Since the mid 1990s, recommender systems have grown to be a major area of de- ployment in industry, and research in academia. A through-line in this research has been the pursuit, above all else, of the perfect algorithm. With this admirable focus has come a neglect of the full scope of building, maintaining, and improving recom- mender systems. In this work I outline a system deployment and a series of offline and online experiments dedicated to improving our holistic understanding of recommender systems. This work explores the design, algorithms, early performance, and interfaces of recommender systems within the scope of how they are interconnected with other aspects of the system. This work explores many indivisual aspects of a recommender system while keeping in mind how they are connected to other aspects of
    [Show full text]
  • Recommender Systems User-Facing Decision Support Systems
    Recommender Systems User-Facing Decision Support Systems Michael Hahsler Intelligent Data Analysis Lab (IDA@SMU) CSE, Lyle School of Engineering Southern Methodist University EMIS 5/7357: Decision Support Systems February 22, 2012 Michael Hahsler (IDA@SMU) Recommender Systems EMIS 5/7357: DSS 1 / 44 Michael Hahsler (IDA@SMU) Recommender Systems EMIS 5/7357: DSS 2 / 44 Michael Hahsler (IDA@SMU) Recommender Systems EMIS 5/7357: DSS 3 / 44 Michael Hahsler (IDA@SMU) Recommender Systems EMIS 5/7357: DSS 4 / 44 Table of Contents 1 Recommender Systems & DSS 2 Content-based Approach 3 Collaborative Filtering (CF) Memory-based CF Model-based CF 4 Strategies for the Cold Start Problem 5 Open-Source Implementations 6 Example: recommenderlab for R 7 Empirical Evidence Michael Hahsler (IDA@SMU) Recommender Systems EMIS 5/7357: DSS 5 / 44 Decision Support Systems ? Michael Hahsler (IDA@SMU) Recommender Systems EMIS 5/7357: DSS 6 / 44 Decision Support Systems \Decision Support Systems are defined broadly [...] as interactive computer-based systems that help people use computer communications, data, documents, knowledge, and models to solve problems and make decisions." Power (2002) Main Components: Knowledge base Model User interface Michael Hahsler (IDA@SMU) Recommender Systems EMIS 5/7357: DSS 6 / 44 ! A recommender system is a decision support systems which help a seller to choose suitable items to offer given a limited information channel. Recommender Systems Recommender systems apply statistical and knowledge discovery techniques to the problem of making product recommendations. Sarwar et al. (2000) Advantages of recommender systems (Schafer et al., 2001): Improve conversion rate: Help customers find a product she/he wants to buy.
    [Show full text]
  • Evaluating Recommendation Systems
    Evaluating Recommendation Systems Guy Shani and Asela Gunawardana Abstract Recommender systems are now popular both commercially and in the research community, where many approaches have been suggested for providing recommendations. In many cases a system designer that wishes to employ a rec- ommendation system must choose between a set of candidate approaches. A first step towards selecting an appropriate algorithm is to decide which properties of the application to focus upon when making this choice. Indeed, recommendation sys- tems have a variety of properties that may affect user experience, such as accuracy, robustness, scalability, and so forth. In this paper we discuss how to compare recom- menders based on a set of properties that are relevant for the application. We focus on comparative studies, where a few algorithms are compared using some evaluation metric, rather than absolute benchmarking of algorithms. We describe experimental settings appropriate for making choices between algorithms. We review three types of experiments, starting with an offline setting, where recommendation approaches are compared without user interaction, then reviewing user studies, where a small group of subjects experiment with the system and report on the experience, and fi- nally describe large scale online experiments, where real user populations interact with the system. In each of these cases we describe types of questions that can be answered, and suggest protocols for experimentation. We also discuss how to draw trustworthy conclusions from the conducted experiments. We then review a large set of properties, and explain how to evaluate systems given relevant properties. We also survey a large set of evaluation metrics in the context of the property that they evaluate.
    [Show full text]