Learning New User Preferences in Recommender Systems Al Mamunur Rashid, Istvan Albert, Dan Cosley, Shyong K
Total Page:16
File Type:pdf, Size:1020Kb
Getting to Know You: Learning New User Preferences in Recommender Systems Al Mamunur Rashid, Istvan Albert, Dan Cosley, Shyong K. Lam, Sean M. McNee, Joseph A. Konstan, John Riedl GroupLens Research Project Department of Computer Science and Engineering University of Minnesota Minneapolis, MN 55455 USA {arashid, ialbert, cosley, lam, mcnee, konstan, riedl}@cs.umn.edu Recommender systems help people make decisions in these ABSTRACT Recommender systems have become valuable resources for complex information spaces. Recommenders suggest to the users seeking intelligent ways to search through the user items that she may value based on knowledge about enormous volume of information available to them. One her and the space of possible items. A news service, for crucial unsolved problem for recommender systems is how example, might remember the articles a user has read. The best to learn about a new user. In this paper we study six next time she visits the site, the system can recommend techniques that collaborative filtering recommender new articles to her based on the ones she has read before. systems can use to learn about new users. These techniques Collaborative filtering is one technique for producing select a sequence of items for the collaborative filtering recommendations. Given a domain of choices (items), users system to present to each new user for rating. The can express their opinions (ratings) of items they have tried techniques include the use of information theory to select before. The recommender can then compare the user’s the items that will give the most value to the recommender ratings to those of other users, find the “most similar” users system, aggregate statistics to select the items the user is based on some criterion of similarity, and recommend most likely to have an opinion about, balanced techniques items that similar users have liked in the past. that seek to maximize the expected number of bits learned When new users come along, however, the system knows per presented item, and personalized techniques that predict nothing about them. This is called the new user problem for which items a user will have an opinion about. We study recommender systems [1, 2, 6]. The system must acquire the techniques thru offline experiments with a large pre- some information about the new user in order to make existing user data set, and thru a live experiment with over personalized predictions. The most direct way to do this is 300 users. We show that the choice of learning technique to ask for ratings directly by presenting items to the user. significantly affects the user experience, in both the user effort and the accuracy of the resulting predictions. However, the system must be careful to present useful items that garner information. A food recommender, for Keywords instance, probably should not ask whether a new user likes Recommender systems, collaborative filtering, information vanilla ice cream. Most people like vanilla ice cream, so filtering, startup problem, entropy, user modeling. knowing that a new user likes it tells you little about the user. At the same time, the recommender should ask about INTRODUCTION items the user is likely to have an opinion about. A travel People make decisions every day. “Which movie should I recommender would probably not benefit by asking a new see?” “What city should I visit?” “What book should I user if she liked Burkina Faso, for instance. The read?” “What web page has the information I need?” We recommender system is likely to learn only that, like most have far too many choices and far too little time to explore people, she has not visited Burkina Faso, which is of little them all. The exploding availability of information that the value in forming future travel recommendations. web provides makes this problem even tougher. The choice of exactly what questions to ask a new user, then, is critical. An intelligent recommender interface will Permission to make digital or hard copies of all or part of this work for minimize a new user’s effort and get him to the fun part— personal or classroom use is granted without fee provided that copies are using the system and seeing recommendations—while still not made or distributed for profit or commercial advantage and that learning enough to make good recommendations. copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, In this paper we explore approaches for choosing which requires prior specific permission and/or a fee. items to present to new users for rating. We consider this IUI’02, January 13-16, 2002, San Francisco, California, USA. problem in the general case of recommender systems, Copyright 2002 ACM 1-58113-459-2/02/0001…$5.00. illustrating strategies and performing experiments using the MovieLens movie recommender. We first survey related on attributes of items) and asking questions that help work in the areas of decision theory and recommender choose an appropriate model for a user. When these models systems, then consider approaches for selecting movies to are accurate they can be quite useful, but the premise of present to users. We test these approaches on historical data personalized recommender systems and collaborative drawn from the 7.5 million-rating MovieLens dataset. We filtering is that a person’s preferences are a better predictor also test three of the most promising strategies on over 300 of other preferences than other attributes. Category and new MovieLens users. We then discuss the results and demographic models are thus less general than the methods suggest directions for future work. we present; they apply only to certain domains, and require domain-specific expertise. RELATED WORK We briefly mention related work in the field of decision Filterbots are a technique to overcome the startup problem theory and survey work that has been done on the new user for new items in a collaborative filtering system by problem in the area of recommender systems. injecting ratings agents that rate every item in the system according to their algorithmic analysis of the content of the Decision theory and entropy item [6]. Filterbots can make sure that every item in the Decision theory has proved useful in determining models system has many ratings to help users find the items they for re-ordering search results [4]. This application of utility are most interested in. However, filterbots do not directly functions has also been used in recommender systems [13, attack the new user problem. 14]. Others have integrated agents into a collaborative filtering Analysis of data for entropy—its theoretical information environment to extract user preference information content—has been a standard technique used in information transparently [17]. This method has the advantage of retrieval [10], medical diagnostic systems [9], and collecting implicit information in addition to explicitly sequential classification problems [3] for many years. provided ratings, and should gather data for new users more Lately, researchers have extended the use of entropy into rapidly. Using implicit data in addition to explicit data is a areas such as probabilistic models for information retrieval promising approach, and is complementary to our approach [7] and value-of-information analysis [16]. of carefully selecting which explicit data to collect. We apply decision theory techniques to a new problem: STRATEGIES FOR SELECTING ITEMS TO PRESENT choosing the items to first present to a new user of a There are trade-offs to be made when choosing a strategy recommender system. Our problem is in some ways the for presenting items. As discussed in the introduction, converse of the cited research; we are selecting items as requiring too much effort of the user will cause some users questions to present to the user, rather than choosing which to give up, while not asking enough questions will result in answers to present for a user’s question. poor recommendations. We identify four dimensions that a strategy might choose to support: (a) User effort: how hard Recommender systems and the new user problem was it to sign up? (b) User satisfaction: how well did the There has been little work in solving the new user problem user like the signup process? (c) Recommendation by analyzing ratings data to make smart decisions. Pennock accuracy: how well can the system make recommendations and Horvitz proposed the use of a “value-of-information” to the user? (d) System utility: how well will the system be calculation to discover the most valuable ratings able to serve all users, given what it learns from this one? information to next gather from a user [14]. To our knowledge, they have not published any implementations We choose to focus on user effort and accuracy. We chose or evaluations of their calculations. these two is because they are easy to measure and can be measured in both off-line and on-line experiments. User Kohrs and Merialdo make use of entropy and variance in satisfaction studies are difficult to do off-line from their ratings data in order to generate more accurate historical data, and we believe that user satisfaction will predictions for new users [12]. Our work expands their rise as user effort falls. While we touch on a few issues results by using a number of strategies that we consider as related to system utility, such as the danger of introducing being more suitable than variance or entropy. We also have biases into a system’s ratings database when using certain a much larger dataset for our offline experiments and verify strategies, we do not focus on it since our primary focus is our findings in a live experiment.