Eliciting Pairwise Preferences in Recommender Systems
Total Page:16
File Type:pdf, Size:1020Kb
Eliciting Pairwise Preferences in Recommender Systems Saikishore Kalloori, Francesco Ricci and Rosella Gennari Free University of Bozen - Bolzano, Italy Piazza Domenicani 3, I - 39100, Bolzano, Italy [email protected],[email protected],[email protected] ABSTRACT from absolute item evaluations, e.g., user ratings or likes [31]. How- Preference data in the form of ratings or likes for items are widely ever, this type of preferences have few disadvantages, illustrated in used in many Recommender Systems. However, previous research previous literature [5, 22, 23]. For instance, if a user has assigned has shown that even item comparisons, which generate pairwise the highest rating to an item and, subsequently, the user nds that preference data, can be used to model user preferences. Moreover, he prefers another item to the rst one, the user has no choice but pairwise preferences can be eectively combined with ratings to to also give to this new item the highest rating [11]. Moreover, if compute recommendations. In such hybrid approaches, the Rec- most of the items are rated with the largest rating, then it is dicult ommender System requires to elicit both types of preference data to understand which items the user does like the most. from the user. In this work, we aim at identifying how and when to For such reasons, few researchers [5, 11, 22] have developed elicit pairwise preferences, i.e., when this form of user preference recommendation techniques that leverage an alternative way to data is more meaningful for the user to express and more benecial collect user preferences, namely, pairwise comparisons, such as for the system. We conducted an online A/B test and compared a item i is preferred to item j. In these scenarios, users provide their rating-only based system variant with another variant that allows preferences by expressing which item is preferred in a pair (i;j). the user to enter both types of preferences. Our results demonstrate In [24, 25], it was shown that pairwise preferences can be eec- that pairwise preferences are valuable and useful, especially when tively used for training matrix factorization and nearest neighbor the user is focusing on a specic type of items. By incorporating approaches, and ultimately to compute eective recommendations. pairwise preferences, the system can generate beer recommenda- It is worth noting that pairwise preferences naturally arise and tions than a state of the art rating-only based solution. Additionally, are expressed by users (directly or indirectly) in many decision- our results indicate that there seems to be a dependency between making scenarios. Actually, in everyday life, there are situations the user’s personality, the perceived system usability and the sat- where rating alternative options is not the most natural mechanism isfaction for the preference elicitation procedure, which varies if for expressing preferences and making decisions. For instance, we only ratings or a combination of ratings and pairwise preferences do not rate sweaters when we want to buy one. It is more likely that are elicited. we will compare them and purchase the preferred one. However, understanding which pairwise preferences the user should express, CCS CONCEPTS so as to learn her user model and when to acquire these comparisons, is a pivotal issue in designing eective RSs based on this form of •Information systems ! Recommender systems; preference data. KEYWORDS For this reason, in this work, we aim at understanding when eliciting pairwise preferences is meaningful and benecial. We Pairwise Preferences; Ratings; Recommender Systems hypothesize that pairwise preferences are more eective, if com- ACM Reference format: pared with ratings, in situations and scenarios where the user has a Saikishore Kalloori, Francesco Ricci and Rosella Gennari. 2018. Eliciting rather clear objective and is looking for a specic type of items (rec- Pairwise Preferences in Recommender Systems. In Proceedings of Twelh ommendation). For instance, when a user wants to enjoy a beach ACM Conference on Recommender Systems, Vancouver, BC, Canada, October resort along with his family during a weekend, it is more likely 2–7, 2018 (RecSys ’18), 9 pages. that he will choose the best option by comparing the available ones DOI: 10.1145/3240323.3240364 instead of rating them. We focus on such user situations having a specic goal/context, and we model their preferences. Our intuition 1 INTRODUCTION is that in order to make a comparison, the user requires a (set of) Most of the current Recommender Systems (RSs) generate person- criteria to formulate a judgment. Moreover, we also hypothesize alised suggestions by leveraging preference information derived that the items to be compared should not be too dierent; a buyer Permission to make digital or hard copies of all or part of this work for personal or can compare two cars but not a car and a bicycle. In fact, even classroom use is granted without fee provided that copies are not made or distributed the adjective “comparable” means that the compared objects are for prot or commercial advantage and that copies bear this notice and the full citation somewhat similar. on the rst page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permied. To copy otherwise, or We also aim at addressing the issue of which specic items the republish, to post on servers or to redistribute to lists, requires prior specic permission user should compare. In fact, in addition to the issue mentioned and/or a fee. Request permissions from [email protected]. above (i.e., that the compared items should be somewhat similar), RecSys ’18, Vancouver, BC, Canada © 2018 Copyright held by the owner/author(s). Publication rights licensed to ACM. an additional downside of acquiring user preference information 978-1-4503-5901-6/18/10...$15.00 from pairwise comparison is that the number of possible compar- DOI: 10.1145/3240323.3240364 isons in a set of N items is N 2. Even though it has been shown 329 RecSys ’18, October 2–7, 2018, Vancouver, BC, Canada Saikishore Kalloori, Francesco Ricci and Rosella Gennar that in practice RSs based on pairwise preferences do not require a comprehensive discussion of the obtained results. Finally, we more preference data than systems based on ratings [5], it is still formulate our conclusions and discuss future work. important and critical to identify, for each user, a convenient set of comparisons that she should express. In order to eectively 2 RELATED WORK elicit pairwise preferences, the RS needs a usable GUI and an active 2.1 Pairwise Preferences learning method that can identify which pairs of items the users could and should compare. Pairwise preferences (relative preferences or comparisons) are Hence, in this paper, we develop a specic interaction model, GUI widely used even outside the RS research area, e.g., in decision components, an active learning strategy for pairwise preferences making and marketing. For instance, popular techniques, such as, and a personalized ranking algorithm that uses pairwise preferences Analytic Hierarchy Processing (AHP) [34] or Conjoint Analysis in order to validate the following hypotheses: [37], use relative comparison between product features to under- stand or elicit users interest. In Behavioral Economics, according to (H ) Pairwise preferences can lead to a larger system usability 1 Preference Reversal theory [40], user’s preferences while evaluating compared to ratings. items individually (Separate Evaluation) dier from the preferences (H2) e usage of pairwise preferences can lead to a larger user expressed while comparing items together (Joint Evaluation). Users satisfaction for the preference elicitation process compared tend to express preferences dierently in these two situations [1] to the usage of ratings, when the choice set has been al- and therefore it is important for RSs to analyse the quality of rec- ready reduced. ommendations generated by alternative preference models (e.g., ratings vs. pairwise preferences). Moreover, pairwise comparison (H3) Pairwise preferences can lead to higher RS accuracy and are shown to be useful in obtaining reliable teacher assessments quality compared to ratings when the user has a clear [19] and peer assessment of undergraduate studies [21]. decision making objective. In previous research, it has been already shown that pairwise Modeling user preferences in the form of pairwise preferences and preference judgments are also faster than ratings to express, and identifying which situation is best suited for eliciting pairwise pref- users prefer to provide their preferences in this form [12, 22]. Ad- erences has not been explored in RSs so far. erefore, in order to ditionally, the authors of [22] showed that pairwise preferences are validate our research hypotheses, we have implemented techniques more stable than ratings. More in general, some research works in for pairwise preference elicitation and recommendation generation RSs, instead of modeling user preferences using ratings, tried mod- in a mobile RS, called South Tyrol Suggests (STS). STS recommends eling user preferences in other forms such as group based elicitation interesting Places of Interests (POIs) in South Tyrol region in Italy approach using groups of items [32, 39] and choice based prefer- [7]. STS is an Android-based RS that provides users with context- ence elicitation [18]. ese approaches have shown to yield good aware recommendations. We have implemented two RSs variants, performance in terms of reducing user eort, time and improving using STS, and conducted a live user study. One variant is based user satisfaction when compared to ratings. on ratings only while the other employs our newly developed algo- Focussing on recommendation algorithms, while there are nu- rithms, gathers user preferences in the form of pairwise preferences merous RSs that generate recommendations by using user pref- and makes recommendations using them together with a possibly erences in the form of ratings, recently, few authors developed pre-existent rating data set.