Infusing Collaborative Recommenders with Distributed Representations
Total Page:16
File Type:pdf, Size:1020Kb
Infusing Collaborative Recommenders with Distributed Representations Greg Zanotti1, Miller Horvath2, Lucas Nunes Barbosa2 ∗ Venkata Trinadh Kumar Gupta Immedisetty1, Jonathan Gemmell1 1Center for Web Intelligence School of Computing, DePaul University Chicago, IL, USA 2CAPES Foundation Ministry of Education of Brazil Brasilia, DF, BRA gzanotti,millerhorvath,lnbarbosa,vimmedis,[email protected] ABSTRACT Keywords Recommender systems assist users in navigating complex Collaborative filtering; neural networks; recommender sys- information spaces and focus their attention on the content tems; distributed representations most relevant to their needs. Often these systems rely on user activity or descriptions of the content. Social annota- tion systems, in which users collaboratively assign tags to 1. INTRODUCTION items, provide another means to capture information about Modern websites provide a myriad of conveniences. Users users and items. Each of these data sources provides unique can browse news articles, watch movies, download books benefits, capturing different relationships. or post their latest vacation pictures. The sheer volume In this paper, we propose leveraging multiple sources of of available content can be overwhelming and users often data: ratings data as users report their affinity toward an suffer from information overload. Consequently, websites item, tagging data as users assign annotations to items, and often offer some level of personalization. item data collected from an online database. Taken to- By personalizing a user's interaction with the website, the gether, these datasets provide the opportunity to learn rich user's attention can be focused on the content most related distributed representations by exploiting recent advances in his or her interests. Personalization might be achieved by neural network architectures. We first produce represen- allowing the user to manually select topics of interest. For tations that subjectively capture interesting relationships example, a visitor to a news aggregation site might select among the data. We then empirically evaluate the utility of \soccer" and “finance" from a list of available topics. Other the representations to predict a user's rating on an item and techniques identify a user's interests by passively observing show that it outperforms more traditional representations. their activity on the site. Finally, we demonstrate that traditional representations can Recommender systems identify relevant content for a user be combined with representations trained through a neural based on interests they have exhibited in the past. For ex- network to achieve even better results. ample, users that have rated popular science fiction movies highly in the past might be recommended the newest sci-fi blockbuster. Recommendation algorithms often work by computing the CCS Concepts similarity between users or between items. To compute this •Information systems ! Collaborative filtering; So- similarity, a representation is needed. Users can be repre- arXiv:1608.06298v1 [cs.IR] 22 Aug 2016 cial recommendation; Social tagging; •Computing method- sented as a vector over the item space. Likewise, items are ologies ! Neural networks; Ensemble methods; often represented as the users that have consumed them. Representing users and items is this way has proven to be useful. The representations are simple to construct and robust when built from large amounts of data. However, ∗Corresponding Author their inherent simplicity means they fail to capture richer semantic relationships. In this work, we construct alternative user and item rep- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed resentations learned through a neural network. These re- for profit or commercial advantage and that copies bear this notice and the full cita- sulting representations have several benefits. First, they are tion on the first page. Copyrights for components of this work owned by others than relatively quick to create when compared to other neural net- ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior specific permission work techniques. Second, they capture meaningful semantic and/or a fee. Request permissions from [email protected]. information. Third, by exploiting this semantic information, RecSys-DLRS September 15, 2016, Boston, MA, USA recommender systems can better predict a user's interests. c 2016 ACM. ISBN 123-4567-24-567/08/06. $15.00 Our experiments on a real-world dataset first offer subjec- DOI: 10.475/123 4 tive evidence that distributed representations capture mean- ingful semantic relationships. These representations can ing television programs [2] or online products [31]. even be manipulated to produce new representations. For This paper extends upon these previous efforts in recom- example, after computing the representation for users, movies, mender systems, distributed representations, and hybrid al- directors, actors and tags in the movie domain, we observe gorithms. Deep learning and neural networks have been that \Dr. No" minus \Sean Connery" plus \Roger Moore" applied to recommender systems before; for example in [34] produces a new representation, most closely related to the the authors propose a deep learning-based framework to fuse representation of \Octopussy". The representations appear content and ratings information, while [33] uses a convolu- to capture the notion that Moore replaced Connery in the tional neural network to create a content-based music rec- Bond franchise. ommender. Our model learns distributed representations for Our evaluation then turns to the utility of incorporating users, movies, directors, actors, and tags|contextual infor- these representations into a recommender system. Exper- mation for which representations are useful in their own right iments conducted on the MovieLens 10M Dataset and the (e.g. for finding actors that a user may like, or for recom- Internet Movie Database show that distributed representa- mending tags). These representations can be used in place of tions provide useful information to the system and improve those commonly employed in user-based or item-based rec- the root mean square error of predicted ratings. ommenders, making it easy to adapt existing recommender The rest of this paper is organized as follows. In Sec- systems to our method. We then leverage the results of the tion 2 we present related work. Section 3 describes how we traditional representations with the new representation in a learn the distributed representations. We discuss how we hybrid recommender system. incorporate these representations in a recommender engine in Section 4. Our experimental evaluation is presented in 3. DISTRIBUTED REPRESENTATIONS Section 5 and our experimental results in Section 6. We Distributed representations have a long history of use in conclude the paper with a discussion of our results and di- machine learning due to their generality and flexibility in rection for future work in Section 7. representing higher-dimensional data [11]. The process of creating a distributed representation may add meaningful 2. RELATED WORK constructive properties to elements in the transformed vec- Collaborative filtering algorithms work on the assumption tor space as well. Elements from the space can then be used that users that have behaved similarly in the past are likely as high-quality lower-dimensional features in other learning exhibit similar behavior in the future. User-based collab- problems. orative filtering [15, 30] works by identifying similar users Recently, developments in continuous space language mod- and recommending items these neighbors have liked. Alter- els have produced effective methods to provide meaningful natively, item-based collaborative filtering [5, 28] identifies distributed representations of words and related concepts items similar to those the user has liked in the past. Content- in the domain of natural language processing. In particu- based recommender systems [1, 23] model users and items lar, models utilizing a single hidden layer neural network in a content space, such as the terms in a newspaper article architecture with a specially designed structure have been or the meta-data of a movie. shown to be efficient and accurate on a variety of word sim- In social annotation systems, users described the content ilarity tasks [17, 18]. These architectures avoid nonlinear of online resources by collaboratively assigning tags. For ex- hidden layers, thereby significantly reducing training and ample, a user might annotate Dr. No with \Spy". Taken inference complexity. The distributed representations pro- together, the annotations of many users create a rich de- duced by these models are of relatively low dimension, create scription of users and items. In such systems, recommenders meaningful neighborhoods of words, and have useful compo- can promote items [7], tags [6, 13] or even other users [37]. sitional properties that capture semantic features [19]. Two Tags have also been used in models to predict ratings [29, models in particular are used to produce these representa- 38] and have been integrated into traditional collaborative tions: the Continuous Bag of Words model and the Skip- filtering [32] and content-based recommenders [4]. gram model. An alternative way to learn features about items and users is