A Hybrid Collaborative Filtering Model with Deep Structure for Recommender Systems

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) A Hybrid Collaborative Filtering Model with Deep Structure for Recommender Systems Xin Dong, Lei Yu, Zhonghuo Wu, Yuxia Sun, Lingfeng Yuan, Fangxi Zhang Ctrip Travel Network Technology (Shanghai) Co., Limited. Shanghai, P.R.China {dongxin, yu lei, zh wu, yx sun, lfyuan, fxzhang}@ctrip.com Abstract The most successful approach among CF-based methods Collaborative filtering(CF) is a widely used approach in is to learn effective latent factors directly by matrix factor- recommender systems to solve many real-world problems. ization technique from the user-item rating matrix (Koren et Traditional CF-based methods employ the user-item matrix al. 2009). However, the rating matrix is often very sparse which encodes the individual preferences of users for items in real world, causing CF-based methods to degrade sig- for learning to make recommendation. In real applications, nificantly in learning the appropriate latent factors. In par- the rating matrix is usually very sparse, causing CF-based ticular, this phenomenon occurs seriously in online travel methods to degrade significantly in recommendation perfor- agent(OTA) websites such as Ctrip.com, since user access mance. In this case, some improved CF methods utilize the in- these websites with lower frequency. Moreover, another lim- creasing amount of side information to address the data spar- itation for CF-based methods is how to provide recommen- sity problem as well as the cold start problem. However, the dations when a new item arrives in the system, which is also learned latent factors may not be effective due to the sparse nature of the user-item matrix and the side information. To known as the cold start problem. The reason of the existence address this problem, we utilize advances of learning effec- about cold start is that the systems cannot recommend new tive representations in deep learning, and propose a hybrid items which have not yet receive rating information from model which jointly performs deep users and items’ latent users. factors learning from side information and collaborative fil- In order to overcome the cold start and data sparsity prob- tering from the rating matrix. Extensive experimental results lems, it is inevitable for CF-based methods to exploit addi- on three real-world datasets show that our hybrid model out- tional sources of information about the users or items, also performs other methods in effectively utilizing side information and achieves performance improvement. known as the side information, and hence hybrid CF methods have gained popularity in recent years (Shi, Larson, and Hanjalic 2014). The side information can be obtained from Introduction user profile and item content information, such as demo- In recent years, with the growing number of choices avail- graphics of users, properties of items, etc. Some hybrid CF- able online, recommender systems are becoming more and based methods (Singh and Gordon 2008; Nickel, Tresp, and more indispensable. The goal of recommender systems is Kriegel 2011; Wang and Blei 2011) have integrated side in- to help users in identifying the items that best fit their per- formation into matrix factorization to learn effective latent sonal tastes from a large repository of items. Besides, many factors. However, these methods employ the side informa- commerce companies have been using recommender sys- tion as regularizations and the learned latent factors are of- tems to target their customers by recommending items. Over ten not effective especially when the rating matrix and side the years, various algorithms for recommender systems have information are very spare (Agarwal, Chen, and Long 2011). been developed. Such algorithms can roughly be catego- Therefore, it is highly desirable to realize this latent factor rized into two groups (Shi, Larson, and Hanjalic 2014): learning problem from such datasets. content-based and collaborative filtering(CF) based meth- Recently, one of the powerful methods to learn effective ods. Content-based methods (Lang 1995) utilize user profile representations is deep learning (Hinton and Salakhutdinov or item content information for recommendation. CF-based 2006; Hinton, Osindero, and Teh 2006). Thus, with large- methods (Salakhutdinov and Mnih 2011), on the other hand, scale ratings and rich additional side information, it is na- ignore user or item content information and use the past ture to integrate deep learning in recommender systems to activities or preferences, such as user buying/viewing his- learn latent factors. Thereby, some researches have made use tory or user ratings on items, to recommendation. Neverthe- of deep learning directly for the task of collaborative filter- less, CF-based methods are often preferred to content-based ing. Work (Salakhutdinov, Mnih, and Hinton 2007) employs methods because of their impressive performance (Su and restricted Boltzmann machines to perform CF. Although Khoshgoftaar 2009). this method combines deep learning and CF, it does not Copyright c 2017, Association for the Advancement of Artificial incorporate side information, which is crucial for accurate Intelligence (www.aaai.org). All rights reserved. recommendation. Moreover, work (Van den Oord, Diele- 1309 k man, and Schrauwen 2013; Wang and Wang 2014) directly Let ui, vj ∈ R be user i’s latent factor vector and item j’s uses convolutional neural network(CNN) or deep belief net- latent factor vector respectively, where k is the dimensional- work(DBN) to obtain latent factors for content information, ity of the latent space. Therefore, the corresponding matrix but they are content-based methods which only infer latent forms of latent factors for users and items are U = u1:m factors for items and the methods are especially fit for music and V = v1:n, respectively. Given the sparse rating matrix R datasets. Furthermore, work (Wang, Wang, and Yeung 2015; and the side information matrix X and Y, the goal is to learn Li, Kawale, and Fu 2015) utilizes Bayesian stacked denois- user latent factors U and item latent factors V, and hence to ing auto-encoders(SDAE) or marginalized SDAE to CF but predict the missing ratings in R. requires learning of a large number of manually adjusted hy- per parameters. Matrix Factorization In this paper, to address the challenges above, we propose a hybrid collaborative filtering model with deep struc- An effective collaborative filtering approach is matrix fac- ture for recommender systems. We first present a novel deep torization (Koren et al. 2009). By factorizing the user-item learning model called additional stacked denoising autoen- interactions matrix, matrix factorization can map both users coder(aSDEA), which extends the stacked denoising autoen- and items to a joint latent factor space. Therefore, user-item coder to integrate additional side information into the in- interactions are modeled as inner products in that space. For- puts, and then overcomes cold start problem and data spar- mally, matrix factorization decomposes the original rating sity problem. With this, we then present our hybrid model matrix R into two low-rank matrices U and V consisting of the user and item latent factor vectors respectively, such that which tightly couples deep representation learning for the ≈ additional side information and collaborative filtering for the R UV. Given the latent factor vectors for users and items, ratings matrix. Experiments show that our hybrid model sig- a user’s rating for a movie is predicted by the inner product nificantly outperforms the state of the art. Specifically, the of those vectors. main contributions of this paper can be summarized as the The objective function of matrix factorization can be writ- following three aspects: ten as: • L T || ||2 || ||2 We propose a hybrid collaborative filtering model, which arg min (R, UV )+λ( U F + V F ), integrates deep presentation learning and matrix factor- U,V ization. It simultaneously extracts effective latent factors where L(·, ·) is the loss function that measures the distance from side information and captures the implicit relation- between two matrices with the same size, the last two terms ship between users and items. are the regularizations used to avoid overfitting and || · ||F • We present a novel deep learning model aSDAE, which is denotes the Frobenius norm. By specifying different L(·, ·), a variant of SDAE and can integrate the side information many matrix factorization models have been proposed, for into the learned latent factors efficiently. example, non-negative matrix factorization (Lee and Se- • We conduct experiments on three real-world datasets to ung 2001), probabilistic matrix factorization (Salakhutdi- evaluate the effectiveness of our hybrid model. Experi- nov and Mnih 2011), Bayesian probabilistic matrix factor- mental results show that our hybrid model outperforms ization (Salakhutdinov and Mnih 2008), max-margin matrix four state-of-art methods in terms of root mean squared factor (Srebro, Rennie, and Jaakkola 2004), etc. error(RMSE) and recall metrics. When side information are available, some matrix factorization models generate a rating from the product of latent Preliminaries factor vectors which contain additional information about users or items. Various models show that additional side in- In this section, we start with formulating the problem dis- formation can act as a useful informative prior that can sig- cussed in this paper, and then have a brief view on matrix nificantly

Load more