Deep Learning Architecture for Collaborative Filtering Recommender Systems

applied sciences Article Deep Learning Architecture for Collaborative Filtering Recommender Systems Jesus Bobadilla * , Santiago Alonso and Antonio Hernando Dpto. Sistemas Informáticos, Escuela Técnica Superior de Ingeniería de Sistemas Informáticos, Universidad Politécnica de Madrid, 28031 Madrid, Spain; [email protected] (S.A.); [email protected] (A.H.) * Correspondence: [email protected] Received: 17 March 2020; Accepted: 30 March 2020; Published: 3 April 2020 Abstract: This paper provides an innovative deep learning architecture to improve collaborative filtering results in recommender systems. It exploits the potential of the reliability concept to raise predictions and recommendations quality by incorporating prediction errors (reliabilities) in the deep learning layers. The underlying idea is to recommend highly predicted items that also have been found as reliable ones. We use the deep learning architecture to extract the existing non-linear relations between predictions, reliabilities, and accurate recommendations. The proposed architecture consists of three related stages, providing three stacked abstraction levels: (a) real prediction errors, (b) predicted errors (reliabilities), and (c) predicted ratings (predictions). In turn, each abstraction level requires a learning process: (a) Matrix Factorization from ratings, (b) Multilayer Neural Network fed with real prediction errors and hidden factors, and (c) Multilayer Neural Network fed with reliabilities and hidden factors. A complete set of experiments has been run involving three representative and open datasets and a state-of-the-art baseline. The results show strong prediction improvements and also important recommendation improvements, particularly for the recall quality measure. Keywords: collaborative filtering; reliabilities; deep learning; recommender systems; matrix factorization 1. Introduction Recommender Systems (RS) are powerful tools to address the information overload problem in the Internet. They make use of diverse sources of information. Explicit votes from users to items, and implicit interactions are the basis of the Collaborative Filtering (CF) RS. Implicit interactions examples are clicks, listened songs, watched movies, likes, dislikes, etc. Often, hybrid RS [1] are designed to ensemble CF with some other types of information filtering: demographic [2], context-aware [3], content-based [4], social information [5], etc. RS cover a wide range of recommendation targets, such as travels [6], movies [7], restaurants [8], fashion [9], news [10], etc. Traditionally, CF RS have been implemented using machine learning methods and algorithms, mainly the memory-based K nearest neighbor algorithm, and more recently the Matrix Factorization (MF) method. MF is an efficient model-based approach that obtains accurate recommendations; it is easy to understand and to implement, and it is based on the dimension reduction basis, where a reduced set of hidden factors contains the ratings information. Predictions are obtained by making the dot product of the users and the items hidden factors. There are several implementations of the MF such as PMF [11] and BNMF. An important drawback of the MF predictions is that the linear dot product cannot catch the complex non-linear relations existing among the set of hidden factors. Neural models do not have this restriction and they are called to lead the CF research in RS. Currently, RS research is directed toward deep learning approaches [12,13]. It is usual to find neural content-based approaches: images [8,9], text [10,14], music [15,16], videos [17,18], etc. The above targets Appl. Sci. 2020, 10, 2441; doi:10.3390/app10072441 www.mdpi.com/journal/applsci Appl. Sci. 2020, 10, x FOR PEER REVIEW 2 of 14 Appl. Sci. 2020, 10, 2441 2 of 14 Currently, RS research is directed toward deep learning approaches [12,13]. It is usual to find neural content-based approaches: images [8,9], text [10,14], music [15,16], videos [17,18], etc. The aboveare processed targets are by usingprocessed different by using neural different networks neur architectures:al networks Convolutionalarchitectures: Convolutional Neural Networks Neural [19], NetworksRecurrent Neural[19], Recurrent Networks Neural [20], Multilayer Networks Neural [20], Multilayer Networks Neural (MNN) Netw [3,21],orks and (MNN) autoencoders [3,21], [ 7and,22] autoencodersare usual approaches. [7,22] are Theusual preceding approaches. publications The preceding are focused publications on content-based are focused or on hybrid content-based filtering. orCF hybrid neural filtering. approaches CF canneural be classifiedapproaches as: can (1) be Deep classified Factorization as: (1) Deep Machines Factorization (deepFM) Machines [23] and (deepFM)(2) Neural [23] Collaborative and (2) Neural Filtering Collaborative (NCF) [24]. Filtering NCF simultaneously (NCF) [24]. NCF processes simultaneously item and user processes information item andusing user a dual information neural network. using a dual The neural two main network. NCF implementationsThe two main NCF are implementations (a) NCF [24] and are (b) (a) Neural NCF [24]Network and (b) Matrix Neural Factorization Network Matrix (NNMF) Factorization [25]. Since (NNMF) NCF reports [25]. better Since accuracyNCF reports than better NNMF, accuracy we use thanit as ourNNMF, baseline. we use Figure it as 1our shows baseline. the NCF Figure architecture: 1 shows the the NCF input architecture: sparse data the (ratings input ofsparse the users data (ratingsand ratings of the of theusers items) and isratings converted of the in items) dense informationis converted by in meansdense ofinformation an embedding by means layer. of Then, an embeddinga Multilayer layer. Perceptron Then, combinesa Multilayer both Perceptron features paths combines by concatenating both features them. paths Training by concatenating is performed them.using Training the known is ratingsperformed as labels. using Oncethe known the architecture ratings as haslabels. learned, Once itthe can architecture predict unknown has learned, ratings it canin a predict non-linear unknown process. ratings Our proposed in a non-linear architecture proces makess. Our useproposed of the NCFarchitec one,ture expanding makes use it to of hold the NCFtraining one, using expanding known it errors to hold and training then predicting using known reliabilities. errors and We willthen refer predicting to the proposed reliabilities. architecture We will referas Reliability-based to the proposed Neural architecture Collaborative as Reliabil Filteringity-based (RNCF). Neural Collaborative Filtering (RNCF). FigureFigure 1. 1. NeuralNeural Collaborative Collaborative Filtering Filtering (NCF) (NCF) architecture. architecture. ThereThere are are some some publications publications that that combine combine wide wide and and deep deep learning learning [26,27] [26 ,to27 ]support to support CF RS. CF The RS. wideThe wide side sideis implemented is implemented using using a perceptron, a perceptron, wher whereaseas the the deep deep side side is isimplemented implemented using using an an MNN. MNN. TheThe wide component learns simple relations (mem (memorization);orization); the the deep deep component component learns learns complex complex relationsrelations (generalization). The proposedproposed architecturearchitecture (RNCF) (RNCF) focuses focuses on on the the deep deep learning learning approach approach to tocatch catch complex complex relations relations both both to to predict predict reliabilities reliabilities and and to to predict predictrating rating values.values. The wide learning cannotcannot make make these these functions, functions, but but it it could could slightly slightly improve improve the the overall overall results. results. A A potential potential future future work work isis to to expand expand RNCF RNCF to to hold hold a a wi widede learning component. The The deepFM deepFM architecture architecture [28] [28] joins joins a a deep deep andand a a wide wide component component to to process process the the raw raw input input vectors. vectors. It It simultaneously simultaneously learns learns low low and and high-order high-order featurefeature interactions.interactions. ComparedCompared to to NCF, NCF, deepFM deepFM achieves achieves a slight a slight accuracy accuracy improvement improvement (0.33% (0.33% Criteo Criteodataset, dataset, 0.48% Company0.48% Company dataset). dataset). The proposed The prop approachosed approach (RNCF) fits (RNCF) better withfits better the NCF with architecture the NCF architecturethan with the than deepFM with the one. deepFM Additionally, one. Additionally the wide, architectures the wide architectures have the sizehave of the the size input of the layer input as layera drawback as a drawback that jeopardizes that jeopardizes their scalability. their scalabil Givenity. Given the limited the limited room room forimprovement for improvement using using the thedeepFm deepFm model, model, we we have have chosen chosen the the more more regular regular and and scalable scalable NCF NCF architecture architecture as a as base a base to create to create our ourmodel. model. There There are some are othersome approachesother approaches that provide that neuralprovide networks neural basednetwor onks improved based on information improved informationcoming from coming data, suchfrom asdata, [29 such], where as [29], authors where extract authors interaction extract interaction behavior behavior from bipartite from bipartite graphs.

Deep Learning Architecture for Collaborative Filtering Recommender Systems

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support