Multi-Feature Discrete Collaborative Filtering for Fast Cold-Start Recommendation
Total Page:16
File Type:pdf, Size:1020Kb
The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20) Multi-Feature Discrete Collaborative Filtering for Fast Cold-Start Recommendation Yang Xu,1 Lei Zhu,1∗ Zhiyong Cheng,2 Jingjing Li,3 Jiande Sun1 1Shandong Normal University 2Shandong Computer Science Center (National Supercomputer Center in Jinan) 2Qilu University of Technology (Shandong Academy of Sciences) 3University of Electronic Science and Technology of China [email protected] Abstract tween their latent features. However, the time complexity for generating top-k items recommendation for all users is Hashing is an effective technique to address the large- O(nmr + nm log k) (Zhang, Lian, and Yang 2017). There- scale recommendation problem, due to its high computation and storage efficiency on calculating the user preferences fore, MF-based methods are often computational expensive on items. However, existing hashing-based recommendation and inefficient when handling the large-scale recommenda- methods still suffer from two important problems: 1) Their tion applications (Cheng et al. 2018; 2019). recommendation process mainly relies on the user-item in- Recent studies show that the hashing-based recommen- teractions and single specific content feature. When the in- dation algorithms, which encode both users and items into teraction history or the content feature is unavailable (the binary codes in Hamming space, are promising to tackle cold-start problem), their performance will be seriously de- teriorated. 2) Existing methods learn the hash codes with re- the efficiency challenge (Zhang et al. 2016; 2014). In these laxed optimization or adopt discrete coordinate descent to methods, the preference score could be efficiently computed directly solve binary hash codes, which results in signifi- by Hamming distance. However, learning binary codes is cant quantization loss or consumes considerable computation generally NP-hard (Hastad˚ 2001) due to the discrete con- time. In this paper, we propose a fast cold-start recommenda- straints. To tackle this problem, the researchers resort to a tion method, called Multi-Feature Discrete Collaborative Fil- two-stage hash learning procedure (Liu et al. 2014; Zhang tering (MFDCF), to solve these problems. Specifically, a low- et al. 2014): relaxed optimization and binary quantization. rank self-weighted multi-feature fusion module is designed Continuous representations are first computed by the relaxed to adaptively project the multiple content features into binary optimization, and subsequently the hash codes are generated yet informative hash codes by fully exploiting their comple- by binary quantization. This learning strategy indeed simpli- mentarity. Additionally, we develop a fast discrete optimiza- tion algorithm to directly compute the binary hash codes with fies the optimization challenge. However, it inevitably suf- simple operations. Experiments on two public recommenda- fers from significant quantization loss according to (Zhang tion datasets demonstrate that MFDCF outperforms the state- et al. 2016). Hence, several solutions are developed to di- of-the-arts on various aspects. rectly optimizing the binary hash codes from the matrix fac- torization with discrete constraints. Despite much progress has been achieved, they still suffer from two problems: 1) Introduction Their recommendation process mainly relies on the user- With the development of online applications, recommender item interactions and single specific content feature. Under systems have been widely adopted by many online services such circumstances, they cannot provide meaningful recom- for helping their users find desirable items. However, it is mendations for new users (e.g. for the new users who have still challenging to accurately and efficiently match items no interaction history with the items). 2) They learn the hash to their potential users, particularly with the ever-growing codes with Discrete Coordinate Descent (DCD) that learns scales of items and users (Batmaz et al. 2019). the hash codes bit-by-bit, which results in significant quan- In the past, Collaborative Filtering (CF), as exemplified tization loss or consumes considerable computation time. by Matrix Factorization (MF) algorithms (Koren, Bell, and In this paper, we propose a fast cold-start recommendation Volinsky 2009) have demonstrated great successes in both × method, called Multi-Feature Discrete Collaborative Filter- academia and industry. MF factorizes an n m user-item ing (MFDCF) to alleviate these problems. Specifically, we rating matrix to project both users and items into a r- propose a low-rank self-weighted multi-feature fusion mod- dimensional latent feature space, where the user’s prefer- ule to adaptively preserve the multiple content features of ence scores for items are predicted by the inner product be- users into the compact yet informative hash codes by suffi- ∗The corresponding author ciently exploiting their complementarity. Our method is in- Copyright c 2020, Association for the Advancement of Artificial spired by the success of the multiple feature fusion in other Intelligence (www.aaai.org). All rights reserved. relevant areas (Wang et al. 2018; 2012; Lu et al. 2019a; 270 Zhu et al. 2017a). Further, we develop an efficient discrete applies Deep Belief Network (DBN) to extract item repre- optimization approach to directly solve binary hash codes sentation from item content information, and combines the by simple efficient operations without quantization errors. DBN with DCF. Discrete content-aware matrix factorization Finally, we evaluate the proposed method on two public rec- methods (Lian et al. 2017; Lian, Xie, and Chen 2019) de- ommendation datasets, and demonstrate its superior perfor- velop discrete optimization algorithms to learn binary codes mance over state-of-the-art competing baselines. for users and items at the presence of their respective content The main contributions of this paper are summarized as information. Discrete Factorization Machines (DFM) (Liu et follows: al. 2018) learns hash codes for any side feature and models the pair-wise interactions between feature codes. Besides, • We propose a Multi-Feature Discrete Collaborative Fil- since the above binary cold-start recommendation frame- tering (MFDCF) method to alleviate the cold-start rec- works solve the hash codes with bit-by-bit discrete optimiza- ommendation problem. MFDCF directly and adaptively tion, they still consumes considerable computation time. projects the multiple content features of users into binary hash codes by sufficiently exploiting their complementar- ity. To the best of our knowledge, there is still no similar The Proposed Method work. Notations. Throughout this paper, we utilize bold lowercase letters to represent vectors and bold uppercase letters to rep- • We develop an efficient discrete optimization strategy to resent matrices. All of the vectors in this paper denote col- directly learn the binary hash codes without relaxed quan- umn vectors. Non-bold letters represent scalars. We denote tization. This strategy avoids performance penalties from tr(·) as the trace of a matrix and ·F as the Frobenius norm both the widely adopted discrete coordinate descent and of a matrix. We denote sgn(·):R →±1 as the round-off the storage cost of huge interaction matrix. function. • We design a feature-adaptive hash code generation strat- egy to generate user hash codes that accurately capture Low-rank Self-weighted Multi-Feature Fusion the dynamic variations of cold-start user features. Exper- n Given a training dataset O = oi|i=1, which contains n iments on the public recommendation datasets demon- user’s multiple features information represented with M dif- strate the superior performance of the proposed method ferent content features (e.g. demographic information such over the state-of-the-arts. as age, gender, occupation, and interaction preference ex- tracted from item side information). The m-th content fea- (m) (m) (m) dm×n Related Work ture is X =[x1 , ..., xn ] ∈ R , where dm is the In this paper, we investigate the hashing-based collabora- dimensionality of the m-th content feature. Since the user’s tive filtering at the presence of multiple content features for multiple content features are quite diverse and heteroge- fast cold-start recommendation. Hence, in this section, we neous, in this paper, we aim at adaptively mapping multiple (m) M mainly review the recent advanced hashing-based recom- content features X |m=1 into a consensus multi-feature mendation and cold-start recommendation methods. representation H ∈ Rr×n (r is the hash code length) in a A pioneer work, (Das et al. 2007) is proposed to exploit shared homogeneous space. Specifically, it is important to Locality-Sensitive Hashing (LSH) (Gionis, Indyk, and Mot- consider the complementarity of multiple content features wani 1999) to generate hash codes for Google new read- and the generalization ability of the fusion module. Moti- ers based on their item-sharing history similarity. Based on vated by these considerations, we introduce a self-weighted this, (Karatzoglou, Smola, and Weimer 2010; Zhou and Zha fusion strategy and then formulate the multi-feature fusion 2012) followed the idea of Iterative Quantization (Gong part as: et al. 2013) to project real latent representations into hash M codes. To enhance discriminative capability of hash codes, (m) (m) min ||H − W X ||F (1) de-correlation constraint (Liu et al. 2014) and Constant Fea- W(m),H ture Norm (CFN) constraint (Zhang et al. 2014) are im- m=1 (m)