An Item–Item Collaborative Filtering Recommender System Using Trust and Genre to Address the Cold-Start Problem

An Item–Item Collaborative Filtering Recommender System Using Trust and Genre to Address the Cold-Start Problem

big data and cognitive computing Article An Item–Item Collaborative Filtering Recommender System Using Trust and Genre to Address the Cold-Start Problem Mahamudul Hasan 1,*,† and Falguni Roy 2,† 1 Department of Computer Science and Engineering, East West University, Dhaka 1212, Bangladesh 2 Institute of Information Technology, Noakhali Science and Technology University, Sonapur 3814, Noakhali, Bangladesh * Correspondence: [email protected]; Tel.: +88-0173-432-5208 † These authors contributed equally to this work. Received: 10 June 2019; Accepted: 2 July 2019; Published: 8 July 2019 Abstract: Item-based collaborative filtering is one of the most popular techniques in the recommender system to retrieve useful items for the users by finding the correlation among the items. Traditional item-based collaborative filtering works well when there exists sufficient rating data but cannot calculate similarity for new items, known as a cold-start problem. Usually, for the lack of rating data, the identification of the similarity among the cold-start items is difficult. As a result, existing techniques fail to predict accurate recommendations for cold-start items which also affects the recommender system’s performance. In this paper, two item-based similarity measures have been designed to overcome this problem by incorporating items’ genre data. An item might be uniform to other items as they might belong to more than one common genre. Thus, one of the similarity measures is defined by determining the degree of direct asymmetric correlation between items by considering their association of common genres. However, the similarity is determined between a couple of items where one of the items could be cold-start and another could be any highly rated item. Thus, the proposed similarity measure is accounted for as asymmetric by taking consideration of the item’s rating data. Another similarity measure is defined as the relative interconnection between items based on transitive inference. In addition, an enhanced prediction algorithm has been proposed so that it can calculate a better prediction for the recommendation. The proposed approach has experimented with two popular datasets that is Movielens and MovieTweets. In addition, it is found that the proposed technique performs better in comparison with the traditional techniques in a collaborative filtering recommender system. The proposed approach improved prediction accuracy for Movielens and MovieTweets approximately in terms of 3.42% & 8.58% mean absolute error, 7.25% & 3.29% precision, 7.20% & 7.55% recall, 8.76% & 5.15% f-measure and 49.3% and 16.49% mean reciprocal rank, respectively. Keywords: item-based collaborative filtering; cold-start problem; genre; asymmetric correlation; transitive inference 1. Introduction For the easy access of the internet through different smart objects, the world wide web (WWW) has become a vast information pool. Nowadays, achieving required information from this information pool is an onerous job because of the complexity of searching required information is also high. To resolve this problem, the recommender system (RS) has been introduced in the World Wide Web and ease internet users by suggesting them with their required information, items or services [1]. The recommender system (RS) is broadly used in e-commerce and amusement based websites. Amazon, Big Data Cogn. Comput. 2019, 3, 39; doi:10.3390/bdcc3030039 www.mdpi.com/journal/bdcc Big Data Cogn. Comput. 2019, 3, 39 2 of 15 Facebook, Youtube, and IMDB are the real-life applications of recommender systems. Currently, e-learning and e-tourism based websites are also implementing the recommender system to aid users by gaining the required information smoothly from different websites [2,3]. Basically, a recommender system (RS) is a web information personalization system that is used to suggest a set of items or services from the vast items or services on the internet to the users by predicting their preferences [1]. Personal preference prediction is done by generating a rating prediction model on the basis of analyzing the user’s rating pattern of similar items. According to the analysis methodology of the user’s rating pattern, RS is classified as content-based recommending, collaborative filtering and hybrid recommender system [1], but collaborative filtering is more efficient than others and also easy to implement to discover complex patterns [4]. In collaborative filtering (CF), selection and recommendation of the items are done by defining the similarities between items or users’ tastes. Based on the similarity of the subject, CF can be categorized as either user or item-based CF. An item-based CF technique defines the similarity between items and then recommends a new item to the user on the basis of similar items characteristic [5]. User-based CF also does the same thing by defining the similarity of the preferences of users. Traditional item-based CF performs excellently for a dense dataset, but, in the case of the sparse dataset, it performs inversely and data sparsity also leads to a cold-start problem in the RS. Sometimes, in RS, there exists a high amount of new users or items, again, some old users or items that have a low quantity of rating information so that traditional RS cannot make any inferences. In RS, such users and items are depicted as cold-start users and cold-start items and cannot predict preferences accurately [6]. To improve the recommendation accuracy, some of the researchers have proposed some techniques to improve prediction quality of the simple item-based RS by using side data to address the cold-start problem [7–9] and some of the researchers have suggested integrating trust in the existing CF which will be defined by the degree of similarity between subjects named as implicit trust [10,11]. According to the trust theory, four main trust properties are utilized. They are asymmetry, transitivity, dynamicity, and context-dependence [12]. The trust value between two subjects should not be the same in the asymmetry property. If there exists a trust relationship between subject ‘a’ with subject ‘b’ and subject ‘a’ with subject ‘c’ then there could be a relative trust relationship between subject ‘a’ and subject ‘c’ in the transitive property of the trust. The trust relationship between two subjects is not constant and it might be changed over time and objects which are determined in the dynamicity and context-dependence of trust [12]. 1.1. Motivation Usually, the users’ behavior is a complicated process and is not constant to every user so it is too arduous to define uniformity between users by using a single correlation determination method. Some correlation determination method wouldn’t work well in all circumstances. Different methods work well in different scenarios. Moreover, the performance of traditional correlation or similarity methods is not up to the mark for the case of the sparse dataset and cannot predict recommendation for cold-start users or items. As most of the real-life applications deal with the sparse dataset so this problem should be handled efficiently. Thus, here, a new fast and efficient item-based CF method have been proposed with better prediction accuracy to overcome the existing issue of cold-start items. The novelty of the proposed method is, firstly, that the direct asymmetric similarity is calculated between items based on the genre of items on the basis of the asymmetric characteristic of the trust relationship. Secondly, the relative similar relation between items is calculated based on the transitive property of items’ trust values. Again, defining the similarity between items by using the item’s genre is a new approach in this domain, which is not addressed earlier. When a new item enters the system, it might belong to any genre. In addition, an item that is rated hardly must belong to at least one genre. As the information of genre is available previously, the similarity between the items can be defined by taking the help of genre which is the key to alleviating the problem of the cold-start item. However, the genre-based items similarity method may treat a new item as equally similar to the highly-rated Big Data Cogn. Comput. 2019, 3, 39 3 of 15 item of the system for their presence in more common genres. However, as one of the items is new and another one may be highly popular in the system and contains users’ rating information, so the degree of similarity between them should not be uniform. By taking account of the corresponding item’s rating data, the asymmetric similar relationship is incorporated in the genre-based items’ similarity. Furthermore, an enhancement of the prediction method has been proposed to increase the accuracy of the recommender system performances. 1.2. Paper Contribution The key purpose of this paper is to design a new item-based collaborative filtering model by calculating the similarity of items based on items’ genre and also improve the prediction accuracy of recommendation. The contribution of this paper is classified as fourfold and these are: A single asymmetric similarity method has been proposed by taking account of items’ genre with • the reliability between them which defines the direct asymmetric similar relation of items. Another new similarity method has been defined by identifying the correlations of items based • on the transitive relations of reliability between them. A prediction algorithm is proposed to increase the accuracy of recommendation. • A detailed experiment is done to prove our statement that the proposed methodology outperforms • existing methods. The organization of the paper is as follows: Section2 demonstrates an overview of the related works with the explanation of previous works’ issues and also the aim of the paper. Section3 describes the proposed approach to serve the aim of the paper by getting rid of the existing cold start problem. Section4 presents the list of datasets on which the proposed approach will be tested and the methods that are used to validate proposed approach’s performance.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    15 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us