The LFM-1B Dataset for Music Retrieval and Recommendation

Total Page:16

File Type:pdf, Size:1020Kb

The LFM-1B Dataset for Music Retrieval and Recommendation The LFM-1b Dataset for Music Retrieval and Recommendation Markus Schedl Department of Computational Perception Johannes Kepler University Linz, Austria [email protected] ABSTRACT However, researchers interested in conducting experiments We present the LFM-1b dataset of more than one billion in music retrieval and recommendation on a large scale — in music listening events created by more than 120,000 users particular those working in academia — are facing the chal- of Last.fm. Each listening event is characterized by artist, lenge that they frequently have to acquire the data for their album, and track name, and further includes a timestamp. experiments themselves, which results in non-standardized collections, hindering reproducibility . While online music On the (anonymous) user level, basic demographics and a 1 2 3 selection of more elaborate user descriptors are included. platforms such as Spotify, Last.fm, or Soundcloud of- The dataset is foremost intended for benchmarking in mu- fer convenient API endpoints that provide access to their sic information retrieval and recommendation. To facili- databases, it takes a considerable amount of time to build tate experimentation in a straightforward manner, it also collections of substantial size necessary for large-scale eval- includes a precomputed user-item-playcount matrix. In ad- uation. The few publicly available datasets for these pur- dition, sample Python scripts showing how to load the data poses, the most well-known of which is probably the Million and perform efficient computations are provided. An imple- Song Dataset (MSD) [2], might represent an alternative, but mentation of a simple collaborative filtering recommender come with certain restrictions. For instance, while the MSD rounds off the code package. offers a great variety of pieces of information (among oth- We discuss in detail the LFM-1b dataset’s acquisition, ers, genre labels, tags, term weights of lyrics, song similar- availability, statistics, and content, and place it in the con- ity information, and aggregated playcount data), user- and text of existing datasets. We also showcase its usage in a listening-specific information is provided rather scarcely, on simple artist recommendation task, whose results are in- a high level, or in a summarized form only. tended to serve as baseline against which more elaborate Since we believe that listener-specific information is key techniques can be assessed. The two unique features of the to build personalized music retrieval systems, the focus and dataset in comparison to existing ones are (i) its substantial unique feature of the LFM-1b dataset presented here is de- size and (ii) a wide range of additional user descriptors that tailed information on the level of listeners and of listen- reflect their music taste and consumption behavior. ing events. For instance, the dataset provides user-specific scores about their music taste (among others, measures of mainstreaminess and inclination to listen to novel music). Keywords Next to this, the size of the dataset, which includes more Dataset, Analysis, Music Information Retrieval, Music Rec- than a billion listening events of approximately 120,000 users ommendation, Collaborative Filtering, Experimentation should be sufficient to perform experimentation on a large scale on real-world data. The paper is organized as follows. We first review re- 1. MOTIVATION lated datasets for music retrieval and recommendation (Sec- Research and development in music information retrieval tion 2). Subsequently, we outline the data acquisition pro- (MIR) and music recommender systems has seen a sharp cedure, present the dataset’s structure and content, pro- increase during the past few years, not least due to the pro- vide basic statistics and analyze them, and point to sample liferation of music streaming services [8, 5]. Having tens Python scripts that show how to access the components of of millions of music pieces available at the listeners’ finger- the dataset (Section 3). We further illustrate how to exploit tips requires novel retrieval, recommendation, and interac- the dataset for the use case of building a music recommender tion techniques for music. system that implements various recommendation algorithms (Section 4). We eventually round off the paper with a sum- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed mary and discussion of possible extensions (Section 5). for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the 2. RELATED WORK author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission The need for user-aware and multimodal approaches to and/or a fee. Request permissions from [email protected]. music retrieval and recommendation has been acknowledged ICMR'16, June 06 - 09, 2016, New York, NY, USA 1https://developer.spotify.com/web-api c 2016 Copyright held by the owner/author(s). Publication rights licensed to ACM. 2 ISBN 978-1-4503-4359-6/16/06. $15.00 http://www.last.fm/api 3 DOI: http://dx.doi.org/10.1145/2911996.2912004 https://developers.soundcloud.com/docs/api/guide many times and is meanwhile widely accepted [9, 11, 14, 15]. users, only including artists they most frequently listened to. However, respective scientific work is still in its fledgling The other subset offers full listening data of nearly a thou- stage. One of the reasons for this is that involving users, sand users, where each listening event is annotated with a which is an obvious necessity to build user-aware approaches, timestamp, artist, and track name. Both subsets include is time-consuming and hardly feasible on a large scale — at gender, age, country, and date of registering at Last.fm, as least not in academia. As a consequence, datasets offering provided by their API. user-specific information are scarce. Other datasets that are related to LFM-1b to a smaller On the other hand, thanks to evaluation campaigns in the extent include the AotM-2011 dataset of playlists extracted fields of music information retrieval and music recommenda- from Art of the Mix9 and the MagnaTagATune10 dataset [7] tion, including the Music Information Retrieval Evaluation of user-generated tags and relative similarity judgments be- eXchange4 (MIREX) and the KDD Cup 2011 5 [4], the re- tween triples of tracks. search community has been given several datasets that can In comparison to the datasets most similar to the one be used for a wide range of MIR tasks, from tempo estima- proposed here — the MSD and Celma’s [3] — the LFM-1b tion to melody extraction to emotion classification. Most of dataset offers the following unique features: (i) substantially these datasets, however, are specific to a particular task, e.g., more listening events, i.e., over one billion, in comparison onset detection or genre classification. What is more, for to roughly 48 and 19 million, respectively, for MSD and content-based or audio-based approaches, the actual audio Celma’s [3]; (ii) exact timestamps of each listening event, can typically not be shared, because of restrictions imposed unlike MSD; (iii) demographic information about listeners by intellectual property rights. in an anonymous way, unlike MSD; and (iv) additional infor- Datasets that can be used to some extent for evaluat- mation describing the listeners’ music preferences and con- ing personalized approaches to music retrieval and recom- sumption behavior, unlike both MSD and Celma’s [3]. These mendation include the Yahoo! Music dataset [4], which cur- additional descriptors include temporal aspects of listening rently represents the largest available music recommenda- behavior as well as novelty and mainstreaminess scores as tion dataset, including more than 262 million ratings of more proposed in [12], among others. than 620 thousand music items created by more than one million users. The ratings cover a time range from 1999 to 3. THE LFM-1B DATASET 2010. However, the dataset is completely anonymized, i.e., not only users, but also items are unknown. The absence In the following, we outline the data acquisition procedure of any descriptive metadata and ignorance of music domain from Last.fm, describe in detail the dataset’s components, knowledge therefore restricts the usage of the dataset to rat- analyze basic statistical properties of the dataset, provide ing prediction and collaborative filtering [13]. download links, and refer to some sample code in Python, The Million Song Dataset 6 (MSD) [2] is perhaps one of which is also available for download. Please note that the LFM-1b dataset is considered derivative work according to the most widely used datasets in MIR research. It offers a 11 wealth of information, among others, audio content descrip- paragraph 4.1 of Last.fm’s API Terms of Service. tors such as tempo, key, or loudness estimates, editorial item metadata, user-generated tags, term vector representations 3.1 Data Acquisition We first use the overall 250 top tags12 to gather their top of lyrics, and playcount information. While the MSD pro- 13 vides a great amount of information about one million songs, artists using the Last.fm API. For these artists, we fetch it has also been criticized, foremost for its lack of audio mate- the top fans, which results in 465,000 active users. For a ran- domly chosen subset of 120,322 users, we then obtain their rial, the obscurity of the approaches used to extract content 14 descriptors, and the improvable integration of the different listening histories. For approximately 5,000 users, we cap parts of the dataset. The MSD Challenge7 [10] further in- the fetched listening histories at 20,000 listening events in creased the popularity of the dataset.
Recommended publications
  • The Use of Time Dimension in Recommender Systems for Learning
    The Use of Time Dimension in Recommender Systems for Learning Eduardo José de Borba1, Isabela Gasparini1 and Daniel Lichtnow2 1Graduate Program in Applied Computing (PPGCA), Department of Computer Science (DCC), Santa Catarina State University (UDESC), Paulo Malschitzki 200, Joinville, Brazil 2Polytechnic School, Federal University of Santa Maria (UFSM), Av. Roraima 1000, Santa Maria, Brazil Keywords: Recommender System, Context-aware, Time, Learning. Abstract: When the amount of learning objects is huge, especially in the e-learning context, users could suffer cognitive overload. That way, users cannot find useful items and might feel lost in the environment. Recommender systems are tools that suggest items to users that best match their interests and needs. However, traditional recommender systems are not enough for learning, because this domain needs more personalization for each user profile and context. For this purpose, this work investigates Time-Aware Recommender Systems (Context-aware Recommender Systems that uses time dimension) for learning. Based on a set of categories (defined in previous works) of how time is used in Recommender Systems regardless of their domain, scenarios were defined that help illustrate and explain how each category could be applied in learning domain. As a result, a Recommender System for learning is proposed. It combines Content-Based and Collaborative Filtering approaches in a Hybrid algorithm that considers time in Pre- Filtering and Post-Filtering phases. 1 INTRODUCTION potential to improve the quality of recommendation (Campos et al., 2014). Nowadays, there are distinct educational approaches, This work aims to identify how Time-aware RS e.g. online learning, blended learning, face-to-face (Context-aware RS that uses time context) can be learning, etc.
    [Show full text]
  • Infinite Setlist: Analyzing Pioneer DJ's Catalogue Streaming Partnerships
    Cybaris® Volume 12 Issue 1 Article 2 2021 Infinite Setlist: Analyzing Pioneer DJ’s Catalogue Streaming Partnerships with Beatport and SoundCloud Nicholas Rivera Follow this and additional works at: https://open.mitchellhamline.edu/cybaris Part of the Entertainment, Arts, and Sports Law Commons, and the Intellectual Property Law Commons Recommended Citation Rivera, Nicholas (2021) "Infinite Setlist: Analyzing Pioneer DJ’s Catalogue Streaming Partnerships with Beatport and SoundCloud," Cybaris®: Vol. 12 : Iss. 1 , Article 2. Available at: https://open.mitchellhamline.edu/cybaris/vol12/iss1/2 This Article is brought to you for free and open access by the Law Reviews and Journals at Mitchell Hamline Open Access. It has been accepted for inclusion in Cybaris® by an authorized administrator of Mitchell Hamline Open Access. For more information, please contact [email protected]. © Mitchell Hamline School of Law CYBARIS®, AN INTELLECTUAL PROPERTY LAW REVIEW INFINITE SETLIST: ANALYZING PIONEER DJ’S CATALOGUE STREAMING PARTNERSHIPS WITH BEATPORT AND SOUNDCLOUD Nicholas Rivera1 Table of Contents Introduction ................................................................................................................................... 36 The Story Thus Far ................................................................................................................... 38 The Rise of Streaming .............................................................................................................. 39 Brief History of DJing
    [Show full text]
  • A Grouplens Perspective
    From: AAAI Technical Report WS-98-08. Compilation copyright © 1998, AAAI (www.aaai.org). All rights reserved. RecommenderSystems: A GroupLensPerspective Joseph A. Konstan*t , John Riedl *t, AI Borchers,* and Jonathan L. Herlocker* *GroupLensResearch Project *Net Perceptions,Inc. Dept. of ComputerScience and Engineering 11200 West78th Street University of Minnesota Suite 300 Minneapolis, MN55455 Minneapolis, MN55344 http://www.cs.umn.edu/Research/GroupLens/ http://www.netperceptions.com/ ABSTRACT identifying sets of articles by keyworddoes not scale to a In this paper, wereview the history and research findings of situation in which there are thousands of articles that the GroupLensResearch project I and present the four broad contain any imaginable set of keywords. Taken together, research directions that we feel are most critical for these two weaknesses represented an opportunity for a new recommender systems. type of filtering, that would focus on finding which INTRODUCTION:A History of the GroupLensProject available articles matchhuman notions of quality and taste. The GroupLens Research project began at the Computer Such a system would be able to produce a list of articles Supported Cooperative Work (CSCW)Conference in 1992. that each user wouldlike, independentof their content. Oneof the keynote speakers at the conference lectured on a Wedecided to apply our ideas in the domain of Usenet his vision of an emerging information economy,in which news. Usenet screamsfor better information filtering, with most of the effort in the economywould revolve around hundreds of thousands of articles posted daily. Manyof the production, distribution, and consumptionof information, articles in each Usenet newsgroupare on the sametopic, so rather than physical goods and services.
    [Show full text]
  • Mobile Application Recommender System
    UPTEC IT 10 025 Examensarbete 30 hp December 2010 Mobile Application Recommender System Christoffer Davidsson Abstract Mobile Application Recommender System Christoffer Davidsson Teknisk- naturvetenskaplig fakultet UTH-enheten With the amount of mobile applications available increasing rapidly, users have to put a lot of effort into finding applications of interest. The purpose of this thesis is to Besöksadress: investigate how to aid users in the process of discovering new mobile applications by Ångströmlaboratoriet Lägerhyddsvägen 1 providing them with recommendations. A prototype system is then built as a Hus 4, Plan 0 proof-of-concept. Postadress: The work of the thesis is divided into three phases where the aim of the first phase is Box 536 751 21 Uppsala to study related work and related systems to identify promising concepts and features. During the second phase, a prototype system is designed and implemented. Telefon: The outcome and result of the first two phases is then evaluated and analyzed in the 018 – 471 30 03 third and final phase. Telefax: 018 – 471 30 00 The prototype system integrates and extends an existing recommender engine previously used to recommend media items. As a part of the system, an Android Hemsida: application is developed, which observes user actions and presents recommended http://www.teknat.uu.se/student applications to the user. In parallel to the development, the system was tested by a small group of users recruited among colleagues at Ericsson. The data generated during this test period is analyzed to show the usefulness of observed user actions over explicit ratings and the dependency on context for application usage.
    [Show full text]
  • A Systematic Review and Taxonomy of Explanations in Decision Support and Recommender Systems
    Noname manuscript No. (will be inserted by the editor) A Systematic Review and Taxonomy of Explanations in Decision Support and Recommender Systems Ingrid Nunes · Dietmar Jannach Received: date / Accepted: date Abstract With the recent advances in the field of artificial intelligence, an increasing number of decision-making tasks are delegated to software systems. A key requirement for the success and adoption of such systems is that users must trust system choices or even fully automated decisions. To achieve this, explanation facilities have been widely investigated as a means of establishing trust in these systems since the early years of expert systems. With today's increasingly sophisticated machine learning algorithms, new challenges in the context of explanations, accountability, and trust towards such systems con- stantly arise. In this work, we systematically review the literature on expla- nations in advice-giving systems. This is a family of systems that includes recommender systems, which is one of the most successful classes of advice- giving software in practice. We investigate the purposes of explanations as well as how they are generated, presented to users, and evaluated. As a result, we derive a novel comprehensive taxonomy of aspects to be considered when de- signing explanation facilities for current and future decision support systems. The taxonomy includes a variety of different facets, such as explanation objec- tive, responsiveness, content and presentation. Moreover, we identified several challenges that remain unaddressed so far, for example related to fine-grained issues associated with the presentation of explanations and how explanation facilities are evaluated. Keywords Explanation · Decision Support System · Recommender System · Expert System · Knowledge-based System · Systematic Review · Machine Learning · Trust · Artificial Intelligence I.
    [Show full text]
  • User Data Analytics and Recommender System for Discovery Engine
    User Data Analytics and Recommender System for Discovery Engine Yu Wang Master of Science Thesis Stockholm, Sweden 2013 TRITA-ICT-EX-2013: 88 User Data Analytics and Recommender System for Discovery Engine Yu Wang [email protected] Whaam AB, Stockholm, Sweden Royal Institute of Technology, Stockholm, Sweden June 11, 2013 Supervisor: Yi Fu, Whaam AB Examiner: Prof. Johan Montelius, KTH Abstract On social bookmarking website, besides saving, organizing and sharing web pages, users can also discovery new web pages by browsing other’s bookmarks. However, as more and more contents are added, it is hard for users to find interesting or related web pages or other users who share the same interests. In order to make bookmarks discoverable and build a discovery engine, sophisticated user data analytic methods and recommender system are needed. This thesis addresses the topic by designing and implementing a prototype of a recommender system for recommending users, links and linklists. Users and linklists recommendation is calculated by content- based method, which analyzes the tags cosine similarity. Links recommendation is calculated by linklist based collaborative filtering method. The recommender system contains offline and online subsystem. Offline subsystem calculates data statistics and provides recommendation candidates while online subsystem filters the candidates and returns to users. The experiments show that in social bookmark service like Whaam, tag based cosine similarity method improves the mean average precision by 45% compared to traditional collaborative method for user and linklist recommendation. For link recommendation, the linklist based collaborative filtering method increase the mean average precision by 39% compared to the user based collaborative filtering method.
    [Show full text]
  • The Spotify Paradox: How the Creation of a Compulsory License Scheme for Streaming On-Demand Music Platforms Can Save the Music Industry
    UCLA UCLA Entertainment Law Review Title The Spotify Paradox: How the Creation of a Compulsory License Scheme for Streaming On-Demand Music Platforms Can Save the Music Industry Permalink https://escholarship.org/uc/item/7n4322vm Journal UCLA Entertainment Law Review, 22(1) ISSN 1073-2896 Author Richardson, James H. Publication Date 2014 DOI 10.5070/LR8221025203 Peer reviewed eScholarship.org Powered by the California Digital Library University of California The Spotify Paradox: How the Creation of a Compulsory License Scheme for Streaming On-Demand Music Platforms Can Save the Music Industry James H. Richardson* I. INTRODUCTION �����������������������������������������������������������������������������������������������������46 II. ILLEGAL DOWNLOADING LOCALLY STORED MEDIA, AND THE RISE OF STREAMING MUSIC ����������������������������������������������������������������������������������������������������������������47 A. The Digitalization of Music, and the Rise of Locally Stored Content. ......47 B. The Road to Legitimacy: Digital Media in Light of A&M Records, Inc. ..48 C. Legitimacy in a Sea of Piracy: The iTunes Music Store. ...........................49 D. Streaming and the Future of Digital Music Service. ���������������������������������50 III. THE COPYRIGHT AND DIGITALIZATION �������������������������������������������������������������������51 A. Statutory Background ................................................................................51 B. Digital Performance Right in Sound Recordings Act ................................52
    [Show full text]
  • A Recommender System for Groups of Users
    PolyLens: A Recommender System for Groups of Users Mark O’Connor, Dan Cosley, Joseph A. Konstan, and John Riedl Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN 55455 USA {oconnor;cosley;konstan;riedl}@cs.umn.edu Abstract. We present PolyLens, a new collaborative filtering recommender system designed to recommend items for groups of users, rather than for individuals. A group recommender is more appropriate and useful for domains in which several people participate in a single activity, as is often the case with movies and restaurants. We present an analysis of the primary design issues for group recommenders, including questions about the nature of groups, the rights of group members, social value functions for groups, and interfaces for displaying group recommendations. We then report on our PolyLens prototype and the lessons we learned from usage logs and surveys from a nine-month trial that included 819 users. We found that users not only valued group recommendations, but were willing to yield some privacy to get the benefits of group recommendations. Users valued an extension to the group recommender system that enabled them to invite non-members to participate, via email. Introduction Recommender systems (Resnick & Varian, 1997) help users faced with an overwhelming selection of items by identifying particular items that are likely to match each user’s tastes or preferences (Schafer et al., 1999). The most sophisticated systems learn each user’s tastes and provide personalized recommendations. Though several machine learning and personalization technologies can attempt to learn user preferences, automated collaborative filtering (Resnick et al., 1994; Shardanand & Maes, 1995) has become the preferred real-time technology for personal recommendations, in part because it leverages the experiences of an entire community of users to provide high quality recommendations without detailed models of either content or user tastes.
    [Show full text]
  • Licensing Rules Repertoire Definition
    CIS14-0091R40 Source language: English 30/06/2021 The most recent updates are marked in red Licensing Rules Repertoire Definition This document sets out the repertoire definitions claimed directly by European Licensors in Europe and in some cases outside Europe. the repertoires are defined per Licensor / territory / types of on-line exploitations and DSPs with starting dates when necessary. Differences with the last version of this document are written in red. The document is a snap shot of the current information available to the TOWGE and is updated on an ongoing basis. Please note that the repertoires are the repertoires applied by the Licensors and are not precedential nor can they bind other licensing entities. The applicable repertoire will always be the one set out in the respective representation agreement between Licensors. Towge best practices on repertoires update are: • to communicate an update preferably 3 months but no later than 1 month before the starting period of the repertoire in order to allow enough time for Towge to communicate a new "repertoire definition document" to both Licensors and Licensees, which will allow these to adapt their programs accordingly • to not re-process invoices that have already been generated by Licensors and processed by Licensees Date of Publication Repertoire Licensing Body Pan-European Repertoire Definition DSPs Use Type Start Date Notes End Date Contact licensing Apr-19 WCM Anglo-American ICE Warner Chappell Music Publishing repertoire (mechanical and CP rights) licensable under the PEDL arrangement where the 7 Digital Ltd all digital 01/01/2010 Steve.Meixner repertoire author/composer of the Musical Work (or part thereof as applicable) is non-society or a member of PRS, IMRO, ASCAP, BMI, Amazon Music Unlimited (Steve.Meixner@u SESAC, SOCAN, SAMRO or APRA.
    [Show full text]
  • Chapter: Music Recommender Systems
    Chapter 13 Music Recommender Systems Markus Schedl, Peter Knees, Brian McFee, Dmitry Bogdanov, and Marius Kaminskas 13.1 Introduction Boosted by the emergence of online music shops and music streaming services, digital music distribution has led to an ubiquitous availability of music. Music listeners, suddenly faced with an unprecedented scale of readily available content, can easily become overwhelmed. Music recommender systems, the topic of this chapter, provide guidance to users navigating large collections. Music items that can be recommended include artists, albums, songs, genres, and radio stations. In this chapter, we illustrate the unique characteristics of the music recommen- dation problem, as compared to other content domains, such as books or movies. To understand the differences, let us first consider the amount of time required for ausertoconsumeasinglemediaitem.Thereisobviouslyalargediscrepancyin consumption time between books (days or weeks), movies (one to a few hours), and asong(typicallyafewminutes).Consequently,thetimeittakesforausertoform opinions for music can be much shorter than in other domains, which contributes to the ephemeral, even disposable, nature of music. Similarly, in music, a single M. Schedl (!)•P.Knees Department of Computational Perception, Johannes Kepler University Linz, Linz, Austria e-mail: [email protected]; [email protected] B. McFee Center for Data Science, New York University, New York, NY, USA e-mail: [email protected] D. Bogdanov Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain e-mail: [email protected] M. Kaminskas Insight Centre for Data Analytics, University College Cork, Cork, Ireland e-mail: [email protected] ©SpringerScience+BusinessMediaNewYork2015 453 F. Ricci et al. (eds.), Recommender Systems Handbook, DOI 10.1007/978-1-4899-7637-6_13 454 M.
    [Show full text]
  • A Multitask Ranking System
    Recommending What Video to Watch Next: A Multitask Ranking System Zhe Zhao, Lichan Hong, Li Wei, Jilin Chen, Aniruddh Nath, Shawn Andrews, Aditee Kumthekar, Maheswaran Sathiamoorthy, Xinyang Yi, Ed Chi Google, Inc. {zhezhao,lichan,liwei,jilinc,aniruddhnath,shawnandrews,aditeek,nlogn,xinyang,edchi}@google.com ABSTRACT promising items. We present experiments and lessons learned from In this paper, we introduce a large scale multi-objective ranking building such a ranking system on a large-scale industrial video system for recommending what video to watch next on an indus- publishing and sharing platform. trial video sharing platform. The system faces many real-world Designing and developing a real-world large-scale video recom- challenges, including the presence of multiple competing ranking mendation system is full of challenges, including: objectives, as well as implicit selection biases in user feedback. To • There are often diferent and sometimes conficting objec- tackle these challenges, we explored a variety of soft-parameter tives which we want to optimize for. For example, we may sharing techniques such as Multi-gate Mixture-of-Experts so as to want to recommend videos that users rate highly and share efciently optimize for multiple ranking objectives. Additionally, with their friends, in addition to watching. we mitigated the selection biases by adopting a Wide & Deep frame- • There is often implicit bias in the system. For example, a user work. We demonstrated that our proposed techniques can lead to might have clicked and watched a video simply because it substantial improvements on recommendation quality on one of was being ranked high, not because it was the one that the the world’s largest video sharing platforms.
    [Show full text]
  • When Diversity Met Accuracy: a Story of Recommender Systems †
    proceedings Extended Abstract When Diversity Met Accuracy: A Story of Recommender Systems † Alfonso Landin * , Eva Suárez-García and Daniel Valcarce Department of Computer Science, University of A Coruña, 15071 A Coruña, Spain; [email protected] (E.S.-G.); [email protected] (D.V.) * Correspondence: [email protected]; Tel.: +34-881-01-1276 † Presented at the XoveTIC Congress, A Coruña, Spain, 27–28 September 2018. Published: 14 September 2018 Abstract: Diversity and accuracy are frequently considered as two irreconcilable goals in the field of Recommender Systems. In this paper, we study different approaches to recommendation, based on collaborative filtering, which intend to improve both sides of this trade-off. We performed a battery of experiments measuring precision, diversity and novelty on different algorithms. We show that some of these approaches are able to improve the results in all the metrics with respect to classical collaborative filtering algorithms, proving to be both more accurate and more diverse. Moreover, we show how some of these techniques can be tuned easily to favour one side of this trade-off over the other, based on user desires or business objectives, by simply adjusting some of their parameters. Keywords: recommender systems; collaborative filtering; diversity; novelty 1. Introduction Over the years the user experience with different services has shifted from a proactive approach, where the user actively look for content, to one where the user is more passive and content is suggested to her by the service. This has been possible due to the advance in the field of recommender systems (RS), making it possible to make better suggestions to the users, personalized to their preferences.
    [Show full text]