A Bayesian Hierarchical Model for Personalized Playlist Generation
Total Page:16
File Type:pdf, Size:1020Kb
Groove Radio: A Bayesian Hierarchical Model for Personalized Playlist Generation Shay Ben-Elazar y∗ Gal Lavee y∗ Noam Koenigstein y∗ Oren Barkany Hilik Bereziny Ulrich Paquet yz Tal Zaccaiy y Microsoft R&D, Israel z Microsoft Research, UK {shaybe, galla, noamko, orenb, hilikbe, ulripa, talzacc}@microsoft.com ABSTRACT Central to the approach taken in this research is the idea This paper describes an algorithm designed for Microsoft's of estimating the relatedness of a \candidate" track to the Groove music service, which serves millions of users world previously selected tracks already in the playlist. Such relat- wide. We consider the problem of automatically generating edness can depend on multiple information sources, such as personalized music playlists based on queries containing a meta-data and domain semantics, the acoustic audio signal, \seed" artist and the listener's user ID. Playlist generation and popularity patterns, as well as information extracted may be informed by a number of information sources in- using collaborative filtering techniques. cluding: user specific listening patterns, domain knowledge The multiplicity of useful signals has motivated recent encoded in a taxonomy, acoustic features of audio tracks, works [10, 24] to combine several information sources in or- and overall popularity of tracks and artists. The importance der to model playlists. While it is known that all these fac- assigned to each of these information sources may vary de- tors play a role in the construction of a quality playlist [5], a pending on the specific combination of user and seed artist. key distinction of Groove's model is the idea that the impor- The paper presents a method based on a variational Bayes tance of each of these information sources varies depending solution for learning the parameters of a model containing a on the specific seed artist. For example, when composing four-level hierarchy of global preferences, genres, sub-genres a playlist for a Jazz artist such as John Coltrane, the im- and artists. The proposed model further incorporates a per- portance of acoustic similarity features may be high. In sonalization component for user-specific preferences. Em- contrast, in the case of a Contemporary Pop artist, such as pirical evaluations on both proprietary and public datasets Lady Gaga, features based on collaborative filtering may be demonstrate the effectiveness of the algorithm and showcase more important. In Section 5, evaluations are provided in the contribution of each of its components. support of this assumption. Another key element of the model in this paper is per- sonalization based on the user. For example, a particular 1. INTRODUCTION user may prefer strictly popular tracks, while another prefers Online music services such as Spotify, Pandora, Google more esoteric content. Our method provides for modeling Play Music and Microsoft's Groove serve as a major growth user specific preferences via a personalization component. engine for today's music industry. A key experience is the The model in this paper is designed to support any artist ability to stream automatically generated playlists based in Groove's catalog, far beyond the small number of artists on some criteria chosen by the user. This paper consid- that dominate the lion's share of user listening patterns. ers the problem of automatically generating personalized The distribution of artists in the catalog contains a long music playlists based on queries containing a \seed" artist tail of less popular artists for which insufficient informa- and the listener's user ID . We describe a solution designed tion exists to learn artist specific parameters. Therefore, the for Microsoft's Groove internet radio service, which serves Groove model also leverages the hierarchical music domain playlists to millions of users world wide. taxonomy of genres, sub-genres and artists. When particu- In recommender systems research, collaborative filtering lar artists or possibly even sub-genres are underrepresented approaches such as matrix factorization are often used to in the data, the Groove model can still allow prediction by learn relations between users and items [18]. The playlist \borrowing"information from sibling nodes sharing the same generation task is fundamentally different as it requires learn- parent in the hierarchical domain taxonomy. ing a coherent sequence to allow smooth track transitions. The contributions of this work are enumerated below. The first contribution is a novel model specification combining ∗Corresponding Author several properties considered advantageous for playlist gen- Permission to make digital or hard copies of all or part of this work for personal or eration: (i) a hierarchical encapsulation of the music domain classroom use is granted without fee provided that copies are not made or distributed taxonomy, (ii) flexibility to weight a number of similarity for profit or commercial advantage and that copies bear this notice and the full cita- tion on the first page. Copyrights for components of this work owned by others than types (audio, meta-data, usage, popularity), and (iii) a per- ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- sonalization component to model per-user preferences. The publish, to post on servers or to redistribute to lists, requires prior specific permission second major contribution is an efficient variational Bayes and/or a fee. Request permissions from [email protected]. algorithm to learn the parameters of the model from data. WSDM 2017, February 06-10, 2017, Cambridge, United Kingdom The final contribution of this paper is in describing a playlist c 2017 ACM. ISBN 978-1-4503-4675-7/17/02. $15.00 generation approach that serves the basis for a currently de- DOI: http://dx.doi.org/10.1145/3018661.3018718 ployed Groove radio algorithm. While some changes do exist (global, genre, sub-genre and artist), considers multiple simi- between this paper and the production system, this work is larity types (audio, meta-data, usage), and incorporates per- the only published description of the methods underlying sonalization. Section 5 gives an extensive experimental eval- such a large-scale commercial music service. uation, showing the importance of each of these parts to the Our paper is organized as follows: Section 2 discusses rel- quality of the playlist. evant related work. Section 3 motivates our approach, dis- cusses the specification of our model and explains the algo- rithm we apply to learn the parameters from data. Section 3. MODELING CONTEXTUAL PLAYLISTS 4 describes the features we use to encode playlist context. The algorithm in this paper is designed to generate per- Section 5 gives the details of our experimental evaluation. sonalized playlists in the context of a seed artist and a spe- Section 6 describes additional modifications that allow the cific user. In Groove music, a playlist request can be called proposed approach to work in large-scale scenarios. Finally, by each subscriber using any artist in the catalog as a seed. section 7 summarizes the paper. An artist seed is a common scenario in many alternative online music services, but the algorithm in this paper can 2. RELATED WORK easily be extended to support also track seeds, genre seeds, Automatic playlist generation is an active research prob- seeds based on labels as well as various combinations of the lem for which many formulations, approaches and evaluation above (multi-seed). In this paper, we limit the discussion to frameworks have been proposed. The problem has been var- the case of a single artist seed. iously formalized as: constraint satisfaction [28], the trav- As explained earlier, constructing a playlist depends on eling salesman [17], and clustering in a metric space [27]. multiple similarities used for estimating the relatedness of a Other works [13, 15, 33] opt for a recommendation oriented new candidate track to the playlist. These similarities are approach i.e., identifying the best tracks in the catalog given based on meta-data, usage, acoustic features, and popular- the user and some context. To our knowledge, this paper is ity patterns. However, the importance of each feature may among the first publications to describe a playlist algorithm be different for each seed. In the presence of a user with powering a large scale commercial music service. For more historical listening data, the quality of the playlist may be background on playlist generation, we refer the reader to the further improved with personalization. A key contribution in-depth survey and discussion by Bonnin and Jannach [4]. of the model in this paper is the ability to learn different Gillhofer and Schedl [11] empirically illustrate the im- importance weights for each combination of seed artist and portance of personalization in selecting appropriate tracks. listening user. Personalization via latent space representation of users is a An additional contribution of the proposed model is the mainstay of classical recommendation systems [18]. Ferw- ability to support completely new or \cold" (i.e. sparsely erda and Schedl [10] proposed modeling additional user fea- represented in the training data) combinations of users and tures encoding personality and emotional state to enhance artists. If there is insufficient information on the user, the music recommendations. proposed model performs a graceful fall-back, relying only Many works discuss computing similarity metrics between on parameters related to the seed artist. In the case of an musical tracks. Collaborative Filtering (CF) or usage fea- unknown or \cold" artist, the proposed model uses the hi- tures relate tracks consumed by similar users [1, 8, 22]. erarchical domain taxonomy, relying on the parameters at Meta-Data features, relating tracks with similar semantic the sub-genre level. This property applies also to underrep- attributes, are explored in [3, 23, 27, 31]. Finally, acoustic resented sub-genres and even genres, afforded by relying on audio features, relating tracks based on their audio signal, correspondingly higher levels in the domain taxonomy.