Empirical Analysis of Track Selection and Ordering in Electronic Dance Music Using Audio Feature Extraction
Total Page:16
File Type:pdf, Size:1020Kb
EMPIRICAL ANALYSIS OF TRACK SELECTION AND ORDERING IN ELECTRONIC DANCE MUSIC USING AUDIO FEATURE EXTRACTION Thor Kell George Tzanetakis IDMIL, CIRMMT, McGill University University of Victoria [email protected] [email protected] ABSTRACT similarity between tracks, in terms of features represent- ing timbre, key, tempo and loudness. The source data for Disc jockeys are in some ways the ultimate experts at this investigation is the British Broadcasting Corporation’s selecting and playing recorded music for an audience, es- Essential Mix radio program 1 . Broadcast since 1993, the pecially in the context of dance music. In this work, we Essential Mix showcases exceptional DJs of various genres empirically investigate factors affecting track selection and of electronic dance music (EDM), playing for one or two ordering using DJ-created mixes of electronic dance mu- hours. It is considered one of the most reputable and in- sic. We use automatic content-based analysis and discuss fluential radio programs in the world. By investigating the the implications of our findings to playlist generation and relationships between tracks in DJ sets we hope to better ordering. Timbre appears to be an important factor when understand track selection by DJs and inform the design of selecting tracks and ordering tracks, and track order itself algorithms and audio features for automatic playlisting. matters, as shown by statistically significant differences in The automatic estimation of music similarity between the transitions between the original order and a shuffled two tracks has been a primary focus of music information version. We also apply this analysis to ordering heuristics retrieval (MIR) research. [4] Several methods for comput- and suggest that the standard playlist generation model of ing music similarity have been proposed based on content- returning tracks in order of decreasing similarity to the ini- analysis, metadata (such as artist similarity, web reviews), tial track may not be optimal, at least in the context of track and usage information (such as ratings and download pat- ordering for electronic dance music. terns in peer to peer networks). [1] Music similarity is the basis of query-by-example which is a fundamental MIR 1. INTRODUCTION task, and also one of the first tasks explored in MIR litera- ture. In this paradigm the user submits a query consisting The invention of recording lead to the possibility of select- of one or more ‘seed’ pieces of music, sometimes also in- ing recorded music to entertain a group of people. The cluding metadata and user preferences. The system then idea of listening to records instead of listening to bands responds by returning a playlist of music pieces ranked took off after the second world war, when sound systems by their similarity to the query, and set in some order. In and record players began to appear in night clubs and cafes contrast, our approach is analytic. Rather than generating in New York, Jamaica, London, Paris, and beyond [5]. playlists, we investigate existing DJ sets through audio fea- Since then, the disk jockey (DJ) has evolved from a ture extraction and examine the transitions between tracks simple selector and orderer of music into a sophisticated in terms of audio features representing timbre, loudness, performer with considerable skill and training. Although tempo, and key. We also compare the results of our empir- these performance aspects are compelling, the primary fo- ical investigation to common ordering methods, and offer cus of this paper is the basic selection and ordering of mu- some suggestions for improving current playlisting heuris- sic. DJs generally bring a limited amount of their music tics. collection to any given gig, and play a reasonably large subset of it. Two important questions to consider are ‘What 2. RELATED WORK tracks go into a playlist?’, and ‘What is the best ordering of these tracks?’. Track ordering is not a well understood Early MIR work investigating the automatic calculation of process, even by DJs. Many DJs will say only that two music similarity and how to evaluate different approaches tracks ‘work’ or ‘do not work’ together, and not be able formulated a general methodology that is followed by the to comment further. [5] We investigate this selection and majority of existing work to this day. In this methodol- ordering process in terms of the automatically computed ogy, the primary goal is assessing the relative performance of different algorithms for computing music similarity by somehow evaluating the ’quality’ of the generated playlists. Permission to make digital or hard copies of all or part of this work for The most common approach of generating a playlist is personal or classroom use is granted without fee provided that copies are to consider the N closest neighbors in terms of automat- not made or distributed for profit or commercial advantage and that copies ically calculated similarity to a particular query. Several bear this notice and the full citation on the first page. c 2012 International Society for Music Information Retrieval. 1 http://www.bbc.co.uk/programmes/b006wkfp automatic playlists are generated from seed queries repre- of playlist generation by tracking user listening informa- senting the desired diversity of the music considered. This tion has also been explored [9]. A different approach alto- set of automatically generated playlists is then evaluated, gether is to create playlists visually, based on some graph- typically using one or both of two approaches: objective ical representation of the music collection [18]. In all of evaluation using proxy ground truth for relevance, or sub- this literature, different approaches to playlist generation jective evaluation through user studies. The basic idea is are also evaluated with a combination of objective mea- to evaluate a playlist by considering it good if it contains a sures and user studies, comparing different configurations high number of ‘relevant items’ to the query [1]. The rele- to a random or a simple algorithmic baseline. For example, vance ground truth can be provided by users in subjective a recent study compared two recommender systems (based evaluation but this is a time consuming and labor intensive on artist similarity and acoustic content) with the Apple process that does not scale well. iTunes Genius recommender which is believed to be based Objective evaluation has the advantage that it can scale on collaborative filtering [3]. to any number of queries and playlists, as long as the tracks have some associated meta-data that can be used as a proxy 3. MOTIVATION AND PROBLEM for relevance. Common examples of such proxy sources FORMULATION include artist, genre, and song [1,10]. In addition to music The motivation behind our work is to investigate the pro- similarity calculated based on audio content analysis, other cess of playlist/mix creation by analyzing existing mixes sources of information such as web reviews, download pat- created by experts - in this case, DJs. Existing work has terns, ratings and explicit editorial artist similarity [7] can mostly focused on more general playlists created by av- also be used for estimating music similarity [6]. Playlists erage listeners. Rather than relying on user surveys, we themselves have also been used to calculate artist and track focus on empirical analysis based on audio feature extrac- similarity based on co-occurrence. Sources of playlist data tion. This allows us to investigate what audio attributes DJs include the Art of the Mix, a website that contains a large use when selecting and order their tracks. We further com- number of hobbyist playlists [4], and listings of radio sta- pare these attributes to collections of random EDM tracks, tions [12]. DJ sets remain an untapped resource, however. and to artist albums. We specifically investigate whether The most common relevance-based evaluation measures track order matters. We also examine important assump- (such as precision, recall and F-measure [2]) are borrowed tions that are frequently made by automatic playlist gen- from text information retrieval and only consider the items eration systems. Specifically, we investigate if ordering contained in the set of returned results, without taking into based on similarity ranking is a good choice, and if so, in account their order. The paradigm of a single seed query what manner. song creating a list of N items ranked by similarity has re- In existing literature these assumptions are typically man- mained a common approach to automatic playlisting and ifested in the design of an automatic playlist algorithm, music recommendation. Some notable exceptions in terms and the results are evaluated through objective or subjec- of ordering include: heuristics about trajectory for the or- tive approaches. Issues such as the sameness problem in dering of returned items [10], using song sets instead of playlists formed from collections of music that do not have single seeds [11], ordering based on the traveling sales- stylistic diversity, or the playlist drift problem in large di- man problem [16], and considering both a start track and verse collections are also discussed but are not empirically an end track for the playlist [8]. The assumption of simi- supported [8]. In contrast, our approach is complimentary larity has also been challenged by the finding that in many and attempts to test these assumptions directly on existing cases users prefer diverse playlists [17] as measured by au- mixes. Our methodology can be also viewed as an em- tomatic feature analysis. This is the closest work in terms pirical musicological approach to understanding how DJs of approach to the work described in this paper. select and order music. Another theme of more recent work has been providing more user control to the process of automatic playlist gen- 4. METHODOLOGY eration. If the tracks considered are associated with a rich set of attribute/value pairs then techniques from constraint 4.1 Data satisfaction programming and inductive learning can be We obtained 114 Essential Mix DJ sets from the archival used to generate playlists that to some extent optimally sat- website themixingbowl.org (DJS).