A Bayesian Approach to Understanding Music Popularity Heather Shapiro Advisor: Merlise Clyde Department of Statistical Science, Duke University
[email protected];
[email protected] Abstract The Billboard Hot 100 has been the main record chart for popular music in the American music industry since its first official release in 1958. Today, this rank- ing is based upon the frequency of which a song is played on the radio, streamed online, and its sales. Over time, however, the limitations of the chart have become more pronounced and record labels have tried different strategies to maximize a song’s potential on the chart in order to increase sales and success. This paper intends to analyze metadata and audio analysis features from a random sample of one million popular tracks, dating back to 1922, and assess their potential on the Billboard Hot 100 list. We compare the results of Bayesian Additive Regression Trees (BART) to other decision tree methods for predictive accuracy. Through the use of such trees, we can determine the interaction and importance of differ- ent variables over time and their effects on a single’s success on the Billboard chart. With such knowledge, we can assess and identify past music trends, and provide producers with the steps to create the ‘perfect’ commercially successful song, ultimately removing the creative artistry from music making. Keywords: Bayesian, BART, popular music, American music industry, regression trees, sentiment analysis. 1 1 Introduction Popular music is typically created to be mass distributed to a broad range of music listeners. This music is often stored and distributed in non-written form and can only be successful as a commodity in an ‘industrial monetary economy’.1 The goal of popular music is to sell as much music as possible to a large quantity of people at little cost.