Estimating Song Composition Year Using Chord Features

Estimating song composition year using chord features Alastair Porter May 2, 2011 1 Introduction This report presents a method of estimating the year that a song was written by using symbolic information about the song. The method presented uses the chord progressions of 1013 popular songs released between the late 1950s and 1990s. Different methods of extracting features from the chord progressions were used. 2 Background As far as can be determined, no prior work has been done on estimating the year that a song was written using symbolic data. One reason for this is the relative scarcity of symbolic datasets that contain a significant number of data points. For example, Harte's Beatles dataset (Harte, Sandler, Abdallah, and Gómez2005) contains transcriptions for less than 200 songs, all written in a relatively small time period, and by the same band. This experiment used data from the Billboard dataset(The McGill Billboard Project 2011). This dataset contains manually transcribed chord progressions and key for over 1000 popular songs taken from the Bill- board hot 100 chart over four decades. Also included is the year that the song appeared on the chart. 3 Method A section of the Billboard dataset is presented in Figure 1. In this transcription, numbers at the front of the line indicates the time in seconds that the chords on that line start. Any figures at the beginning of the line before the first | symbol indicate broad sections, such as verses and choruses. Bars are indicated by the | symbol. If a section of bars repeats exactly, it can be represented by a repeat marker at the end of the line (e.g. x4). 90.755759637 C, chorus ,| Em D/F# | G D/F# | x2 98.996371882 | Em D/F# | G E/G# | A F#/A# | B | 107.228140589 D, bridge, | C#m | F#m | C#m | F#m | x2 Figure 1: Sample of transcription of Jane by Jefferson Starship 1 E:min, D:maj/F#, G:maj, D:maj/F#, E:min, D:maj/F#, G:maj, D:maj/F#, E:min, D:maj/F# ...etc Figure 2: The expanded list of chords from Figure 1 E G B, F# D F# A, G B D, F# D F# A, E G B, F# D F# A, G B D, F# D F# A, E G B ... etc Figure 3: The notes in each chord from Figure 2, with the roots at the beginning of each list A suit of software tools was written to read these transcribed files and convert them into a machine readable list of chords, using the representation format described by Harte (2005) in order to consistently represent all songs (Figure 2). Section information such as verses and choruses was discarded. Information about the start and end of each chord was preserved by splitting bars up evenly based on the start time of one line and the start time of the next line. Different forms of representing chord qualities by different transcribers were uniformly converted into a standard format. For example, m, mi, min, and the - symbol were all used to describe a minor chord. These were all changed into the single representation \min". The notes that each chord were made up of was also extracted, ensuring that the root notation in the original transcription was preserved (Figure 3). This representation was used for the features that used the roots of chords. In total, 1013 transcriptions were generated, including list of chords, the key of the song, and the year the song was on the billboard charts. Six different sets of features were extracted from the chord files. They were: 1. A bag of chords. Each chord that occurred in the song was listed along with the number of times that it occurred in the song. 2. A bag of roots. The root notes of the chords in each song were listed along with the number of times that it occurred in the song. 3. A bag of chords relative to the key of the song. As above, but each chord was represented as a degree of the key. This had the effect of allowing more than one song to appear to use the same chords, even if they were written in different keys. 4. A bag of roots relative to the key of the song. 5. A bag of all chord qualities used in the song. This set of features was chosen because of a hypothesis that the complexity of songs changed over time. 6. A bag of the most popular chords in the corpus. The entire corpus was scanned and the top 10 chord qualities obtained. These qualities were major, minor, dominant 7th, minor 7th, major 7th, 5th (\power" chord), sus4, dominant 7th sus4, major 7th add 9", and major 6th. Each instance only added chords to the instance if the quality of the chord was one of these. Year estimation was performed using the libsvm library (Chang and Lin 2001), which provides a set of tools for creating support vector machines. 2 Feature 1 year bins 2 year bins 5 year bins 10 year bins random chance baseline 3.13% 5.88% 12.5% 20% bag of chords 5.05% 10.50% 23.76% 44.06% bag of roots 5.25% 10.30% 22.87% 43.76% normalised bag of chords 4.85% 10.0% 23.76% 46.93% normalised bag of roots 6.34% 9.70% 24.16% 45.64% bag of qualities 6.24% 10.89% 23.47% 44.55% bag of top qualities 6.14% 11.19% 25.35% 47.03% number of bins 32 17 8 5 Table 1: Accuracy after 5-fold cross-validation (to 2 d.p.) Each instance was converted into a line that could be read by the sample libsvm tools. A bag of features was turned into numerical features by listing the number of times each chord, root, or quality occurred in that instance. The count of each attribute was then scaled to fall between 0 and 1, as suggested in the libsvm documentation. libsvm uses the RBF kernel by default, and comes with a tool to automatically perform a search to find optimal values for the C and γ parameters to the kernel. This tool performed 5-fold cross-validation of the training data and returned an accuracy rating. The training was performed with a number of different sets of classes. One dataset was created where the class to estimate was the exact year associated with the song. Three additional sets were made with the year falling into 2, 5, and 10 year bins, respectively. For example, for 5 year bins, the year 1978 would fall into the 1975 bin along with all years up to and including 1979. 4 Results Results for each of the combinations of feature sets and class bin sizes are given in Table 1. The baseline given refers to the probability of getting the year correct if one of the possible classes was chosen randomly (assuming a regular distribution of test instances) From the table it can be seen that every set of features gives a result that is better than the baseline. No choice of feature significantly out-performs the other features. 4.1 Discussion It is worth considering the implication of the results presented, especially for the smaller bins (one and two years). While the accuracy is higher than random chance, the equivalent question being asked to a human would be \can you guess the exact year that this song was written in". A more sensible question to ask might be to place the song in a decade, or at least a part of the decade (beginning or end) An additional evaluation metric might be to consider the estimated year in a sliding window around the actual year. For example, with a 10 year sliding window, a song written in 1985 would be considered correct if the estimate falls between 1980 and 1990. It is unknown how accurate the years are in the dataset The year used as the class is the year that the song charted on the Billboard hot 100. If the song was written by the artist that performed it then there is 3 a good chance that it was written in the year that it charted, however there are also a number of cover of songs by different artists on the charts. Further investigation of occurrence of these songs would be required to see if this would affect the results. 5 Further work There are other chord based features that could be used in the year estimation. One feature that was not implemented is the number of times common chord progressions occur. New MIR datasets are becoming available. One such dataset that could be of use in a task like this is the Million Song Dataset(Lab ROSA 2011). The Million Song Dataset provides a computer generated transcriptions of the frequency intensity information. This information could be used along with the known chord progressions from the Billboard dataset to generate more chord progressions for training a year detection system on. 6 Acknowledgements Thanks is given to Ashley Burgoyne for providing the Billboard dataset, and for his help with identifying suitable features for use in the SVM models. References Chang, C.-C., and C.-J. Lin. 2001. LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. Harte, C., M. Sandler, S. Abdallah, and E. Gómez. 2005. Symbolic representation of musical chords: A proposed syntax for text annotations. In Proc.

Load more