Music Similarity and Recommendation

Computer Science Clinic Final Report for Auditude Music Similarity and Recommendation May 13, 2003 Team Members Paul Ruvolo (Team Leader) Brad Poon Elizabeth Schoof Nicholas Taylor Advisor Melissa O’Neill Liaison Nicholas Seet ’99 Abstract The Auditude clinic team has investigated content-based similarity relationships between recordedmusicalperformances.Ifacomputer system can automatically determine whether two recordings are similar, it can assist in managing a music collection and make recommendations for possible new acquisitions. Similarity is acomplexconceptinvolving many judgments–our team has focused our attention on a combination of rhythm, timbre, and apparent loudness. We have developed software that extracts these features fromarecording, and uses them to organize music. We have not used any metadata in our process, though we have designed it so that metadata-based or other similarities could readily be integrated. The system can be used to categorize music, to make recommendations, and generate playlists that arrange music in a sequence with smooth transitions between songs. The project as a whole takes a significant step towards making the experience of discovering and listening to new digital music effortless. Contents Abstract iii 1Introduction 1 1.1Auditude. ................................ 1 1.2Problem Statement........................... 1 1.3OverviewofOur Solution ....................... 2 1.4Deliverables ............................... 3 2FeatureExtraction 5 2.1Psychoacoustics andLoudnessSensation .............. 5 2.2BeatSpectrum ............................. 16 2.3Timbre .................................. 20 3 Distance Metrics 25 3.1Euclidean ................................ 25 3.2Mahalanobis ............................... 26 3.3Cosine .................................. 26 3.4DimensionalityReduction ....................... 28 3.5Combining FeatureVectors ...................... 28 3.6EMD andClustering .......................... 31 4Generating Maps 33 4.1The Self-OrganizingMap ........................ 33 4.2The GrowingHierarchicalSelf-Organizing Map........... 34 4.3Our ApplicationofSOMs ........................ 35 4.4EdgeWeights .............................. 36 4.5Motivationfor Maps andHierarchicalMaps ............. 36 4.6Expanding aGHSOM .......................... 37 4.7MultipleTrees andMultipleFeatures................. 37 4.8Map Quality ............................... 38 vi Contents 5Similarity Lookups 39 5.1SimilarityLookups Defined ...................... 39 5.2Topological Distance .......................... 39 5.3Complexity............................... 40 5.4MultipleFeatures ............................ 40 5.5Algorithmsfor Lookups........................ 40 6PlaylistGeneration 43 6.1ComplexityofGeneratinganOptimal Playlist ............ 43 6.2AGreedy Algorithm .......................... 45 6.3Approximation Algorithms ...................... 45 6.4Genetic Algorithms ........................... 47 7UserTesting 49 7.1First User Test .............................. 49 7.2SecondUserTest ............................ 51 7.3Professor O’Neill’sPlaylist ....................... 53 7.4PresentationDaysDemo........................ 60 8Conclusions and Future Work 63 AManualsand End to End Description 65 A.1Fluctuation Strength .......................... 65 A.2Calculating theBeatSpectrum.................... 67 A.3TimbreSimilarity ............................ 71 A.4GeneratingMaps ............................ 73 A.5PlaylistGenerationUsing theGreedyAlgorithm ........... 76 A.6PlaylistGenerationUsing theGenetic Algorithm .......... 80 A.7PerformingMusic Recommendation ................. 81 BUnexploredPossibilities 89 B.1RepresentingTrees in aDatabase................... 89 B.2MoreEfficientDimensionalityReduction .............. 94 B.3Other Algorithms forPlaylistGeneration ............... 95 CSagaofProject(withPictures) 97 C.1CastofCharacters ........................... 97 C.2GroundZero. .............................. 98 C.3LaMèrde de Paris ............................ 98 C.4ForgetParis... Please .......................... 99 C.5Casey at theBat ............................. 99 Contents vii C.6The Ghetto Girl MakesGood ..................... 102 C.7Wait, We Have to do aPresentation? ................. 102 C.8The DayofReckoning ......................... 103 C.9There andBackAgain,aProjectManager’s Story .......... 103 List of Figures 1.1Anoverviewofour solution ...................... 2 2.1Spectra of threepopular songs andaclassical piece ......... 7 2.2 Critical band spectra of three popular songs and a classical piece . 9 2.3Criticalbandspectra,modified by thespreading function ..... 10 2.4Phonvaluesfor theexample songs .................. 11 2.5Sonevaluesfor theexample songs .................. 13 2.6Modulationamplitude values forthe examplesongs ........ 14 2.7Fluctuation strength values forthe four examplesongs ....... 15 2.8 Similarity matrix of Vivaldi’s Spring .................. 17 2.9The beat spectrum ........................... 19 2.10 TheFFT of thebeatspectrum ..................... 21 2.11 Thewaveformrepresentation ..................... 22 2.12 MFCC data forthree songs ....................... 24 3.1Asample covariance matrix....................... 27 3.2 Songs projected into the x-y plane. .................. 29 3.3Contributionofadditional dimensions.. ............... 30 4.1Aview of ahierarchicalSOM ..................... 35 6.1Envisioning playlist generation as ashortestpathproblem ..... 44 7.1Comparing machineand humansimilarityjudgments. ....... 52 A.1The GHSOMcodeinaction...................... 77 B.1Two different viewsofthe same tree.................. 92 B.2Anumbered tree ............................ 93 B.3Asample split .............................. 95 x List of Figures C.1AdventuresinParis ........................... 100 C.2 Projects Day, 2003 ............................ 101 Chapter 1 Introduction Music similarity judgments are valuable to both the music industry and consumers. Stores can use this information to organizetheirmerchandise and recommend similar artists or albums. Individual listeners can use similarity to arrange playlists and select new music to extend their collections. Although people make these judgments fairly quickly even without formal music training, it is prohibitively time-consuming for people to analyze a meaningful fraction of all the recorded music that exists today. This motivates the quest for a computer-generated music similarity judgment useful for arranging playlists and recommending new recordings. 1.1 Auditude Auditude has a very accurate music recognition system which is currently em- ployed by several service providers as well as in consumer applications. The problem with this recognition system is, in a sense, that it is too accurate. It has been in- tentionally tuned to prevent similar-sounding songs being mistaken for each other. Though their system can recognize a particular recording even if it has been cor- rupted by radio static and cell phone compression, it cannot calculate the less pre- cise matching required for similarity and recommendation. 1.2 Problem Statement For this project, we have explored music similarity,featureextraction,andcom- parison metrics. We have also developed frameworks which use these metrics to group songs hierarchically and to create pleasantly ordered playlists. These frameworks can also easily be extended to use new metrics as they are developed. 2 Introduction Database Creation (One Time) Sone/Loudness Tree of Feature MFCC/Timbre Neural Networking/ Audio Extraction GHSOM Relationships Waveforms Beat Spectrum Saved Tree Database Lookup (Many Times) Existing Similar Songs Query Song Auditude Song ID Tree lookup (optionally ordered) Technology Figure 1.1: An overview of our solution 1.3 Overview of Our Solution Our system attempts to determine similarity relationships from a database of audio files. To do this, it first determines a ‘feature vector,’ or a series of numbers that somehow represent the way the song sounds, from each audio file. These feature vectors are then used as inputtoaneuralnetwork, which finds similarity relationships and arranges the songs into a hierarchy. An overview of this process is shown in the top half of Figure 1.1. Details of the procedures used may be found in Chap- ters 2, 3, and 4. Once the hierarchy of relationships has been generated, it can be used many times for similarity lookups from either content-based queries or Auditude-generated song IDs. Once the song name has been obtained, the song in question is located in the tree of relationships, and nearby songs are found. This procedure is shown in the bottom half of Figure 1.1. The specifics of this process may be found in Chap- ter 5. Our system can also order the results of a lookup, or any other set of songs from the database, in amanner that will reduce the overall dissimilarity between adjacent songs. The methods used for sequencing are discussed in Chapter 6. Deliverables 3 1.4 Deliverables In the first semester, we gave Auditude a proposal describing what we hoped to accomplish this year and a mid-year report detailing our progress through Novem- ber. Now, at the end of the project, we are delivering the codewewroteandafinal report. This document describes the possible approaches we researched, the algorithms we implemented, and the results we obtained. It also includes a user’s manual for the programs we developed and some suggestions for future develop- ment. Chapter 2 Feature Extraction Feature vectors lie at the core of our technique for discerning

Load more