Computational Models of Music Similarity and Their Applications In

DISSERTATION Computational Models of Music Similarity and their Application in Music Information Retrieval ausgeführt zum Zwecke der Erlangung des akademischen Grades Doktor der technischen Wissenschaften unter der Leitung von Univ. Prof. Dipl.-Ing. Dr. techn. Gerhard Widmer Institut fürComputational Perception Johannes Kepler UniversitätLinz eingereicht an der Technischen Universität Wien FakultätfürInformatik von Elias Pampalk Matrikelnummer: 9625173 Eglseegasse 10A, 1120 Wien Wien, März 2006 Kurzfassung Die Zielsetzung dieser Dissertation ist die Entwicklung von Methoden zur Unterstützung von Anwendern beim Zugriff auf und bei der Entdeckung von Musik. Der Hauptteil besteht aus zwei Kapiteln. Kapitel 2 gibt eine Einführung in berechenbare Modelle von Musikähnlich- keit. Zudem wird die Optimierung der Kombination verschiedener Ansätze beschrieben und die größte bisher publizierte Evaluierung von Musikähnlich- keitsmassen präsentiert. Die beste Kombination schneidet in den meisten Evaluierungskategorien signifikant besser ab als die Ausgangsmethode. Beson- dere Vorkehrungen wurden getroffen um Overfitting zu vermeiden. Um eine Gegenprobe zu den Ergebnissen der Genreklassifikation-basierten Evaluierung zu machen, wurde ein Hörtest durchgeführt.Die Ergebnisse vom Test bestä- tigen, dass Genre-basierte Evaluierungen angemessen sind um effizient große Parameterräume zu evaluieren. Kapitel 2 endet mit Empfehlungen bezüglich der Verwendung von Ahnlichkeitsmaßen.¨ Kapitel 3 beschreibt drei Anwendungen von solchen Ahnlichkeitsmassen.¨ Die erste Anwendung demonstriert wie Musiksammlungen organisiert and visualisiert werden können,so dass die Anwender den Ahnlichkeitsaspekt,¨ der sie interessiert, kontrollieren können. Die zweite Anwendung demonstriert, wie auf der Künstlerebene Musiksammlungen hierarchisch in sich überlappende Gruppen organisiert werden können. Diese Gruppen werden mittels Wörter von Webseiten zusammengefasst, welche mit den Künstlern assoziiert sind. Die dritte Anwendung demonstriert, wie mit minimalen An- wendereingaben Playlisten generiert werden können. Abstract This thesis aims at developing techniques which support users in accessing and discovering music. The main part consists of two chapters. Chapter 2 gives an introduction to computational models of music similarity. The combination of different approaches is optimized and the largest evaluation of music similarity measures published to date is presented. The best combination performs significantly better than the baseline approach in most of the evaluation categories. A particular effort is made to avoid overfitting. To cross-check the results from the evaluation based on genre classification a listening test is conducted. The test confirms that genre- based evaluations are suitable to efficiently evaluate large parameter spaces. Chapter 2 ends with recommendations on the use of similarity measures. Chapter 3 describes three applications of such similarity measures. The first application demonstrates how music collections can be organized and vi- sualized so that users can control the aspect of similarity they are interested in. The second application demonstrates how music collections can be organized hierarchically into overlapping groups at the artist level. These groups are summarized using words from web pages associated with the respective artists. The third application demonstrates how playlists can be generated which require minimum user input. Acknowledgments This work was carried out while I was working as a researcher at the Austrian Research Institute for Artificial Intelligence (OFAI). I want to thank Gerhard Widmer for giving me the opportunity to work at OFAI and supervising me. His constant feedback and guidance have shaped this thesis from the wording of the title to the structure of the conclusions. I want to thank Andreas Rauber for reviewing this thesis, for convincing me to start working on MIR in the first place (back in 2001), and for his supervision during my Master’s studies. Colleagues at OFAI I want to thank my colleagues at OFAI for lots of interesting discussions, vari- ous help, and making life at OFAI so enjoyable: Simon Dixon, Werner Goebl, Dominik Schnitzer, Martin Gasser, Søren Tjagvad Madsen, Asmir Tobudic, Paolo Petta, Markus Mottl, Fabien Gouyon, and those who have moved to Linz in the meantime: Tim Pohle, Peter Knees, and Markus Schedl. I especially want to thank: Simon for pointers to related work, help with signal processing, Linux, English expressions, lots of useful advice, and helpful comments on this thesis; Werner for introducing me to the fascinating world of expressive piano performances; Dominik for some very inspiring discussions and for the collaboration on industry projects related to this thesis; Mar- tin for helping me implement the “Simple Playlist Generator” and for lots of help with Java; Paolo for always being around when I was looking for someone to talk to; Arthur for interesting discussions related to statistical evaluations; Markus M. for help with numerical problems, Fabien for interesting discussions related to rhythm features and helpful comments on this thesis; Tim for the collaboration on feature extraction and playlist genera- tion; Peter for the collaboration on web-based artist similarity; and Markus S. for the collaboration on visualizations. I also want to thank the colleagues from the other groups at OFAI, the system administrators, and the always helpful secretariat. It was a great time and I would have never managed to complete this thesis without all the help I got. Other Colleagues I want to thank Xavier Serra and Mark Sandler for inviting me to visit their labs where I had the chance to get to know many great colleagues. I want to thank Perfecto Herrera for all his help during my visit to the Music Technol- ogy Group at UPF, for his help designing and conducting listening tests (to evaluate similarity measures for drum sounds), and for many valuable pointers to related work. I’m also very grateful to Perfe for his tremendous efforts writing and organizing the SIMAC project proposal. Without SIMAC this thesis might not have been related to MIR. (Special thanks to the European Commission for funding it!) Juan Bello has been very helpful in organizing my visit to the Centre for Digital Music at QMUL. I’m thankful for the interesting discussions on chord progressions and harmony in general. Which brings me to Christopher Harte who has taught me virtually everything I know about chords, harmony, and tonality. I’m also grateful for the guitar lessons he gave me. I want to thank Peter Hlavac for very interesting discussions on the music industry, and in particular for help and ideas related to the organization of drum sample libraries. I want to thank John Madsen for insightful discussions which guided the direction of my research. I’m grateful to many other colleagues in the MIR community who have been very helpful in many ways. For example, my opinions and views on evaluation procedures for music similarity measures were influenced by the numerous discussions on the MIREX and Music-IR mailing lists. In this context I particularly want to thank Kris West for all his enthusiasm and efforts. I’m also thankful to: Jean-Julien Aucouturier for a helpful review of part of my work; Beth Logan for help understanding the KL-Divergence; Masataka Goto for giving me the motivation I needed to finish this thesis a lot sooner than I would have other- wise; Stephan Baumann for interesting discussions about the future of MIR; and Emilia Gómez for interesting discussions related to harmony. Special thanks also to Markus Frühwirth who helped me get started with MIR. Friends & Family I’m very grateful to my friends for being who they are, and for helping me balance my life and giving me valuable feedback and ideas related to MIR applications. I especially want to thank the Nordsee gang for many very interesting discussions. Finally, I want to thank those who are most important: my parents and my two wonderful sisters. I want to thank them for sharing so much with me, for always being there for me, for all the love, encouragement, and support. Financial Support This work was supported by the EU project SIMAC (FP6-507142) and by the project Y99-INF, sponsored by the FWF in the form of a START Research Prize. Contents 1 Introduction 1 1.1 Outline of this Thesis ....................... 2 1.1.1 Contributions of this Doctoral Thesis ........... 4 1.2 Matlab Syntax ........................... 5 2 Audio-based Similarity Measures 9 2.1 Introduction ............................ 9 2.1.1 Sources of Information ................... 10 2.1.2 Related Work ....................... 12 2.2 Techniques ............................. 13 2.2.1 The Basic Idea (ZCR Illustration) ............. 14 2.2.2 Preprocessing (MFCCs) .................. 16 2.2.2.1 Power Spectrum ................. 17 2.2.2.2 Mel Frequency .................. 18 2.2.2.3 Decibel ...................... 21 2.2.2.4 DCT ....................... 22 2.2.2.5 Parameters .................... 25 2.2.3 Spectral Similarity ..................... 25 2.2.3.1 Related Work .................. 25 2.2.3.2 Thirty Gaussians and Monte Carlo Sampling (G30) 26 2.2.3.3 Thirty Gaussians Simplified (G30S) ....... 29 2.2.3.4 Single Gaussian (G1) .............. 33 2.2.3.5 Computation Times ............... 34 2.2.3.6 Distance Matrices ................ 36 2.2.4 Fluctuation Patterns .................... 36 2.2.4.1 Details ...................... 38 2.2.4.2 Illustrations

Computational Models of Music Similarity and Their Applications In

Lab Data.Pages

Nqiefecta the Brink of Super Stardom

Chance the Rapper's Creation of a New Community of Christian

AUSTRALIAN SINGLES REPORT 19Th December, 2016 Compiled by the Music Network© FREE SIGN UP

Rock Music Is a Genre of Popular Music That Entered the Mainstream in the 1950S

A Fast-Moving Storm

Sooloos Collections: Advanced Guide

Grammy® Awards 2018

Distant Music: Recorded Music, Manners, and American Identity Jacklyn Attaway

PUNK KUULUU KAIKILLE! Punk-Kokoelman Kartoitus, Arviointi Ja Esiintuominen Oulun Kaupunginkirjastossa

Triple J Hottest 100 of the Decade Voting List: AZ Artists

Augmented Renaissance: from Creation to Revelation Christopher Atkinson Louisiana State University and Agricultural and Mechanical College