Here, and in Chronological Order, I Give Them My Eternal Gratitude
Total Page:16
File Type:pdf, Size:1020Kb
Sponsoring Committee: Professor Morwaread M. Farbood Professor Juan P. Bello Doctor Tristan Jehan DISCOVERING STRUCTURE IN MUSIC: AUTOMATIC APPROACHES AND PERCEPTUAL EVALUATIONS Oriol Nieto Program in Music Technology Department of Music and Performing Arts Professions Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Steinhardt School of Culture, Education, and Human Development New York University 2015 Copyright © 2015 Oriol Nieto To Amalia, Ana, Antonio and Juan. ACKNOWLEDGEMENTS This epic adventure would have been impossible to undertake without the help of many wonderful individuals who unconditionally shared their seemingly endless wisdom with me. Here, and in chronological order, I give them my eternal gratitude. First of all, to family. Especially: my parents, who never gave up on me; my sister, who is probably the strongest person I know (I am so proud of you); and my grandparents, to whom I dedicate this work. To my half brother Daniel Bolsa Madrid, for teaching me Life since I was eight. To my second family, also known as La Caverna, or The Old School, or Can Paco’s Crew, including, in alphabetical order: Santi Bonillo, Daniel Fàbregas, Daniel González, Bernat Maspons, Albert Mesas, Guillem and Arnau Mayoral, and Adrià Montanya. Special mention to Eduard and Roger Piqueras (Chewbacca!), Anna Mercader, and Sergi Mansilla, for showing me video games (and peach juice), how to deal with our parents’ divorce, and GNU/Linux, respectively. To Maria Josep, my math teacher in high school, who made me learn (and enjoy!) Calculus. To Lluís, my music teacher in high school, who unraveled the world of The Beatles to me. To Jordi Bosch, Pau Farell, and Gerson Gelabert for making me sing in Darkgeon, my first metal band. To Carles Ferreiro, Jordi Llobet, Marc Prim, Felip Sánchez (miss you my friend), David Alarcón and Albert Comerma for iv all the Sargon years, which I look back with pride, nostalgia, and happiness (I strongly believe that Vida is the best metal album in Catalan to date). To Dani “de Aquitania,” for inspiring me in all possible ways, and for the never ending nights with Madee, Buses, Trains, Chairs, Volls, Cazallas, and Marsellas. To Marc Alier for being the example of the teacher I would like to become and the ideal advisor for my Computer Science undergraduate thesis. To Miquel Barceló, for publishing Neal Stephenson in Spain. To all my BEST friends, especially: João Terra (that train in Siberia will not ever be forgotten), Cemil Ozan, Páll Jens, the rest of the participants and organizers of the amazing course in Ekaterinburg; Marc Velasco, Alba Gil, Elena Guasch, Beta Foix, Erik Abner, and the rest of the LBG Barcelona. Europe got much smaller, my English got much better, and my desire to study abroad got much stronger with you around. To all the wonderful people I met the years of living in Barcelona, par- ticularly: Amanda “Amandarina” Fernández, Leslie Cristobal, Sandra Maya, Ivan Rivero, Donato Lorenzo, Anna Basagaña, and the rest of València 477. Also thanks to all the guys at Madee, for giving me the opportunity to tour with them across Spain in several occasions. To Justin Salamon (to whom, at the end of my very first day at the Music Technology Group as a master’s student, I asked: “what is a Fourier Transform?”), for being my colleague at the MTG, roommate in Barcelona, and friend in life. To Vassilis Pantazis, for showing me Meshuggah. To Elena Martínez, for teaching me the basics of signal processing. To Jordi Janer, for making me use Python. To Xavier Serra, for showing me the fascinating world v of music technology, and inspiring me to pursue a Ph.D. To Jordi Bonada, for being the best advisor I could ever had during my master’s thesis. To La Caixa Fellowship, for giving me this once-in-a-lifetime opportu- nity to cross the Atlantic to keep studying my two passions —computers and music— in the best schools in the world. Thanks to my friends who also got this scholarship in 2008, especially: Ferran Masip, Franc Camps, Carlos Fernández, Daniel Climent, Jordi Graupera, Marta Martínez, Juan Astasio, Marta Fenollosa, Almudena Toral, Sara Cabal, Sergi Casanelles, Tomàs Peire, Juan Argote, and Pau Guinart. After having taken a little peek to your won- derfully bright minds, I still wonder why I obtained this fellowship. To Professor Dimitar Deliysky and his beautiful family, who hosted when I visited the University of South Carolina, and later gave me the opportunity to “scream” in front of the world experts on voice production in the gala dinner of the AQL conference. To my professors at Stanford, particularly: Jonathan Abel, Julius O. Smith, Ge Wang, and Jonathan Berger. You have inspired me in so many ways that I still feel I have to somehow pay it back. Also, to my fellow students and friends: Nick Bryan, Roy Fejgin, Blair Kaneshiro, Nick Kruge, Puja Kumar, June Oh, Colin Raffel, Adam Sheppard, Adam Somers, and Sean Zhang. What a year we had together in California! Special thanks to Jordan Rudess, an inspiration since I was sixteen, who I now have the privilege to have as a friend (thanks for coming all the way to Stanford to play a couple of shows with us!). Thanks to him, I have met two of the best musicians I know: Eyal Amir and Eren Başbuğ. Collaborating with these titans has been such a powerful inspiration, I feel so lucky and thankful for having had this opportunity. vi To my advisor at NYU, Mary Farbood, for all the many good hours en- couraging and advising me in the pursuit of this Ph.D. This road would have been much more painful without you. To my “second” advisor, Juan P. Bello, for challenging me in every possible way, and making me not only a much bet- ter researcher, but a much better human. To Dennis Shasha, for his strenuous class of heuristic problem solving, and all the pleasant and stimulating lunches together. To the rest of the professors at MARL, especially: Agnieszka Rogin- ska, Tae Hong Park, Alex Ruthman, Tom Beyer, and Panayotis Mavromatis. To my fantastic C Programming students, who really teach me how to teach. To my fellow students and friends at NYU: Areti Andreopoulou, Rachel Bittner, Braxton Boren, Taemin Cho, Jon Forsyth, Aron Glennon, Brian McFee, Michael Musick, Andrew Telichan, and Finn Upham. Very special thanks to Eric Humphrey. I doubt I would have finished this without him, it has been a pleasure to share the office and to have learned so much by his titanic side. Also big thanks to those of you who reviewed the “Catalanglish” of this document, I owe you so much (or so many beers, at the very least). To the Caja Madrid scholarship, since I would not have been able to continue my Ph.D without their financial help. To the people at The Echo Nest, who treated me like one of them since the very first day of my internship. Especially Tristan Jehan (who gave me the opportunity in the first place, and from whom I learned so much as an engineer, researcher, and friend), Ruofeng Chen, Brian Whitman, Nicola Montechio, Hunter McCurry, Noura Howell, and Amanda Bulger. Thanks to Ava Vitali for the good and geeky times in Boston. To Nathalie Alegre for making me a better human being in all imaginable aspects. I am so lucky and thankful to have such a great partner in life. Special vii thanks to her family, particularly to Isabel, Pablo, and El Nono, for making me feel like home for the two and a half months I was in Lima, where this document originated in June of 2014. To my ISMIR colleagues, especially: Amélie Anglade, Eric Battenberg, Sebastian Böck, Michael Casey, Oscar Celma, Tom Collins, Emanuele Coviello, Sander Dieleman, Frederic Font, Masataka Goto, Philippe Hamel, Katie Kin- naird, Matthias Mauch, Matt McVicar, Geoffroy Peeters, Jan Schlüter, Erik Schmidt, Jeff Scott, Joan Serrà, Siddharth Sigtia, Moha Sordo, Jessica Thomp- son, and Aäron van den Oord. Especial thanks to the “music structure analy- sis” reading group: Jordan Smith, Nanzhu Jiang, and Meinard Müller. ISMIR, in general, would not make sense without you. To the guys at my current metal band, Midnight Blue: Seth, Jason, Giovani, and Jan. Now that I am about to become a doctor, we should have no problems on saving the world with our music. Finally, thanks to Thiru Kumar (“The Dosas Man”) for making the best food in New York City. Thanks to Bare Burger for the wonderful beers and, of course, –now veggie– burgers. And thanks to Vim and LATEX for letting me create this document without suffering as much as I would have without their existence. You —and all of those who I left out due to my terrible memory, sorry!— are all titans. viii TABLE OF CONTENTS LIST OF TABLES xiii LIST OF FIGURES xv CHAPTER I INTRODUCTION1 1 Scope of this Study2 2 Motivation4 3 Dissertation Outline6 4 Contributions9 5 Associated Publications by the Author 10 5.1 Peer-Reviewed Articles 10 5.2 Algorithms Submitted to MIREX 11 II REVIEW OF CURRENT APPROACHES AND EVALUATIONS 13 1 Music Structure Analysis Review 14 1.1 Music Information Retrieval 15 1.2 Music Perception and Cognition 23 2 Current Approaches 30 2.1 Feature Extraction 30 2.2 Tools for Discovering Structure 37 3 Current Evaluations 41 3.1 F-measure 42 3.2 Boundaries Evaluation 44 3.3 Structure Evaluation 46 3.4 Music Segmentation Evaluation Criticism 49 3.5 Pattern Discovery Evaluation 51 3.6 mir_eval 53 4 Summary 54 III MIR METHODS: MUSIC SUMMARIES AND PATTERNS 55 1 Introduction 55 ix 2 Audio Representation 55 2.1 Tracking the Beats 56 3 Summarizing Music Using a Criterion 58 3.1 Feature Quantization 59 3.2 Defining an Audio Summary