Knowledge-Based Music Recommendation: Models, Algorithms and Exploratory Search

PHD THESIS In Partial Fulfilment of the Requirements for the Degree of Doctor of Philosophy from Sorbonne University Specialization: Data Science Knowledge-based Music Recommendation: Models, Algorithms and Exploratory Search Pasquale LISENA Defended on 11/10/2019 before a committee composed of: Reviewer Michel BUFFA, Université Côte d’Azur, INRIA, Sophia Antipolis, France Reviewer Mounia LALMAS, Spotify, University College London, United Kingdom Examiner Gaël RICHARD, TELECOM Paris, France Examiner Tommaso DI NOIA, Politecnico di Bari, Italy Examiner Pietro MICHIARDI, EURECOM, Sophia Antipolis, France Thesis Director Benoit HUET, EURECOM, Sophia Antipolis, France Thesis Co-Director Raphäel TRONCY, EURECOM, Sophia Antipolis, France Dedicated to my family Acknowledgements Firstly, I would like to sincerely thank my advisor Raphäel Troncy, for having strongly wanted me to start this PhD, for having directed and guided my research, for having constantly given value and importance to my work, and for having shared with me goals and responsibilities. I would like to thank the members of my research group for their feedback and cooperation, in particular Enrico Palumbo, which shared with me most of this journey. An acknowledgement goes to the members of the DOREMUS project, which it was a pleasure to work with. I would like to thank also the research group of VU Amsterdam which welcomed me in early 2019 in their vibrant research environment, and in particular Frank van Harmelen, Albert Meroño Peñuela and Ilaria Tiddi. A very special gratitude goes out to all colleagues at EURECOM which contributed to create a pleasant work environment, with a special thank to those that have become genuine friends for me here in France. Finally, I would like to thank my parents, for having always supported me and having repre- sented a standing point for my life. I would like to thank my grandpa Pasquale, the first of my supporters, that is for me a big source of motivation. Lastly, I am extremely grateful to Valeria, for having chosen to stay next to me and for having helped, sustained, inspired, and encouraged me. Sophia Antipolis, 11 October 2019 Pasquale Lisena i Abstract Representing information about music is a complex activity that involves different sub-tasks. This thesis mostly focuses on classical music, researching how to represent and exploit rich metadata. Our main goal is to investigate knowledge representation and discovery strategies applied to classical music, including research topics such as Knowledge-Base population, metadata prediction and recommender systems. We first propose a complete workflow for the management of music metadata using Semantic Web technologies. We introduce a specialised ontology and a set of controlled vocabularies for the different concepts specific to music. Then, we present an approach for converting data, in order to go beyond the librarian practice currently in use, relying on mapping rules and interlinking with controlled vocabularies. Finally, we show how these data can be exploited. In particular, we study approaches based on embeddings computed on structured metadata, titles, and symbolic music for ranking and recommending music. Several demo applications have been realized for testing the previous approaches and resources. iii Abrégé Représenter l’information décrivant la musique est une activité complexe, qui implique dif- férentes sous-tâches. Cette thèse porte principalement sur la musique classique et étudie comment représenter et exploiter ces informations. L’objectif principal est l’étude de stratégies de représentation et de découverte de connaissances appliquées à la musique classique, dans des domaines tels que la production de bases de connaissances, la prédiction de métadon- nées et les systèmes de recommandation. Nous proposons tout d’abord une architecture pour la gestion des métadonnées de musique à l’aide des technologies du Web Sémantique. Nous introduisons une ontologie spécialisée et un ensemble de vocabulaires contrôlés pour les différents concepts spécifiques à la musique. Ensuite, nous présentons une approche de conversion des données, afin d’aller au-delà de la pratique bibliothécaire actuellement utilisée, en s’appuyant sur des règles d’appariement et sur l’interconnexion avec des vocabulaires contrôlés. Enfin, nous montrons comment ces données peuvent être exploitées. En particulier, nous étudions des approches basées sur des plongements calculés sur des métadonnées struc- turées, des titres et de la musique symbolique pour classer et recommander de la musique. Plusieurs applications de démonstration ont été réalisées pour tester les approches et les ressources produites. v Contents Acknowledgementsi Abstract iii List of Figures xi List of Tables xiii 1 Introduction 1 1.1 Motivation.........................................1 1.1.1 Track-based vs work-based approach.....................3 1.1.2 Status of Classical music metadata online..................4 1.1.3 Why recommend classical music........................6 1.2 Research context: the DOREMUS project.......................7 1.3 Research Questions....................................9 1.4 Summary of contributions................................9 1.5 Thesis outline....................................... 10 I Building a Music Graph 11 2 Related Work 15 2.1 Knowledge Representation in the Semantic Web................... 15 2.2 Music Ontologies in the literature........................... 16 2.2.1 Schema.org.................................... 18 2.3 Digital Libraries in the Semantic Web......................... 19 2.4 Data access........................................ 20 2.5 Conclusion......................................... 22 3 A Music Model 23 3.1 The DOREMUS ontology................................ 23 3.2 Mapping to Schema.org................................. 26 3.2.1 Choose the starting node............................ 28 vii Contents 3.2.2 Identify similar classes.............................. 28 3.2.3 Identify similar properties............................ 29 3.2.4 Simplify the graph................................ 29 3.2.5 Limits of the mapping.............................. 31 3.3 Evaluation......................................... 32 3.4 Conclusion......................................... 33 4 Controlled Vocabularies for Music Metadata 37 4.1 Music Vocabularies.................................... 38 4.1.1 Collection of interlinked vocabularies..................... 39 4.1.2 New vocabularies................................. 41 4.2 Modelling process.................................... 42 4.2.1 Vocabulary Alignments............................. 43 4.3 String2Vocabulary.................................... 44 4.4 Conclusion......................................... 45 5 Data Conversion 47 5.1 MARC and the librarian practice............................ 48 5.2 From MARC to RDF.................................... 50 5.3 A set of interlinked graph................................ 53 5.4 Conclusion......................................... 54 6 Developing smart Web APIs 57 6.1 Motivation......................................... 58 6.2 Related Work........................................ 60 6.3 The JSON query syntax.................................. 61 6.3.1 The prototype definition............................. 63 6.3.2 The root $-properties.............................. 64 6.4 Implementation...................................... 65 6.4.1 Integration in grlc and Tapas......................... 66 6.5 Evaluation......................................... 68 6.5.1 Quantitative evaluation............................. 68 6.5.2 User Survey.................................... 69 6.6 Conclusion and Future Work.............................. 71 II Exploit the Music Knowledge 73 7 Related Work 77 7.1 Music recommendation and metadata........................ 77 7.2 Recommender Systems and Knowledge Graphs................... 78 viii Contents 7.3 Recommending with Embeddings........................... 79 7.4 Context-based recommendation............................ 80 7.5 Symbolic Music for MIR................................. 80 7.6 Conclusion......................................... 81 8 Embeddings and Similarity 83 8.1 Feature Embeddings................................... 84 8.1.1 Music Embeddings................................ 85 8.1.2 Years and Places embeddings.......................... 86 8.2 Embeddings Combination................................ 88 8.3 Euclidean Similarity with Penalty............................ 89 8.4 Conclusion......................................... 91 9 Playlists and Weights 93 9.1 Real world data: concerts and playlists........................ 94 9.1.1 Dataset description................................ 94 9.1.2 Dimensions homogeneity inside playlists.................. 96 9.2 Weights for playlist ranking............................... 100 9.3 Evaluation......................................... 100 9.4 Conclusions and Future Work.............................. 102 9.5 The role of playlist title: Title2Rec........................... 102 9.5.1 Algorithm..................................... 103 9.5.2 Optimisation................................... 104 9.5.3 Results and Future Work............................. 106 9.6 The role of playlist emotions.............................. 106 9.6.1 Emotion Recognition in Song Lyrics...................... 107 9.6.2 Playlist Classification............................... 108 10 Explore 111 10.1 Explore the Music Graph with OVERTURE ....................... 111 10.2 Discover music in the

Knowledge-Based Music Recommendation: Models, Algorithms and Exploratory Search

Recommended Formats Statement 2019-2020

Troubling the Waters for Healing of the Church

The Music Encoding Initiative As a Document-Encoding Framework

Petrichor Magazine 20Th August 2019 Third Edition

Live Band Karaoke I Wanna Dance with Jenny Jenny 867-5309 Rolling Stones, the Steve Miller Band Somebody the Name Says It All

Volume X, Issue I

The Mainstream Source Production Music Catalog Is Very Proud to Include Outstanding Performances by Legendary Artists and Grammy Winners. Our Cutting

Praise & Worship

The Weekly Word Having Just Celebrated the 4Th of July and Our

Gulf Shores Passes Changes to Parasail Regulations

MAGNUM 28. März 2018 Stuttgart Im Wizemann the Road to Eternity

November—December 2018