Music Recommendation and Discovery in the Long Tail

MUSIC RECOMMENDATION AND DISCOVERY IN THE LONG TAIL Oscar` Celma Herrada 2008 c Copyright by Oscar` Celma Herrada 2008 All Rights Reserved ii To Alex and Claudia who bring the whole endeavour into perspective. iii iv Acknowledgements I would like to thank my supervisor, Dr. Xavier Serra, for giving me the opportunity to work on this very fascinating topic at the Music Technology Group (MTG). Also, I want to thank Perfecto Herrera for providing support, countless suggestions, reading all my writings, giving ideas, and devoting much time to me during this long journey. This thesis would not exist if it weren’t for the the help and assistance of many people. At the risk of unfair omission, I want to express my gratitude to them. I would like to thank all the colleagues from MTG that were —directly or indirectly— involved in some bits of this work. Special mention goes to Mohamed Sordo, Koppi, Pedro Cano, Mart´ın Blech, Emilia Gómez, Dmitry Bogdanov, Owen Meyers, Jens Grivolla, Cyril Laurier, Nicolas Wack, Xavier Oliver, Vegar Sandvold, JoséPedro Garc´ıa, Nicolas Falquet, David Garc´ıa, Miquel Ram´ırez, and Otto Wüst. Also, I thank the MTG/IUA Administration Staff (Cristina Garrido, Joana Clotet and Salvador Gurrera), and the sysadmins (Guillem Serrate, Jordi Funollet, Maarten de Boer, Ramón Loureiro, and Carlos Atance). They provided help, hints and patience when I played around with the machines. During my six months stage at the Center for Computing Research of the National Poly- technic Institute (Mexico City) in 2007, I met a lot of interesting people ranging different disciplines. I thank Alexander Gelbukh for inviting me to work in his research group, the Natural Language Laboratory. Also, I would like to thank Grigori Sidorov, Tine Stalmans, Obdulia Pichardo, Sulema Torres, and Yulema Ledeneva for making my stay so wonderful. This thesis would be much more difficult to read —except for the “Spanglish” experts— if it weren’t for the excellent work of the following people: Paul Lamere, Owen Meyers, Terry Jones, Kurt Jacobson, Douglas Turnbull, Tom Slee, Kalevi Kilkki, Perfecto Herrera, Alberto Lumbreras, Daniel McEnnis, Xavier Amatriain, and Neil Lathia. They not only have helped me to improve the text, but have provided feedback, comments, suggestions, and —of course— criticism. v Many people have influenced my research during these years. Furthermore, I have been lucky enough to meet some of them. In this sense, I would like to acknowledge Elias Pampalk, Paul Lamere, Justin Donaldson, Jeremy Pickens, Markus Schedl, Peter Knees, and Stephan Baumann. I had very interesting discussions with them in several ISMIR (and other) conferences. Other researchers whom I have learnt a lot, and I have worked with, are: Massimiliano Zanin, Javier Buldú, Raphaël Troncy, Michael Hausenblas, Roberto Garc´ıa, and Yves Raimond. I also want to thank some MTG veterans, whom I met and worked before starting the PhD. They are: Alex Loscos, Jordi Bonada, Pedro Cano, Oscar Mayor, Jordi Janer, Lars Fabig, Fabien Gouyon, and Enric Mieza. Also, special thanks goes to Esteban Maestre and Pau Arum´ıfor having such a great time while being PhD students. Last but not least, this work would have never been possible without the encouragement of my wife Claudia, who has provided me love and patience, and my lovely son Alex —who altered my last.fm and youtube accounts with his favourite music. Nowadays, Cri–Cri, Elmo and Barney, coexists with The Dogs d’Amour, Backyard Babies, and other rock bands. I reckon that the two systems are a bit lost when trying to recommend me music and videos!. Also, a special warm thanks goes to my parents Tere and Toni, my brother Marc and my sister in law Marta, and the whole family in Barcelona and Mexico. At least, they will understand what my work is about. hopefully. This research was performed at the Music Technology Group of the Universitat Pompeu Fabra in Barcelona, Spain. Primary support was provided by the EU projects FP6-507142 SIMAC1 and FP6-045035 PHAROS2, and by a Mexican grant from the Secretar´ıa de Rela- ciones Exteriores (Ministry of Foreign Affairs) for a six months stage at the Center for Computing Research of the National Polytechnic Institute (Mexico City). 1http://www.semanticaudio.org 2http://www.pharos-audiovisual-search.eu/ vi Abstract Music consumption is biased towards a few popular artists. For instance, in 2007 only 1% of all digital tracks accounted for 80% of all sales. Similarly, 1,000 albums accounted for 50% of all album sales, and 80% of all albums sold were purchased less than 100 times. There is a need to assist people to filter, discover, personalise and recommend from the huge amount of music content available along the Long Tail. Current music recommendation algorithms try to accurately predict what people de- mand to listen to. However, quite often these algorithms tend to recommend popular —or well–known to the user— music, decreasing the effectiveness of the recommendations. These approaches focus on improving the accuracy of the recommendations. That is, try to make accurate predictions about what a user could listen to, or buy next, independently of how useful to the user could be the provided recommendations. In this Thesis we stress the importance of the user’s perceived quality of the recommendations. We model the Long Tail curve of artist popularity to predict —potentially— interesting and unknown music, hidden in the tail of the popularity curve. Effective recommendation systems should promote novel and relevant material (non–obvious recommendations), taken primarily from the tail of a popularity distribution. The main contributions of this Thesis are: (i) a novel network–based approach for recommender systems, based on the analysis of the item (or user) similarity graph, and the popularity of the items, (ii) a user–centric evaluation that measures the user’s relevance and novelty of the recommendations, and (iii) two prototype systems that implement the ideas derived from the theoretical work. Our findings have significant implications for recommender systems that assist users to explore the Long Tail, digging for content they might like. vii viii Resum Avui en dia, la música estàesbiaixada cap al consum d’alguns artistes molt populars. Per exemple, el 2007 només l’1% de totes les can¸cons en format digital va representar el 80% de les vendes. De la mateixa manera, només 1.000 àlbums varen representar el 50% de totes les vendes, i el 80% de tots els àlbums venuts es varen comprar menys de 100 vegades. Es´ clar que hi ha una necessitat per tal d’ajudar a les persones a filtrar, descobrir, personalitzar i recomanar música, a partir de l’enorme quantitat de contingut musical disponible. Els algorismes de recomanacióde música actuals intenten predir amb precisióel que els usuaris demanen escoltar. Tanmateix, molt sovint aquests algoritmes tendeixen a recomanar artistes famosos, o coneguts d’avantmàper l’usuari. Això fa que disminueixi l’eficàcia i utilitat de les recomanacions, ja que aquests algorismes es centren bàsicament en millorar la precisióde les recomanacions. Es´ a dir, tracten de fer prediccions exactes sobre el que un usuari pugui escoltar o comprar, independentment de quantutils ´ siguin les recomanacions generades. En aquesta tesi destaquem la importància que l’usuari valori les recomanacions rebudes. Per aquesta raómodelem la corba de popularitat dels artistes, per tal de poder recomanar música interessant i desconeguda per l’usuari. Les principals contribucions d’aquesta tesi són: (i) un nou enfocament basat en l’anàlisi de xarxes complexes i la popularitat dels productes, aplicada als sistemes de recomanació, (ii) una avaluaciócentrada en l’usuari, que mesura la importància i la desconeixen¸ca de les recomanacions, i (iii) dos prototips que implementen la idees derivades de la tasca teòrica. Els resultats obtinguts tenen clares implicacions per aquells sistemes de recomanacióque ajuden a l’usuari a explorar i descobrir continguts que els pugui agradar. ix x Resumen Actualmente, el consumo de música estásesgada hacia algunos artistas muy populares. Por ejemplo, en el año 2007 sólo el 1% de todas las canciones en formato digital representaron el 80% de las ventas. De igual modo, únicamente 1.000 álbumes representaron el 50% de todas las ventas, y el 80% de todos los álbumes vendidos se compraron menos de 100 veces. Existe, pues, una necesidad de ayudar a los usuarios a filtrar, descubrir, personalizar y recomendar música a partir de la enorme cantidad de contenido musical existente. Los algoritmos de recomendación musical existentes intentan predecir con precisión lo que la gente quiere escuchar. Sin embargo, muy a menudo estos algoritmos tienden a recomendar o bien artistas famosos, o bien artistas ya conocidos de antemano por el usuario. Esto disminuye la eficacia y la utilidad de las recomendaciones, ya que estos algoritmos se centran en mejorar la precisión de las recomendaciones. Con lo cuál, tratan de predecir lo que un usuario pudiera escuchar o comprar, independientemente de lo útiles que sean las recomendaciones generadas. En este sentido, la tesis destaca la importancia de que el usuario valore las recomendaciones propuestas. Para ello, modelamos la curva de popularidad de los artistas con el fin de recomendar música interesante y, a la vez, desconocida para el usuario. Las principales contribuciones de esta tesis son: (i) un nuevo enfoque basado en el análisis de redes com- plejas y la popularidad de los productos, aplicada a los sistemas de recomendación, (ii) una evaluación centrada en el usuario que mide la calidad y la novedad de las recomendaciones, y (iii) dos prototipos que implementan las ideas derivadas de la labor teórica.

Music Recommendation and Discovery in the Long Tail

PERFORMED IDENTITIES: HEAVY METAL MUSICIANS BETWEEN 1984 and 1991 Bradley C. Klypchak a Dissertation Submitted to the Graduate

Only Imagine

Free, Hateful, and Posted: Rethinking First Amendment Protection of Hate Speech in a Social Media World

Flowers for Algernon.Pdf

Music and the American Civil War

Tiktok Y La Creación De Marca Personal

ARIA TOP 50 AUSTRALIAN ARTIST ALBUMS CHART 2016 TY TITLE Artist CERTIFIED COMPANY CAT NO

The Musicares Foundation® 3402 Pico Boulevard, Santa Monica, CA 90405

Beats of Your Town

Kool'zine (The Go-Betweens)

MAKE YOUR SOCIAL LIFE Karaoke, Drink, SING Dance, Laugh in 2011

16 Lovers Lane Was Just One of Those Defining Moments in My Education