Distributed Computing Identifying music and inferring similarity Bachelor's Thesis Tobias Schl¨uter
[email protected] Distributed Computing Group Computer Engineering and Networks Laboratory ETH Z¨urich Supervisors: Samuel Welten Prof. Dr. Roger Wattenhofer December 24, 2012 Acknowledgements I would like to say thank you to Samuel Welten for not only offering this interest- ing and challenging project but also for supporting me actively and passionately. Additionally, I would like to say thank you to all proofreaders for their useful comments as well as to the people behind coursera for providing online lectures about machine learning, neural networks and scientific writing. i Abstract Downloading and storing a large amount of music has become easier due to faster internet connections and cheaper storage. Finding desired music in a big music collection still poses a challenge that intelligent music players try to solve: In- stead of organizing music in a hierarchical folder structure music is presented in a more intuitive way using similarity relations between artists and tracks. We provide the foundation that intelligent music players can use to organize the user's music collection. We infer the similarity relations between artists using collected data about users' music taste. Our system unambiguously identifies artists and transforms the collected music taste data into an intermediate rep- resentation that we use to embed artists into an euclidean space where similar artists are nearby. In an experimental study we evaluate our