An Exploration of Data Mining Techniques to Predict Mood Based On
Total Page:16
File Type:pdf, Size:1020Kb
The College of Wooster Libraries Open Works Senior Independent Study Theses 2013 Let's Get in the Mood: An Exploration of Data Mining Techniques to Predict Mood Based on Musical Properties of Songs Sarah Smith-Polderman The College of Wooster, [email protected] Follow this and additional works at: https://openworks.wooster.edu/independentstudy Part of the Applied Mathematics Commons Recommended Citation Smith-Polderman, Sarah, "Let's Get in the Mood: An Exploration of Data Mining Techniques to Predict Mood Based on Musical Properties of Songs" (2013). Senior Independent Study Theses. Paper 4933. https://openworks.wooster.edu/independentstudy/4933 This Senior Independent Study Thesis Exemplar is brought to you by Open Works, a service of The oC llege of Wooster Libraries. It has been accepted for inclusion in Senior Independent Study Theses by an authorized administrator of Open Works. For more information, please contact [email protected]. © Copyright 2013 Sarah Smith-Polderman Let’s Get In The Mood: an exploration and implementation of data mining techniques to predict mood based on musical properties of songs Independent Study Thesis Presented in Partial Fulfillment of the Requirements for the Degree Bachelor of Arts in the Department of Mathematics at The College of Wooster by Sarah Smith-Polderman The College of Wooster 2013 Advised by: Dr. James Hartman (Mathematics) Abstract This thesis explores the possibility of predicting the mood a song will evoke in a person based on certain musical properties that the song exhibits. First, I introduce the topic of data mining and establish its significant relevance in this day and age. Next, I explore the several tasks that data mining can accomplish, and I identify classification and clustering as the two most relevant tasks for mood prediction based on musical properties of songs. Chapter 3 introduces in detail two specific classification techniques: Naive Bayes Classification and k-Nearest Neighbor Classification. Similarly, Chapter 4 introduces two specific clustering techniques: k-Means Clustering and k-Modes Clustering. Next, Chapter 5 implements these previously discussed classification and clustering techniques on a data set involving musical property combinations and mood, and makes conclusions about which musical property combinations will most likely evoke a happy, sad, calm, or agitated mood. It can be concluded very generally that songs that are in major keys, have consonant harmonies, faster tempos, and firm rhythms will most likely evoke a happy mood in a person. Songs that are in minor keys, have slower tempos, and firm rhythms will probably evoke a sad mood in a person. Songs that are in major keys and have flowing rhythms tend to evoke a calm mood in a person. And last but not least, songs that are in minor keys, have dissonant harmonies, and firm rhythms will evoke an agitated mood in a person. v This work is dedicated to those who have a deep passion for math and music, and a yearning to learn more about how the two can be intertwined to produce both interesting and relevant knowledge that, if used productively, could potentially have a significant and positive impact on our everyday lives. vii Acknowledgments I would like to acknowledge Dr. Peter Mowrey for his helpful insight on how musical parameters are linked to emotional response. Special thanks to my advisor, Dr. James Hartman, for the countless hours he spent with me writing code and his intuition on the path to take for my project. Thank you to my great friend, Ms. Julia Land, for taking the time to help identify the musical properties of the songs used in this study. And furthermore, I would like to thank all of my friends and family for the endless support that they have provided me throughout this year-long journey. And last but not least, I would like to acknowledge Gault 3rd floor. ix Contents Abstractv Dedication vii Acknowledgments ix Contents xi CHAPTER PAGE 1 Let’s Get In The Mood1 2 Introduction to Data Mining 11 2.1 The Need for Data Mining . 11 2.2 Description . 12 2.3 Estimation . 13 2.4 Prediction . 14 2.5 Classification . 15 2.6 Clustering . 16 2.7 Association . 17 2.8 Which tasks are most relevant? . 18 3 Classification 21 3.1 Naive Bayes Classification . 21 3.1.1 Example: Naive Bayes Classification . 25 3.2 k-Nearest Neighbor Classification . 43 3.2.1 Example: k-Nearest Neighbor Classification . 48 4 Clustering 55 4.1 k-Means Clustering . 56 4.1.1 Example: k-Means . 57 4.2 k-Modes Clustering . 61 4.2.1 Example: k-Modes . 67 5 Application to Songs and Mood 73 5.1 Predicting Moods Using Classification Techniques . 79 5.1.1 Naive Bayes Classification . 80 5.1.2 k-Nearest Neighbor Classification . 95 5.2 Predicting Moods Using Clustering Techniques . 108 5.2.1 k-Means Clustering . 110 xi 5.2.2 k-Modes Clustering . 121 6 Limitations and Extensions 139 6.1 Limitations . 139 6.2 Improvements and Extensions . 147 APPENDIX PAGE A Maple Code and Music/Mood Sheet 151 B Musical Property and Mood Training Data Set 165 C Musical Property and Mood Training Data Set Continued 177 D Musical Property and Mood Test Data Set 187 References 197 xii CHAPTER 1 Let’s Get In The Mood Imagine, just for a moment, that you are listening to your all-time favorite song. Who is the composer or artist? Does the song remind you of anyone? When do you listen to this song? Are you usually with others, or alone while listening to it? Do you have a favorite place where you like to listen to it? Do you prefer to play this song loudly or quietly? If it were really playing would you sing along, pretend to play the chords on an instrument, move to the rhythm, or simply listen? Can you pinpoint exactly why this is your favorite song? And, most importantly, how does this song make you feel? Regardless of your answers to the questions above, it is undeniable that music holds a very special place in our society, and can be connected to almost every aspect of our lives. And because of this, it is no wonder that music seems to always have been, and continues to be, one of the most common forms of leisurely activity to engage in. It is rare to encounter someone who does not enjoy listening to music or creating music in some way. Furthermore, with the recent advancements in technology, obtaining music is easier now that it has ever been before. In Rentfrow and Gosling’s article entitled The Do Re Mi’s of Everday Life: The Structure and Personality Correlates of Music Preference, they state, “...the fact that our participants reported listening to music more often than any other activity across 1 2 1. Let’s Get In The Mood a wide variety of contexts confirms that music plays an integral role in people’s everyday lives,” [15]. This statement is indisputable. People are awakened every morning by a musical tune. People listen to music in the car on the way to work, while they exercise at the gym, while cooking dinner or while doing homework. We encounter music in schools, in the mall, in church, and in restaurants. Parents sing songs to their children to help them fall asleep. People can listen to music loudly–so that everyone can hear it–or can listen to music privately, simply by putting on headphones. We can listen to music alone in our own homes or with thousands of others in a concert hall. The number of ways in which people can engage in musical activity every day is practically infinite. As we all know, along with the numerous ways that music is embedded in our daily lives, there are several different types of music. The most common way to distinguish between these differences is to label an artist or song as belonging to a specific genre of music, such as classical, popular, folk, or hip-hop, just to name a few. And inevitably so, as people become more familiar with different types of music, some tend to develop a certain musical preference. One can only imagine the wide variety of reasons why people have different musical preferences. Some may prefer songs about certain things, songs with a given tempo, or songs with specific instruments. I bet, however, that most people have a hard time putting into words exactly why they prefer one type of music over another. With the wide range of music available to us, along with the significantly larger number of people in our society, one might easily think that there would be a substantial number of musical preferences. However, from a study involving 1,704 undergraduate students, Rentfrow and Gosling conclude that there are only 4 music preference dimensions: Reflective and Complex, Intense and Rebellious, Upbeat and Conventional, and Energetic and Rhythmic, [15]. Some might argue that most, or even all, genres could fit neatly under these umbrella music preference dimensions. 3 I argue, however, that believing that is a very simplistic way of viewing musical genres and these proposed music preference dimensions. It is not the case that every pop song would be classified as upbeat and conventional, for example. Thinking more realistically, we can see that within genres there are a wide range of songs that are very different, and without question, give a different feel. In response to what a person’s favorite kind of music to listen to is, it is common to hear, “It depends on what I am in the mood for.” This statement is crucial, as it seems to imply two things.