Deep Neural Models for Jazz Improvisations
Total Page:16
File Type:pdf, Size:1020Kb
Deep Neural Models for Jazz Improvisations Shunit Haviv Hakimi Technion - Computer Science Department - M.Sc. Thesis MSC-2021-03 - 2021 Technion - Computer Science Department - M.Sc. Thesis MSC-2021-03 - 2021 Deep Neural Models for Jazz Improvisations Research Thesis Submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science Shunit Haviv Hakimi Submitted to the Senate of the Technion — Israel Institute of Technology Kislev 5781 Haifa November 2020 Technion - Computer Science Department - M.Sc. Thesis MSC-2021-03 - 2021 Technion - Computer Science Department - M.Sc. Thesis MSC-2021-03 - 2021 This research was carried out under the supervision of Prof. Ran El-Yaniv, in the Faculty of Computer Science. Some results in this thesis have been published as an article by the author and research collaborators in a conference during the course of the author’s research period, the most up-to-date version of which being: Shunit Haviv Hakimi, Nadav Bhonker, and Ran El-Yaniv. Bebopnet: Deep neural models for personalized jazz improvisations. In Proceedings of 21st International Society of Music Information Retrieval Conference, ISMIR 2020, Best Research Award, 2020. The generous financial help of the Technion is gratefully acknowledged. Technion - Computer Science Department - M.Sc. Thesis MSC-2021-03 - 2021 Technion - Computer Science Department - M.Sc. Thesis MSC-2021-03 - 2021 Contents List of Figures Abstract 1 1 Introduction 3 2 BebopNet: Deep Neural Models for Personalized Jazz Improvisations 7 2.1 Introduction .................................. 7 2.2 Related Work ................................. 8 2.3 Problem Statement .............................. 9 2.4 Methods .................................... 10 2.4.1 BebopNet: Jazz Model Learning .................. 10 2.4.2 User Preference Elicitation ..................... 12 2.4.3 User Preference Metric Learning .................. 12 2.4.4 Optimized Music Generation .................... 13 2.5 Experiments .................................. 14 2.5.1 Harmonic Coherence ......................... 14 2.5.2 Analyzing Personalized Models ................... 14 2.6 Plagiarism Analysis .............................. 16 3 Musical Features 17 3.1 Introduction .................................. 17 3.2 Musical Feature List ............................. 17 3.2.1 Number of notes ........................... 18 3.2.2 Duration entropy ........................... 18 3.2.3 Pentatonic scale ........................... 18 3.2.4 Blues scale .............................. 18 3.2.5 Syncopation .............................. 19 3.2.6 Chord note on a downbeat ..................... 19 3.2.7 Chromatic before a chord note on a down-beat .......... 19 3.2.8 Chord match ............................. 19 3.2.9 Scale match .............................. 20 3.2.10 Altered scale match ......................... 20 Technion - Computer Science Department - M.Sc. Thesis MSC-2021-03 - 2021 3.2.11 Tension ................................ 20 3.2.12 Intervals ................................ 21 3.2.13 Duration repetition within a whole measure ............ 21 3.2.14 Duration repetition within a half measure ............. 21 3.2.15 Pitch repetition ............................ 22 3.3 Dataset Feature Analysis .......................... 22 3.4 Feature-Guided Jazz Improvisation Generation .............. 22 4 Active Learning 25 4.1 Introduction .................................. 25 4.2 Related work ................................. 25 4.3 Methods .................................... 26 4.4 Experiments and Results ........................... 26 4.4.1 Active Learning vs. Passive Learning: Predefined Target Feature Vector ................................. 27 4.4.2 Active Learning vs. Passive Learning: Human Labeled Dataset . 27 4.5 Conclusions .................................. 28 5 Reinforcement Learning 31 5.1 Introduction .................................. 31 5.2 Related Work ................................. 31 5.2.1 Deep Q-Learning ........................... 33 5.2.2 Policy Gradient – A2C ........................ 33 5.3 Reinforcement Learning with BebopNet .................. 34 5.3.1 Jazz Environment .......................... 34 5.3.2 Reward Function ........................... 34 5.4 Experiments and Results ........................... 35 5.4.1 User Preference Reward ....................... 37 6 Conclusions and open questions 41 7 Appendix: BebopNet – Personalized Improvisation Generation 43 7.1 Implementation on GitHub ......................... 43 7.2 Music Samples ................................ 43 7.3 Methods .................................... 44 7.3.1 Dataset Details ............................ 44 7.3.2 Musical Preference Labeling System ................ 44 7.3.3 Beam Search ............................. 44 7.4 Experiments .................................. 44 7.4.1 Hyper-parameter search ....................... 44 7.4.2 Harmonic Guided Generation .................... 46 7.4.3 Per-User Personalization ....................... 46 Technion - Computer Science Department - M.Sc. Thesis MSC-2021-03 - 2021 7.4.4 Plagiarism Analysis ......................... 47 7.5 Jazz Dataset (XML files) .......................... 52 Hebrew Abstract i Technion - Computer Science Department - M.Sc. Thesis MSC-2021-03 - 2021 Technion - Computer Science Department - M.Sc. Thesis MSC-2021-03 - 2021 List of Figures 1.1 A short excerpt generated by BebopNet. .................. 4 2.1 An example of a measure in music notation and its vector representation. Integers are converted to one-hot representations. ............. 11 2.2 The BebopNet architecture for the next note prediction. Each note is represented by concatenating the embeddings of the pitch (red bar), the duration (purple bar) and the four pitches comprising the current chord (green bars). The output of the LSTM is passed to two heads (orange bars), one the size of the pitch embedding (top) and the other the size of the duration embedding (bottom). ................... 12 2.3 2.3i Predictions of the preference model on sequences from a validation set. Green: sequences labeled with a positive score (yτ > 0); yellow: neutral (yτ = 0); red: negative (yτ < 0). The blue vertical lines indicate thresholds β1, β2 used for selective prediction. 2.3ii Risk-coverage plot for the predictions of the preference model. β1, β2 (green lines) are defined to be the thresholds that yield a minimum error on the contour of 25% coverage. ................................... 16 3.1 Feature distribution over the dataset (histogram (bars), estimated dis- tribution (lines)) ............................... 22 4.1 Active Learning vs. Passive Learning convergence checks: 1-hot fea- ture target preferred. The graphs are an average of 10 runs, and the confidence interval is for 95%. ........................ 28 4.2 Active Learning vs. Passive Learning convergence checks: 5-hot fea- ture target preferred. The graphs are an average of 10 runs, and the confidence interval is for 95%. ........................ 29 4.3 Active Learning vs. Passive Learning convergence checks: Manually labeled dataset ................................ 29 5.1 The loss of the pre-trained LSTM model (BebopNet for RL) ....... 36 5.2 The accuracy of the pitch of the estimated next note (by BebopNet for RL) ...................................... 36 Technion - Computer Science Department - M.Sc. Thesis MSC-2021-03 - 2021 5.3 The accuracy of the duration of the estimated next note (by BebopNet for RL) .................................... 36 5.4 The complete model architecture we used. (model) contains the pre- trained LSTM, (rnn) is the additional LSTM and (actor_base), (ac- tor_pitch), (actor_dur), (critic) and (critic_linear) are feed-forward lay- ers dedicated to the A2C predictions. .................... 37 5.5 Example of the convergence of the reward during the training process. 38 7.1 Digital CRDI controlled by a user to provide continuous preference feed- back. ..................................... 44 7.2 User 1: 7.2i Predictions of the preference model on sequences from a validation set. Green: sequences labeled with a positive score (yτ > 0); yellow: neutral (yτ = 0); red: negative (yτ < 0). Blue vertical line indicates thresholds β1, β2 used for selective prediction. 7.2ii Risk- coverage plot for the predictions of the preference model. β1, β2 (green lines) are defined to be the thresholds that yield minimum error onthe contour of 25% coverage. .......................... 48 7.3 User 2: 7.3i Predictions of the preference model on sequences from a validation set. Green: sequences labeled with a positive score (yτ > 0); yellow: neutral (yτ = 0); red: negative (yτ < 0). Blue vertical line indicates thresholds β1, β2 used for selective prediction. 7.3ii Risk- coverage plot for the predictions of the preference model. β1, β2 (green lines) are defined to be the thresholds that yield minimum error onthe contour of 25% coverage. .......................... 48 7.4 User 3: 7.4i Predictions of the preference model on sequences from a validation set. Green: sequences labeled with a positive score (yτ > 0); yellow: neutral (yτ = 0); red: negative (yτ < 0). Blue vertical line indicates thresholds β1, β2 used for selective prediction. 7.4ii Risk- coverage plot for the predictions of the preference model. β1, β2 (green lines) are defined to be the thresholds that yield minimum error onthe contour of 25% coverage. .......................... 49 7.5 User 4: 7.5i Predictions of the preference model on sequences from a validation