Classification of Carnatic Thumbnails Using CNN-RNN Models
Total Page:16
File Type:pdf, Size:1020Kb
Classification of Carnatic Thumbnails using CNN-RNN Models Thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in in Computer Science and Engineering by Research by Amulya Sri Pulijala 201450827 [email protected] International Institute of Information Technology Hyderabad - 500 032, India May 2021 Copyright © Amulya Sri Pulijala, 2021 All Rights Reserved International Institute of Information Technology Hyderabad, India CERTIFICATE It is certified that the work contained in this thesis, titled “Classification of Carnatic Thumb- nails using CNN-RNN Models” by Amulya Sri Pulijala, has been carried out under my supervision and is not submitted elsewhere for a degree. Date Adviser: Dr. Suryakanth V Gangashetty To my Mom and Grandparents Acknowledgements om ajn˜ana-timir¯ andhasya¯ jn˜an¯ a¯njana-˜ sal´ akay¯ a¯ chaksur unm¯ılitam yena tasmai sr´ ¯ı-gurave namah Translation: I offer obeisance unto Sr´ ¯ı Guru, who has opened my eyes, which were blinded by the cataract of ignorance, with the collyrium of knowledge. I would like to thank my supervisor Dr. Suryakanth V Gangashetty for his support and care throughout my Masters’ journey. I cannot thank him enough for his unrestricted backing and bearing with me in spite of taking so much time to do every single task. Huge gratitude and respect for his simplicity and timely help. I also admire him for his patience in bearing with me and in guiding me accordingly. I would like to thank Dr. Venkatesh Choppella who inspired me to pursue masters. I was fortunate to attend the classes of Prof. Yegnanarayana Bayya, a phenomenal teacher who changed the way perceive research. I am grateful to Dr. Sakthi Balan and Aditya whose interactions were always encouraging and fruitful. On a personal note, I would like to thank my peers and friends who made my life easy at each and every step. I would like to extend my heartfelt gratitude to Ramakrishna Sir, CVRS Sastry Sir, Narayan Rao Sir from Indian Space Research Organisation (ISRO) who were encouraging me constantly to pursue research. I would like to thank the management of ISRO for giving me permission to complete masters. This thesis would not have been possible without the support of my mother Lakshmi, grandparents BSSSG Krishna Murthy and Saroja, husband Phani Mahesh and little daughter Vnnela. I extend my heartful thanks to my mother-in-law Kameswari for her support. Finally, I would like to thank aunt Prameela Vani, Uncle Jayababu and sisters Kavitha and Sahithy for v vi their continuous support and encouragement. Also grateful to Uncle Phani, aunt Suneetha and brothers Manu and Abhi for their cheer up whenever I feel low. Without their help pursuing masters would have been a dream. This family is a unique gift that am bestowed with and am always grateful for their cooperation in every phase of my life. At last, I would like to thank Lord almighty and my spiritual guide who taught me ’Bhagavad Gita’ which is the main reason for what am today. I bow my head to his compassionate gesture for teaching me the value and purpose of life. Amulya Sri Pulijala Abstract Are repetitive parts representative too in music? Music signal processing is a sub branch in signal processing which is a promising area these days. Music analysis based on signal processing techniques paved a new way of generation and analysis of music. Music analysis and recognition of various swaras and ragas is inherent to human understanding. Music signal processing include various areas of research such as Synthesis of Music, Transcription, Classification, Music Information Retrieval, Raga Classification, Tala Classification, Instrument/ Voice Identification, Audio Matching, Source Separation, Tonic Identification, Intonation/melodic/rhythmic analysis, Music emotion recognition etc. The concept of Raga and Tala is integral part of Indian Classical music. Raga is the melodic component while Tala is the rhythmic component in the music. Hence, classification and identification of Raga and tala is a paramount problem in the area of Music Information Retrieval (MIR) systems. Although there are seven basic Talas in Carnatic Music, a further subdivision of them gives a total of 175 ragas. There are 72 melakartha ragas and more than thousand janya ragas. Statistical and machine learning approaches are proposed in Literature Survey to classify Ragas and Talas. However, they use complete musical recording for training and testing. As part of this thesis, a novel approach is proposed for the first time in Carnatic music to classify Carnatic music recordings using repetitive structure called Thumbnails. We proposed a parallel CNN-RNN models to classify Ragas and Talas in Carnatic music using ’Thumbnails’. vii viii Keywords: Music Signal Processing, Music Information Retrieval, Carnatic Music, Hin- dustani Music, Raga Classification, Tala Classification, Melody/Intonation analysis, Source Separation, Audio Thumbnails, SVM, CNN, RNN Contents Chapter Page Abstract ::::::::::::::::::::::::::::::::::::::::: vii 1 Introduction to Indian Classical Art and Audio Thumbnailing :::::::::::: 1 1.1 Indian Art Music Traditions . 1 1.1.1 Swara . 2 1.1.2 Raga . 3 1.1.3 Tala . 5 1.1.3.1 Tala Schemes in Carnatic Music . 6 1.1.3.2 Saptha Tala System . 6 1.1.4 Tonic . 6 1.1.5 Carnatic Concert . 7 1.1.6 Classification of Indian Musical Instruments . 7 1.2 Machine Learning and Neural Networks . 9 1.2.1 Programming Vs Learning . 9 1.2.2 Supervised Learning . 9 1.2.3 Unsupervised Learning . 9 1.3 Deep Learning . 10 1.4 Audio Thumbnailing . 11 1.5 Motivation and Goals . 12 1.6 Thesis Outline . 13 2 Literature Survey of Music Signal Processing :::::::::::::::::::: 15 2.1 Introduction . 15 2.2 Areas of Research and Related Work . 16 2.2.1 Related Work With Respect to Source Separation . 16 2.2.2 Emotion Recognition . 18 2.2.3 Raga Classification . 18 2.2.4 Tala Classification . 19 2.2.5 Intonation/Rhythmic Analysis . 20 2.2.6 Tonic Identification . 20 2.2.7 Music Note Representation . 21 2.3 Existing methodologies in Audio Thumbnailing . 21 ix x CONTENTS 2.4 Audio Classification Techniques . 25 2.5 Summary and Conclusions . 26 3 Classification and Computation of Audio Thumbnails :::::::::::::::: 27 3.1 Proposed Methodology . 27 3.1.1 Computation of Self Similarity Matrix . 28 3.1.1.1 Enhancement Strategies . 30 3.1.2 Generation of Thumbnails . 33 3.1.3 Classification Model . 35 3.2 Summary and Conclusions . 35 4 Results of Classification of Carnatic Thumbnails using CNN-RNN Models ::::: 37 4.1 Experimental Setup . 40 4.1.1 Dataset for Raga Classification . 40 4.1.2 Dataset for Tala Classification . 61 4.2 Summary and Conclusions . 66 5 Summary and Conclusions :::::::::::::::::::::::::::::: 67 5.1 Future Work . 68 Related Publications ::::::::::::::::::::::::::::::::::: 69 Bibliography ::::::::::::::::::::::::::::::::::::::: 70 List of Figures Figure Page 1.1 Classification of Musical Instruments . 8 1.2 Machine Learning Application adopted from https://www.guru99.com/machine- learning-tutorial.html . 10 1.3 Regression Algorithms . 11 1.4 Unsupervised learning Algorithms . 12 3.1 Sample Self Similarity Matrix . 29 3.2 Chromogram of Song Inta Chala in Adi Talam . 30 3.3 Detailed view of Self Similarity Matrix . 31 3.4 Self Similarity Matrix - Procedure . 32 3.5 Architecture of the neural network classification model . 36 4.1 Self Similarity Matrix before Enhancing and Smoothing for Song Inta Chala in Adi Tala . 40 4.2 Self Similarity Matrix After Enhancing and Smoothing for Song Inta Chala in Adi Tala . 41 xi List of Tables Table Page 1.1 Table illustrating the note of each swara . 3 1.2 Table illustrating twelve swaras . 4 2.1 Comparison between existing Thumbnailing Approaches . 23 4.1 Table Describing Dataset for Raga Classification . 42 4.2 Table Describing Dataset for Raga Ahiri . 43 4.3 Table describing Dataset for Raga Darbari . 43 4.4 Table describing Dataset for Raga Darbari Kannada . 43 4.5 Table describing Dataset for Raga Suruthi . 44 4.6 Table describing Dataset for Raga Varali . 44 4.7 Table describing Dataset for Raga Kuntala Varali . 45 4.8 Table describing Dataset for Raga Sahana . 45 4.9 Table describing Dataset for Raga Nattai . 45 4.10 Table describing Dataset for Raga Saurastram . 46 4.11 Table describing Dataset for Raga Nilambari . 46 4.12 Table describing Dataset for Raga Vasantha . 47 4.13 Table describing Dataset for Raga Kapi . 47 4.14 Table describing Dataset for Raga Kedaram . 48 4.15 Table describing Dataset for Raga Kanada . 49 4.16 Table describing Dataset for Raga Khamas . 50 4.17 Table describing Dataset for Raga Sri . 50 4.18 Table describing Dataset for Raga Ahiri . 51 4.19 Table describing Dataset for Raga Darbari . 51 4.20 Table describing Dataset for Raga Darbari Kanada . 52 4.21 Table describing Dataset for Raga Surthi . 53 4.22 Table describing Dataset for Raga Varali . 53 4.23 Table describing Dataset for Raga Sahana . 54 4.24 Table describing Dataset for Raga Nattai . 54 4.25 Table describing Dataset for Raga Kuntalavarali . 55 4.26 Table describing Dataset for Raga Saurastram . 55 4.27 Table describing Dataset for Raga Nilambari . 56 xii LIST OF TABLES xiii 4.28 Table describing Dataset for Raga Vasantha . 56 4.29 Table describing Dataset for Raga Kapi . 57 4.30 Table describing Dataset for Raga Kedaram . 58 4.31 Table describing Dataset for Raga Kanada . 58 4.32 Table describing Dataset for Raga Khamus . 59 4.33 Table describing Dataset for Raga Sri . 60 4.34 Table Describing Dataset for Tala Classification . 62 4.35 Table describing Dataset for Tala Adi . 62 4.36 Table describing Dataset for Tala Rupaka . 63 4.37 Table describing Dataset for Tala Tisra Jati Eka .