Identifying Raga Similarity Through Embeddings Learned from Compositions’ Notation
Total Page:16
File Type:pdf, Size:1020Kb
IDENTIFYING RAGA SIMILARITY THROUGH EMBEDDINGS LEARNED FROM COMPOSITIONS’ NOTATION Joe Cheri Ross1 Abhijit Mishra3 Kaustuv Kanti Ganguli2 Pushpak Bhattacharyya1 Preeti Rao2 1 Dept. of Computer Science & Engineering, 2 Dept. of Electrical Engineering Indian Institute of Technology Bombay, India 3 IBM Research India [email protected] ABSTRACT by the raga framework. A raga is a melodic mode or tonal matrix providing the grammar for the notes and melodic Identifying similarities between ragas in Hindustani mu- phrases, but not limiting the improvisatory possibilities in sic impacts tasks like music recommendation, music in- a performance [25]. formation retrieval and automatic analysis of large-scale Raga being one of the most prominent categorization musical content. Quantifying raga similarity becomes ex- aspect of Hindustani music, identifying similarities be- tremely challenging as it demands assimilation of both in- tween them is of prime importance to many Hindustani trinsic (viz., notes, tempo) and extrinsic (viz. raga singing- music specific tasks like music information retrieval, mu- time, emotions conveyed) properties of ragas. This pa- sic recommendation, automatic analysis of large-scale mu- per introduces novel frameworks for quantifying similar- sical content etc. Generally similarity between ragas is ities between ragas based on their melodic attributes alone, inferred through attributes associated with the ragas. For available in the form of bandish (composition) notation. instance, in Hindustani music, classification of ragas based Based on the hypothesis that notes in a particular raga on the tonal material involved is termed as thaat. There are are characterized by the company they keep, we design 10 thaats in Hindustani music [8]. prahar, jati, vadi, sam- and train several deep recursive neural network variants vadi etc. are the other important attributes. Most of the with Long Short-term Memory (LSTM) units to learn dis- accepted similarities between ragas encompass the simi- tributed representations of notes in ragas from bandish larities in many of these attributes. But these similarities notations. We refer to these distributed representations cannot always be derived exclusively from these attributes. as note-embeddings. Note-embeddings, as we observe, Melodic similarity is a strong substitute and close to per- capture a raga’s identity, and thus the similarity between ceived similarity. The melodic similarity between Hindus- note-embeddings signifies the similarity between the ragas. tani ragas is not largely available in documented form. This Evaluations with perplexity measure and clustering based necessitates systems for raga similarity measurement to be method show the performance improvement in identifying devised, even though the number of ragas in the Hindustani similarities using note-embeddings over n-gram and uni- classical framework is fixed. directional LSTM baselines. While our metric may not A composed musical piece termed as bandish is writ- capture similarity between ragas in their entirety, it could ten to perform in a particular raga, giving ample freedom be quite useful in various computational music settings that to the performer to improvise upon. As the literal mean- heavily rely on melodic information. ing suggests, bandish is tied to its raga, tala (rhythm) and lyrics. bandish is taken as the basic framework for a perfor- 1. INTRODUCTION mance which gets enriched with improvisation while the performer renders it. Realization of a bandish in a per- Hindustani music is one of the Indian classical music tradi- formance brings out all the colors and characteristics of tions developed in northern part of India getting influences a raga. Given this fact, audio performances of the ban- from the music of Persia and Arabia [17]. The south Indian dishes can be deemed to be excellent sources for analyzing music tradition is referred to as Carnatic music [30]. The raga similarities from a computational perspective. How- compositions and their performances in both these classi- ever, methods for automatic transcription of notations from cal traditions are strictly based on the grammar prescribed audio performances have been elusive; this restricts the possibilities of exploiting audio-resources. Our work on c Joe Cheri Ross, Abhijit Mishra, Kaustuv Kanti Ganguli, raga similarity identification, thus, relies on notations hav- Pushpak Bhattacharyya, Preeti Rao. Licensed under a Creative Com- ing abstract representation of a performance covering most mons Attribution 4.0 International License (CC BY 4.0). Attribution: dimensions of the composition’s raga. We use bandish no- Joe Cheri Ross, Abhijit Mishra, Kaustuv Kanti Ganguli, Pushpak Bhat- tacharyya, Preeti Rao. “Identifying Raga Similarity Through embeddings tations dataset available from swarganga.org [16]. learned from Compositions’ notation”, 18th International Society for Mu- Our proposed approach, based on deep recursive sic Information Retrieval Conference, Suzhou, China, 2017. neural network with bi-directional LSTM as recurrent units, learns note-embeddings for each raga from the [Note distribution] bandish notations available for that raga. We partition our data by raga and train the model independently SoftMax SoftMax SoftMax SoftMax C C3 C for each raga. It produces as many note-embeddings, as C1 2 n [Merge] many different ragas we have represented in the dataset. + + + + The cosine similarity between the note-embeddings serves LSTM LSTM LSTM . LSTM for analyzing the similarity between the ragas. Our evalu- ations with perplexity measure and clustering based meth- LSTM LSTM LSTM . LSTM ods show the performance improvement in identifying sim- [Merge] ilarities using note-embeddings using our approach over + + + + (a) a baseline that uses n-gram overlaps of notes in ban- LSTM LSTM LSTM . LSTM dish for raga similarity computation (b) a baseline that uses pitch class distribution (PCD) and (c) our approach with LSTM LSTM LSTM . LSTM uni-directional LSTM. We believe, our approach can be e e e seamlessly adopted to the Carnatic music style as it fol- 1 2 e3 n lows most of the principles as Hindustani music. [ |�|×� representation] 2. RELATED WORK x 1 x2 x3 xn To the best of our knowledge no such attempts to identify raga similarity have been made so far. The work closest Figure 1. Bi-directional LSTM architecture for learning to ours is by Bhattacharjee and Srinivasan [5] who dis- note-embeddings cuss raga identification of Hindustani classical audio per- formances through a transition probability based approach. Here they also discuss about validating the raga identifica- could account for how notes are preceded and succeeded tion method through identifying known raga relationship by other notes in compositions. between 10 ragas considered for this work. A good num- Formally, in a composition, a note x 2 V (where V rep- ber of research works have been carried out pertaining to resents a vocabulary all notes in three octaves) can be rep- raga identification in Hindustani music using note intona- resented as a d dimensional vector that captures semantic- tion [3], chromagram patterns [11], note histogram [12]. information specific to the raga that the compositions be- Pandey et al. [22] proposed an HMM based approach on long to. Such distributed note-representations, referred to automatically transcribed notation data from audio. There as note-embeddings (jV j × d matrix) can be expected to has been quite a few raga recognition attempts in Carnatic capture more information than other forms of sparse rep- music also [28, 4, 27, 24]. resentations (like presenting notes with unique integers). We propose a bi-directional LSTM [14] based architecture that is motivated by the the work of Huang and Wu [15] to 3. RAGA SIMILARITY BASED ON NOTATION: learn note-embeddings characterizing a particular style of MOTIVATION AND CENTRAL IDEA music. We learn note-embeddings for each raga separately While the general notion of raga similarity is based on var- from the compositions available for the raga. ious dimensions of ragas like thaat, prahar, jati, vadi, sam- How can note-embeddings help capture similarities be- vadi etc., the similarities perceived by humans (musicians tween ragas? We hypothesize that embeddings learned for and expert listeners) is predominantly substantiated upon a given note for similar ragas will have more similarity. For the melodic structure. A raga-similarity method solely example, the representation for note Ma-elevated (equiva- based on notational (melodic) information can be quite rel- lent note F# in C-scale) in raga Yaman can be expected to evant to computational music tasks involving Indian clas- be very similar to that of Yaman Kalyan as both of these sical music. ragas share very similar melodic characteristics. Theoretically, the identity of a raga lies in how certain notes and note sequences (called phrases) are used in its 4. NEURAL NETWORK ARCHITECTURE FOR compositions. We hypothesize that capturing the semantic LEARNING NOTE-EMBEDDINGS association between different notes appearing in the com- position can possibly reveal the identity of a raga. More- We design a deep recurrent neural network (RNN), with over, it can also provide insights into how similar or dis- bi-directional LSTMs as recurrent units, that learns to pre- similar two ragas can be, based on how similar / dissimilar dict the forth-coming notes that are highly likely to ap- the semantic associations of notes in the compositions are. pear in a bandish composition,