Quantized Melodic Contours in Indian Art Music Perception: Application to Transcription
Total Page:16
File Type:pdf, Size:1020Kb
QUANTIZED MELODIC CONTOURS IN INDIAN ART MUSIC PERCEPTION: APPLICATION TO TRANSCRIPTION Ranjani, H. G. Deepak Paramashivan Thippur V. Sreenivas Dept of ECE, Dept of Music, Dept of ECE, Indian Institute of Science, University of Alberta, Indian Institute of Science, Bangalore Canada Bangalore [email protected] [email protected] [email protected] ABSTRACT 300 Ragas¯ in Indian Art Music have a florid dynamism asso- M P G ciated with them. Owing to their inherent structural intri- 250 cacies, the endeavor of mapping melodic contours to mu- sical notation becomes cumbersome. We explore the po- ! tential of mapping, through quantization of melodic con- 200 tours and listening test of synthesized music, to capture the Pitch (Hz) nuances of ragas¯ . We address both Hindustani and Car- 150 natic music forms of Indian Art Music. Two quantization schemes are examined using stochastic models of melodic pitch. We attempt to quantify the salience of raga¯ per- 100 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 ception from reconstructed melodic contours. Perception Time (s) experiments verify that much of the raga¯ nuances inclu- ! sive of the gamaka (subtle ornamentation) structures can Figure 1. [Color Online] Contour of a vocal melodic clip be retained by sampling and quantizing critical points of (rendition by Vid. Deepak Paramashivan) in Thodi raga¯ of melodic contours. Further, we show application of this re- Carnatic music form (tonic frequency at 146.83 Hz). The sult to automatically transcribe melody of Indian Art Mu- transcribed notes correspond to ‘MaP aGa’. The presence sic. of ornamentations pose difficulty for transcription. 1. INTRODUCTION notes from pitch contours [11]. The difficulty involved in Melody contours are often perceived as continuous func- identifying notes from a rendered melodic contour can be tions though generated from notes which assume discrete seen in Figure 1. pitch values. The rendition of a raga¯ , the melodic frame- In this work, we analyze pitch contours in an attempt to work of Indian Art Music (IAM), is a florid movement understand (i) how musicians possibly assess a correctly across notes, embellished with suitable ornamentations rendered note (ii) how they approach subsequent note(s) in (gamakas). Several engineering approaches to analyse a raga¯ . We explore the possibility to incorporate this un- and/or model pitch contours rely on ‘stable’ notes [5, 12]; derstanding to engineer an automated framework to repre- yet, it contradicts the perceptions and claims of musicians sent a raga¯ in terms of note sequences. A perceptual study in both Carnatic and Hindustani forms of music and also of the effects of two quantization schemes on raga¯ char- that of detailed experiments which assert that it is actu- acteristics (raga¯ -bhava) is explored. For a more detailed ally the manner of approaching notes that characterizes a exposition of ragas¯ in Indian Art Music, interested readers raga¯ [1, 6]. Algorithms to automatically align note tran- can refer to [19, 20]. scription to melodic contours show promise more at a rhythm cycle level rather than at a note level [21], lead- 1.1 Complexity of Pitch Contours in Indian Art Music ing to a hypothesis that it is necessary to study the role of raga pitch in rendering notes rather than finding / transcribing A ¯ contains 3 structures of information : (i) Pitch po- sitions of notes (swarasthana¯ ) (ii) Ornamentation of notes (Gamaka) (iii) Note movement (swarasanchara¯ ). All the c Ranjani, H. G., Deepak Paramashivan, Thippur V. three structures are coupled in a raga¯ rendition. The note Sreenivas. Licensed under a Creative Commons Attribution 4.0 Inter- position is embellished with gamakas, and is also depen- national License (CC BY 4.0). Attribution: Ranjani, H. G., Deepak dent on the note transitions themselves. Paramashivan, Thippur V. Sreenivas. “Quantized melodic contours in Indian Art Music Perception: Application to transcription”, 18th Inter- In [13], different notes and their transitions are stud- national Society for Music Information Retrieval Conference, Suzhou, ied and classified as ‘inflected intervals’, ‘transient notes’ China, 2017. and ‘transient inflexions’, while acknowledging that musi- 174 Proceedings of the 18th ISMIR Conference, Suzhou, China, October 23-27, 2017 175 cal insight and knowledge is necessary to distinguish be- tions of same raga¯ . Let τ = f t j ryn(t) = 0 g be the set tween different transitions. of critical points and x = f yn(τ) g be the corresponding From an engineering perspective, in a raga¯ , we observe critical pitch values. The tuple X = (x; τ) are the critical that inspite of all such note transitions detected at both finer points of yn(t). Mathematically, critical points can be ob- and coarser levels, the peaks, valleys and stable regions tained only if a function is differentiable. We estimate X correspond to discrete pitch values with an error factor. It from the zero crossings of numerical gradient of yn(t). is also logical that a musician perceives these points to as- sess if the intended note has been reached 1 . The salience 2.1 Semi-Continuous Gaussian Mixture Model of peaks, valleys and stable regions is utilized for motif Consider the Semi-Continuous Gaussian Mixture Model spotting in Carnatic music in [10]. However, not all peaks, (SC-GMM) [18] with K number of components whose valley regions can correspond to notes as per conventional means, µk8k 2 f1; 2;:::;Kg within an octave are fixed in transcriptions [15]. As an engineering approach, we pro- accordance to the note ratios used in IAM, as shown in Ta- pose to discretize or quantize the pitch contours at these ble 1. The distribution of pitch values in yn and the critical critical points using a semi-continuous Gaussian mixture pitch values x can be modeled using SC-GMM as: model (SC-GMM) proposed in [18] (Section 2) and thus K map continuous pitch contours to discrete note sequences. X We believe such a mapping brings us closer to understand- p(y) = αk;yN (yn; µk; σk;y) (1) ing the structures of ragas¯ in accordance with the theory k=1 K that discrete elements carry the structural information of X musical forms while expressions are realized in continu- p(x) = αk;xN (x; µk; σk;x) (2) ous variations [14]. k=1 While analysing contours w.r.t. discrete pitch values, For a fixed K components, the set of parameters estimated we often encounter scenarios in which pitch values can ei- from distribution of pitch are fαy; σyg and fαx; σxg. µ ther overshoot or undershoot the intended values as can be parameters are not estimated since they are fixed and are seen in Figure 1; this is also reported in literature [13, 15]. same in both cases. The following reasons can be attributed to such detours w.r.t. discrete pitch values - (i) Performers’ intent to gener- 2.2 Quantization using SC-GMM ate certain perceived effect in the listener (ii) Possible de- We use the above model to quantize pitch contours. Each viations during learning/ fast renditions (iii) Creative free- pitch sample of y (t) can be quantized to a nearest com- dom and margin of error allowed in rendering a raga¯ as an n ponent of SC-GMM which maximizes its probability: art form. Any deviation which does not bring about the re- quired perceptual effect can cause a connoisseur/musician ∗ ky(t) = arg max αk;yN (yn(t); µk; σk;y) (3) to not appreciate the rendition in its totality. k2f1;2;:::;Kg In this work, we assume the deviations to be due to any Similarly, every critical pitch of x can be quantized as: of above reasons and hence is part of errors in quantizing pitch contours. If the quantization process has disregarded ∗ kx(τ) = arg max αk;xN (x(τ); µk; σk;x) (4) the musically intended overshooting and undershooting of k2f1;2;:::;Kg pitch values, it only implies that the effect of the raga¯ is Thus, both y and x are now quantized and correspond not captured completely in the quantized sample. In or- n to a sequence of notes; their temporal information (corre- der to analyze the importance of limits of quantization, sponding to ftg and fτg) are retained. we reconstruct the melody from quantized sequences and conduct perception experiments on these melodies (Sec- tion 3.3). Further, we propose a framework by using these 3. SYNTHESIS FROM QUANTIZED SEQUENCE quantized notes to transcribe a contour (Section 4.1). OF NOTES To check if this mapping process captures the essence of 2. QUANTIZATION MODEL ragas¯ and to assess the effect of quantization on raga¯ per- ception, we conduct perception experiments. Audio clips Given pitch contours, y(t) estimated from audio record- are synthesized for perception. In order to synthesize audio ings, it is possible to identify the tonic frequency f as T from quantized note sequences, we first synthesize melody shown in [7, 18]. The pitch contours are tonic normal- contours, and use the same to synthesize audio clips. ized and mapped to a common tonic, fU ; let yn(t) = y(t) ∗ fU =fT denote pitch contours mapped to common 3.1 Quantized Pitch Contour tonic frequency 2 . This helps to analyze different rendi- ∗ ∗ The un-quantized yn(t), the quantized ky(t) and kx(τ) are 1 This also explains the fact that music listeners do not perceive in- interpolated to obtain a contour sampled at Fs, the sam- termediate notes during note transitions which are greater than a semi- 3 tone; for example, when a musician glides from Sa (tonic) to Pa (fifth) pling frequency of the discrete-time audio signal . Piece- in a raga¯ , we do not perceive all the intermediatary semi-tones which the wise cubic hermite interpolating polynomial is used with glide passes through.