Hmm-Based Glissando Detection for Recordings of Chinese Bamboo Flute
Total Page:16
File Type:pdf, Size:1020Kb
View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Queen Mary Research Online HMM-BASED GLISSANDO DETECTION FOR RECORDINGS OF CHINESE BAMBOO FLUTE Changhong Wang1, Emmanouil Benetos1, Xiaojie Meng2, Elaine Chew1 1Centre for Digital Music, Queen Mary University of London, UK fchanghong.wang,emmanouil.benetos,[email protected] 2Department of Chinese Music, China Conservatory of Music, China [email protected] ABSTRACT Playing techniques in non-Western instruments, while sim- ilarly important, are often overlooked. Take for example, Playing techniques such as ornamentations and articula- one of the world’s most ancient instruments, the Chinese tion effects constitute important aspects of music perfor- bamboo flute (also known as the Dizi or Zhudi, thereafter mance. However, their computational analysis is still at referred to as CBF): many listeners are most often cap- an early stage due to a lack of instrument diversity, estab- tivated by its unique timbre, which belies the twenty or lished methodologies and informative data. Focusing on more playing techniques invoked when performing on the the Chinese bamboo flute, we introduce a two-stage glis- instrument. To our knowledge, only Ayers [15, 16] has sando detection system based on hidden Markov models done some analysis of CBF playing techniques through (HMMs) with Gaussian mixtures. A rule-based segmen- synthesis. This work focused only on trills, tremolos and tation process extracts glissando candidates that are con- flutter-tongue. But many other techniques remain to be ex- secutive note changes in the same direction. Glissandi are plored. For the case of other non-Western instruments, lim- then identified by two HMMs. The study uses a newly cre- ited computational work can be found [5, 17]. ated dataset of Chinese bamboo flute recordings, including For playing technique detection, methods adopted in the both isolated glissandi and real-world pieces. The results, literature are typically frame-wise classifiers based on high based on both frame- and segment-based evaluation for as- dimensional feature inputs [6,18], with little explanation of cending and descending glissandi respectively, confirm the why the methods work. Support vector machines (SVMs) feasibility of the proposed method for glissando detection. are the most frequently used class of methods. A series of Better detection performance of ascending glissandi over electric bass guitar playing techniques was classified into descending ones is obtained due to their more regular pat- plucking or expressive styles using SVMs in [6]; [10] ap- terns. Inaccurate pitch estimation forms a main obstacle plied it to distinguish five fundamental guitar playing tech- for successful fully-automated glissando detection. The niques. A multimodal input using SVMs was used for dataset and method can be used for performance analysis. analysing piano pedalling techniques in [12]. Su et al. [11] proposed new features as input to an SVM based on sparse 1. INTRODUCTION modeling of magnitude and phase-derived spectra before Computational analysis of expressive patterns in music sig- classifying violin playing techniques. Other work used dy- nals plays an important role in music information research. namic time warping [19], COSFIRE filters [20], spectro- For instrumental music, these expressive patterns are fre- gram templates [21], and filter diagnoalisation method [22] quently the result of playing techniques. Automated analy- for analysis of playing techniques. sis of playing techniques can benefit automatic music tran- Datasets used in playing technique research consist of scription [1], computer-aided music pedagogy [2], instru- mainly playing techniques performed in isolation. Isolated ment classification [3, 4], and performance analysis [5]. techniques can vary greatly from the same techniques used However, computational analysis of playing techniques is in live performance. For ecological validity, we argue that still in its early stages, lacking instrument diversity, estab- playing techniques should be collected in context. A chal- lished methodologies, and informative data. lenge of obtaining playing technique examples in real-world Most existing work on computational analysis of playing settings is that some techniques may be rare. Thus, it may techniques focuses on Western instruments such as gui- be hard to find pieces covering a wide range of playing tar [6–8], violin [9–11], piano [12], and drums [13, 14]. techniques and with sufficient repeated instances of these techniques to obtain a variety of samples for a specific C. Wang is funded by the China Scholarship Council (CSC). E. Bene- tos is supported by a UK RAEng Research Fellowship (RF/128). technique. To address these limitations, we use the CBF as our in- Copyright: c 2019 Changhong Wang, Emmanouil Benetos, Xiaojie Meng, Elaine strument of choice and glissando, a rarely explored au- Chew. This is an open-access article distributed under the terms of the ma- dio gesture in the literature, as our starting point, aim- genta Creative Commons Attribution 3.0 Unported License, which permits unre- ing to build a systematic methodology for automatically stricted use, distribution, and reproduction in any medium, provided the original analysing playing techniques. Glissando, here refers to a author and source are credited. rapid slide up or down the musical scale [23], which is not 10 -10 recording length and number of glissandi in each group are -15 9 shown in Table. 1. 8 -20 7 -25 Isolated glissandi Whole-piece recordings 6 -30 Players Flute 5 -35 Length #glissandi Length #glissandi -40 Piece, style 4 (mins) ["; #] (mins) ["; #] Frequency (kHz) -45 3 -50 1-3 C 2.4 [58,47] Morning, Southern 16.0 [24,0] 2 -55 1 Busy Delivering -60 4-10 G 5.0 [117,112] 28.0 [23,106] 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Harvest, Northern Time (s) Table 1. Dataset information. Figure 1. Spectrogram of two ascending and two descend- ing glissando examples in Chinese bamboo flute music. In order to assess the performance of the proposed glis- comparable to the one defined as a continuous slide from sando detection system independent of the performance of one note to another in [24]. Fig. 1 shows a spectrogram pitch estimation methods, pitch ground truth for all record- of a series of two ascending and two descending CBF glis- ings is created. The fundamental frequency of each record- sandi. As can be seen, they exhibit a readily recognisable ing is first estimated using the pYIN algorithm [27] due to pattern, resembling rapid scale segments. Glissando de- the strictly monophonic property of the recordings. All er- rors are then manually corrected by the first author using tection in CBF playing is not straightforward: CBF glis- 1 sandi are less regular than the stair-like glissando patterns Sonic Visualiser . Both isolated and performed glissandi in piano and guitar playing [18]. For the same glissando are annotated and verified by the players on the score. The type, variations exist in the ways they are executed be- final annotation is created by the first author after consult- tween different players, different pieces, and even differ- ing with the players. ent parts of the same piece. The main characteristic of glissando is the consecutive note change, which we claim 2.2 Dataset Statistics can be captured by latent states of a hidden Markov model To verify the intuition of the difference between isolated (HMM) [25, 26]. HMMs enable the decoding of note evo- and performed glissandi, characteristic statistics of the gro- lution while smoothing outlier variations within performed und truth are calculated. Fig. 2 shows two-dimensional glissandi. histograms for four types of glissandi in CBF-GlissDB: as- In this paper, we make a first attempt to the computational cending and descending isolated glissandi; and ascending analysis of CBF glissandi. A new dataset including both and descending performed glissandi. As can be seen, per- isolated glissandi and real-world pieces is created and is formed glissandi have shorter durations than isolated glis- being prepared for public release. Based on the analysis of sandi, especially for descending glissandi, performed ones ground truth statistics, we propose a two-stage detection have almost half duration as isolated ones. Further analysis system. A rule-based segmentation process first extracts of note durations within each glissandi shows little differ- glissando candidates that are consecutive note changes in ence among isolated glissandi while ascending performed the same direction. Different from traditional binary clas- glissandi have larger variation than descending performed sification, the false positives obtained in the segmentation ones. This may be attributed to the performers’ tendency stage, which exhibit similar pitch evolution and duration as to lengthen the start or end note in an ascending performed the ground truth, are used to train a non-glissando HMM glissando. (NG-HMM). A glissando HMM (G-HMM) is trained us- ing all ground truth glissandi in the training set. Glissandi are then identified by two HMMs at test time. 3. METHOD To automatically detect glissando from real-world CBF re- 2. DATASET cordings, we propose a two-stage detection system based on rule-based segmentation (Sec. 3.1) and HMM-based iden- 2.1 Dataset Information tification (Sec. 3.2). The glissando analysis dataset, CBF-GlissDB, comprises recordings by ten expert CBF players from the China Con- 3.1 Rule-based Segmentation servatory of Music. All data is recorded in a professional To obtain glissando candidates from the whole-piece record- recording studio using a Zoom H6 recorder at 44.1kHz/24- ings, we introduce a rule-based segmentation component bits. Each player performs both isolated glissandi cover- using pitch with a 20ms hop size as input, as demonstrated Busy ing all notes on the CBF and one full-length piece— in Fig. 3. The pitch is first smoothed to exclude noisy vari- Delivering Harvest l¬lЮ忙 Morning é or ations and quantised to the nearest notes in 12-tone equal h .