Note, Cut and Strike Detection for Traditional Irish Flute Recordings
Total Page:16
File Type:pdf, Size:1020Kb
Technological University Dublin ARROW@TU Dublin 6th International Workshop on Folk Music Papers Analysis, 15-17 June, 2016 2016-6 Note, Cut and Strike Detection for Traditional Irish Flute Recordings Islah Ali-MacLachlan Birmingham City University, [email protected] Maciej Tomczak Birmingham City University, [email protected] Carl Southall Birmingham City University, [email protected] See next page for additional authors Follow this and additional works at: https://arrow.tudublin.ie/fema Part of the Musicology Commons Recommended Citation Ali-MacLachlan, I., Tomczak, M., Southall, C., Hockman, J. (2016). Note, cut and strike detection for traditional Irish flute ecorr dings.6th International Workshop on Folk Music Analysis, Dublin, 15-17 June, 2016. This Conference Paper is brought to you for free and open access by the 6th International Workshop on Folk Music Analysis, 15-17 June, 2016 at ARROW@TU Dublin. It has been accepted for inclusion in Papers by an authorized administrator of ARROW@TU Dublin. For more information, please contact [email protected], [email protected]. This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 4.0 License Authors Islah Ali-MacLachlan, Maciej Tomczak, Carl Southall, and Jason Hockman This conference paper is available at ARROW@TU Dublin: https://arrow.tudublin.ie/fema/12 FMA 2016 Workshop NOTE, CUT AND STRIKE DETECTION FOR TRADITIONAL IRISH FLUTE RECORDINGS Islah Ali-MacLachlan, Maciej Tomczak, Carl Southall, Jason Hockman DMT Lab, Birmingham City University islah.ali-maclachlan, maciej.tomczak, carl.southall, jason.hockman @bcu.ac.uk ABSTRACT =1/8 Note frequency This paper addresses the topic of note, cut and strike detection in frequency Irish traditional music (ITM). In order to do this we first evalu- time ate state of the art onset detection methods for identifying note cut strike time boundaries. Our method utilises the results from manually and automatically segmented flute recordings. We then demonstrate how this information may be utilised for the detection of notes and single note articulations idiomatic of this genre for the pur- poses of player style identification. Results for manually anno- tated onsets achieve 86%, 70% and 74% accuracies for note, cut and strike classification respectively. Results for automatically frequency segmented recordings are considerably, lower therefore we per- frequency form an analysis of the onset detection results per event class to time time establish which musical patterns contain the most errors. long roll short roll Figure 1: Frequency over time of cut and strike artic- ulations showing change of pitch. Long and short rolls 1. INTRODUCTION are also shown with pitch deviations (eighth note lengths shown for reference). 1.1 Background Irish Traditional Music (ITM) is a form of dance music played on a variety of traditional instruments including 1.2 Related work the flute. Within the tradition of ITM, players from dif- The approach undertaken in this paper utilises onset detec- ferent backgrounds are individuated based on their use of tion as a crucial first step in the identification of notes and techniques such as ornamentation, a key factor alongside ornaments. There are relatively few studies in the litera- melodic and rhythmic variation, phrasing and articulation ture that deal specifically with onset detection within ITM, in determining individual player style (McCullough, 1977; particularly with reference to the flute. Hast & Scott, 2004). Onsets were found by Gainza et al. (2004) using band- To automatically detect a player’s style in audio signals, specific thresholds in a technique similar to Scheirer a critical first step is to detect these notes and ornamenta- (1998) and Klapuri (1999). A decision tree was used to tion types. In this paper we evaluate both notes and single determine note, cut or strike based on duration and pitch. note ornaments known as cuts and strikes. Both ornaments Kelleher et al. (2005) used a similar system to analyse or- generate a pitch deviation: a cut is performed by quickly naments on the fiddle within Irish music, as bowed instru- lifting a finger from a tonehole then replacing it; a strike in- ments also produce slow onsets. volves momentarily covering the tonehole below the note K¨ok¨uer et al. (2014) also analysed flute recordings being played. We also analyse the cut and strike elements through the incorporation of three kinds of informationand of multi-note ornaments known as short roll and long roll. a fundamental frequency estimation method using the YIN Figure 1 shows the pitch deviation for cuts and strikes. algorithm by De Cheveign´e& Kawahara (2002). As in Long and short rolls are also displayed, showing the in- Gainza et al. (2004) a filterbank with fourteen bands opti- clusion of cut and strike figures. A long roll occupies the mised for the flute was used. More recently, Janˇcoviˇcet al. same duration as three eighth notes whereas a short roll is (2015) presented a method for transcription of ITM flute equivalent to two eighth notes. In practice, ITM follows a recordings with ornamentation using hidden Markov mod- swing rhythm—while there is a regular beat, swing follows els. an irregular rhythm and therefore each eighth-note section Unlike the above flute-specific methods, which rely on may not be of equal duration in normal playing (Schuller, signal processing based onset detection algorithms, state 1991). of the art generalised onset detection methods use proba- 30 FMA 2016 Workshop bilistic modelling. The number of onset detection meth- these are also important in class distinction. The change ods using neural networks has substantially risen since in timbre is caused by player’s fleeting finger motion as a Lacoste & Eck (2005). OnsetDetector by Eyben et al. tonehole is temporarily opened or closed. This results in a (2010) uses bidirectional long short-term memory neural unique timbre that differs from notes. For that purpose we networks, and performs well in a range of onset detection extract 13 Mel-frequency cepstral coefficients (MFCCs), tasks including solo wind instruments. excluding the first coefficient, and 12 chroma features fea- In this paper we perform an evaluation using several tures to accommodate for timbre and pitch changes in each modern onset detection algorithms and a dataset comprised of the articulations. of 79 real flute performances of ITM. We then demonstrate To extract features from the audio segments the input how this information may be utilised towards the determi- audio (mono WAV files) is down sampled to 11,025 Hz. nation of notes and single note ornamentations. Following the approach in Mauch & Dixon (2010) we cal- The remainder of this paper is structured as follows: culate the MFCC and chroma features using a Hanning Section 2 details the method of segmentation, feature ex- window of 1024 samples with 50% overlap. The extracted traction and classification. In Section 3 we discuss evalua- features are then normalised to the range [0,1] for every tions of a range of onset detection methods and classifica- corresponding feature type. Each audio segment is as- tion of notes, cuts and strikes. Results of the studies into signed to its class Ω (e.g. note). An n x 26 matrix FΩ is onset detection and ornament classification are presented created, where n represents the number of segments with in Section 4 and finally conclusions and further work are 26 features (i.e., MFCCs, chroma, durations). discussed in Section 5. Each FΩ segment appears in the context of musical pat- terns such as rolls, shakes or just consecutive notes in the recording. To account for the rhythmic, timbral and pitch 2. METHOD changes of each event type in the context of these patterns, Figure 2 shows an overview of the proposed method. We we concatenate the first derivatives of all features into ev- extract features from audio segments representing events ery FΩ segment. (notes, cuts, strikes) and propose an event type classifica- tion approach using the segmented event features. 2.2 Neural network classification For a fully automated method we use onset detection Audio segments are then classified into note, cut and strike for segmentation. Event features are then extracted from classes using a feed-forward neural network. inter-onset intervals (IOI). These features are used in a su- pervised learning algorithm to classify the segments as one of three distinct classes: notes, cuts and strikes. For onset input 1 2 3 detection, we attempt to use the top-performing algorithm w a1 w a2 w a3 output from the evaluation presented in Section 3.2, and in the fol- features lowing discuss only the remaining feature extraction and 1 2 3 classification stages. b b b Figure 3: Neural network architecture containing two hid- den layers a1 and a2, with weights w and biases b. Event Feature Extraction Event Classification notes The proposed neural network, shown in Figure 3, con- feature ornament segmentation cuts sists of two hidden layers containing 20 neurons each. extraction classification Back propagation is used to train the neural network, up- strikes dating the weights and biases iteratively using scaled con- jugate gradient of the output errors. A maximum iteration limit is set to 10,000 and the weights and biases are ini- Figure 2: Overview of the proposed classification method tialised with random non-zerovalues to ensure that training of notes, cuts and strikes in flute signals. The first phase commenced correctly. A validation set is used to prevent shows feature extraction from segmented audio events and over-fitting and cross entropy is used for the performance the second phase shows classification of the events. measure. The output for each layer of an L layered neural net- work can be calculated using: 2.1 Feature extraction (l) (l 1) l l a = fl(a − (t)W + b ), (1) In order to capture the differences between each event type where, al is the outputat layer l and W and b are the weight we extract features related to rhythm, timbre and pitch.