Towards Measuring Intonation Quality of Choir Recordings: a Case Study on Bruckner’S Locus Iste
Total Page:16
File Type:pdf, Size:1020Kb
TOWARDS MEASURING INTONATION QUALITY OF CHOIR RECORDINGS: A CASE STUDY ON BRUCKNER’S LOCUS ISTE Christof Weiß, Sebastian J. Schlecht, Sebastian Rosenzweig, Meinard Müller International Audio Laboratories Erlangen, Germany [email protected] ABSTRACT a) b) ) Unaccompanied vocal music is a central part of West- cents ern art music, yet it requires excellent skills for singers to ( achieve proper intonation. In this paper, we analyze into- pitch pitch nation deficiencies by introducing an intonation cost mea- sure that can be computed from choir recordings and may Deviation help to assess the singers’ intonation quality. With our ap- proach, we measure the deviation between the recording’s c) d) local salient frequency content and an adaptive reference grid based on the equal-tempered scale. The adaptivity ) introduces invariance of the local intonation measure to cents global intonation drifts. In our experiments, we compute ( pitch this measure for several recordings of Anton Bruckner’s pitch choir piece Locus Iste. We demonstrate the robustness of Deviation the proposed measure by comparing scenarios of different complexity regarding the availability of aligned scores and multi-track recordings, as well as the number of singers per time time part. Even without using score information, our cost mea- Figure 1. Note-wise analysis for polyphonic music. sure shows interesting trends, thus indicating the potential (a) Global intonation matching a fixed reference grid. of our method for real-world applications. (b) Global intonation higher than a fixed reference grid. (c) Intonation drift shown against a fixed reference grid. (d) Intonation drift with deviations from an adaptive grid. 1. INTRODUCTION Unaccompanied vocal music constitutes the nucleus of Western art music and the starting point of polyphony’s idealized situation, where each note-wise pitch lies on a evolution. Despite an increasing number of studies [1,4,5, fixed reference grid. In Fig. 1b, all pitches are sharp (too 8–11,16,17,21] dating back to the 1930s [29], many facets high) compared to the same reference grid. The intonation of polyphonic a cappella singing are yet to be explored and quality, however, should be considered equivalent in both understood. A central challenge of a cappella singing is the cases. Fig. 1c illustrates a different situation with down- adjustment of pitch in order to stay in tune relative to the ward intonation drift. Using a fixed reference grid, the de- fellow singers. Even if choirs achieve locally good intona- viations gradually accumulate. To compensate for this, one tion, they may suffer from intonation drifts slowly evolving can use an adaptive reference grid [12, 15, 31] so that the over time [8–10,15–17,21,23]. Thus, one has to deal with vertical intonation quality is separated from the horizontal different intonation issues that refer to harmonic (or verti- intonation drift. The residual pitch deviations—relative to cal) and melodic (or horizontal) intonation. In Fig. 1, we the adaptive grid—only refer to vertical intonation prob- show how these different aspects of intonation quality may lems (Fig. 1d), which are in the focus of our analysis. be visualized separately. Our schematic example illustrates In this paper, we propose an intonation cost measure the assessment of note-wise pitch deviations (color-coded) that can be computed from choir recordings and may help in the presence of a global pitch drift. Fig. 1a shows the to assess the singers’ intonation quality. Developing such a measure encompasses two central challenges: (i) accu- rate estimation of the local salient frequency content from c Christof Weiß, Sebastian J. Schlecht, Sebastian Rosen- a choir recording, and, (ii) reliable measurement of intona- zweig, Meinard Müller. Licensed under a Creative Commons Attribution tion quality corresponding to human perception on the one 4.0 International License (CC BY 4.0). Attribution: Christof Weiß, Se- hand and to music theory on the other hand. bastian J. Schlecht, Sebastian Rosenzweig, Meinard Müller. “Towards Measuring Intonation Quality of Choir Recordings: A Case Study on Concerning the estimation of the local salient frequency Bruckner’s Locus Iste”, 20th International Society for Music Information content (i), recordings of polyphonic choir pieces con- Retrieval Conference, Delft, The Netherlands, 2019. stitute extremely difficult scenarios. Often, the different 276 Proceedings of the 20th ISMIR Conference, Delft, Netherlands, November 4-8, 2019 parts of a musical composition are highly correlated both , cent ∆ in rhythm (joint on- and offsets) and harmony (overlap of cent partials). In the case of a mix recording (single-track), this cent , ∆ leads to overlaps in time–frequency representations, which cent , ∆ makes the estimation of fundamental frequencies (F0s) [3] or partial tracking [25] hard. On the other hand, captur- Figure 2. Grid deviations ∆τ (fn) in cents of frequency ing intonation at the sub-semitone level requires high fre- components fn (solid gray lines) from a 12-TET grid quency resolutions. One can leverage these problems using (dashed blue lines) shifted by τ. The corresponding am- dedicated recording scenarios where singers are isolated plitudes an are indicated by the grayscale colors. acoustically [5] or recorded sequentially [4]. Alternatively, special devices such as Larynx microphones [6, 16, 28] or additional score information [9] can help to simplify the tuple (fn; an) 2 P denotes the frequency fn and ampli- F0-estimation problem [18, 27]. In Section 3.2, we detail tude an of an individual component. the strategy used for this paper’s experiments. First, we convert a given frequency f in Hertz (Hz) to For assessing intonation quality (ii), we aim towards de- cents by veloping a robust intonation cost measure. Ideally, a high f value of this measure indicates passages of low intonation Fcent(f) := 1200 · log2 ; (1) , f quality. Hereby, intonation quality may relate to human cent 0 ∆ cent perception such as the measure proposed in [30] based on where f0 := 55 Hz is an arbitrary but fixed reference fre- cent , ∆ psychometric curves [26]. On the other hand, intonation quency. We compute the deviation of the frequency com- , quality may be guided by music theory or historical per- ponent f from a 12-TET gridcent ∆ formance practice [9]. In particular, choirs often aim for just intonation, which involves complex adjustment strate- ∆τ (f) := min jFcent(f) − τ − 100ij ; (2) i2 gies according to the harmonic context [10]. In contrast to Z such ideas, we follow a simplified approach based on 12- where τ 2 [−50; 50[ specifies the overall grid shift in tone equal temperament (12-TET). Even though 12-TET cents, see Fig. 2. Applying a Gaussian-like function to the is not considered to be the ideal intonation practice for grid deviation ∆τ (f), we define the intonation cost (IC) Western choir performance, it is a first approximation and Θτ as can provide useful feedback [15]. As a major advantage, 2 P ∆τ (f) our strategy estimates the sub-semitone intonation quality (f;a)2P a 1 − exp − 2σ2 : for any type of notated chord irrespective of its harmonic Θτ (P) = P ; (3) (f;a)2P a consonance—in contrast to other methods [30] that mea- sure a mixture of harmonic consonance and intonation. where the deviations are weighted and then normalized by As our main contribution, we propose a 12-TET-based the corresponding amplitudes. Due to the normalization, intonation cost measure (Section 2). Inspired by prior we obtain Θτ (P) 2 [0; 1] where Θτ (P) = 1 indicates the work [15, 24], we accumulate the overall deviation of maximal IC. The parameter σ adjusts the cost for deviating frequency components from an adaptive 12-TET grid from the grid. As suggested by [24], we choose a value of weighted by their corresponding amplitudes. For testing σ = 16 cents. To obtain invariance to pitch drifts, i.e., this measure, we compiled a small but diverse dataset of variation of the reference frequency, we choose the grid Anton Bruckner’s Graduale Locus Iste using different per- shift τ in an adaptive way so that the IC is minimized: formances (Section 3). We evaluate the robustness of our ∗ τ := arg min Θτ (P): (4) method by comparing scenarios of varying complexity re- τ2[−50;50[ garding the availability of aligned scores and multi-track recordings (Section 4.1). Finally, we apply our method to We then define the intonation cost Θ as different performances and show its benefit for assessing : ∗ the overall intonation quality of a recording (Section 4.2). Θ(P) = Θτ (P): (5) Section 5 concludes the paper and gives an outlook on fu- For instance, in a scenario where a choir performs with ture work and practical applications. good local intonation but is affected by a pitch drift, τ ∗ slowly changes over time while Θ is constantly small. 2. MEASURING INTONATION QUALITY 2.2 Example with Synthetic Signals In the following, we describe the computation of an into- nation cost measure based on frequency deviations from a In the following, we illustrate the properties of the IC mea- 12-tone equal-temperament (12-TET) grid. sure Θ(P) by means of synthetic examples. To this end, we define a harmonic tone xf : R ! R with K partials and fundamental frequency f as 2.1 Intonation Cost Measure K The proposed measure operates on a set of N frequency X xf (t) := ak · sin (2πkft) ; (6) : components P = f(f1; a1);:::; (fN ; aN )g, where each k=1 277 Proceedings of the 20th ISMIR Conference, Delft, Netherlands, November 4-8, 2019 Figure 3. ICs Θτ for two harmonic tones xduad with an interval of size If1;f2 in cents and three grid shifts: adap- tive grid τ ∗, fixed grid τ = 0, and, fixed grid τ = −25. The F0 of the lower tone is fixed at f1 = 220 Hz.