Workshop HfM Karlsruhe Music Information Retrieval
Harmony Analysis
Christof Weiß, Frank Zalkow, Meinard Müller International Audio Laboratories Erlangen
[email protected] [email protected] [email protected] Book: Fundamentals of Music Processing
Meinard Müller Fundamentals of Music Processing Audio, Analysis, Algorithms, Applications 483 p., 249 illus., hardcover ISBN: 978-3-319-21944-8 Springer, 2015
Accompanying website: www.music-processing.de Book: Fundamentals of Music Processing
Meinard Müller Fundamentals of Music Processing Audio, Analysis, Algorithms, Applications 483 p., 249 illus., hardcover ISBN: 978-3-319-21944-8 Springer, 2015
Accompanying website: www.music-processing.de Chapter 5: Chord Recognition
5.1 Basic Theory of Harmony 5.2 Template-Based Chord Recognition 5.3 HMM-Based Chord Recognition 5.4 Further Notes
In Chapter 5, we consider the problem of analyzing harmonic properties of a piece of music by determining a descriptive progression of chords from a given audio recording. We take this opportunity to first discuss some basic theory of harmony including concepts such as intervals, chords, and scales. Then, motivated by the automated chord recognition scenario, we introduce template-based matching procedures and hidden Markov models—a concept of central importance for the analysis of temporal patterns in time-dependent data streams including speech, gestures, and music. Dissertation: Tonality-Based Style Analysis
Christof Weiß Computational Methods for Tonality-Based Style Analysis of Classical Music Audio Recordings Dissertation, Ilmenau University of Technology, 2017
Chapter 5: Analysis Methods for Key and Scale Structures Chapter 6: Design of Tonal Features Recall: Chroma Features
. Human perception of pitch is periodic . Two components: tone height (octave) and chroma (pitch class)
Chromatic circle Shepard’s helix of pitch Recall: Chroma Features Chroma Salience / Likelihood
Time (seconds) Recall: Chroma Representations Salience / Likelihood
L. van Beethoven, Fidelio, Overture, Slovak Philharmonic Recall: Chroma Representations
. Orchestra
L. van Beethoven, Fidelio, Overture, Slovak Philharmonic
. Piano
Fidelio, Overture, arr. Alexander Zemlinsky M. Namekawa, D.R. Davies, piano four hands Tonal Structures
. Western music (and most other music): Different aspects of harmony . Referring to different time scales
Movement level Global key C major Global key detection
Segment level C majorLocal key G major C major Local key detection
Chord level CM GM7 Am Chords Chord recognition
Melody Note level Middle voices Music transcription Bass line Tonal Structures
. Western music (and most other music): Different aspects of harmony . Referring to different time scales
Movement level Global key C major Global key detection
Segment level C majorLocal key G major C major Local key detection
Chord level CM GM7 Am Chords Chord recognition
Melody Note level Middle voices Music transcription Bass line Chord Recognition
Source: www.ultimate-guitar.com Chord Recognition
Time (seconds) Chord Recognition Chord Recognition
Prefiltering Postfiltering ▪ Compression ▪ Smoothing ▪ Overtones ▪ Transition ▪ Smoothing ▪HMM
Audio Chroma Pattern Recognition representation representation matching result
Major triads CG Minor triads Chord Recognition: Basics
. Musical chord: Group of three or more notes . Combination of three or more tones which sound simultaneously . Types: triads (major, minor, diminished, augmented), seventh chords… . Here: focus on major and minor triads
→ C Major (C)
C Chord Recognition: Basics
. Musical chord: Group of three or more notes . Combination of three or more tones which sound simultaneously . Types: triads (major, minor, diminished, augmented), seventh chords… . Here: focus on major and minor triads
C Major (C)
C Minor (Cm)
. Enharmonic equivalence: 12 different root notes possible → 24 chords Chord Recognition: Basics
Chords appear in different forms: . Inversions
. Different voicings
. Harmonic figuration: Broken chords (arpeggio)
. Melodic figuration: Different melody note (suspension, passing tone, …) . Further: Additional notes, incomplete chords Chord Recognition: Basics
. Templates: Major Triads
CD ♭ DE ♭ EF G ♭ GA ♭ AB ♭ B
B A♯/B♭ A G♯/A♭ G F♯/G♭ F E D♯/E♭ D C♯/D♭ C Chord Recognition: Basics
. Templates: Major Triads
CD ♭ DE ♭ EF G ♭ GA ♭ AB ♭ B
B A♯/B♭ A G♯/A♭ G F♯/G♭ F E D♯/E♭ D C♯/D♭ C Chord Recognition: Basics
. Templates: Minor Triads
Cm C♯mDmE ♭mEmFmF♯mGmG♯mAmB ♭mBm
B A♯/B♭ A G♯/A♭ G F♯/G♭ F E D♯/E♭ D C♯/D♭ C Chord Recognition: Template Matching
CC♯ D…CmC♯mDm … Chroma vector 24 chord templates B 000…000… for each audio frame (12 major, 12 minor) A♯ 000…000… A 001…001… G♯ 010…010… Compute for each frame the G 100…100… similarity of the chroma vector F♯ 001…000… to the 24 templates F 010…001… E 100…010… D♯ 000…100… D 001…001… C♯ 010…010… C 100…100… Chord Recognition: Template Matching
. Similarity measure: Cosine similarity (inner product of normalized vectors)
Chord template:
Chroma vector:
Similarity measure: Chord Recognition: TemplateMatching
Chord Chroma Time (seconds) Chord Recognition: Label Assignment
CC♯ D…CmC♯mDm … Chroma vector 24 chord templates B 000…000… for each audio frame (12 major, 12 minor) A♯ 000…000… A 001…001… G♯ 010…010… Compute for each frame the G 100…100… similarity of the chroma vector F♯ 001…000… to the 24 templates F 010…001… E 100…010… D♯ 000…100… Assign to each frame the chord label D 001…001… of the template that maximizes the C♯ similarity to the chroma vector 010…010… C 100…100… Chord Recognition: Label Assignment Chord Chord
Time (seconds) Chord Recognition: Evaluation C G Am F C G F C
Time (seconds) Chord Recognition: Evaluation
. “No-Chord” annotations: not every frame labeled
. Different evaluation measures: . Precision: #TP 𝑃 #TP #FP . Recall: #TP 𝑅 #TP #FN
. F-Measure (balances precision and recall):
2⋅𝑃⋅𝑅 𝐹 𝑃 𝑅
. Without “No-Chord” label: 𝑃 𝑅 𝐹 Chord Recognition: Smoothing
. Apply average filter of length 𝐿∈ℕ:
Time (seconds) Chord Recognition: Smoothing
. Apply average filter of length 𝐿∈ℕ:
Time (seconds) Chord Recognition: Smoothing
. Evaluation on all Beatles songs
0.8
0.75
0.7
0.65
0.6
0.55
F-measure 0.5
0.45 Binary templates 0.4
0.35 15913172125 Smoothing length Markov Chains
. Probabilistic model for sequential data . Markov property: Next state only depends on current state (no “memory”) . Consist of: . Set of states (hidden)
. State transition probabilities 0.8 . Initial state probabilities C
0.2 0.1
0.1 0.1 0.1 G F 0.7 0.3 0.6 Markov Chains Notation:
States 𝛼 for 𝑖∈ 1:𝐼
State transition probabilities 𝑎
A α1 α2 α3
α1 𝑎 𝑎 𝑎
α2 𝑎 𝑎 𝑎 α3 𝑎 𝑎 𝑎 0.8 Initial state probabilities 𝑐
C α1 α2 α3 C 𝑐 𝑐 𝑐 0.2 0.1 𝛽 for 𝑘∈ 1:𝐾 0.1 0.1 0.1 G F 0.7 0.3 0.6 Markov Chains
. Application examples: . Compute probability of a sequence using given a model (evaluation) . Compare two sequences using a given model . Evaluate a sequence with two different models (classification)
0.8
C
0.2 0.1
0.1 0.1 0.1 G F 0.7 0.3 0.6 Hidden Markov Models
. States as hidden variables
. Consist of: . Set of states (hidden)
. State transition probabilities
. Initial state probabilities 0.8
C
0.2 0.1
0.1 0.1 0.1 G F 0.7 0.3 0.6 Hidden Markov Models
. States as hidden variables
. Consist of: . Set of states (hidden)
. State transition probabilities
. Initial state probabilities 0.8
. Observations (visible) C
0.2 0.1
0.1 0.1 0.1 G F 0.7 0.3 0.6 Hidden Markov Models
. States as hidden variables
. Consist of: . Set of states (hidden) 0.9 0.2 0 . State transition probabilities 0.7 0.3 0 0 0.1 0.8 . Initial state probabilities 0.8
. Observations (visible) C
. Emission probabilities 0.2 0.1 0.1 0.1 0.1 G F 0.7 0.3 0.6 Hidden Markov Models Notation:
States 𝛼 for 𝑖∈ 1:𝐼
State transition probabilities 𝑎
A α1 α2 α3 0.9 0.2 α 𝑎 𝑎 𝑎 0 1 0.7 0.3 α 𝑎 𝑎 𝑎 2 0 0 0.1 0.8 α3 𝑎 𝑎 𝑎 0.8 Initial state probabilities 𝑐
C α1 α2 α3 C 𝑐 𝑐 𝑐 0.2 0.1 Observation symbols 𝛽 for 𝑘∈ 1:𝐾 Emission probabilities 𝑏 0.1 0.1 0.1
B β1 β2 β3 G F 0.7 0.3 0.6 α1 𝑏 𝑏 𝑏
α2 𝑏 𝑏 𝑏
α3 𝑏 𝑏 𝑏 Markov Chains
. Analogon: the student‘s life . Consists of: . Set of states (hidden)
. State transition probabilities 0.7 0.5 . Initial state probabilities
0.1 sleep coding 0.2 0.1 0.2 0.3 0 0.2 0.1 0.1 0.4 0.2 social eating 0.4 activity 0.2 0.3 Hidden Markov Models
. Analogon: the student‘s life Noise . Consists of: Light . Set of states (hidden) 0.5 0.4 0.3 . State transition probabilities 0.9 0.7 0.5 . Initial state probabilities
0.1 . Observations (visible) sleep coding 0.2 . Emission probabilities 0.1 0.2 0.3 0 0.2 0.1 0.1 0.4 0.2 social eating 0.3 0.4 activity 0.2 0.3 Smell 0.7 Hidden Markov Models
. Only observation sequence is visible! Different algorithmic problems: . Evaluation problem . Given: observation sequence and model . Calculate how well the model matches the sequence . Uncovering problem: . Given: observation sequence and model . Find: optimal hidden state sequence . Estimation problem („training“ the HMM): . Given: observation sequence . Find: model parameters . Baum-Welch algorithm (Expectation-Maximization) Uncovering problem
. Given: observation sequence 𝑂 𝑜 ,…,𝑜 of length 𝑁∈ℕand HMM 𝛩 (model parameters) ∗ ∗ ∗ . Find: optimal hidden state sequence 𝑆 𝑠 ,…,𝑠 . Corresponds to chord estimation task!
Observation sequence 𝑂 𝑜 , 𝑜 , 𝑜 , 𝑜 , 𝑜 , 𝑜
β1 β3 β1 β3 β3 β2 Uncovering problem
. Given: observation sequence 𝑂 𝑜 ,…,𝑜 of length 𝑁∈ℕand HMM 𝛩 (model parameters) . Find: optimal hidden state sequence . Corresponds to chord estimation task!
Observation sequence 𝑂 𝑜 , 𝑜 , 𝑜 , 𝑜 , 𝑜 , 𝑜
β1 β3 β1 β3 β3 β2
α1 α1 α1 α3 α3 α1
∗ ∗ ∗ ∗ ∗ ∗ ∗ Hidden state sequence 𝑆 𝑠 , 𝑠 , 𝑠 , 𝑠 , 𝑠 , 𝑠 Uncovering problem
. Given: observation sequence 𝑂 𝑜 ,…,𝑜 of length 𝑁∈ℕand HMM 𝛩 (model parameters) . Find: optimal hidden state sequence . Corresponds to chord estimation task!
Observation sequence 𝑂 𝑜 , 𝑜 , 𝑜 , 𝑜 , 𝑜 , 𝑜
β1 β3 β1 β3 β3 β2
CCCGGC
∗ ∗ ∗ ∗ ∗ ∗ ∗ Hidden state sequence 𝑆 𝑠 , 𝑠 , 𝑠 , 𝑠 , 𝑠 , 𝑠 Uncovering problem
. Optimal hidden state sequence? . “Best explains” given observation sequence 𝑂 . Maximizes probability 𝑃 𝑂, 𝑆 Θ
Prob∗ max 𝑃 𝑂, 𝑆 Θ 𝑆∗ argmax 𝑃 𝑂, 𝑆 Θ
. Straight-forward computation (naive approach): . Compute probability for each possible sequence 𝑆 . Number of possible sequences of length 𝑁 (𝐼 number of states):
𝐼·𝐼·… ·𝐼 𝐼 computationally infeasible! 𝑁 factors Viterbi Algorithm
. Based on dynamic programming (similar to DTW) . Idea: Recursive computation from subproblems . Use truncated versions of observation sequence
𝑂 1: 𝑛 ≔ 𝑜 ,…,𝑜 , length 𝑛∈ 1:𝑁
. Define 𝐃 𝑖, 𝑛 as the highest probability along a single state sequence 𝑠 ,…,𝑠 that ends in state 𝑠 𝛼
𝐃 𝑖, 𝑛 max 𝑃 𝑂 1: 𝑛 , 𝑠 ,…,𝑠 ,𝑠 𝛼 Θ ,… ,
. Then, our solution is the state sequence yielding
Prob∗ max𝐃 𝑖, 𝑁 ∈ : Viterbi Algorithm
. 𝐃: matrix of size 𝐼 𝑁 . Recursive computation of 𝐃 𝑖, 𝑛 along the column index 𝑛 . Initialization: . 𝑛 1
. Truncated observation sequence: 𝑂 1 𝑜
. Current observation: 𝑜 𝛽
𝐃 𝑖, 1 𝑐 ⋅𝑏 for some 𝑖∈ 1:𝐼 Viterbi Algorithm
. 𝐃: matrix of size 𝐼 𝑁 . Recursive computation of 𝐃 𝑖, 𝑛 along the column index 𝑛 . Recursion: . 𝑛∈ 2:𝑁
. Truncated observation sequence: 𝑂 1: 𝑛 𝑜 ,…,𝑜
. Last observation: 𝑜 𝛽
∗ ∗ 𝐃 𝑖, 𝑛 𝑏 ⋅𝑎 ⋅𝑃 𝑂 1: 𝑛 1 , 𝑠 ,…,𝑠 𝛼 Θ for 𝑖∈ 1:𝐼 must be maximal!
∗ ∗ 𝐃 𝑖, 𝑛 𝑏 ⋅𝑎 ⋅ 𝐃 𝑗 ,𝑛 1 Viterbi Algorithm
. 𝐃: matrix of size 𝐼 𝑁 . Recursive computation of 𝐃 𝑖, 𝑛 along the column index 𝑛 . Recursion: . 𝑛∈ 2:𝑁
. Truncated observation sequence: 𝑂 1: 𝑛 𝑜 ,…,𝑜
. Last observation: 𝑜 𝛽
∗ ∗ 𝐃 𝑖, 𝑛 𝑏 ⋅𝑎 ⋅𝑃 𝑂 1: 𝑛 1 , 𝑠 ,…,𝑠 𝛼 Θ for 𝑖∈ 1:𝐼 must be maximal!
∗ ∗ 𝐃 𝑖, 𝑛 𝑏 ⋅𝑎 ⋅𝐃 𝑗 ,𝑛 1
must be maximal (best index 𝑗∗)
𝐃 𝑖, 𝑛 𝑏 ⋅ max 𝑎 ⋅𝐃 𝑗,𝑛 1 ∈ : Viterbi Algorithm
∗ ∗ ∗ . 𝐃 given – find optimal state sequence 𝑆 𝑠 ,…,𝑠 ≔ 𝛼 ,…,𝛼 . Backtracking procedure (reverse order) . Last element: . 𝑛 𝑁
. Optimal state: 𝛼
𝑖 argmax𝐃 𝑗, 𝑁 ∈ : Viterbi Algorithm
∗ ∗ ∗ . 𝐃 given – find optimal state sequence 𝑆 𝑠 ,…,𝑠 ≔ 𝛼 ,…,𝛼 . Backtracking procedure (reverse order) . Further elements: . 𝑛 𝑁 1,𝑁 2,…,1
. Optimal state: 𝛼
𝑖 argmax 𝑎 ⋅𝐃 𝑖, 𝑛 ∈ : Viterbi Algorithm
∗ ∗ ∗ . 𝐃 given – find optimal state sequence 𝑆 𝑠 ,…,𝑠 ≔ 𝛼 ,…,𝛼 . Backtracking procedure (reverse order) . Further elements: . 𝑛 𝑁 1,𝑁 2,…,1
. Optimal state: 𝛼
𝑖 argmax 𝑎 ⋅𝐃 𝑖, 𝑛 ∈ :
. Simplification of backtracking: Keep track of maximizing index 𝑗 in
𝐃 𝑖, 𝑛 𝑏 ⋅max 𝑎 ⋅𝐃 𝑗, 𝑛 1 ∈ : . Define 𝐼 𝑁 1 matrix 𝐄:
𝐄 𝑖,𝑛 1 argmax 𝑎 ⋅𝐃 𝑗,𝑛 1 ∈ : Viterbi Algorithm Summary
Initialization
8
7
6 States 5 𝑖∈ 1:𝐼 4 𝐷 𝑖,1 𝑐𝑏 𝑖 𝑖𝑘1 3 …
2
1
1
Sequence index 𝑛 ∈ 1: 𝑁 Viterbi Algorithm Summary
Initialization Recursion
8 …
7 … … … …
6 States 5 … 𝑖∈ 1:𝐼 4 𝐷 𝑖,1 … …