<<

1 Kristen Masada and Razvan Bunescu: A Segmental CRF Model for Chord Recognition in Symbolic

A. Types of Chords in Tonal Music minished triads most frequently contain a diminished A chord is a group of notes that form a cohesive har- interval (9 half steps), producing a fully di- monic unit to the listener when sounding simulta- minished , or a interval, neously (Aldwell et al., 2011). We design our sys- creating a half- chord. tem to handle the following types of chords: triads, augmented 6th chords, suspended chords, and power A.2 Augmented 6th Chords chords. An augmented 6th chord is a type of chromatic chord defined by an interval between the A.1 Triads lowest and highest notes of the chord (Aldwell et al., A triad is the prototypical instance of a chord. It is 2011). The three most common types of augmented based on a note, which forms the lowest note of a 6th chords are Italian, German, and French in standard position. A and a fifth are then chords, as shown in Figure 8 in the key of A minor. built on top of this root to create a three-note chord. In- In a , Italian sixth chords can be seen as verted triads also exist, where the third or fifth instead iv chords with a sharpened root, in the first inversion. appears as the lowest note. The chord labels used in Thus, they can be created by stacking the sixth, first, our system do not distinguish among inversions of the and sharpened fourth scale degrees. In minor, Ger- same chord. However, once the basic triad is deter- man sixth chords are iv7 (i.e. minor seventh) chords mined by the system, finding its inversion can be done with a sharpened root, in the first inversion. They are in a straightforward post-processing step, as a function formed by combining the sixth, first, third, and sharp- of the note in the chord. The quality of the third ened fourth scale degrees. Lastly, French sixth chords and fifth intervals of a chord in standard position deter- are created by stacking the sixth, first, second, and mines the mode of a triad. For our system, we consider sharpened fourth scale degrees. Thus, they are ii 7 three triad modes: major (maj), minor (min), and di- (i.e. half-diminished seventh) chords with a sharpened minished (dim). A major triad consists of a third, in . interval (i.e. 4 half steps) between the root and third, as well as a perfect fifth (7 half steps) between the root and fifth. A minor triad has a interval (3 half steps) between the root and third. Lastly, a dimin- ished triad maintains the minor third between the root and third, but contains a diminished fifth (6 half steps) between the root and fifth. Figure 7 shows these three Figure 8: Common types of augmented 6th chords, triad modes, together with three versions of a C ma- shown for the A minor scale. The same notes would jor chord, one for each possible type of added note, as also be used for the A . explained below. A.3 Suspended and Power Chords Both suspended and power chords are similar to triads in that they contain a root and a perfect fifth. They dif- fer, however, in their omission of the third. As shown in Figure 9, suspended second chords (sus2) use a sec- ond as replacement for this third, forming a major sec- Figure 7: Triads in 3 modes and with 3 added notes. ond (2 half steps) with the root, while suspended fourth chords (sus4) employ a as replacement A triad can contain an added note, or a fourth note. (Taylor, 1989). The suspended second and fourth of- We include three possible added notes in our system: ten resolve to a more stable third. In addition to these a fourth, a sixth, and a seventh. A fourth chord (add4) two kinds of suspended chords, our system considers contains an interval of a perfect fourth (5 half steps) suspended fourth chords that contain an added minor between the root and the added note for all modes. In seventh, forming a dominant seventh suspended fourth contrast, the interval between the root and added note chord (7sus4). of a sixth chord (add6) of any mode is a (9 half steps). For seventh chords (add7), the added note interval varies. If the triad is major, the added note can form a (11 half steps) with the root, called a . It can also form a minor seventh (10 half steps) to create a . If the triad is minor, the added sev- Figure 9: Suspended and power chords. enth can again either form an interval of a major sev- enth, creating a minor-major seventh chord, or a minor In contrast with suspended chords, power chords seventh, forming a . Finally, di- (pow) do not contain a replacement for the missing 2 Kristen Masada and Razvan Bunescu: A Segmental CRF Model for Chord Recognition in Symbolic Music

third. They simply consist of a root and a perfect fifth. are with respect to the segments they belong Though they are not formally considered to be chords to. in , they are commonly referred to in both rock and pop music (Denyer, 1992).

A.4 Chord Ambiguity Sometimes, the same set of notes can have multiple chord interpretations. For example, the German sixth chord shown in Figure 8 can also be interpreted as an F dominant seventh chord. Added notes can also lead Figure 11: Examples of figuration notes, in red. to other types of ambiguity, for example {D, F, A, C} Suspension: Note n belongs to the first event of could be an F major sixth chord (i.e. F major with segment s. There is an anchor note m in the previous an added sixth) or a D minor seventh chord (i.e. D event (last event in the previous segment) such that: m with an added minor seventh). Human and n have the same pitch; n is either tied with m (i.e. annotators can determine the correct chord interpre- held over) or m’s offset coincides with n’s onset (i.e. tation based on cues such as inversions and context. restruck); n is not longer than m; n is non-harmonic The semi-CRF model described in this paper captures with respect to chord y, while m is harmonic with re- inversions through the bass features (Appendix C.3), spect to the previous chord. whereas context is taken into account through the Anticipation: Note n belongs to the last event of chord bigram features (Appendix C.4). This could be segment s. There is an anchor note m in the next further improved by adding other features, such as de- event (first event in the next segment) such that: n termining how notes in the current chord resolve to and m have the same pitch; m is either tied with n (i.e. notes in the next chord. held over) or n’s offset coincides with m’s onset (i.e. B. Figuration Heuristics restruck); n is not longer than m; n is non-harmonic with respect to chord y, while m is harmonic relative We designed a set of heuristics to determine whether to all other notes in its event. a note n from a segment s is a figuration note with re- Furthermore, because the weak semi-CRF features spect to a candidate chord label y. The heuristic rules shown below discover four types of figurations: pass- shown in Equation ?? do not have access to the candi- date label yk 1 of the previous segment sk 1, we need ing and neighbor notes (Figure 10), and suspensions − − and anticipations (Figure 11). a heuristic to determine whether an anchor note is har- monic whenever the anchor note belongs to the previ- ous segment. The heuristic simply looks at the other notes in the event containing the anchor note: if the event contains 2 or more other notes, at least 2 of them need to be consonant with the anchor, i.e. intervals of , fifths, thirds, and their inversions; if the event contains just one note other than the anchor note, it Figure 10: Examples of figuration notes, in red. has to be consonant with the anchor.

Passing: There are two anchor notes n1 and n2 We emphasize that the rules mentioned above for such that: n1’s offset coincides with n’s onset; n2’s on- detecting figuration notes are only approximations. We set coincides with n’s offset; n1 is one scale step below recognize that correctly identifying figuration notes n and n2 is one step above n, or n1 is one step above n can also depend on subtler stylistic and contextual and n2 one step below; n is not longer than either n1 cues, thus allowing for exceptions to each of these or n2; the accent value of n is strictly smaller than the rules. accent value of n1; at least one of the two anchor notes belongs to segment s; n is non-harmonic with respect C. Chord Recognition Features to chord y, i.e. n is not equivalent to the root, third, Given a segment s and chord y, we will use the follow- fifth, or added note of y; both n1 and n2 are harmonic ing notation: with respect to the segments they belong to. • s.Notes, s.N = the set of notes in the segment s. Neighbor: There are two anchor notes n1 and n2 • s.Events, s.E = the sequence of events in s. such that: n1’s offset coincides with n’s onset; n2’s on- • e.len, n.len = the length (i.e. duration) of event set coincides with n’s offset; n1 and n2 are both either e or note n, in quarters. one step below or one step above n; n is not longer • e.acc, n.acc = the accent value of event e or note than either n1 or n2; the accent value of n is strictly n, as computed by the beatStrength() function smaller than the accent value of n1; at least one of in Music211. the two anchor notes belongs to segment s; n is non- harmonic with respect to chord y; both anchor notes 1Link to Music21: http://web.mit.edu/music21 3 Kristen Masada and Razvan Bunescu: A Segmental CRF Model for Chord Recognition in Symbolic Music

• y.root, y.third, and y.fifth = the triad tones of the The accent-weighted version f3(s, y) of the purity fea- chord y. ture weighs each note n by its accent weight n.acc: • y.added = the added note of chord y, if y is an X added tone chord. 1[n y] n.acc n s.Notes ∈ ∗ • s.Fig(y) = the set of notes in s that are figuration f3(s, y) ∈ X = n.acc with respect to chord y. n s.Notes • s.NonFig(y) s.Notes s.Fig(y) = the set of notes ∈ = − in s that are not figuration with respect to y. The 3 real-valued features are discretized using the default bin set B. Note that a note may contain multiple events, as such the note length n.len can be seen as the sum of the C.1.1 Figuration-Controlled Segment Purity length of all events that span the duration of that note. For each segment purity feature, we create a figuration- For example, the first G3 in the bass of Figure 2 has a controlled version that ignores notes that were heuris- length of a quarter – it corresponds to the G3 in mea- tically detected as figuration, i.e. replace s.Notes with sure 2 of Figure 1 and is shown as a tied note to sim- s.NonFig(y) in each feature formula. plify the description. Therefore its n.len 1. Each of the two events that span its duration have= a length of C.2 Chord Coverage an eighth, hence e1.len e2.len 0.5. The chord coverage features determine which of the = = chord notes belong to the segment. In this section, The accent value is determined based on the metri- each of the coverage features are non-zero only for ma- cal position of a note or event, e.g. in a song written jor, minor, and diminished triads and their added note in a 4/4 time signature, the first beat position would counterparts. This is implemented by first defining an have a value of 1.0, the third beat 0.5, and the second indicator function y.Triad that is 1 only for triads and and fourth beats 0.25. Any other eighth note position chords with added notes, and then multiplying it into within a beat would have a value of 0.125, any six- all the triad features from this section. teenth note position strictly within the beat would have a value of 0.0625, and so on. To determine whether y.Triad 1[y.mode {maj, min, dim}] a note n from a segment s is a figuration note with = ∈ respect to a candidate chord label y, we use a set of Furthermore, we compress notation by showing the heuristics, as detailed in Appendix B. mode predicates as attributes of the label, e.g. y.ma j is a predicate equivalent with testing whether y.mode The duration and accent-weighted segment-level = maj. Thus, an equivalent formulation of y.Triad is as features introduced in this section have real values. follows: Given a real-valued feature f (s, y) that takes values in [0,1], we discretize it into K 2 Boolean features by par- y.Triad 1[y.ma j y.min y.dim] + titioning the [0,1] interval into a set of K subinterval = ∨ ∨ To avoid clutter, we do not show y.Triad in any of the bins B {(bk 1,bk ] 1 k K }. For each bin, the corre- = − | ≤ ≤ sponding Boolean feature determines whether f (s, y) features below, although it is assumed to be multiplied ∈ into all of them. The first 3 coverage features refer to (bk 1,bk ]. Additionally, two Boolean features are de- − fined for the boundary cases f (s, y) 0 and f (s, y) 1. the triad notes: For each real-valued feature, unless= specified other-= f4(s, y) 1[y.root s.Notes] wise, we use the bin set B [0,0.1,...,0.9,1.0]. = ∈ = f (s, y) 1[y.third s.Notes] 5 = ∈ f (s, y) 1[y.fifth s.Notes] C.1 Segment Purity 6 = ∈

The segment purity feature f1(s, y) computes the frac- A separate feature determines if the segment contains tion of the notes in segment s that are harmonic, i.e. all the notes in the chord: belong to chord y: Y f7(s, y) 1[n s.Notes] = n y ∈ X ∈ 1[n y] A chord may have an added tone y.added, such as a n s.Notes ∈ f1(s, y) ∈ 4th, a 6th, or a 7th. If a chord has an added tone, we = s.Notes | | define two features that determine whether the seg- ment contains the added note: The duration-weighted version f2(s, y) of the purity feature weighs each note n by its length n.len: f8(s, y) 1[ y.added y.added s.Notes] = ∃ ∧ ∈ f9(s, y) 1[ y.added y.added s.Notes] X = ∃ ∧ ∉ 1[n y] n.len Through the first feature, the system can learn to pre- n s.Notes ∈ ∗ f2(s, y) ∈ X fer the added tone version of the chord when the seg- = n.len n s.Notes ment contains it, while the second feature enables the ∈ 4 Kristen Masada and Razvan Bunescu: A Segmental CRF Model for Chord Recognition in Symbolic Music

system to learn to prefer the triad-only version if no • y.5th = the note that forms a perfect fifth (for added tone is in the segment. To prevent the system German 6th chords) or a diminished fifth (for from recognizing added chords too liberally, we add French) above y.bass. a feature that is triggered whenever the total length of The features defined in this section are non-zero only the added notes in the segment is greater than the total for augmented 6th chord labels. Similar to Ap- length of the root: pendix C.2, we define an indicator function y.AS that X is 1 only for augmented 6th chords and implicitly mul- alen(s, y) 1[n y.added] n.len tiply this into each of the features from this section. = n s.Notes = ∗ ∈ X rlen(s, y) 1[n y.root] n.len y.AS 1[y.mode {it6, fr6, ger6}] = ∈ = n s.Notes = ∗ ∈ y.AS 1[y.i t6 y.f r 6 y.ger 6] f10(s, y) 1[ y.added] 1[alen(s, y) rlen(s, y)] = ∨ ∨ = ∃ ∗ > We define an additional indicator function y.FG that is The duration-weighted versions of the chord cover- 1 only for French and German 6th chords. age features weigh each chord tone by its total dura- y.FG 1[y.fr6 y.ger6] tion in the segment. For the root, the feature would be = ∨ computed as shown below: The coverage features for augmented 6th chords are X overall analogous to the ones for triad chords. 1[n y.root] n.len = ∗ n s.Notes as (s, y) 1[y.bass s.Notes] f11(s, y) ∈ X 1 = n.len = ∈ as2(s, y) 1[y.3rd s.Notes] n s.Notes ∈ = ∈ as3(s, y) 1[y.6th s.Notes] Similar features f and f are computed for the third = ∈ 12 13 as (s, y) 1[y.FG y.5th s.Notes] and the fifth. The corresponding accent-weighted fea- 4 = ∧ ∈ tures f14, f15, and f16 are computed in a similar way, The duration-weighted versions are as follows: by replacing the note duration n.len in the duration- X 1[n y.bass] n.len weighted formulas with the note accent value n.acc. n s.Notes = ∗ as5(s, y) ∈ X The duration-weighted feature for the added tone = n.len is computed similarly: n s.Notes ∈ X 1[ y.added] 1[n y.added] n.len X ∃ ∗ = ∗ 1[n y.5th] n.len n s.Notes = ∗ f17(s, y) ∈ X n s.Notes = n.len as6(s, y) 1[y.FG] ∈ X = ∗ n.len n s.Notes ∈ n s.Notes ∈ Furthermore, by replacing n.len with n.acc, we also As before, we replace n.len with n.acc to obtain the obtain the accent-weighted version f18. accent-weighted versions of as5 and as6. We also de- An alternative definition of duration-weighted fea- fine segment-based duration-weighted features: tures is based on the proportion of the segment time X that is covered by a particular chord note. The corre- 1[y.bass e] e.len e s.Events ∈ ∗ sponding duration-weighted feature for the chord root as7(s, y) ∈ X = e.len is shown below: e s.Events X ∈ 1[y.root e] e.len C.2.2 Chord Coverage for Suspended and Power Chords e s.Events ∈ ∗ f19(s, y) ∈ X As before, we define the features in this section to be = e.len non-zero only for suspended or labels by e s.Events ∈ implicitly multiplying them with an indicator function Similar duration-weighted features normalized by the y.SP. segment length are defined for thirds, fifths, and added y.SP 1[y.sus2 y.sus4 y.7sus4 y.pow] notes. = ∨ ∨ ∨ All duration-weighted and accent-weighted fea- The coverage features for suspended and power chords tures are discretized using the default bin set B. are also similar to the ones defined for triad chords. sp (s, y) 1[y.root s.Notes] C.2.1 Chord Coverage for Augmented 6th Chords 1 = ∈ We label each note appearing in an augmented 6th sp (s, y) 1[y.sus2 y.2nd s.Notes 2 = ∧ ∈ ∨ chord as follows: (y.sus4 y.7sus4) y.4th s.Notes)] • y.bass = the lowest note. ∨ ∧ ∈ sp (s, y) 1[y.5th s.Notes] • y.3rd = the note that is a third above y.bass. 3 = ∈ sp (s, y) 1[y.7sus4 y.7th s.Notes] • y.6th = the note that is an augmented sixth 4 = ∧ ∈ above y.bass. sp (s, y) 1[y.7sus4 y.7th s.Notes] 5 = ∧ ∉ 5 Kristen Masada and Razvan Bunescu: A Segmental CRF Model for Chord Recognition in Symbolic Music

X alen(s, y) 1[n y.7th] n.len mine s.E e.bass. The corresponding features will be: = n s.Notes = ∗ ∈ ∈ X f24(s, y) 1[y.root min e.bass] rlen(s, y) 1[n y.root] n.len = = e s.E = n s.Notes = ∗ ∈ ∈ f25(s, y) 1[y.third min e.bass] sp (s, y) 1[y.7sus4 alen(s, y) rlen(s, y)] = = e s.E 6 = ∧ > ∈ f26(s, y) 1[y.fifth min e.bass] = = e s.E ∈ The duration-weighted versions are as follows: f27(s, y) 1[ y.added y.added min e.bass] = ∃ ∧ = e s.E ∈ rlen(s, y) The duration-weighted version of the bass features sp7(s, y) X weigh each chord tone by the time it is used as the = n.len n s.Notes lowest note in each segment event, normalized by the ∈ alen(s, y) duration of the bass notes in all the events. For the sp8(s, y) 1[y.7sus4] X root, the feature is computed as shown below: = ∗ n.len n s.Notes X ∈ 1[e.bass y.root] e.len e s.Events = ∗ f28(s, y) ∈ X = e.len We also define accent-weighted versions of sp7 and e s.Events ∈ sp , as well as segment-based duration-weighted fea- 8 Similar features f29 and f30 are computed for the third tures: and the fifth. The duration-weighted feature for the X added tone is computed as follows: 1[y.root e] e.len X e s.Events ∈ ∗ 1[ y.added] 1[e.bass y.added] e.len sp9(s, y) ∈ X ∃ ∗e s.E = ∗ = e.len f31(s, y) ∈ X e s.Events = e.len ∈ e s.E ∈ The corresponding accent-weighted features f , f , C.2.3 Figuration-Controlled Chord Coverage 32 33 f34, and f35 are computed in a similar way, by replacing For each chord coverage feature, we create a the bass duration e.bass.len in the duration-weighted figuration-controlled version that ignores notes that formulas with the note accent value e.bass.acc. were heuristically detected as figuration, i.e. replace All duration-weighted and accent-weighted fea- s.Notes with s.NonFig(y) in each feature formula. tures are discretized using the default bin set B.

C.3 Bass C.3.1 Bass Features for Augmented 6th Chords The provides the foundation for the Similar to the chord coverage features in Ap- of a musical segment. For a correct segment, its bass pendix C.2.1, we assume that the indicator y.AS is mul- note often matches the root of its chord label. If the tiplied into all features in this section, which means bass note instead matches the chord’s third or fifth, they are non-zero only for augmented 6th chords. or is an added dissonance, this may indicate that the as8(s, y) 1[s.e1.bass y.bass] chord is inverted. Thus, comparing the bass note with = = as (s, y) 1[s.e .bass y.3rd] the chord tones can provide useful features for deter- 9 = 1 = as (s, y) 1[s.e .bass y.6th] mining whether a segment is compatible with a chord 10 = 1 = label. As in Appendix C.2, we implicitly multiply each as (s, y) 1[(y.f r 6 y.ger 6) s.e .bass y.5th] 11 = ∨ ∧ 1 = of these features with y.Triad so that they are non-zero only for triads and chords with added notes. as12(s, y) 1[y.bass min e.bass] = = e s.E ∈ There are multiple ways to define the bass note of a as13(s, y) 1[y.3rd min e.bass] = = e s.E segment s. One possible definition is the lowest note of ∈ as14(s, y) 1[y.6th min e.bass] the first event in the segment, i.e. s.e1.bass. Comparing = = e s.E ∈ it with the root, third, fifth, and added tones of a chord as15(s, y) 1[y.FG y.5th min e.bass] = ∧ = e s.E results in the following features: ∈ We define the following duration-weighted version for the augmented sixth bass and fifth. f (s, y) 1[s.e .bass y.root] 20 = 1 = X 1[e.bass y.bass] e.len f21(s, y) 1[s.e1.bass y.third] = = e s.E = ∗ f (s, y) 1[s.e .bass y.fifth] as16(s, y) ∈ X 22 = 1 = = e.len f (s, y) 1[ y.added s.e .bass y.added] e s.E 23 = ∃ ∧ 1 = ∈X 1[e.bass y.5th] e.len e s.E = ∗ An alternative definition of the bass note of a seg- as17(s, y) 1[y.FG] ∈ X = ∗ e.len ment is the lowest note in the entire segment, i.e. e s.E ∈ 6 Kristen Masada and Razvan Bunescu: A Segmental CRF Model for Chord Recognition in Symbolic Music

C.3.2 Bass Features for Suspended and Power Chords • Augmented 6th chords: Italian 6th (it6), French The indicator y.SP is multiplied into all features in this 6th (fr6), and German 6th (ger6). section like in Appendix C.2.2, meaning they are non- • Suspended chords: suspended second (sus2), zero only for suspended and power chords. suspended fourth (sus4), dominant seventh sus- pended fourth (7sus4). sp (s, y) 1[s.e .bass y.root] 10 = 1 = • Power (pow) chords. sp (s, y) 1[y.sus2 s.e .bass y.2nd Correspondingly, the chord bigrams can be generated 11 = ∧ 1 = ∨ (y.sus4 y.7sus4) s.e .bass y.4th] using the feature template below: ∨ ∧ 1 = sp12(s, y) 1[s.e1.bass y.5th] g (y, y0) 1[(y.mode, y0.mode) M M = = 1 = ∈ × sp13(s, y) 1[y.7sus4 s.e1.bass y.7th] = ∧ = (y.added, y0.added) { ,4,6,7} { ,4,6,7} ∧¯ ¯ ∈ ; × ; ¯y.root y0.root¯ {0,1,...,11}] ∧ − = sp14(s, y) 1[y.root min e.bass] = = e s.E ∈ Note that y.r oot is replaced with y.bass for augmented sp15(s, y) 1[y.sus2 y.2nd min e.bass = ∧ = e s.E ∨ 6th chords. Additionally, y.added is always none ( ) ∈ ; (y.sus4 y.7sus4) y.4th min e.bass] for augmented 6th, suspended, and power chords. ∨ ∧ =e s.E ∈ Thus, g1(y, y0) is a feature template that can generate sp16(s, y) 1[y.5th min e.bass] = = e s.E (3 triad modes 4 added + 3 aug6 modes + 3 sus ∈ × 2 sp17(s, y) 1[y.7sus4 y.7th min e.bass] modes + 1 pow mode) 12 intervals = 4,332 distinct = ∧ = e s.E × ∈ features. To reduce the number of features, we use The duration-weighted version for the root and sev- only the (mode.added)–(mode.added)’–interval combi- enth are computed as follows: nations that appear in the manually annotated chord bigrams from the training data. X 1[e.bass y.root] e.len e s.E = ∗ C.5 Chord Changes and Metrical Accent sp18(s, y) ∈ X = e.len In general, repeating a chord creates very little ac- e s.E cent, whereas changing a chord tends to attract an ac- ∈ X 1[e.bass y.7th] e.len cent (Aldwell et al., 2011). Although conflict between e s.E = ∗ sp19(s, y) 1[y.7sus4] ∈ X meter and harmony is an important compositional re- = ∗ e.len source, in general chord changes support the meter. e s.E ∈ Correspondingly, a new feature is defined as the accent C.3.3 Figuration-Controlled Bass value of the first event in a candidate segment: For each bass feature, we create a figuration-controlled f (s, y) s.e .acc version that ignores event bass notes that were heuris- 36 = 1 tically detected as figuration, i.e. replace e s.Events ∈ with e s.Events e.bass s.Fig(y) in each feature for- References mula. ∈ ∧ ∉ Aldwell, E., Schachter, C., and Cadwallader, A. (2011). Harmony and Voice Leading. Schirmer, 4th edition. C.4 Chord Bigrams Denyer, R. (1992). The Handbook: A Unique The arrangement of chords in chord progressions is Source Book for the Guitar Player. Knopf. an important component of harmonic syntax (Aldwell Radicioni, D. P. and Esposito, R. (2010). BREVE: an et al., 2011). A first-order semi-Markov CRF model can HMPerceptron-based chord recognition system. In capture chord sequencing information only through Advances in Music Information Retrieval, pages the chord labels y and y0 of the current and previ- 143–164. Springer Berlin Heidelberg. ous segment. To obtain features that generalize to un- Taylor, E. (1989). The AB Guide to Part 1. seen chord sequences, we follow Radicioni and Espos- ABRSM. ito (2010) and create chord bigram features using only the mode, the added note, and the interval in between the roots of the two chords. We define the possible modes of a chord label as follows:

M {maj, min, dim} = {it6, fr6, ger6} ∪ {sus2, sus4, 7sus4, pow} ∪ Other than the common major (maj), minor (min), and diminished (dim) modes, the following chord types have been included in M as modes: