Quick viewing(Text Mode)

Exploring the Ii-Vi Chord Progression

Exploring the Ii-Vi Chord Progression

CHAPTER 9: EXPLORING THE II-V-I CHORD PROGRESSION

The purpose of this chapter is to explore an important source of in Western , a musically related sequence of chords called a chord progression. The chapter begins by discussing chord progressions in the context of establishing tonality. It then describes an important chord progression in called the ii-V-I progression. The goal of this chapter is to train a number of different networks on this progression; when provided one chord, the network is trained to respond with the next chord in the progression that should be played. While all of the networks in the chapter are trained on the same jazz progression problem, different networks use different codes to represent input and output chords. The question of interest is whether the choice of encoding has any effect on the ease of discovering a solution to the progression problem. We present results that show that this is definitely true: when an abstract encoding is employed, a multilayer perceptron with many hidden units is required to learn the chord progression. In contrast, when different encodings are used the same problem can be learned by a perceptron that has no hidden units at all. We demonstrate a practical implication of this by demonstrating that a simple network can be easily interpreted. We end the chapter by pointing out that other factors might influence encoding selection; depending on the goals of a simulation, one may not always be seeking the encoding that leads to the simplest network solution.

9.1 Tonality and Chord Progressions ...... 2 9.2 The ii-V-I Progression ...... 5 9.3 The Importance of Encodings ...... 8 9.4 Four Encodings of the ii-V-I Problem ...... 10 9.5 Simulations With Pitch-Class Encoding ...... 15 9.6 Simulations Using Pitch Encodings of Forms 17 9.7 Simulations Using Pitch Encodings of Inverted Forms ...... 18 9.8 Simulations Using Encodings ...... 19 9.9 Interpreting A Lead Sheet Perceptron ...... 20 9.10 A Progression of Progressions ...... 26 9.11 Summary and Implications ...... 31 9.12 References ...... 34

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 2

9.1 Tonality and Chord Progressions Tonality is a central characteristic of of balanced support of the tonic, like two Western music (Piston, 1962; Schoenberg, equidistant weights on either side of a 1969). Tonality is the sense that a fulcrum” (Piston, 1962, p. 31). Alternatively, composition belongs to a particular musical some tones are intrinsically unstable in a key, and has been a topic central to many of particular musical key; their presence the earlier parts of this book, including produces a musical tension that demands Chapter 3 on identifying scale tonics, resolution in the form of hearing a more Chapter 4 on identifying scale modes, and stable musical entity. Chapter 5 on key finding. Establishing tonality, then, also requires In what is known as the era of ‘common communicating and exploiting the various practice’ in Western , which relationships amongst tones in a musical spans the 18th and 19th centuries, key. “It is more a process of setting forth the establishing a musical key almost always organized relationship of these tones to one meant establishing a major or minor key of among them which is to be the tonal center” the type that we have already encountered (Piston, 1962, p. 31). In common practice in previous chapters. Although other modes the process of establishing tonality and its can and have been used in Western structure is accomplished by using . compositions, they are essentially ignored by common practice’s focus on the ‘major- The basic element of harmony is the minor system’ of music. “We are so imbued musical interval, the simultaneous presence with this tradition that we tend to interpret of two tones a specific musical distance music based on other modes as being in apart. Chords involve the presentation of either major or minor, usually with somewhat more than two simultaneous tones, and unsatisfactory results” (Piston, 1962, p. 30). therefore the presence of more than one musical interval. Earlier in Chapter 7 we In our earlier chapters on the modality saw how one could construct particular and tonics of musical scales we noted that chords (triads and ) on each of one could define a musical key by the degrees of a musical scale (Figure 7- constraining pitch-class choice: specifically, 15). Accounts of harmony often begin by by only using tones that belonged to a considering the use of, and the relationship particular scale. For instance, to set a between, triads (Piston, 1962; Schoenberg, composition in the key of C major a 1969). composer would only select its tones from the C , and not employ tones Just as the presence of a single tone that do not belong to this scale. cannot by itself establish a musical key, the occurrence of a triad in isolation cannot However, establishing tonality is more establish tonality. “A triad standing alone is complex than merely restricting the use of entirely indefinite in its harmonic meaning; it particular tones. The tones that define a may be the tonic of one tonality or one musical scale have an organized degree of several others” (Schoenberg, relationship to one another, relationships 1969, p. 1). In order for tonality to be that even listeners with no musical training established, a succession of triads must be are aware of (Krumhansl, 1990). Because presented. The succession must be of these relationships, different tones in a structured so that there is the relationship scale – typically identified by their scale from one triad to the next takes the listener degree (i.e. the Roman numerals used to an intended goal. Such a structured earlier in Figures 4-11 and 7-15), such as I succession of chords is called a chord for the tonic, IV for the , and V progression. In jazz a chord progression is for the dominant – have specific tonal often called ‘the changes’. functions. There are two strongly related aspects to For example, “dominant and creating a chord progression: establishing a subdominant seem to given an impression sequence of chord roots, and defining the

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 3 structure of each chord to establish an imperfect or a half cadence appropriate . Let us briefly ends on a V chord, and can be preceded by consider these two aspects in turn. any of a number of different chords (e.g. IV or I). The tension produced by ending on Defining a particular succession of the V chord provides a clear signal that chords requires considering the succession further music is coming. It is “like a comma, of chord roots independently of the form (the indicating a partial stop in an unfinished inversion) of each chord. “Chord succession statement” (Piston, 1962, p. 60). can be reduced to root succession (or root progression), which in turn can be translated Root progressions can also be used to into Roman numerals representing a perform other functions, such as modulating succession of scale degrees” (Piston, 1962, from one musical key to another p. 18). Piston notes that common practice (Schoenberg, 1969). We saw earlier in reveals a set of typical root progressions Chapter 4 that different musical scales are which are summarized in Table 9-1. Each similar to one another because they share row of this table provides the root of the many tones. As a result the same chord can current chord in a progression, and notes be found in more than one musical key; the root of the next chord that typically these are called common chords. follows, that sometimes follows, and that less often follows. For instance its first row For instance, A minor is a common chord can be interpreted as “If the current chord found in both the key of C major (where it is has I as its root, it is typically followed by a the built on the VI scale degree) and in the IV or by a V chord, it is sometimes followed key of G major (where it is built on the II by a VI chord, and it is less often followed by scale degree). One can therefore use A a II or III chord.” minor as a pivot chord in a cadence that modulates the key of a composition from C Less Root Of Typically Sometimes major to G major. Often Current Followed Followed Followed Chord By By By Root successions are only aspect of a I IV or V VI II or III chord progression. A second strongly II V VI I, III, or IV related aspect is voice leading (see the III VI IV II or V earlier discussion of this term in Section IV V I or II III or VI 4.5.3). In choral music different voices V I VI or IV III or II perform the component notes of each chord VI II or V III or IV I VII III in a progression. In addition to defining the Table 9-1. The usual progression of chord roots in succession of chord roots in this common practice. See text for details. progression, a composer of choral music must also decide which voice is to move Why do the root progressions from one tone in the first chord to another summarized in Table 9-1 emerge from tone in the second. common practice? The reason is that the relations amongst chords with these roots in Common practice adopts principles that terms of a particular tonal center are such lead to efficient voice leading, which that they instill a particular musical direction attempts to minimize the musical distance to a listener. For instance, the progression travelled by each voice as it moves from from V to I defines the root progression of tone to tone in successive chords what is called the perfect cadence, which is (Tymoczko, 2006, 2008, 2011). In a musical phrase that produces a satisfying compositional terms, this means choosing dispersal of tension that can be used to the form of each chord – in particular, the signify the end of a phrase or of a inversion of each chord (see the discussion composition. A similar, but less satisfying, of Figure 6-1) – that leads to the most effect is produced by the plagal cadence efficient voice leading. that proceeds from IV to I. Although the notion of efficient voice Other chord progressions intensify leading is typically framed in the context of tension instead of relieving it. For example, choral music, it plays an important role in

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 4 other kinds of composition as well. For of Scales and Melodic Patterns (Slonimsky, instance, when performing chord 1947). This book was written for an progressions on a piano efficient voice intended audience of classical composers, leading translates into using chord forms but was part of Coltrane’s daily practice that minimize finger movements from one regimen in the 1950s. “Slonimsky may be chord to the next (Sudnow, 1978). We will the most direct link between John Coltrane see later in this chapter, and again in and structural principles of the late Chapter 10, that using efficient voice leading nineteenth century” (Demsey, 1991, p. 155). to motivate the encoding of musical problems for a network there can be a The current chapter presents our first profound impact on network simplicity. attempt at examining harmonic structure in the context of successions of chords. It The harmonic structures of common does so by exploring a common and practice are not restricted to classical music. important progression called the also uses these principles, ii-V-I progression. Artificial neural networks although typically in simpler form (e.g. learn this progression in the sense that Schoenberg, 1969, p. 2). Chord when presented some chord (in a particular progressions also define the structure of musical key) that belongs to the most jazz pieces (Sudnow, 1978); a recent progression, the network will respond with study examines ‘lead sheets’ – an the next chord (in the same musical key) abbreviated notation of chord progressions - that belongs to the progression. - that define a jazz repertory of 1,186 songs (Broze & Shanahan, 2013). The current chapter uses the ii-V-I progression to introduce another key idea The harmonic structure of jazz can be as that must be considered when using artificial complex as that found in common practice. neural networks to explore music: encoding. Indeed, there exist strong relationships In any musical task, a researcher must between the harmonic structures of jazz and make design decisions about how to of classical music. represent musical stimuli for a network, and how to represent the musical responses of a For instance, radical new network. A primary goal of the current were introduced to jazz in the 1940s and chapter is to show that the choice of an 1950s by such be-bop pioneers as encoding can substantially impact the nature Thelonious Monk, Charlie Parker, and Dizzie of the network that learns the chord Gillespie (Kelley, 2009). However, analyses progression. of its structure reveals the same tonal hierarchies that are the foundation of Recall from Sections 3.5 and 4.1 that the common practice harmony (Jarvinen, 1995). complexity of a classification problem is “The underlying structures of two different- reflected in the complexity of the network sounding pieces of music, for example a that learns to perform the classification. Schubert lied and an improvisation by Hank Simple classification problems, such as the Mobley, share a remarkably similar tonal identification of scale tonics, can be hierarchy” (Jarvinen, 1995, p. 435). performed by perceptrons which do not contain hidden units. More complex Similarly, a casual listen to the free-form classification problems, such as the improvisations of saxophonist John Coltrane identification of scale mode, require more does not reveal traditional harmonic complicated networks that include hidden structure. His music can easily be described units (i.e. multilayer perceptrons). as a radical departure from be-bop (Porter, 1998). However, careful analysis of We will see that when one encoding of Coltrane’s music reveals structures that are the ii-V-I progression problem is used, a inspired by classical music (Demsey, 1991). fairly complicated multilayer perceptron is In particular, Demsey discovered that required for its solution. However, if the sections of songs on Coltrane’s seminal identical jazz progression problem is Giant Steps album were strongly related to encoded in a different fashion then a simpler exercises in Nicolas Slonimsky’s Thesaurus network can solve the problem.

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 5

9.2 The ii-V-I Progression 9.2.1 Three Tetrachords Per Key Western music. Figure 9-1 presents a musical score which represents the chords The topic of the current chapter is a in this progression in every key, with each succession of chords called the ii-V-I chord chord in its root position. progression. This progression is extremely important and popular; it is likely the most Chord commonly encountered in jazz (Levine, Key ii V I 1989). In its most basic form this A Bm7 E7 Amaj7 progression involves three different A# or Cm7 F7 A#maj7 or tetrachords, each defined in the same B♭ B♭maj7 musical key; as a result the ii-V-I B C#m7 or F#7 or Bmaj7 progression can be written for each of the D♭m7 G♭7 twelve different major keys in Western music. The three chords in any of these C Dm7 G7 Cmaj7 versions of the progression are constructed C# or D#m7 or G#7 or C#maj7 or using particular notes in a major scale as D♭ D♭m7 A♭7 D♭maj7 their root; the scale used defines the key of D Em7 A7 Dmaj7 the progression. D# or Fm7 A#7 or D#maj7 or E♭ B♭7 E♭maj7 The first in the ii-V-I E F#m7 or B7 Emaj7 progression is the minor chord G♭m7 constructed using the second note of the major scale as its root. This is the ii chord; F Gm7 C7 Fmaj7 its Roman numeral name is written in lower F# or G#m7 or C#7 or F#maj7 or case because it is minor, and also indicates G♭ A♭m7 D♭7 G♭maj7 the position of the chord’s root in the major G Am7 Dm7 Gmaj7 scale for the chord’s musical key. For G# or A#m7 or D#7 or G#maj7 or instance, the second note in the C major A♭ B♭m7 E♭7 A♭maj7 scale is D, so the ii tetrachord for the key of Table 9-2. The three tetrachords that define C is Dm7 which includes the notes D, F, A the ii-V-I progression for each major key. and C. Where appropriate two different enharmonic names of the same chord are provided. The second tetrachord in the ii-V-I progression is the dominant seventh The ii-V-I progression is important in jazz tetrachord constructed using the fifth note of compositions for several reasons. First, it its key’s major scale as its root. In the C establishes tonality. For any major key, the major scale this note is G, so in the key of C most stable tones are notes I, IV, and V the V chord in the progression is G7 which (Krumhansl, 1990), and the most stable uses the notes G, B, D, and F. chords are the ones built on those three notes. In other words, the ii-V-I progression The tetrachord in the ii-V-I involves two of the most stable pitch-classes progression is the major seventh tetrachord of a major key, including chords built using constructed using the first note of its key’s the I and V pitch-classes as roots. major scale as its root. In the C major scale this note is C, so in the key of C the I chord Second, in the perception of chord in the progression is Cmaj7 which contains sequences there are definite preferences for the notes C, E, G, and B. the IV chord to resolve into the V chord, and for the V chord to resolve into the I chord, The procedure described above for producing the IV-V-I progression that is constructing the three chords in the key of C common in in classical music is used to construct the ii-V-I progression (Bharucha, 1984; Jarvinen, 1995; Katz, using any other major scale. Table 9-2 1995; Krumhansl, Bharucha, & Kessler, provides the three chords in this progression 1982; Rosner & Narmour, 1992). The role for each of the possible major keys in of the IV chord in this relationship can also

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 6 be served by a ii chord because of its minor in one key to the first chord of the nature. (Steedman, 1984). Thus the ii-V-I progression in a key that is a full tone lower. progression is a powerful tool for For instance, in the key of C the progression establishing the tonality of a musical piece, ends with Cmaj7; a performer can easily operating in an analogous fashion to the IV- move from this chord to a Cmin7 which is V-I. the first chord of the ii-V-I progression in the key of B♭. As a result, one can move from Third, the ii-V-I progression lends itself to one key to another, providing variety but a further progression of chord progressions. also establishing tonality because the same That is, the ii-V-I progression in one key progression is used in different but related leads naturally to the ii-V-I progression in a keys. different key. In particular, it is very easy to move from the last chord of the progression

Figure 9-1. The ii-V-I progression for every key with chords represented in root position. Each pair of bars presents the three tetrachords that provide the progression in a particular key; the key is defined by the major that ends the progression.

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 7

9.2.2 The ii-V-I Progression Problem encoding its tetrachords using different representational formats. We can encode In the current chapter we will be the input and output chords for the ii-V-I interested in training networks to generate progression problem in a number of different the ii-V-I progression in any key; we will not ways. Of particular interest in the current be concerned with building networks to chapter is whether the choice of encoding generate a progression of these impacts the complexity of the network progressions from key to key. required to learn the progression.

The ii-V-I progression problem is defined Before the results of training networks on by considering it from the perspective of the ii-V-I progression problem let us first pattern classification. In pattern discuss the importance of encoding, and classification, a network generates a how there are several different approaches discrete class name to a presented stimulus; to encoding tetrachords that are worthy of the name that it generates classifies the exploration, and which may have an effect input pattern. on the kind of network required to learn the problem. The ii-V-I progression can be viewed as involving exactly this sort of pattern classification. Imagine a situation in which a tetrachord is being presented to a network. Our goal is to have the network classify this input chord by generating a class name. However, in the ii-V-I progression problem, the discrete class name that is output is in fact another tetrachord. In particular, when presented one chord, the network’s task is to generate a representation of the next chord in the progression.

For example, consider the ii-V-I progression in the key of C, which involves the Dmin7, G7, and Cmaj7 chords. We want to train a network so that when Dmin7 is presented to its input units it responds with a representation of G7 in its output units. Similarly, when G7 is presented to its input units, it should generate Cmaj7 in its output units. We want analogous behavior from the network for the other eleven possible musical keys. Each key involves defining two input/output pairs, one involving the minor seventh and the dominant seventh chords, the other involving the dominant seventh and the major seventh chords. A is never used as an input pattern, and when properly trained the network will never generate a minor seventh chord as a response. The entire training set consists of 24 different input/output pattern pairs.

One focus of the current chapter is on the nature of an artificial neural network that can learn the ii-V-I progression problem. A second focus concerns the effects of

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 8

9.3 The Importance of Encodings 9.3.1 Dasein and Design (Winograd & Flores, 1987). Embodied cognitive science attempts to explain In Being and Time (Heidegger, cognition by focusing on the intrinsic 1927/1962), philosopher Martin Heidegger relationships between agents, their bodies, explored a fundamental question: what does and the structure of the world (Calvo & it mean for an entity to exist? In attempting Gomila, 2008; Chemero, 2009; Dawson, to answer this question, Heidegger 2013; Dawson, Dupuis, & Wilson, 2010; investigated different modes of being, and Shapiro, 2011, 2014; Varela, Thompson, & introduced the concept Dasein, which Rosch, 1991). Embodied cognitive science literally means ‘there-being’, and which is can easily be described as a school of typically translated as Being-in-the-world. thought that opposes many of the key assumptions of connectionist cognitive Being-in-the-world is a notion that human science (Dawson, 2013). existence can only be defined by recognizing that this existence is embedded Heidegger’s concept of readiness-to- or immersed in the day-to-day world. hand has played a key role in the debate “Dasein’s understanding of Being pertains between different schools of cognitive with equal primordiality both to an science (Vera & Simon, 1993; Winograd & understanding of something like a ‘world’, Flores, 1987). Furthermore, this concept and to the understanding of the Being of has served the purpose of finding links those entities which become accessible in between embodied cognitive science and the world” (Heidegger, 1927/1962, p. 33). the science of design. Winograd and Flores took readiness-to-hand as evidence of good What does it mean for entities in the design; we only become aware of equipment world to become accessible? Heidegger itself when the structural coupling between proposed that part of an agent’s world, equipment, and agent breaks down. engagement with the world involves using In other words, if we are aware of the equipment. Equipment consists of entities existence of a tool, then the tool is poorly that are experienced by agents in terms of designed. the potential actions or experiences that they make available. Thus Heidegger’s The invisibility of artifacts – the notion of equipment seems similar to the readiness-to-hand of equipment – is also later notion of affordance central to the frequently characterized as being evidence ecological theory of perception (Gibson, of good design (Dourish, 2001; Norman, 1979). According to Gibson (p. 127) "the 1998, 2002, 2004). Winograd and Flores affordances of the environment are what it took the goal of designing equipment, such offers the animal, what it provides or as human-computer interfaces, to be furnishes, either for good or ill." creating artifacts that are invisible to us when they are used. “A successful word Heidegger (1927/1962) also introduced processing device lets a person operate on the notion of readiness-to-hand as a the words and paragraphs displayed on the property of equipment. Readiness-to-hand screen, without being aware of formulating occurs when agents are properly engaged and giving commands” (Winograd & Flores, with equipment. With readiness-to-hand an 1987, p. 164). entity’s affordances are properly experienced, but its other properties (like its 9.3.2 Solutions by Design physical existence) disappear. It is as if we are able to use a tool to interact with the Readiness-to-hand is not only relevant to world, and can experience this use, but the artifacts and their design, but is also tool itself is invisible to us. important to problem-solving. In cognitive science, problem-solving is typically Heidegger’s philosophy plays a founding described as searching a problem space role in an important school of cognitive (Newell & Simon, 1972). A problem space science called embodied cognitive science is a representation of the current state of

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 9 knowledge about a problem. Applying a rule Edison, ‘I would simply take that bulb, fill it to change our knowledge of the problem is with a liquid, and measure its volume equivalent to moving to a new location (from directly’.” one state of knowledge to the next) in the problem space. Solving a problem involves Edison’s example shows that one can searching the problem-space to find a route represent a problem in such a way that its that moves one from the problem’s initial solution becomes trivial. That is, with the state through intermediate states of proper encoding, a problem’s solution knowledge, finally ending at a solution to the exhibits readiness-to-hand: the solution is problem. immediately apparent, and the process of searching for the solution is so trivial that it The amount of time required to search becomes invisible. through a problem space to find a route to the problem’s solution reflects problem That problem representations make difficulty. The longer the search, the harder problem solving exhibit readiness-to-hand is the problem. was central to Herbert Simons’ account of the sciences of the artificial (Simon, 1969). Crucially, search complexity depends in Simon recognized the importance of finding part upon the manner in which states of problem representations that worked by knowledge about the problem are encoded. revealing solutions effortlessly: "All If a problem is encoded using one mathematical derivation can be viewed representational scheme, then its solution simply as change in representation, making may require a long and difficult search. evident what was previously true but However, if the same problem is encoded in obscure." (Simon, 1969, p. 77). a different format, then its difficulty can be drastically reduced. Simon argued that a great many different disciplines, including cognitive science, were For example, consider this famous in reality sciences of design because they anecdote that emerged from Thomas studied the interface between inner and Edison’s laboratory in Menlo Park outer environments. When this interface is (Josephson, 1961). When initially hired, optimal, it exhibits readiness-to-hand, and mathematical physicist Francis Upton’s first disappears from experience. As a result, "in task for Edison was to calculate the volume large part, the proper study of mankind is of a pear-shaped glass bulb used for the science of design." (Simon, 1969, p. 83). experiments on electric lighting. Upton represented this problem in a format suited The theme of the current chapter and the for mathematical analysis. “Upton drew the next is to explore the connectionist cognitive shape of the bulb exactly on paper, and got science of music in the context of efficient the equation of its lines, with which he was design. In particular, it is possible to use going to calculate its contents” (Josephson, many different encodings of the same 1961, p. 193). After an hour, Edison asked musical problem. Even though the musical Upton for the results, and was told that the problem remains constant, changing the mathematician was only halfway done and problem’s encoding can make it much more needed more time. Edison responded that a difficult – or much easier – for a network to different representation of the problem learn. would produce faster results: “‘Why’, said

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 10

9.4 Four Encodings of the ii-V-I Problem In designing a training set to be used to 9.4.2 Pitch Encoding teach a network the ii-V-I progression, one must decide how to represent tetrachords All of the chords in the ii-V-I progressions both as stimuli and as responses. Ideally, of Figure 9-1 are presented in root position. the choice of representation would be That is, the lowest note of each tetrachord in ‘theory neutral’ (Pylyshyn, 1984): regardless either score is the chord’s root: the lowest of our choice of representation, the results of note of Dm7 is D, the lowest note of G7 is G, training a network on the task would be the and so on. same. Not surprisingly, though, this ideal situation does not arise: different choices of One consequence of having each how to represent tetrachords for the network tetrachord in root position is that there is a lead to very different simulation results. marked similarity in chord ‘shape’, which is the spacing between adjacent notes in the Let us first describe four plausible chord. Tetrachords of the same type (minor methods for representing tetrachords to seventh, dominant seventh or major networks that must learn the ii-V-I seventh) have very similar shape: four notes progression. Later in the chapter we will that are evenly spaced as they are stacked present results that clearly show that these upon each other on the staff. choices are not theory neutral! One can imagine that the input units 9.4.1 Pitch-Class Encoding used for pitch-class encoding are the keys of a small piano. The mapping between the Most of the networks that have been input units and the piano keyboard is described earlier in this book have employed illustrated in Figure 9-2. However, this a pitch-class representation, which is the mapping reveals a possible disadvantage of first kind of encoding to consider for the ii-V-I pitch-class representation: when this progression. In this representation, only encoding is adopted the similarity of shape twelve units are required. Each unit between different chords of the same type is represents the presence or absence of one necessarily lost. That is, different spacing of the possible pitch-classes in Western between notes -- different chord inversions - music. - is required to fit any of the tetrachords from Figure 9-1 on this keyboard because of its One major advantage of pitch-class small size. representation is its simplicity: a very small number of input and output units are required to represent any of the different tetrachords that can occur in the progression. A pitch-class representation of the ii-V-I problem requires only 12 input units to represent an input tetrachord, and the same number of output units to represent the tetrachord response generated by the network.

In pitch-class encoding, as we have seen in earlier chapters, a tetrachord stimulus is represented by turning on the four input Figure 9-2. The mapping of the input units units that represent the chord’s component used for pitch-class encoding onto a 12-key piano keyboard. pitch-classes, and by turning all of the other eight input units off. For the ii-V-I problem This is demonstrated in Figure 9-3. This the network uses the same encoding to figure illustrates uses 12-key keyboards to represent its tetrachord responses in the represent four different tetrachords that are output units. of the same type (minor seventh chords) but

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 11 belong to different keys. Each belongs to shows that a pitch-class encoding is not the ii-V-I progression in a particular key. capable of preserving chord shape. However, to fit each of these chords onto the small keyboard, different chord shapes In order to create a representation that are required. preserves tetrachord shape, we must abandon the central assumption that serves For example, Figure 9-3 shows that the as the foundation of pitch-class encoding: Amin7 can be fit on this keyboard in root octave equivalence. We must adopt an position (the A is the lowest note, which is encoding that explicitly indicates that two the leftmost note colored grey in the different notes (e.g. C4, middle C, and C5, illustration). In contrast, Cmin7 must be fit the C an octave higher than C4) do not using its first inversion (C is the second belong to the same pitch-class, but are lowest note), Dmin7 must be fit using its instead distinct pitches. (D is the second highest note), and Gmin7 must be fit using its third Pitch encoding is an alternative to pitch- inversion (G is the highest note). class encoding, and abandons the octave equivalence assumption. In pitch encoding, each input unit represents the presence or absence of a particular pitch, and not of a pitch-class. For example, C4 and C5 are encoded with different input units. This is illustrated in Figure 9-4, which shows the mapping between pitches (particular piano keys) and input units over a two octave range. Note that in Figure 9-4 the input units are labeled as representing particular pitches (C4, C5, etc.) instead of pitch classes.

Figure 9-4. The mapping of the input units used for pitch encoding onto a 24-key piano keyboard.

In order to use pitch encoding to represent all of the tetrachords in Figure 9-1 more than 24 input units are required to capture all of the pitches. In our version of the problem, the highest key of the progression was G#, and the highest note was C#6 (the highest note in the D#7 Figure 9-3. Four different minor seventh tetrachords; notes that belong to the chord tetrachord for this key). Similarly, the lowest are shaded gre. To fit these chords on the 12- key of the progression was A, and as a key keyboard, four different chord shapes or result the lowest note that we used was A3 inversions are required. See text for details. (the lowest note in the Ama7 tetrachord for this key). As a result our pitch encoding of Using a representation that preserves a chords when all chords were in root position tetrachord’s shape could be critical, required 29 input units which represented all particularly if a chord’s shape provides of the pitches from A3 to C#6. information about its identity. Figure 9-3

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 12

9.4.3 Pitch Encoding of Inversions For instance, if one uses the second Pitch encoding can be used to represent inversion of every tetrachords in the form in which they are in the progression, then a ‘least action’ presented in Figure 9-1. However, as soon version of the progression emerges. The as octave equivalence is abandoned other second inversion of a dominant seventh versions of the ii-V-I progression problem chord is created by taking the two lowest are possible. notes in the chord’s root position and raising each an octave. Figure 9-5 provides a For instance, imagine that someone was version of the Figure 9-1 score in which interested in performing a ii-V-I progression each of the dominant seventh chords have on the piano. The Figure 9-1 score can been inverted. If one compares the Figure certainly be performed on this instrument. 9-5 score to the Figure 9-1 score, then the However, a pianist might prefer alternative difference in shape between the dominant versions of the chords that reduce the hand seventh tetrachords in each will be and finger movement required when one apparent. moves from one chord to the next.

Figure 9-5. The ii-V-I progression for each possible key. The score is identical to Figure 9-1 with the exception that all dominant seventh chords are all represented as second inversions. See text for details.

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 13

How does inverting the middle chord of fingers press the same keys in successive the ii-V-I progression enable least action chords for this version of the progression! movement for a pianist? Figure 9-6 illustrates voice leading – i.e. finger In short, an alternative approach to movements from one chord to the next – for encoding the ii-V-I progression problem is to the ii-V-I progression in the key of C to shed use pitch encoding, but to also take light on this issue. advantage of its flexibility by presenting dominant seventh chords in their second The top three keyboards in Figure 9-6 inversion form. One consequence of this is illustrate the voice leading when the that slightly fewer processing units are dominant seventh chord is in root position. required; all of the tetrachords can be The arrows indicate finger movements from encoding using 24 input units with the lowest chord to chord. Note that because the unit representing A3 and the highest unit middle chord is in root position, substantial representing G#5. movement from chord to chord is required: each finger moves to a different key to play 9.4.4 Lead Sheet Encoding the next chord, and the hand must move up and then back down along the keyboard. All of the encodings that have been described above represent each pitch-class or each pitch in a tetrachord. As a result, all involve activating four processing units, and turning all of the remaining processors off.

However, there are many other ways in which tetrachords could be represented and some of these representations are not concerned with detailed each note in a chord.

For instance, one popular approach to teaching adults how to play piano (Houston, 2004) attempts to simplify music reading by eliminating traditional of chords (notation like that found in Figures 9- 1 and 9-5). Instead chords are represented in what is called lead sheet notation: they are simply written as a combination of the name of one note (to provide the chord’s root) and some additional symbols which indicate the type of chord. For instance if one was using lead sheet notation for the ii- V-I progression in the key of C, the chords would merely be written as ‘Dm7’, ‘G7’, and ‘Cmaj7’.

We can easily create a lead sheet Figure 9-6. Voice leading for two versions of encoding for an artificial neural network that the ii-V-I progression. See text for details. is to learn the ii-V-I progression. This

encoding is very simple, and only requires The lower half of Figure 9-6 shows that if 15 processors as is illustrated in Figure 9-7. the middle chord is played in second Three of these processors are used to inversion form, much less movement is indicate a chord’s type, where only three required. The hand stays at the same chord types (m7, 7, maj7) are involved in the position along the keyboard, and moving ii-V-I progression problem. The remaining from one chord to the next only requires twelve processors represent the chord’s root changing the position of two fingers. Two pitch using pitch-class encoding. For

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 14 example, Figure 9-7 demonstrates how the disadvantage of requiring more processing Dm7 tetrachord can be represented by only units, but has the advantage of preserving activating two units: the unit that represents chord shape across keys. that the chord is a minor seventh and the unit that indicates that the chord’s root is the Third, because pitch encoding abandons pitch-class D. octave equivalence it also permits different chord inversions to be presented to the network. This raises permits us to explore the possibility that chord forms that are easier to play (because of their ‘least action’ shapes) may also be easier for a network to learn.

Figure 9-7. Lead sheet encoding of the Dm7 tetrachord for an artificial neural network. Finally, alternative encodings that are not See text for details. intent on representing every note in a chord can also be employed. One that was 9.4.5 Summary and Implications described for the ii-V-I progression problem is lead sheet encoding. This type of The sections above have discussed four encoding has the disadvantage of not different methods for encoding stimuli (and explicitly representing a chord’s pitches or responses) for the ii-V-I progression its shape. However, it has the advantage of problem. The first is pitch-class encoding being extremely simple because any chord which has been employed in previous in a ii-V-I progression can be represented by chapters. It has the advantage of simplicity, simply activating two processing units. requiring a reasonably small number of input units. However, it has the disadvantage of With these possible encodings of the ii-V- using different chord shapes to represents I progression described, we can now chords of the same type that come from investigate the effect of problem encoding different keys. on network learning. Does problem representation affect network complexity? The second is pitch encoding, which Does problem encoding alter the amount of abandons octave equivalence and training required for a network to solve the ii- represents notes that belong to the same V-I progression problem? pitch-class, but to different octaves, with different processors. This encoding has the

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 15

9.5 Simulations With Pitch-Class Encoding

Figure 9-8. A multilayer perceptron trained on the ii-V-I progression task. This network encoded the input and output tetrachords using pitch-class representation. See text for details.

9.5.1 Task progression problem requires hidden units, and if so, then how many? Results All of the networks to be described in the indicated that a network with 7 hidden value remaining sections of this chapter learn the units could reliably converge on a solution to ii-V-I progression problem that was either the two-chord-per-key or the three- described in Section 9.2 using one of the chord-per-key versions of the task. On encodings that were discussed in Section occasion, a network with 6 hidden value 9.4. The current section describes training units could solve the problem, but in most an artificial neural network when the inputs cases networks of this size failed to and outputs of the ii-V-I progression problem converge after 30,000 epochs of training or are represented using pitch-class encoding. more.

9.5.2 Training Set As a result, we decided that a multilayer perceptron with 12 output value units, 7 The networks described in Section 9.5 all hidden value units, and 12 input units, was use pitch-class encoding of the ii-V-I the most appropriate for learning the pitch- progression problem. The training set for class version of either ii-V-I progression this problem consists of 24 different task. The structure of such a network is input/output pairs, where each member of a illustrated in Figure 9-8. pair is a particular tetrachord 9.5.4 Training 9.5.3 Network Architecture When a multilayer perceptron was As all of the networks described in this trained on the ii-V-I progression problem, the section require 12 input units and 12 output learning rate was 0.01, and connection units because pitch-class encoding was weights were randomly initialized to values used. All of the output processors were in the range from -0.1 to 0.1. All µs were value units that employ the Gaussian started at zero, but were trained during activation function. learning. (When µs were held constant at 0 networks did not learn to solve the problem.) Pilot studies were conducted to Typically a network solved this problem in determine whether a network that uses between 3000 and 4000 epochs, where (as pitch-class encoding for the ii-V-I in previous chapters) convergence was

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 16 defined as generating a hit for every output that has no hidden units, then the encoding unit on every training pattern. has dramatically simplified the problem.

To quantify network performance we One of the key advantages of solving a conducted a small experiment in which ten problem with a simpler network is that the different multilayer perceptrons (our internal structure of a simpler network ‘subjects’) were trained to convergence should be easier to interpret. In addition, using the architecture and training settings there may be some important theoretical detailed above. Every one of these issues that simpler networks permit to be networks solved the problem. On average addressed. convergence was achieved after 3960.6 epochs of training (SD = 1196.1). So, let us first explore the results of training networks using different encodings When the number of hidden units was of the ii-V-I progression problem before reduced from 7 to 6, and the network was deciding on which network to interpret! trained using these settings, convergence was rarely achieved. However, on rare occasions a network was successful in learning the ii-V-I progression. When this occurred between 23,000 and 33,000 epochs of training were required.

9.5.5 Network Interpretation

With converged networks for the ii-V-I progression problem, the typical next step in our research program would be to interpret a network’s internal structure. There is no reason to believe that this could not be done for one of the converged networks that used pitch-class encoding. The fact that such a network has seven hidden units suggests that interpretation might be challenging and time consuming, but it is certainly tractable.

However, before attempting to do interpret one of the networks trained above we could explore different networks that learn the same problem, but used different input/output encodings. This is because it is possible that a change in encoding might produce a network that is simpler and is therefore easier to interpret.

For our purposes, ‘simpler’ has an objective definition: a network with fewer hidden units is simpler than a network with more hidden units. For instance, if a change of encoding permitted a network with only 4 hidden units to solve the ii-V-I progression problem, then this encoding makes the problem simpler than pitch-class encoding (which requires 7 hidden units, as illustrated in Figure 9-8). Furthermore, if a multilayer perceptron can be replaced by a perceptron

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 17

9.6 Simulations Using Pitch Encodings of Root Forms

Figure 9-9. A perceptron trained on the ii-V-I progression task. This network encoded the input and output tetrachords using pitch representation of chords in root position. Every input unit is connected to every output unit; only a subset of these connections are illustrated in the figure. See text for details.

9.6.1 Task 9.6.3 Training

The next networks to consider were also Training proceeded with a learning rate trained on the ii-V-I progression problem. of 0.1, and connection weights started However, the difference between these randomly in the range from -0.1 to 0.1. All networks and those discussed in Section 9.5 µs were held at zero throughout learning. is that the current networks used pitch Typically a perceptron would converge on a encoding (instead of pitch-class encoding), solution to the problem in fewer than 80 and encoded every tetrachord in root epochs of training, where convergence was position. defined as generating a hit for every output unit on every training pattern. We 9.6.2 Network Architecture conducted a small study in which ten different perceptrons were trained on this Because all of the networks described in problem. All ten ‘subjects’ learned to solve this section used pitch encoding of chords in the problem. With this encoding on average root position, they all employed 29 input and 63.7 epochs of training were required for a 29 output units. The lowest pitch network to learn the ii-V-I progression (SD = represented by an input (or output) unit was 7.36). A3, and the highest pitch represented by an input (or output) unit was C#6. All of the 9.6.5 Implications output processors were value units that employ the Gaussian activation function. Though the networks were trained on the same tasks, the choice of encoding had Importantly, pilot tests revealed that, enormous impact. When pitch-class unlike the networks described in Section 9.5, encoding was used multilayer perceptrons no hidden units were required to solve either that contained 7 hidden units were required problem when it was encoded in this to solve the two versions of the problem, fashion. By changing the encoding of the and did so after about 4000 epochs of input/output pairs it was now possible for a training. In contrast, encoding the same perceptron to discover a solution to either problem in terms of pitches resulted in a version of the ii-V-I problem! The network much simpler network – a perceptron – that capable of solving the problems is illustrated converged after only about 65 epochs of in Figure 9-9. training.

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 18

9.7 Simulations Using Pitch Encodings of Inverted Forms 9.7.1 Task training were required for a network to learn the ii-V-I progression (SD = 7.82). The next networks to consider were also trained on the ii-V-I progression problem While the networks of the current section using pitch encoding. However, the and those of Section 9.6 are all perceptrons, difference between these networks and it seems that the current networks converge those discussed in Section 9.6 is that the after less training than did those who only current networks took advantage of pitch faced chords in root position. We used an encoding’s flexibility and encoded dominant independent t-test to compare the seventh chords as second inversions. Minor performance of 10 ‘subjects’ trained under seventh and major seventh chords were still the conditions of Section 9.7.3 with the 10 encoded in root position. As discussed in networks that were discussed in Section Section 9.4.3 inverting the dominant seventh 9.6.3. This test revealed that the current chords in this way produces ‘least action’ networks converged to a problem solution transformations between chords in the ii-V-I significantly faster than did the previous set progression. of perceptrons (t = 5.005, df = 18, p < 0.001). 9.7.2 Network Architecture 9.7.4 Implications While the networks described in this section use pitch encoding (as did the In Section 9.4.3 it was argued that if networks described in Section 9.6), using dominant seventh chords in the ii-V-I the second inversion of dominant sevenths progression were in second inversion form meant that fewer input (and output) units then the progression is easier to play in the were required to represent chords. The sense that ‘least action’ is possible. A networks discussed in the current section pianist can move from chord to chord in the only require 24 input and output units. The progression by only changing the position of lowest pitch represented by a unit was A3, two fingers when the middle chord is and the highest pitch represented by a unit inverted in this way. was G#5. All other pitches between these two extremes were represented by an input There is no reason to expect that the (and output) processor. All of the output potential for ‘least action’ would have any units were value units. Once again pilot effect on network complexity or studies revealed that a perceptron was performance. This is because processing capable of learning the ii-V-I progression units do not map onto actions, particularly in with this representation of inputs and the computations involved when networks outputs. learn or respond to inputs.

9.7.3 Training Perhaps not surprisingly network complexity was not affected by using Training proceeded with a learning rate inverted chords, because a simple network – of 0.1, and connection weights started a perceptron – could learn the progression randomly in the range from -0.1 to 0.1. All whether second inversions were used or µs were held at zero throughout learning. not. More surprisingly, though, network Typically a perceptron would converge on a training was affected by the presence of solution to the problem in fewer than 60 second inversions. Networks were able to epochs of training, where convergence was take advantage of their presence to learn defined as generating a hit for every output the progression significantly faster than did unit on every training pattern. We networks that were only presented chords in conducted a small study in which ten root position. Possible implications of this different perceptrons were trained on this result are considered later in the chapter. problem. All ten ‘subjects’ learned to solve the problem. On average 46.7 epochs of

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 19

9.8 Simulations Using Lead Sheet Encodings

Figure 9-10. A perceptron that can learn the ii-V-I progression problem encoded in lead sheet format. solve the problem. On average only 5.8 9.8.1 Task and Architecture epochs of training were required for a network to learn the ii-V-I progression (SD = The last networks to consider were 0.422). trained on the ii-V-I progression problem using the lead sheet encoding that was 9.8.3 Implications described in Section 9.4.4. These networks require 15 input and output units to Once again these simulations reveal the represent chords using lead sheet notation. importance of exploring different encoding Three of these units represent chord type schemes. All of the networks that have (minor seventh, dominant seventh, major been described in this chapter have been seventh). The remaining 12 input units faced with learning the same input/output represent the root of the chord using pitch- mappings. However, the ease with which classes. All of the output units were value these mappings are acquired depends units. Once again pilot studies revealed that dramatically on the choice of encoding. If a perceptron like the one in Figure 9-10 was pitch-class encoding is used to encode the capable of learning the ii-V-I progression four pitch-classes in each tetrachord, then a with this representation of inputs and multilayer perceptron that contains 7 hidden outputs. value units is required to reliably achieve convergence in about 4000 epochs. In 9.8.2 Training contrast, if the very simple lead sheet encoding is required, then a perceptron can Training proceeded with a learning rate learn the identical input/output mapping, and of 0.1, and connection weights started do so after only about 6 epochs of training. randomly in the range from -0.1 to 0.1. All µs were initialized to a value of 0, but were We have now explored a wide range of modified during training. This is because architectures for learning the ii-V-I these networks would not converge when all progression under a variety of encodings. biases were held at zero. Typically a We have discovered that different encodings perceptron would converge on a solution to have profound impacts on both network the problem almost immediately, requiring complexity and on the amount of training only 5 or 6 epochs of training to generate a required to learn the ii-V-I progression hit for every output unit on every training problem. Let us next turn to exploring the pattern. We conducted a small study in internal structure of a couple of these which ten different perceptrons were trained networks. on this problem. All ten ‘subjects’ learned to

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 20

9.9 Interpreting A Lead Sheet Perceptron 9.9.1 Encoding and Interpretation properties are. However, interpreting this network is a challenging task, a task made Earlier in this chapter we noted that there less palatable with the knowledge that were many different ways in which the same simpler networks for the same problem are problem could be encoded for network also available for interpretation! training. The simulation results that were reported earlier revealed that the choice of A very different situation emerges with encoding had an enormous impact on pitch encoding. This encoding maps input network complexity. In particular, with one and output units directly onto a piano encoding the ii-V-I progression problem keyboard, and therefore makes chord shape could be solved with a value unit perceptron, an explicit property of chord codes. In this while with another the same problem sense, pitch encoding is more concrete: one required a multilayer network of value units could literally represent input and output that included seven hidden units. units with piano keys (e.g. figure 9-4), revealing how they might actually be played. The choice of encoding also has important implications for network Interestingly, this encoding might be too interpretation. Of course, this is largely concrete to reveal interesting musical related to network complexity: if a particular properties. The fact that a perceptron can encoding leads to a simpler network, then it solve the ii-V-I problem when this encoding is expected that such a network is easier to is used means that network interpretation interpret. However, other factors are also at reduces to examining the weights of direct play. connections between input and output units in the context of each output unit’s µ. One For instance, one could use the property might expect that this would reveal a ‘abstractness’ to compare and contrast the repeating pattern of connection weights that pitch-class encoding described in Section maps one chord shape into another. 9.4.1 with the pitch encoding described in Sections 9.4.2 and 9.4.3. However, when weights of trained networks were examined such repeating With pitch-class encoding a tetrachord is patterns of connection weights were not only defined by its component pitch-classes. found. Instead, the network’s structure was As a result, it fails to make explicit some too concrete: weights were assigned in such properties of chords that could be important. a way that a particular output unit would turn For instance, it was shown earlier in Figure on when a particular set of four input units 9-3 that this encoding eliminates information were activated, and off to other input about the shape of a chord (i.e. the relative patterns. However, all of the weights were spacing between the chord’s notes on a staff very specialized to each set of causal links. or on a keyboard). Various chords of the A general, repeated, pattern of connectivity same type have different shapes using this was not discovered and exploited. encoding. It is almost as if the perceptrons learned In other words, the multilayer perceptron to map individual notes to other individual of Figure 9-8 cannot learn the ii-V-I notes without recognizing that patterns of progression by simply learning to directly notes belonged together in a more abstract map the shape of an input chord into the category (e.g. as a tetrachord, or a shape of an output chord. Instead, the tetrachord of a particular type, or a hidden units have to capture some more tetrachord as a particular shape). This abstract property, which is why so many would be analogous to teaching a pianist hidden units are required in the network. teaching a novice the ii-V-I progression by Obviously the seven-dimensional hidden having them remember that when their unit space is capturing important musical fingers are here then next they will be there, properties, and in principle one could peer but without bothering to teach them that they into this network to uncover what these

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 21 the notes are related as entities called are revealed. For example, the spokes in chords. the bottom illustration of the figure reveal the progression for the key of B♭ major. The lead sheet encoding that was also explored seems to offer a compromise level of abstraction between the two types of encoding discussed above. On the one hand, it is an abstract encoding in the sense that it does not represent the individual notes of a chord, but instead makes explicit a chord’s root and its abstract type. On the other hand, the abstract properties that it makes explicit are not so abstract that a complicated multilayer perceptron is required to solve the ii-V-I progression. Indeed, when this type of network is interpreted some basic musical properties – in particular, the intervallic relationships between chord roots in the progression – are laid bare in a simple network structure.

The remainder of this section proceeds as follows. First, we will provide an alternative account of the ii-V-I progression using the circle of perfect fifths. Second, we will examine the connection weights of a perceptron trained with lead sheet encoding to demonstrate that it mirrors this geometric account of this particular progression.

9.9.2 Geometry of the ii-V-I

Earlier in this chapter it was noted that one key task in establishing tonality was deciding upon which roots to use in each of a succession of chords. The ii-V-I progression is interesting because the progression of roots for its three chords in any key can be determined by following a Figure 9-11. The circle of perfect fifths provides a map between the roots of the three particular map: the circle of perfect fifths. chords in the ii-V-I progression for any key. See text for details. This is illustrated in Figure 9-11. The top circle in the figure arranges the twelve pitch- Figure 9-11 demonstrates that the circle classes of Western music around the circle of perfect fifths can be used to map the of perfect fifths, so that adjacent pitch- transition from chord root to chord root in the classes are a musical interval of a perfect ii-V-I progression. When this progression is fifth apart. The middle circle adds spokes to encoded with lead sheet notation, it can be this circle, as well as chord names, to learned by a perceptron. One musical represent the three chords of the ii-V-I property that lead sheet encoding makes progression for the key of C major. Note explicit is the root of each tetrachord. We how the three spokes pick out three might therefore expect to find that the circle positions adjacent to one another along the of fifths is encoded in some fashion within circle of perfect fifths. If one rotates the the connection weights of this perceptron. three spokes to a different position within the Let us proceed with interpreting one of these circle, then the roots of three chords for the perceptrons to determine whether or not this same progression in a different musical key is indeed true.

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 22

9.9.3 Network Interpretation

Output Unit

Input Unit m7 7 maj7 A A# B C C# D D# E F F# G G# µ -0.68 0.33 -0.32 -0.53 0.55 -0.55 -0.51 0.57 0.55 0.58 0.54 0.57 0.54 0.56 0.54

m7 -0.35 -0.34 -0.65 -0.32 0.32 -0.32 -0.33 0.29 0.31 0.29 0.33 0.27 0.33 0.32 0.29

7 -0.39 0.68 0.36 -0.31 0.32 -0.33 -0.32 0.31 0.31 0.30 0.34 0.26 0.33 0.33 0.29 maj7 -0.09 -0.09 0.09 -0.06 -0.07 0.07 0.04 -0.02 0.08 -0.09 -0.09 -0.03 -0.03 -0.08 -0.06

A -0.07 0.04 -0.07 -0.11 0.16 -0.24 -0.19 0.08 -0.76 0.10 0.11 0.11 0.09 0.10 0.20

A# -0.04 -0.03 -0.08 -0.13 0.10 -0.07 -0.11 0.21 0.19 -0.78 0.06 0.16 0.07 0.05 0.09

B -0.03 -0.01 -0.03 -0.10 0.08 -0.09 -0.11 0.04 0.06 0.08 -0.78 0.12 0.09 0.08 0.10

C -0.17 -0.01 -0.06 -0.16 0.16 -0.10 -0.17 0.17 0.10 0.19 0.14 -0.72 0.10 0.06 0.21

C# -0.09 0.01 -0.03 -0.14 0.07 -0.09 -0.06 0.11 0.05 0.04 0.13 0.16 -0.77 0.08 0.12

D -0.03 0.01 0.00 -0.08 0.07 -0.12 -0.14 0.06 0.06 0.06 0.12 0.10 0.09 -0.79 0.13

D# -0.07 0.02 -0.02 -0.11 0.04 -0.14 -0.10 0.12 0.11 0.12 0.08 0.10 0.10 0.06 -0.76

E 0.01 0.03 -0.03 0.76 0.06 -0.09 -0.15 0.06 0.11 0.07 0.05 0.16 0.11 0.03 0.11

F -0.17 0.06 -0.08 -0.14 -0.77 -0.09 -0.15 0.05 0.17 0.03 0.07 0.09 0.09 0.04 0.04

F# -0.05 0.04 -0.03 -0.16 0.14 0.77 -0.14 0.20 0.18 0.18 0.09 0.17 0.14 0.17 0.16

G -0.01 0.07 -0.01 -0.12 0.07 -0.12 0.75 0.09 0.16 0.09 0.05 0.18 0.13 0.13 0.10

G# -0.17 -0.01 -0.07 -0.16 0.29 -0.06 -0.23 -0.73 0.17 0.11 0.19 0.23 0.17 0.18 0.17 Table 9-3. The connection weights for a perceptron that has learned the ii-V-I progression in lead sheet notation. Each row corresponds to an input source (µ or an input unit) and each column corresponds to an output unit.

A perceptron that learns the ii-V-I type output unit (a value around ±0.68). In progression using the lead sheet encoding contrast, the input unit for major seventh has 225 modifiable connection weights chords has a near zero connection weight to (because each of its 15 input units is each of the three chord-type output units. connected to each of its 15 output units) as well as 15 different µs (one for each output Second, there is a repetitive pattern of value unit). The value for each of these 240 connection weights between input units that components for one network that learned represent chord types and output units that the progression after 6 epochs of training is represent pitch-classes. In particular, minor provided in Table 9-3. Stored within this seventh and dominant seventh chord type table of numbers is this particular input units have nearly identical connection perceptron’s knowledge of the ii-V-I weights to the same output pitch-class unit, changes. Fortunately, an inspection of and all of these weights have a value around Table 9-3 indicates the presence of many ±0.32). In contrast, the major seventh input patterns that permit the network’s structure unit has a near zero connection weight to to be simplified for interpretation; these key any pitch-class output unit. elements are highlighted in the table. Third, each of the output units that First, there is a distinct pattern of represent a pitch-class has only one connection weights between pairs of input incoming connection weight that has an and output units that represent chord types. extreme value (around ±0.74). Importantly Two of these input units (for minor seventh this weight comes from the input unit that and dominant seventh chords) have two represents a pitch-class that is a moderate weights to chord type output units away from the output unit’s pitch-class. (with values around ±0.35) and one more extreme connection weight to a third chord-

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 23

Fourth, the remaining connection weights In short, the behavior of this particular between input and output units that perceptron can be simplified by focusing on represent pitch-class are either near zero in the subset of connection weights that are value, or have a relatively small value involved in activating output units when a (±0.17). For the purpose of network tetrachord is presented to the network. Only interpretation this means that these weights a handful of connection weights are involved can be ignored, because their values are in converting an input pattern into an output such that a signal sent through them will not response. This ‘functional pattern of result in the output unit turning on. connectivity’ for the Table 9-3 network is illustrated in Figure 9-12.

Figure 9-12. The functional pattern of connectivity for a perceptron that has learned the ii-V-I progression using lead sheet encoding. The connection weights and values for µ in this figure are taken from Table 9-3. See text for details.

Figure 9-12 illustrates the input and the equal to µ, then it will only turn on when its output units for the perceptron whose full set net input equals -µ. Note many of the of connection weights are provided in Table networks described in previous chapters – 9-3. The number inside each of the output whose output units (with µ = 0) turned on units is that processor’s µ. Note that none when their net input equals 0 – follow a of these values are equal to 0. As a result, a special case of this more general rule slightly different account of what will turn an because 0 = -0. output unit on is required. In the Figure 9-12 network, an output unit will only turn on In Figure 9-12, the order of pitch-class when the net input that it receives ‘cancels units in the input layer is different than that out’ its µ. That is, if a value unit has a bias for the output units. The units have been

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 24 rearranged in the figure so that a particular the D7 input unit will send a signal of -0.32 input unit is directly below the output unit to this output unit. When the E input unit – that represents a pitch-class a perfect fifth which is a perfect fifth away from A – is away from the input unit’s. activated, it will send a strong signal of 0.76 to the A output unit. The two signals being The number in the rectangle directly received by this input unit in this situation (- above each input pitch-class unit is the 0.32 and 0.76) sum to a total of 0.44 which, weight of the connection between the input when combined with µ produces a net input unit and the output pitch-class unit directly of -0.09. This net input is close enough to above it in the figure. A vertical line in the zero to produce high activity in the output figure indicates the existence of this unit (0.97) when the Gaussian activation connection from an input unit to an output function is employed. A similar account for unit. any output pitch-class unit can be extracted from Figure 9-12. The input units for minor seventh and dominant seventh chords have important In order to generate a complete response connections with each output pitch-class in the ii-V-I progression problem, the unit; the weight of this connection (from network must also activate one of its chord either of these two different input units) is type output units. This is accomplished via provided in each rectangle directly below an the signals coming from the same types of output pitch-class unit. Note that the major units in the input pattern. In general, the seventh input unit has no such connections; presence of one chord type in the input indeed, it has no functional purpose in this results in a different chord type being network and therefore there are no generated in the output. The connection connections drawn from this input unit to any weights in Table 9-3 reveal that the strength of the output units! of the connection from one chord type to the next is almost equal to -µ, where µ is the There are important connections from the bias of the output unit. For instance, the minor seventh and the dominant seventh weight from the m7 input unit to the D7 input units to each of the three output units output unit is -0.34, while the D7 output unit for chord type. Each of these connections has µ = 0.33. The other connection weights has been drawn in Figure 9-12, but the between input and output chord types are weight values have not been included in the such that an input chord unit will only figure to avoid clutter. These values are activate the appropriate next chord type, and presented in Table 9-3 and their function will will fail to activate the other two (incorrect) be described shortly. chord type units.

With properties of Figure 9-12 described, Figure 9-13 illustrates the causal we are now in a position to explain how this relations that we have just described for the perceptron operates. network to the geometric description of the ii-V-I progression that was developed in To begin, let us consider the conditions Section 9.9.2. The top of this figure displays that will cause a particular output unit to turn three chord type units; the arrows between on when this output unit represents the root then indicate the causal links between units note of the output chord. Such an output (i.e. each arrow shows that input activity in unit will only activate when 1) either the m7 the unit at its base causes output activity in or D7 input unit is on and 2) the input unit the unit at its arrowhead). So, when the m7 representing a pitch-class that is a perfect unit is turned on, it causes output activity in fifth away from the output unit’s is also on. It the D7 unit. Similarly, when the D7 unit is is only in such a circumstance that the net turned on, it causes output activity in the input to the output unit will be equal to -1 Maj7 unit. The Maj7 unit causes no activity times its value of µ, cancelling µ out. in any other units, which is why no arrows emanate from it. For instance, consider the output unit representing the pitch-class A, which has µ Activity in either the m7 or the D7 unit = -0.53. When activated the m7 input unit or also sends activity to pitch-class units. This

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 25 is represented in Figure 9-13 by having a However, A will only turn on if it receives a second arrow from each unit connect them signal from E and from either the m7 or the to an apex of a triangle which in turn sends D7 unit at the same time. signals to pitch-class units. Both are connected to the same apex because (as In short, when lead-sheet notation was shown earlier in Figure 9-12 and Table 9-3) used to encode stimuli and responses, a both of these input units have essentially the perceptron learned to carve the ii-V-I same connection weight to a pitch-class problem up into two different tasks: the unit. The arrow from the apex to each pitch- causal relations between input and output class unit indicates the role that either the chord types, and the causal relations m7 or the D7 unit plays in turning an output between input and output pitch-classes. Its pitch-class unit on. solution to the problem identified the fact that pitch-class relations were organized in The pitch-classes in Figure 9-13 are terms of perfect fifths, but also required input arranged around a circle of perfect fifths. from appropriate chord types. The solution Arrows around this circle indicate causal also recognized that chord-type relations links from an input pitch-class to an output were independent of chord roots. What is pitch-class. For instance, for the A unit to amazing is that such an elegant solution turn on, it must receive a signal from the E was achieved after such a small amount of unit which is adjacent to it in the circle. No training! other pitch-class unit will turn A on.

Figure 9-13. Causal links between chord type units and pitch-class units. These causal links are taken from Figure 9-12 and Table 9-3, but are illustrated here using the circle of perfect fifths. See text for details.

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 26

9.10 A Progression of Progressions 9.10.1 Second-order Progression the same amount they will point to the ii-V-I progression chords in the next key (A♭ The ii-V-I progression problem that we major). Six rotations such rotations from the have been discussing throughout the current beginning position (the top of Figure 9-14) chapter is the simplest version. One will return them to their original location. interesting property of this progression is that it is fairly easy to play the progression in one key, and then play the same progression in a key that is a full tone lower. As a result, one can perform a progression of progressions, changing key until one reaches the same key that was played at the beginning, although now the chords will be an octave lower than the first ones played. This might be called a second order progression.

The reason that one can have such a progression of progressions is because it is fairly straightforward to change from the major seventh chord that ends the ii-V-I progression to a minor seventh chord that begins the progression in the next key. This is because the two chords are built upon the same root.

This is illustrated geometrically in Figure 9-14. Each circle in this figure arranges chord roots around a circle of perfect fifths. The three spokes in the top circle pick out the three chords for the ii-V-I progression in the key of C, a progression that ends with Cmaj7. One can take this major seventh Figure 9-14. Geometric illustration of the chord and change two of its notes to relation between two ii-V-I progressions in produce a Cm7 chord. This chord is the first adjacent keys. See text for details. part of the ii-V-I progression for the key of B♭ The musical score illustrated in Figure 9- major. The three chords for the ii-V-I 15 provides the second-order progression progression in B♭ major are identified by the that is created by beginning with the top three spokes in the bottom circle of Figure 9- chords illustrated in Figure 9-14 (the ii-V-I 14. progression for C major) and repeatedly employing the 60 rotation rule until the key Note that Figure 9-14 illustrates that of C major is reached again. Notice that the moving from the ii-V-I progression in one first three chords in the Figure 9-15 score key to the same progression in the next key are each an octave higher than their involves taking three spokes that pick out respective chords at the end of the score. the first three chords and rotating them counterclockwise by 60. If the bottom set of spokes is rotated in the same direction by

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 27

Figure 9-15. The progression of ii-V-I progressions created by starting in the key of C major and following the procedure illustrated in Figure 9-14 to move from key to key.

An inspection of Figure 9-15, as well as a initial position of the spokes in Figure 9-14 consideration of the procedure illustrated in must be changed to pick out the chords for a Figure 9-14, reveals that the method for key that belongs to the other circle of major producing a second-order ii-V-I progression seconds (e.g. D♭ major). When this is done, only produces the progression for half of the available major keys. Indeed, the different and the 60 rotation procedure is keys for which this method generates chords implemented, the chords that belong to the all belong to the same circle of major remaining six major keys are generated. seconds; none of the keys that belong to the This second version of the second-order ii- other circle of major seconds have their V-I progression is presented in the musical chords generated. In order to do so, the score found in Figure 9-16.

Figure 9-16. The progression of ii-V-I progressions created by starting in the key of D♭ major and following the procedure illustrated in Figure 9-14 to move from key to key. This progression generates the chords for the six major keys that are not represented in Figure 9-15.

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 28

9.10.2 Second-order Problem first-order version of the problem that were presented earlier this chapter. The creation of the two versions of the second-order ii-V-I progression permits us to When pitch-class encoding was create a slightly more complicated version of employed, the second-order ii-V-I problem the ii-V-I progression problem to be learned was more complicated than the first-order by artificial neural networks. In this second- problem. In order to achieve reliable and order problem, the task for the network is relatively fast convergence, two more hidden the same: when presented an input chord, value units had to be added to the multilayer generate the next chord in the progression. perceptron illustrated in Figure 9-8. With 9 However, the second-order version of the hidden units, with µs trained during learning, problem permits major seventh chords to be and with a learning rate of 0.01, a solution to inputs that result in the generation of an the problem was typically achieved in output chord: the minor seventh chord that between 4500 and 6500 epochs of training. begins the ii-V-I progression in the next key. If the number of hidden units was reduced to This is really the only difference between 8, then the network typically failed to this new version of the problem and the converge after more than 25,000 epochs of simpler version that has been the subject of training, and a smaller learning rate (0.005) earlier sections in the current chapter. was required to achieve some progress. On occasion an 8 hidden unit network would In the old (first-order) version of the converge after a larger number of training problem, the major seventh chord was only sweeps (at least 23,000), and even more a response, and never a stimulus. Similarly, rarely a network would converge after less the minor seventh chord was only a than 8,000 sweeps. It would appear that an stimulus, and never a response. As a result 8 hidden unit network would only converge if the first-order version of the problem only its randomly selected starting state was required 24 patterns in its training set. highly advantageous.

In the second-order version of the ii-V-I Another version of the second-order ii-V-I problem, there is an additional stimulus, progression problem was encoded using a because a major seventh chord input now pitch representation of tetrachords in root leads to a minor seventh chord response. position. Similar to the results for the first- As a result, the second-order ii-V-I order problem (Section 9.6), this encoding progression problem has a total of 36 permitted a value unit perceptron to learn a training patterns instead of 24. solution with output unit µs held constant at zero throughout learning. With a learning Apart from an additional 12 stimuli, the rate of 0.1 this kind of simple network would second-order version of the ii-V-I is nearly typically learn a solution to the problem in identical to the first-order version. In between 100 and 200 epochs of training. particular, input and output chords are treated in the same fashion, and can be Interestingly, when pitch encoding was encoded in the various formats that were used to represent inverted chords in the detailed in Section 9.4. We used these second-order ii-V-I problem, the problem encodings to create four different versions of was more difficult than was the case for the the second-order ii-V-I progression problem, first-order version of the problem. In and then determined how problem encoding contrast to the situation in which non- impacted network complexity. inverted chords were presented, a value unit perceptron was not able to learn a solution 9.10.3 Training Results to the problem. This was somewhat surprising because we expected that the This section provides a brief account of inverted chords would be easier to learn. the results of training networks on different The simplest network that would learn the encodings of the second-order ii-V-I second-order problem was a multilayer progression problem. The results presented perceptron that had a single hidden unit, and below are intended to complement the more also had direct connections between input detailed results for training networks on the and output units. A complete account of

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 29 why inverted chords cause problems for the during learning a value unit perceptron second-order problem, but not for the first, would learn the second-order problem after would require a detailed analysis of the approximately 40 epochs of training. This internal structure of networks. However, suggests that even with this encoding the such an analysis will not be presented here. second-order problem was more difficult than the first-order problem, in the sense The final representation that we that slightly more training was required. examined for the second-order ii-V-I However, solutions to either version of the ii- progression problem was lead sheet V-I progression problem could be encoding. As was the case for the first- discovered by a value unit perceptron – order problem, this kind of encoding led to when lead sheet encoding was employed. fast solutions by simple networks. With a learning rate of 0.1 and with µs modified 9.10.4 Network Interpretation

Output Unit

Input m7 7 maj7 A A# B C C# D D# E F F# G G# Unit µ 0.59 0.55 -0.54 -0.92 0.92 0.94 0.96 -0.96 -0.93 -0.92 0.96 0.98 -0.96 -0.89 0.90

m7 0.60 -0.59 -0.61 -0.60 0.61 0.62 0.60 -0.59 -0.61 -0.65 0.61 0.60 -0.59 -0.66 0.66

7 0.59 0.62 0.59 -0.60 0.61 0.62 0.60 -0.59 -0.61 -0.65 0.60 0.60 -0.58 -0.66 0.66 maj7 -0.62 0.61 -0.62 0.26 -0.25 -0.25 -0.27 0.28 0.25 0.22 -0.27 -0.27 0.28 0.21 -0.21

A 0.03 0.05 -0.05 0.60 0.33 0.30 0.31 -0.33 1.54 -0.32 0.29 0.27 -0.32 -0.34 0.32

A# 0.03 0.05 -0.05 -0.31 -0.60 0.33 0.31 -0.31 -0.31 1.57 0.35 0.29 -0.33 -0.30 0.32

B 0.03 0.05 -0.05 -0.32 0.32 -0.62 0.29 -0.31 -0.31 -0.29 -1.57 0.28 -0.33 -0.30 0.30

C 0.03 0.05 -0.05 -0.34 0.32 0.32 -0.63 -0.31 -0.32 -0.27 0.32 -1.57 -0.31 -0.32 0.30

C# 0.03 0.05 -0.05 -0.37 0.34 0.32 0.32 0.62 -0.33 -0.31 0.31 0.31 1.54 -0.31 0.33

D 0.03 0.05 -0.05 -0.32 0.34 0.32 0.30 -0.32 0.61 -0.31 0.29 0.29 -0.32 1.55 0.31

D# 0.03 0.05 -0.05 -0.32 0.35 0.32 0.30 -0.33 -0.30 0.64 0.30 0.28 -0.32 -0.31 -1.55

E 0.03 0.05 -0.05 1.53 0.33 0.33 0.31 -0.34 -0.32 -0.29 -0.63 0.28 -0.33 -0.30 0.30

F 0.03 0.05 -0.05 -0.33 -1.53 0.32 0.32 -0.31 -0.31 -0.31 0.40 -0.64 -0.34 -0.33 0.30

F# 0.03 0.05 -0.05 -0.32 0.34 -1.56 0.34 -0.30 -0.32 -0.28 0.31 0.32 0.61 -0.32 0.31

G 0.03 0.05 -0.05 -0.32 0.35 0.32 -1.56 -0.33 -0.32 -0.30 0.29 0.31 -0.31 0.62 0.30

G# 0.03 0.05 -0.05 -0.32 0.33 0.31 0.29 1.55 -0.33 -0.30 0.31 0.28 -0.32 -0.30 -0.62 Table 9-4. The connection weights for a perceptron that has learned the second-order ii-V-I progression in lead sheet notation. Each row corresponds to an input source (µ or an input unit) and each column corresponds to an output unit.

In order to complete the parallels weights that are functionally important; these between our earlier examination of the first- weights have been highlighted in the table. order ii-V-I progression problem and the current consideration of the second-order ii- An examination of Table 9-4 reveals that V-I problem, let us proceed with an it shares a great deal of the functional interpretation of the perceptron’s structure structure seen earlier in Table 9-3, structure for solving the second-order problem when that was used to create Figure 9-12. First, lead sheet encoding is employed. the most extreme weight feeding into an output pitch-class unit comes from an input Table 9-4 presents the connection pitch-class unit that is a perfect fifth away. weights of one such perceptron. As was the Second, the connections to these output case when we examined Table 9-3, within units from either of the m7 or D7 input units all of the connection weights in Table 9-4 are equal in weight. Third, the sum of the there is a tractable subset of connection signal from an m7 or a D7 unit plus a signal

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 30 from an input pitch-class unit a perfect fifth Third, the connection weights from the away is sufficient to nearly cancel out the input major seventh chord unit to each of the output unit’s µ. In short, this perceptron is three output chord type units are now all structured to respond to input minor seventh substantially different than zero. The or input dominant seventh chords in exactly weights are such that when the major the same way that was illustrated earlier in seventh input unit is on it will turn on the Figure 9-12. minor seventh input unit, and will fail to activate the other two chord type units. The differences between Tables 9-3 and 9-4 reveal that the perceptron trained on the In short, the perceptron whose structure second-order ii-V-I problem has additional is detailed in Table 9-4 is functionally important connection weights that permit it identical to the perceptron of Table 9-3, but to respond correctly when major seventh includes additional weights. These weights chords are presented to it. provide additional functionality that produces correct responses when major seventh First, the next most extreme connection chords are presented to the network. weight that feeds into an output pitch-class unit comes from an input pitch-class unit that As was illustrated in Figure 9-13, the represents the same pitch-class (i.e. the functional operation of the perceptron input is an interval of perfect unison away separates chord type responses from chord from the output). For example, the root responses. connection between the input unit representing A to the output unit With respect to the activation of an representing A has a weight of 0.60. This output chord type unit, each of these output relation makes sense because in the units will respond only when a particular second-order version of the progression, a input chord type unit is turned on. A minor major seventh chord leads into a minor seventh chord output will only be activated seventh chord that has the same root note. by a major seventh chord input. A dominant seventh chord output will only be activated Second, the connection weights from the by a minor seventh chord input. A major major seventh input unit to each of the seventh chord output will only be activated output pitch-class units are now moderately by a dominant seventh chord input. large (in comparison to the same kinds of weights in Table 9-3), and are the same sign With respect to chord root, only three as the connection weights to the same different situations will cause an output output unit from the input pitch-class unit pitch-class unit to turn on. First, it will turn that represents the same pitch. As a result, on if a minor seventh chord unit is on at the the two signals – one from the major same time that the input pitch-class unit a seventh input unit, the other from a pitch- perfect fifth away is activated. Second, it will class unit – combine to create a more turn on if a dominant seventh chord unit is extreme signal. Finally, this combination on at the same time that the input pitch- sums to a value that nearly cancels out the class unit a perfect fifth away is activated. output pitch-class unit’s µ, turning it on. Third, it will turn on if a major seventh chord is on at the same time that the input pitch- For example, consider the output pitch- class unit a unison away is activated. class unit for A with µ of -0.92. The weight to it from the input A unit is 0.60, and the Of these three rules, the first two are weight to it from the input major seventh unit identical to those found in the perceptron for is 0.26. When these two input units are the first-order ii-V-I progression problem. turned on they together send a total signal of This new perceptron solves the second- 0.86 which combines with µ to create a net order ii-V-I problem by discovering that input of -0.06 which is close enough to zero structure, and adding a small amount of to turn the output unit on. additional functionality to deal with the progression of progressions.

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 31

9.11 Summary and Implications At the start of this book (e.g. Chapter 2, Chapter 4, Figure 4-1) artificial neural In contrast, other encodings of the ii-V-I networks were introduced as artifacts that progression problems permitted much are primarily used for pattern classification. simpler networks to learn the problem. For That is, they arrange input patterns as points instance, we discovered that pitch encoding in a space (either a pattern space or a of chords, and of chord inversions, led to hidden unit space, depending upon network very simple networks (in most cases type), and output units carve this space into perceptrons) solving the same problem that decision regions. If a pattern falls into one required a multilayer network when the more decision region, the network generates one abstract encoding was used. A lead sheet kind of response (i.e. one kind of ‘pattern encoding for both first-order and second- name’); if it falls into a different decision order problems also was quickly solved by a region, a different response is generated. perceptron. The structure of these perceptrons was easy to analyze, and was In earlier chapters we have easily related to a traditional geometric demonstrated that pattern classification is a account of the ii-V-I progression. general ability that can be applied very neatly to a variety of musical problems. For The purpose of the current chapter was example, we have used it to identify scale simply to illustrate the importance of tonics, scale modes, musical keys, and encoding choices. However, it is important chord types. to keep in mind the implications of such choices. The current chapter has demonstrated a further flexible use of pattern classification in Obviously problem difficulty is impacted which the response generated by a network by problem encoding. What encoding, then, to an input chord is a special name: the should we choose for our networks? It name of another chord. This permits a might be very tempting to explore a variety network to represent chord progressions in of different and plausible encodings, and its internal structure. We demonstrated this then to choose the one that generates the ability by training networks on two different simplest networks. versions (first-order and second-order) of an important chord progression, the ii-V-I In some cases this might very well be the changes. appropriate strategy. However, other factors must also be considered when choosing an In addition to demonstrating this ability, encoding. this chapter also explored the importance of how one encodes network stimuli and For example, perhaps the goal of a responses. All of the networks described in network is to provide insight into the formal this chapter learned the same chord regularities that govern a specific musical progression. However, networks differed problem. In this case, the encoding that from one another in how input and output leads to the simplest network may not be the chords were encoded. One of the main most appropriate, because the encoding results of the current chapter was that may make certain musical regularities choice of encoding had enormous impact on disappear. We saw earlier in this chapter problem complexity. that one key element of the musical theory of chord progressions is voice leading. The In particular, we discovered that when lead sheet notation described in this chapter the ii-V-I progression is encoded using the generates simple networks, but essential very abstract pitch-class representation of properties related to voice leading are individual chord notes, the problem was very hidden by this encoding. So, if one is difficult. Multilayer perceptrons with several interested in using networks to explore hidden value units were required to regularities of voice leading, then the converge to a solution when this encoding encoding that leads to the simplest network was employed. may not be the most appropriate.

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 32

For instance, one consequence of As another example, perhaps the goal of representing a problem in a particular format training a musical network is to discover might be that some instances of the problem representations that serve as the basis for can be solved quickly, while other instances musical cognition. In this case, we may not are more difficult to solve. In performing be searching for the encoding that produces mental arithmetic, for example, one might the simplest networks. We might be expect that if numbers were represented searching for the encoding that generates mentally in columns then addition problems the greatest similarity between various that require carrying digits from one column measures of network performance and to another would take longer than problems structure and measures of performance of that did not require this operation. One can human listeners in a musical cognition collect relative complexity evidence experiment. (Pylyshyn, 1984) to investigate artifacts of this type. With relative complexity evidence, From the perspective of musical one varies the nature of problems presented cognition, human listeners are ‘black boxes’. to a system, and then explores the This is because we cannot directly observe relationship between the properties of the the internal structures and processes that problems and the time required to solve mediate musical cognition. Instead we can them. only infer these internal properties on the basis of observations of external behavior. A related kind of data concerns This process of inference is called reverse intermediate state evidence (Pylyshyn, engineering: by observing human responses 1984). This kind of evidence presumes that to musical stimuli in a variety of clever information processing inside the black box experimental situations, we attempt to requires a number of different processing discover the structures, processes, or stages, and that each stage might represent algorithms inside the black box. intermediate results in a different format. To collect intermediate state evidence, one Reverse engineering is hard enough attempts to determine the number and because we cannot directly see inside the nature of these intermediate results. For black box. A second issue that makes example, when researchers determined that reverse engineering challenging is that each items in short-term memory were confused input/output or stimulus/response pairing with similar sounding items (Conrad, 1964) that we can observe can be mediated by and not with items with similar meaning, this more than one process. There is a many-to- suggested that an intermediate memory one mapping from possible structures, store used an acoustic encoding (Waugh & processes, or algorithms to input/output Norman, 1965). relations (Dawson, 2013). As a result, we might believe that one process is A particular type of data, called error responsible for mediating the behavior that evidence (Pylyshyn, 1984), is very well we observe, but in reality a very different suited to determining intermediate states. process might be responsible. What is When extra demands are placed on a required are some special observations that system’s resources, it may not function as might be useful for validating one theory designed, and its internal workings are likely about what is inside the black box from to become more evident (Simon, 1969). another. This is not just because the overtaxed system makes errors in general, but Fortunately, black boxes will generate because these errors are often systematic, some observable behaviors that are side and their systematicity reflects the effects of the processes inside the black underlying representation. One study box. These side effects – called artifacts by (Yaremchuk & Dawson, 2005) investigated Dawson (2013) -- can provide critical an multilayer perceptron trained to identify information for theory validation (Pylyshyn, tetrachord types. It was discovered that 1980, 1984). when some of its hidden units were removed, the network only made very specific errors: it failed to identify tetrachords

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 33 as being major when, and only when, they were in their second inversion form. This suggested that the role of the missing hidden units was to permit the network to deal with this rather specialized type of input.

What is the relationship between relative complexity evident, intermediate state evidence, error evidence, and choice of encoding? In many cases researchers are specifically interested in using artificial neural networks to serve as models of human musical cognition (Griffith & Todd, 1999; Todd & Loy, 1991). In this case establishing the validity of the model likely requires collecting all three types of evidence, not only from the human subjects, but also from the neural network model. The hope is to find a close relation between the evidence collected from the human subjects and the evidence collected from the neural network model. Importantly, this match is likely to be highly related to choice of encoding. In other words, a music cognition researcher may not be interested in seeking the encoding that leads to the simplest network, but instead in seeking the encoding that leads to the best match between subject and model.

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 34

9.12 References Bharucha, J. J. (1984). Anchoring effects in affect. Music Perception, 13(1), 79- music: The resolution of 108. dissonance. Cognitive Psychology, Kelley, R. D. G. (2009). Thelonious Monk: 16(4), 485-518. The Life and Times of an American Broze, Y., & Shanahan, D. (2013). Original (1st Free Press hardcover Diachronic changes in jazz ed.). New York: Free Press. harmony: A cognitive perspective. Krumhansl, C. L. (1990). Cognitive Music Perception, 31(1), 32-45. doi: Foundations Of Musical Pitch. New 10.1525/mp.2013.31.1.32 York: Oxford University Press. Calvo, P., & Gomila, A. (2008). Handbook Krumhansl, C. L., Bharucha, J. J., & Kessler, Of Cognitive Science: An Embodied E. J. (1982). Perceived harmonic Approach. Oxford: Elsevier. structure of chords in three related Chemero, A. (2009). Radical Embodied musical keys. Journal of Cognitive Science. Cambridge, Experimental Psychology: Human Mass.: MIT Press. Perception and Performance, 8(1), Conrad, R. (1964). Information, acoustic 24-36. confusion, and memory span. British Levine, M. (1989). The Jazz Piano Book. Journal of Psychology, 55, 429-432. Petaluma, CA: Sher Music Co. Dawson, M. R. W. (2013). Mind, Body, Newell, A., & Simon, H. A. (1972). Human World: Foundations Of Cognitive Problem Solving. Englewood Cliffs, Science. Edmonton, AB: Athabasca NJ: Prentice-Hall. University Press. Norman, D. A. (1998). The Invisible Dawson, M. R. W., Dupuis, B., & Wilson, M. Computer. Cambridge, Mass.: MIT (2010). From Bricks To Brains: The Press. Embodied Cognitive Science Of Norman, D. A. (2002). The Design Of LEGO Robots. Edmonton, AB: Everyday Things (1st Basic Athabasca University Press. paperback. ed.). New York: Basic Demsey, D. (1991). Chromatic third relations Books. in the music of John Coltrane. Norman, D. A. (2004). Emotional Design: Annual Review Of Jazz Studies, 5, Why We Love (Or Hate) Everyday 145-180. Things. New York: Basic Books. Dourish, P. (2001). Where The Action Is: Piston, W. (1962). Harmony (3d ed.). New The Foundations Of Embodied York,: W. W. Norton. Interaction. Cambridge, Mass.: MIT Porter, L. (1998). John Coltrane: His Life Press. And Music. Ann Arbor: University of Gibson, J. J. (1979). The Ecological Michigan Press. Approach To Visual Perception. Pylyshyn, Z. W. (1980). Computation and Boston, MA: Houghton Mifflin. cognition: Issues in the foundations Griffith, N., & Todd, P. M. (1999). Musical of cognitive science. Behavioral and Networks: Parallel Distributed Brain Sciences, 3(1), 111-132. Perception And Performace. Pylyshyn, Z. W. (1984). Computation And Cambridge, Mass.: MIT Press. Cognition. Cambridge, MA.: MIT Heidegger, M. (1927/1962). Being And Press. Time. New York,: Harper. Rosner, B. S., & Narmour, E. (1992). Houston, S. (2004). Play Piano In A Flash! Harmonic closure: and (1st Hyperion ed.). New York: perception. Music Perception, 9(4), Hyperion. 383-411. Jarvinen, T. (1995). Tonal hierarchies in jazz Schoenberg, A. (1969). Structural Functions improvisation. Music Perception, Of Harmony (Rev. ed.). New York,: 12(4), 415-437. W. W. Norton. Josephson, M. (1961). Edison. New York: Shapiro, L. A. (2011). Embodied Cognition. McGraw Hill. New York: Routledge. Katz, B. F. (1995). Harmonic resolution, neural resonance, and positive

© Michael R. W. Dawson 2014 Chapter 9 Exploring the ii-V-I Progression 35

Shapiro, L. A. (2014). The Routledge Handbook Of Embodied Cognition (1 edition . ed.). London: Routledge. Simon, H. A. (1969). The Sciences of the Artificial. Cambridge, MA: MIT Press. Slonimsky, N. (1947). Thesaurus of scales and melodic patterns. New York,: Coleman-Ross company, inc. Steedman, M. J. (1984). A generative grammar for jazz chord sequences. Music Perception, 2(1), 52-77. Sudnow, D. (1978). Ways Of The Hand: The Organization Of Improvised Conduct. Cambridge, Mass.: Harvard University Press. Todd, P. M., & Loy, D. G. (1991). Music And Connectionism. Cambridge, Mass.: MIT Press. Tymoczko, D. (2006). The geometry of musical chords. Science, 313(5783), 72-74. Tymoczko, D. (2008). Scale theory, serial theory and voice leading. Music Analysis, 27(1), 1-49. doi: 10.1111/j.1468-2249.2008.00257.x Tymoczko, D. (2011). A Geometry Of Music: Harmony And In The Extended Common Practice (E-pub ed.). New York: Oxford University Press. Varela, F. J., Thompson, E., & Rosch, E. (1991). The Embodied Mind: Cognitive Science And Human Experience. Cambridge, Mass.: MIT Press. Vera, A. H., & Simon, H. A. (1993). Situated action: A symbolic interpretation. Cognitive Science, 17, 7-48. Waugh, N. C., & Norman, D. A. (1965). Primary memory. Psychological Review, 72, 89-104. Winograd, T., & Flores, F. (1987). Understanding Computers And Cognition. New York: Addison- Wesley. Yaremchuk, V., & Dawson, M. R. W. (2005). Chord classifications by artificial neural networks revisited: Internal representations of circles of major thirds and minor thirds. Artificial Neural Networks: Biological Inspirations - Icann 2005, Pt 1, Proceedings, 3696, 605-610.

© Michael R. W. Dawson 2014